| ## Comparing Performance: Loops vs. Iterators |
| |
| To determine whether to use loops or iterators, you need to know which |
| implementation is faster: the version of the `search` function with an explicit |
| `for` loop or the version with iterators. |
| |
| We ran a benchmark by loading the entire contents of _The Adventures of Sherlock |
| Holmes_ by Sir Arthur Conan Doyle into a `String` and looking for the word _the_ |
| in the contents. Here are the results of the benchmark on the version of |
| `search` using the `for` loop and the version using iterators: |
| |
| ```text |
| test bench_search_for ... bench: 19,620,300 ns/iter (+/- 915,700) |
| test bench_search_iter ... bench: 19,234,900 ns/iter (+/- 657,200) |
| ``` |
| |
| The iterator version was slightly faster! We won’t explain the benchmark code |
| here, because the point is not to prove that the two versions are equivalent but |
| to get a general sense of how these two implementations compare |
| performance-wise. |
| |
| For a more comprehensive benchmark, you should check using various texts of |
| various sizes as the `contents`, different words and words of different lengths |
| as the `query`, and all kinds of other variations. The point is this: iterators, |
| although a high-level abstraction, get compiled down to roughly the same code as |
| if you’d written the lower-level code yourself. Iterators are one of Rust’s |
| _zero-cost abstractions_, by which we mean using the abstraction imposes no |
| additional runtime overhead. This is analogous to how Bjarne Stroustrup, the |
| original designer and implementor of C++, defines _zero-overhead_ in |
| “Foundations of C++” (2012): |
| |
| > In general, C++ implementations obey the zero-overhead principle: What you |
| > don’t use, you don’t pay for. And further: What you do use, you couldn’t hand |
| > code any better. |
| |
| As another example, the following code is taken from an audio decoder. The |
| decoding algorithm uses the linear prediction mathematical operation to estimate |
| future values based on a linear function of the previous samples. This code uses |
| an iterator chain to do some math on three variables in scope: a `buffer` slice |
| of data, an array of 12 `coefficients`, and an amount by which to shift data in |
| `qlp_shift`. We’ve declared the variables within this example but not given them |
| any values; although this code doesn’t have much meaning outside of its context, |
| it’s still a concise, real-world example of how Rust translates high-level ideas |
| to low-level code. |
| |
| ```rust,ignore |
| let buffer: &mut [i32]; |
| let coefficients: [i64; 12]; |
| let qlp_shift: i16; |
| |
| for i in 12..buffer.len() { |
| let prediction = coefficients.iter() |
| .zip(&buffer[i - 12..i]) |
| .map(|(&c, &s)| c * s as i64) |
| .sum::<i64>() >> qlp_shift; |
| let delta = buffer[i]; |
| buffer[i] = prediction as i32 + delta; |
| } |
| ``` |
| |
| To calculate the value of `prediction`, this code iterates through each of the |
| 12 values in `coefficients` and uses the `zip` method to pair the coefficient |
| values with the previous 12 values in `buffer`. Then, for each pair, we multiply |
| the values together, sum all the results, and shift the bits in the sum |
| `qlp_shift` bits to the right. |
| |
| Calculations in applications like audio decoders often prioritize performance |
| most highly. Here, we’re creating an iterator, using two adapters, and then |
| consuming the value. What assembly code would this Rust code compile to? Well, |
| as of this writing, it compiles down to the same assembly you’d write by hand. |
| There’s no loop at all corresponding to the iteration over the values in |
| `coefficients`: Rust knows that there are 12 iterations, so it “unrolls” the |
| loop. _Unrolling_ is an optimization that removes the overhead of the loop |
| controlling code and instead generates repetitive code for each iteration of the |
| loop. |
| |
| All of the coefficients get stored in registers, which means accessing the |
| values is very fast. There are no bounds checks on the array access at runtime. |
| All these optimizations that Rust is able to apply make the resulting code |
| extremely efficient. Now that you know this, you can use iterators and closures |
| without fear! They make code seem like it’s higher level but don’t impose a |
| runtime performance penalty for doing so. |
| |
| ## Summary |
| |
| Closures and iterators are Rust features inspired by functional programming |
| language ideas. They contribute to Rust’s capability to clearly express |
| high-level ideas at low-level performance. The implementations of closures and |
| iterators are such that runtime performance is not affected. This is part of |
| Rust’s goal to strive to provide zero-cost abstractions. |
| |
| Now that we’ve improved the expressiveness of our I/O project, let’s look at |
| some more features of `cargo` that will help us share the project with the |
| world. |