blob: 35881dfab709a6ce65f201e3ed004352d64d9da9 [file] [log] [blame] [view] [edit]
## Comparing Performance: Loops vs. Iterators
To determine whether to use loops or iterators, you need to know which
implementation is faster: the version of the `search` function with an explicit
`for` loop or the version with iterators.
We ran a benchmark by loading the entire contents of _The Adventures of Sherlock
Holmes_ by Sir Arthur Conan Doyle into a `String` and looking for the word _the_
in the contents. Here are the results of the benchmark on the version of
`search` using the `for` loop and the version using iterators:
```text
test bench_search_for ... bench: 19,620,300 ns/iter (+/- 915,700)
test bench_search_iter ... bench: 19,234,900 ns/iter (+/- 657,200)
```
The iterator version was slightly faster! We wont explain the benchmark code
here, because the point is not to prove that the two versions are equivalent but
to get a general sense of how these two implementations compare
performance-wise.
For a more comprehensive benchmark, you should check using various texts of
various sizes as the `contents`, different words and words of different lengths
as the `query`, and all kinds of other variations. The point is this: iterators,
although a high-level abstraction, get compiled down to roughly the same code as
if youd written the lower-level code yourself. Iterators are one of Rusts
_zero-cost abstractions_, by which we mean using the abstraction imposes no
additional runtime overhead. This is analogous to how Bjarne Stroustrup, the
original designer and implementor of C++, defines _zero-overhead_ in
Foundations of C++” (2012):
> In general, C++ implementations obey the zero-overhead principle: What you
> dont use, you dont pay for. And further: What you do use, you couldnt hand
> code any better.
As another example, the following code is taken from an audio decoder. The
decoding algorithm uses the linear prediction mathematical operation to estimate
future values based on a linear function of the previous samples. This code uses
an iterator chain to do some math on three variables in scope: a `buffer` slice
of data, an array of 12 `coefficients`, and an amount by which to shift data in
`qlp_shift`. Weve declared the variables within this example but not given them
any values; although this code doesnt have much meaning outside of its context,
its still a concise, real-world example of how Rust translates high-level ideas
to low-level code.
```rust,ignore
let buffer: &mut [i32];
let coefficients: [i64; 12];
let qlp_shift: i16;
for i in 12..buffer.len() {
let prediction = coefficients.iter()
.zip(&buffer[i - 12..i])
.map(|(&c, &s)| c * s as i64)
.sum::<i64>() >> qlp_shift;
let delta = buffer[i];
buffer[i] = prediction as i32 + delta;
}
```
To calculate the value of `prediction`, this code iterates through each of the
12 values in `coefficients` and uses the `zip` method to pair the coefficient
values with the previous 12 values in `buffer`. Then, for each pair, we multiply
the values together, sum all the results, and shift the bits in the sum
`qlp_shift` bits to the right.
Calculations in applications like audio decoders often prioritize performance
most highly. Here, were creating an iterator, using two adapters, and then
consuming the value. What assembly code would this Rust code compile to? Well,
as of this writing, it compiles down to the same assembly youd write by hand.
Theres no loop at all corresponding to the iteration over the values in
`coefficients`: Rust knows that there are 12 iterations, so it unrolls the
loop. _Unrolling_ is an optimization that removes the overhead of the loop
controlling code and instead generates repetitive code for each iteration of the
loop.
All of the coefficients get stored in registers, which means accessing the
values is very fast. There are no bounds checks on the array access at runtime.
All these optimizations that Rust is able to apply make the resulting code
extremely efficient. Now that you know this, you can use iterators and closures
without fear! They make code seem like its higher level but dont impose a
runtime performance penalty for doing so.
## Summary
Closures and iterators are Rust features inspired by functional programming
language ideas. They contribute to Rusts capability to clearly express
high-level ideas at low-level performance. The implementations of closures and
iterators are such that runtime performance is not affected. This is part of
Rusts goal to strive to provide zero-cost abstractions.
Now that weve improved the expressiveness of our I/O project, lets look at
some more features of `cargo` that will help us share the project with the
world.