blob: b902d3251a34b7d3bada8f2b9d6623a587f8d279 [file] [log] [blame] [view] [edit]
## Using `Box<T>` to Point to Data on the Heap
The most straightforward smart pointer is a _box_, whose type is written
`Box<T>`. Boxes allow you to store data on the heap rather than the stack. What
remains on the stack is the pointer to the heap data. Refer to Chapter 4 to
review the difference between the stack and the heap.
Boxes dont have performance overhead, other than storing their data on the heap
instead of on the stack. But they dont have many extra capabilities either.
Youll use them most often in these situations:
* When you have a type whose size cant be known at compile time and you want to
use a value of that type in a context that requires an exact size
* When you have a large amount of data and you want to transfer ownership but
ensure the data wont be copied when you do so
* When you want to own a value and you care only that its a type that
implements a particular trait rather than being of a specific type
Well demonstrate the first situation in the
[“Enabling Recursive Types with
Boxes”](#enabling-recursive-types-with-boxes)<!-- ignore --> section. In the
second case, transferring ownership of a large amount of data can take a long
time because the data is copied around on the stack. To improve performance in
this situation, we can store the large amount of data on the heap in a box.
Then, only the small amount of pointer data is copied around on the stack, while
the data it references stays in one place on the heap. The third case is known
as a _trait object_, and Chapter 18 devotes an entire section,
[“Using
Trait Objects That Allow for Values of Different Types,”][trait-objects]<!--
ignore --> just to that topic. So what you learn here youll apply again in
Chapter 18!
### Using a `Box<T>` to Store Data on the Heap
Before we discuss the heap storage use case for `Box<T>`, well cover the syntax
and how to interact with values stored within a `Box<T>`.
Listing 15-1 shows how to use a box to store an `i32` value on the heap:
<Listing number="15-1" file-name="src/main.rs" caption="Storing an `i32` value on the heap using a box">
```rust
{{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-01/src/main.rs}}
```
</Listing>
We define the variable `b` to have the value of a `Box` that points to the value
`5`, which is allocated on the heap. This program will print `b = 5`; in this
case, we can access the data in the box similar to how we would if this data
were on the stack. Just like any owned value, when a box goes out of scope, as
`b` does at the end of `main`, it will be deallocated. The deallocation happens
both for the box (stored on the stack) and the data it points to (stored on the
heap).
Putting a single value on the heap isnt very useful, so you wont use boxes by
themselves in this way very often. Having values like a single `i32` on the
stack, where theyre stored by default, is more appropriate in the majority of
situations. Lets look at a case where boxes allow us to define types that we
wouldnt be allowed to if we didnt have boxes.
### Enabling Recursive Types with Boxes
A value of _recursive type_ can have another value of the same type as part of
itself. Recursive types pose an issue because at compile time Rust needs to know
how much space a type takes up. However, the nesting of values of recursive
types could theoretically continue infinitely, so Rust cant know how much space
the value needs. Because boxes have a known size, we can enable recursive types
by inserting a box in the recursive type definition.
As an example of a recursive type, lets explore the _cons list_. This is a data
type commonly found in functional programming languages. The cons list type
well define is straightforward except for the recursion; therefore, the
concepts in the example well work with will be useful any time you get into
more complex situations involving recursive types.
#### More Information About the Cons List
A _cons list_ is a data structure that comes from the Lisp programming language
and its dialects and is made up of nested pairs, and is the Lisp version of a
linked list. Its name comes from the `cons` function (short for construct
function”) in Lisp that constructs a new pair from its two arguments. By calling
`cons` on a pair consisting of a value and another pair, we can construct cons
lists made up of recursive pairs.
For example, heres a pseudocode representation of a cons list containing the
list 1, 2, 3 with each pair in parentheses:
```text
(1, (2, (3, Nil)))
```
Each item in a cons list contains two elements: the value of the current item
and the next item. The last item in the list contains only a value called `Nil`
without a next item. A cons list is produced by recursively calling the `cons`
function. The canonical name to denote the base case of the recursion is `Nil`.
Note that this is not the same as the null or nil concept in Chapter 6,
which is an invalid or absent value.
The cons list isnt a commonly used data structure in Rust. Most of the time
when you have a list of items in Rust, `Vec<T>` is a better choice to use.
Other, more complex recursive data types _are_ useful in various situations, but
by starting with the cons list in this chapter, we can explore how boxes let us
define a recursive data type without much distraction.
Listing 15-2 contains an enum definition for a cons list. Note that this code
wont compile yet because the `List` type doesnt have a known size, which well
demonstrate.
<Listing number="15-2" file-name="src/main.rs" caption="The first attempt at defining an enum to represent a cons list data structure of `i32` values">
```rust,ignore,does_not_compile
{{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-02/src/main.rs:here}}
```
</Listing>
> Note: Were implementing a cons list that holds only `i32` values for the
> purposes of this example. We could have implemented it using generics, as we
> discussed in Chapter 10, to define a cons list type that could store values of
> any type.
Using the `List` type to store the list `1, 2, 3` would look like the code in
Listing 15-3:
<Listing number="15-3" file-name="src/main.rs" caption="Using the `List` enum to store the list `1, 2, 3`">
```rust,ignore,does_not_compile
{{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-03/src/main.rs:here}}
```
</Listing>
The first `Cons` value holds `1` and another `List` value. This `List` value is
another `Cons` value that holds `2` and another `List` value. This `List` value
is one more `Cons` value that holds `3` and a `List` value, which is finally
`Nil`, the non-recursive variant that signals the end of the list.
If we try to compile the code in Listing 15-3, we get the error shown in Listing
15-4:
<Listing number="15-4" file-name="output.txt" caption="The error we get when attempting to define a recursive enum">
```console
{{#include ../listings/ch15-smart-pointers/listing-15-03/output.txt}}
```
</Listing>
The error shows this type has infinite size.” The reason is that weve defined
`List` with a variant that is recursive: it holds another value of itself
directly. As a result, Rust cant figure out how much space it needs to store a
`List` value. Lets break down why we get this error. First, well look at how
Rust decides how much space it needs to store a value of a non-recursive type.
#### Computing the Size of a Non-Recursive Type
Recall the `Message` enum we defined in Listing 6-2 when we discussed enum
definitions in Chapter 6:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/listing-06-02/src/main.rs:here}}
```
To determine how much space to allocate for a `Message` value, Rust goes through
each of the variants to see which variant needs the most space. Rust sees that
`Message::Quit` doesnt need any space, `Message::Move` needs enough space to
store two `i32` values, and so forth. Because only one variant will be used, the
most space a `Message` value will need is the space it would take to store the
largest of its variants.
Contrast this with what happens when Rust tries to determine how much space a
recursive type like the `List` enum in Listing 15-2 needs. The compiler starts
by looking at the `Cons` variant, which holds a value of type `i32` and a value
of type `List`. Therefore, `Cons` needs an amount of space equal to the size of
an `i32` plus the size of a `List`. To figure out how much memory the `List`
type needs, the compiler looks at the variants, starting with the `Cons`
variant. The `Cons` variant holds a value of type `i32` and a value of type
`List`, and this process continues infinitely, as shown in Figure 15-1.
<img alt="An infinite Cons list" src="img/trpl15-01.svg" class="center" style="width: 50%;" />
<span class="caption">Figure 15-1: An infinite `List` consisting of infinite
`Cons` variants</span>
#### Using `Box<T>` to Get a Recursive Type with a Known Size
Because Rust cant figure out how much space to allocate for recursively defined
types, the compiler gives an error with this helpful suggestion:
<!-- manual-regeneration
after doing automatic regeneration, look at listings/ch15-smart-pointers/listing-15-03/output.txt and copy the relevant line
-->
```text
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
|
2 | Cons(i32, Box<List>),
| ++++ +
```
In this suggestion, indirection means that instead of storing a value
directly, we should change the data structure to store the value indirectly by
storing a pointer to the value instead.
Because a `Box<T>` is a pointer, Rust always knows how much space a `Box<T>`
needs: a pointers size doesnt change based on the amount of data its pointing
to. This means we can put a `Box<T>` inside the `Cons` variant instead of
another `List` value directly. The `Box<T>` will point to the next `List` value
that will be on the heap rather than inside the `Cons` variant. Conceptually, we
still have a list, created with lists holding other lists, but this
implementation is now more like placing the items next to one another rather
than inside one another.
We can change the definition of the `List` enum in Listing 15-2 and the usage of
the `List` in Listing 15-3 to the code in Listing 15-5, which will compile:
<Listing number="15-5" file-name="src/main.rs" caption="Definition of `List` that uses `Box<T>` in order to have a known size">
```rust
{{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-05/src/main.rs}}
```
</Listing>
The `Cons` variant needs the size of an `i32` plus the space to store the boxs
pointer data. The `Nil` variant stores no values, so it needs less space than
the `Cons` variant. We now know that any `List` value will take up the size of
an `i32` plus the size of a boxs pointer data. By using a box, weve broken the
infinite, recursive chain, so the compiler can figure out the size it needs to
store a `List` value. Figure 15-2 shows what the `Cons` variant looks like now.
<img alt="A finite Cons list" src="img/trpl15-02.svg" class="center" />
<span class="caption">Figure 15-2: A `List` that is not infinitely sized because
`Cons` holds a `Box`</span>
Boxes provide only the indirection and heap allocation; they dont have any
other special capabilities, like those well see with the other smart pointer
types. They also dont have the performance overhead that these special
capabilities incur, so they can be useful in cases like the cons list where the
indirection is the only feature we need. Well look at more use cases for boxes
in Chapter 18, too.
The `Box<T>` type is a smart pointer because it implements the `Deref` trait,
which allows `Box<T>` values to be treated like references. When a `Box<T>`
value goes out of scope, the heap data that the box is pointing to is cleaned up
as well because of the `Drop` trait implementation. These two traits will be
even more important to the functionality provided by the other smart pointer
types well discuss in the rest of this chapter. Lets explore these two traits
in more detail.
[trait-objects]: ch18-02-trait-objects.html#using-trait-objects-that-allow-for-values-of-different-types