src/ch15-01-box.md - book - Git at Google

 ## Using `Box<T>` to Point to Data on the Heap

 The most straightforward smart pointer is a _box_, whose type is written
 `Box<T>`. Boxes allow you to store data on the heap rather than the stack. What
 remains on the stack is the pointer to the heap data. Refer to Chapter 4 to
 review the difference between the stack and the heap.

 Boxes don’t have performance overhead, other than storing their data on the heap
 instead of on the stack. But they don’t have many extra capabilities either.
 You’ll use them most often in these situations:

 * When you have a type whose size can’t be known at compile time and you want to
   use a value of that type in a context that requires an exact size
 * When you have a large amount of data and you want to transfer ownership but
   ensure the data won’t be copied when you do so
 * When you want to own a value and you care only that it’s a type that
   implements a particular trait rather than being of a specific type

 We’ll demonstrate the first situation in the
 [“Enabling Recursive Types with
 Boxes”](#enabling-recursive-types-with-boxes)<!-- ignore --> section. In the
 second case, transferring ownership of a large amount of data can take a long
 time because the data is copied around on the stack. To improve performance in
 this situation, we can store the large amount of data on the heap in a box.
 Then, only the small amount of pointer data is copied around on the stack, while
 the data it references stays in one place on the heap. The third case is known
 as a _trait object_, and Chapter 18 devotes an entire section,
 [“Using
 Trait Objects That Allow for Values of Different Types,”][trait-objects]<!--
 ignore --> just to that topic. So what you learn here you’ll apply again in
 Chapter 18!

 ### Using a `Box<T>` to Store Data on the Heap

 Before we discuss the heap storage use case for `Box<T>`, we’ll cover the syntax
 and how to interact with values stored within a `Box<T>`.

 Listing 15-1 shows how to use a box to store an `i32` value on the heap:

 <Listing number="15-1" file-name="src/main.rs" caption="Storing an `i32` value on the heap using a box">

 ```rust
 {{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-01/src/main.rs}}
 ```

 </Listing>

 We define the variable `b` to have the value of a `Box` that points to the value
 `5`, which is allocated on the heap. This program will print `b = 5`; in this
 case, we can access the data in the box similar to how we would if this data
 were on the stack. Just like any owned value, when a box goes out of scope, as
 `b` does at the end of `main`, it will be deallocated. The deallocation happens
 both for the box (stored on the stack) and the data it points to (stored on the
 heap).

 Putting a single value on the heap isn’t very useful, so you won’t use boxes by
 themselves in this way very often. Having values like a single `i32` on the
 stack, where they’re stored by default, is more appropriate in the majority of
 situations. Let’s look at a case where boxes allow us to define types that we
 wouldn’t be allowed to if we didn’t have boxes.

 ### Enabling Recursive Types with Boxes

 A value of _recursive type_ can have another value of the same type as part of
 itself. Recursive types pose an issue because at compile time Rust needs to know
 how much space a type takes up. However, the nesting of values of recursive
 types could theoretically continue infinitely, so Rust can’t know how much space
 the value needs. Because boxes have a known size, we can enable recursive types
 by inserting a box in the recursive type definition.

 As an example of a recursive type, let’s explore the _cons list_. This is a data
 type commonly found in functional programming languages. The cons list type
 we’ll define is straightforward except for the recursion; therefore, the
 concepts in the example we’ll work with will be useful any time you get into
 more complex situations involving recursive types.

 #### More Information About the Cons List

 A _cons list_ is a data structure that comes from the Lisp programming language
 and its dialects and is made up of nested pairs, and is the Lisp version of a
 linked list. Its name comes from the `cons` function (short for “construct
 function”) in Lisp that constructs a new pair from its two arguments. By calling
 `cons` on a pair consisting of a value and another pair, we can construct cons
 lists made up of recursive pairs.

 For example, here’s a pseudocode representation of a cons list containing the
 list 1, 2, 3 with each pair in parentheses:

 ```text
 (1, (2, (3, Nil)))
 ```

 Each item in a cons list contains two elements: the value of the current item
 and the next item. The last item in the list contains only a value called `Nil`
 without a next item. A cons list is produced by recursively calling the `cons`
 function. The canonical name to denote the base case of the recursion is `Nil`.
 Note that this is not the same as the “null” or “nil” concept in Chapter 6,
 which is an invalid or absent value.

 The cons list isn’t a commonly used data structure in Rust. Most of the time
 when you have a list of items in Rust, `Vec<T>` is a better choice to use.
 Other, more complex recursive data types _are_ useful in various situations, but
 by starting with the cons list in this chapter, we can explore how boxes let us
 define a recursive data type without much distraction.

 Listing 15-2 contains an enum definition for a cons list. Note that this code
 won’t compile yet because the `List` type doesn’t have a known size, which we’ll
 demonstrate.

 <Listing number="15-2" file-name="src/main.rs" caption="The first attempt at defining an enum to represent a cons list data structure of `i32` values">

 ```rust,ignore,does_not_compile
 {{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-02/src/main.rs:here}}
 ```

 </Listing>

 > Note: We’re implementing a cons list that holds only `i32` values for the
 > purposes of this example. We could have implemented it using generics, as we
 > discussed in Chapter 10, to define a cons list type that could store values of
 > any type.

 Using the `List` type to store the list `1, 2, 3` would look like the code in
 Listing 15-3:

 <Listing number="15-3" file-name="src/main.rs" caption="Using the `List` enum to store the list `1, 2, 3`">

 ```rust,ignore,does_not_compile
 {{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-03/src/main.rs:here}}
 ```

 </Listing>

 The first `Cons` value holds `1` and another `List` value. This `List` value is
 another `Cons` value that holds `2` and another `List` value. This `List` value
 is one more `Cons` value that holds `3` and a `List` value, which is finally
 `Nil`, the non-recursive variant that signals the end of the list.

 If we try to compile the code in Listing 15-3, we get the error shown in Listing
 15-4:

 <Listing number="15-4" file-name="output.txt" caption="The error we get when attempting to define a recursive enum">

 ```console
 {{#include ../listings/ch15-smart-pointers/listing-15-03/output.txt}}
 ```

 </Listing>

 The error shows this type “has infinite size.” The reason is that we’ve defined
 `List` with a variant that is recursive: it holds another value of itself
 directly. As a result, Rust can’t figure out how much space it needs to store a
 `List` value. Let’s break down why we get this error. First, we’ll look at how
 Rust decides how much space it needs to store a value of a non-recursive type.

 #### Computing the Size of a Non-Recursive Type

 Recall the `Message` enum we defined in Listing 6-2 when we discussed enum
 definitions in Chapter 6:

 ```rust
 {{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/listing-06-02/src/main.rs:here}}
 ```

 To determine how much space to allocate for a `Message` value, Rust goes through
 each of the variants to see which variant needs the most space. Rust sees that
 `Message::Quit` doesn’t need any space, `Message::Move` needs enough space to
 store two `i32` values, and so forth. Because only one variant will be used, the
 most space a `Message` value will need is the space it would take to store the
 largest of its variants.

 Contrast this with what happens when Rust tries to determine how much space a
 recursive type like the `List` enum in Listing 15-2 needs. The compiler starts
 by looking at the `Cons` variant, which holds a value of type `i32` and a value
 of type `List`. Therefore, `Cons` needs an amount of space equal to the size of
 an `i32` plus the size of a `List`. To figure out how much memory the `List`
 type needs, the compiler looks at the variants, starting with the `Cons`
 variant. The `Cons` variant holds a value of type `i32` and a value of type
 `List`, and this process continues infinitely, as shown in Figure 15-1.

 <img alt="An infinite Cons list" src="img/trpl15-01.svg" class="center" style="width: 50%;" />

 <span class="caption">Figure 15-1: An infinite `List` consisting of infinite
 `Cons` variants</span>

 #### Using `Box<T>` to Get a Recursive Type with a Known Size

 Because Rust can’t figure out how much space to allocate for recursively defined
 types, the compiler gives an error with this helpful suggestion:

 <!-- manual-regeneration
 after doing automatic regeneration, look at listings/ch15-smart-pointers/listing-15-03/output.txt and copy the relevant line
 -->

 ```text
 help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
   |
 2 |     Cons(i32, Box<List>),
   |               ++++    +
 ```

 In this suggestion, “indirection” means that instead of storing a value
 directly, we should change the data structure to store the value indirectly by
 storing a pointer to the value instead.

 Because a `Box<T>` is a pointer, Rust always knows how much space a `Box<T>`
 needs: a pointer’s size doesn’t change based on the amount of data it’s pointing
 to. This means we can put a `Box<T>` inside the `Cons` variant instead of
 another `List` value directly. The `Box<T>` will point to the next `List` value
 that will be on the heap rather than inside the `Cons` variant. Conceptually, we
 still have a list, created with lists holding other lists, but this
 implementation is now more like placing the items next to one another rather
 than inside one another.

 We can change the definition of the `List` enum in Listing 15-2 and the usage of
 the `List` in Listing 15-3 to the code in Listing 15-5, which will compile:

 <Listing number="15-5" file-name="src/main.rs" caption="Definition of `List` that uses `Box<T>` in order to have a known size">

 ```rust
 {{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-05/src/main.rs}}
 ```

 </Listing>

 The `Cons` variant needs the size of an `i32` plus the space to store the box’s
 pointer data. The `Nil` variant stores no values, so it needs less space than
 the `Cons` variant. We now know that any `List` value will take up the size of
 an `i32` plus the size of a box’s pointer data. By using a box, we’ve broken the
 infinite, recursive chain, so the compiler can figure out the size it needs to
 store a `List` value. Figure 15-2 shows what the `Cons` variant looks like now.

 <img alt="A finite Cons list" src="img/trpl15-02.svg" class="center" />

 <span class="caption">Figure 15-2: A `List` that is not infinitely sized because
 `Cons` holds a `Box`</span>

 Boxes provide only the indirection and heap allocation; they don’t have any
 other special capabilities, like those we’ll see with the other smart pointer
 types. They also don’t have the performance overhead that these special
 capabilities incur, so they can be useful in cases like the cons list where the
 indirection is the only feature we need. We’ll look at more use cases for boxes
 in Chapter 18, too.

 The `Box<T>` type is a smart pointer because it implements the `Deref` trait,
 which allows `Box<T>` values to be treated like references. When a `Box<T>`
 value goes out of scope, the heap data that the box is pointing to is cleaned up
 as well because of the `Drop` trait implementation. These two traits will be
 even more important to the functionality provided by the other smart pointer
 types we’ll discuss in the rest of this chapter. Let’s explore these two traits
 in more detail.

 [trait-objects]: ch18-02-trait-objects.html#using-trait-objects-that-allow-for-values-of-different-types
	## Using `Box<T>` to Point to Data on the Heap

	The most straightforward smart pointer is a _box_, whose type is written
	`Box<T>`. Boxes allow you to store data on the heap rather than the stack. What
	remains on the stack is the pointer to the heap data. Refer to Chapter 4 to
	review the difference between the stack and the heap.

	Boxes don’t have performance overhead, other than storing their data on the heap
	instead of on the stack. But they don’t have many extra capabilities either.
	You’ll use them most often in these situations:

	* When you have a type whose size can’t be known at compile time and you want to
	use a value of that type in a context that requires an exact size
	* When you have a large amount of data and you want to transfer ownership but
	ensure the data won’t be copied when you do so
	* When you want to own a value and you care only that it’s a type that
	implements a particular trait rather than being of a specific type

	We’ll demonstrate the first situation in the
	[“Enabling Recursive Types with
	Boxes”](#enabling-recursive-types-with-boxes)<!-- ignore --> section. In the
	second case, transferring ownership of a large amount of data can take a long
	time because the data is copied around on the stack. To improve performance in
	this situation, we can store the large amount of data on the heap in a box.
	Then, only the small amount of pointer data is copied around on the stack, while
	the data it references stays in one place on the heap. The third case is known
	as a _trait object_, and Chapter 18 devotes an entire section,
	[“Using
	Trait Objects That Allow for Values of Different Types,”][trait-objects]<!--
	ignore --> just to that topic. So what you learn here you’ll apply again in
	Chapter 18!

	### Using a `Box<T>` to Store Data on the Heap

	Before we discuss the heap storage use case for `Box<T>`, we’ll cover the syntax
	and how to interact with values stored within a `Box<T>`.

	Listing 15-1 shows how to use a box to store an `i32` value on the heap:

	<Listing number="15-1" file-name="src/main.rs" caption="Storing an `i32` value on the heap using a box">

	```rust
	{{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-01/src/main.rs}}
	```

	</Listing>

	We define the variable `b` to have the value of a `Box` that points to the value
	`5`, which is allocated on the heap. This program will print `b = 5`; in this
	case, we can access the data in the box similar to how we would if this data
	were on the stack. Just like any owned value, when a box goes out of scope, as
	`b` does at the end of `main`, it will be deallocated. The deallocation happens
	both for the box (stored on the stack) and the data it points to (stored on the
	heap).

	Putting a single value on the heap isn’t very useful, so you won’t use boxes by
	themselves in this way very often. Having values like a single `i32` on the
	stack, where they’re stored by default, is more appropriate in the majority of
	situations. Let’s look at a case where boxes allow us to define types that we
	wouldn’t be allowed to if we didn’t have boxes.

	### Enabling Recursive Types with Boxes

	A value of _recursive type_ can have another value of the same type as part of
	itself. Recursive types pose an issue because at compile time Rust needs to know
	how much space a type takes up. However, the nesting of values of recursive
	types could theoretically continue infinitely, so Rust can’t know how much space
	the value needs. Because boxes have a known size, we can enable recursive types
	by inserting a box in the recursive type definition.

	As an example of a recursive type, let’s explore the _cons list_. This is a data
	type commonly found in functional programming languages. The cons list type
	we’ll define is straightforward except for the recursion; therefore, the
	concepts in the example we’ll work with will be useful any time you get into
	more complex situations involving recursive types.

	#### More Information About the Cons List

	A _cons list_ is a data structure that comes from the Lisp programming language
	and its dialects and is made up of nested pairs, and is the Lisp version of a
	linked list. Its name comes from the `cons` function (short for “construct
	function”) in Lisp that constructs a new pair from its two arguments. By calling
	`cons` on a pair consisting of a value and another pair, we can construct cons
	lists made up of recursive pairs.

	For example, here’s a pseudocode representation of a cons list containing the
	list 1, 2, 3 with each pair in parentheses:

	```text
	(1, (2, (3, Nil)))
	```

	Each item in a cons list contains two elements: the value of the current item
	and the next item. The last item in the list contains only a value called `Nil`
	without a next item. A cons list is produced by recursively calling the `cons`
	function. The canonical name to denote the base case of the recursion is `Nil`.
	Note that this is not the same as the “null” or “nil” concept in Chapter 6,
	which is an invalid or absent value.

	The cons list isn’t a commonly used data structure in Rust. Most of the time
	when you have a list of items in Rust, `Vec<T>` is a better choice to use.
	Other, more complex recursive data types _are_ useful in various situations, but
	by starting with the cons list in this chapter, we can explore how boxes let us
	define a recursive data type without much distraction.

	Listing 15-2 contains an enum definition for a cons list. Note that this code
	won’t compile yet because the `List` type doesn’t have a known size, which we’ll
	demonstrate.

	<Listing number="15-2" file-name="src/main.rs" caption="The first attempt at defining an enum to represent a cons list data structure of `i32` values">

	```rust,ignore,does_not_compile
	{{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-02/src/main.rs:here}}
	```

	</Listing>

	> Note: We’re implementing a cons list that holds only `i32` values for the
	> purposes of this example. We could have implemented it using generics, as we
	> discussed in Chapter 10, to define a cons list type that could store values of
	> any type.

	Using the `List` type to store the list `1, 2, 3` would look like the code in
	Listing 15-3:

	<Listing number="15-3" file-name="src/main.rs" caption="Using the `List` enum to store the list `1, 2, 3`">

	```rust,ignore,does_not_compile
	{{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-03/src/main.rs:here}}
	```

	</Listing>

	The first `Cons` value holds `1` and another `List` value. This `List` value is
	another `Cons` value that holds `2` and another `List` value. This `List` value
	is one more `Cons` value that holds `3` and a `List` value, which is finally
	`Nil`, the non-recursive variant that signals the end of the list.

	If we try to compile the code in Listing 15-3, we get the error shown in Listing
	15-4:

	<Listing number="15-4" file-name="output.txt" caption="The error we get when attempting to define a recursive enum">

	```console
	{{#include ../listings/ch15-smart-pointers/listing-15-03/output.txt}}
	```

	</Listing>

	The error shows this type “has infinite size.” The reason is that we’ve defined
	`List` with a variant that is recursive: it holds another value of itself
	directly. As a result, Rust can’t figure out how much space it needs to store a
	`List` value. Let’s break down why we get this error. First, we’ll look at how
	Rust decides how much space it needs to store a value of a non-recursive type.

	#### Computing the Size of a Non-Recursive Type

	Recall the `Message` enum we defined in Listing 6-2 when we discussed enum
	definitions in Chapter 6:

	```rust
	{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/listing-06-02/src/main.rs:here}}
	```

	To determine how much space to allocate for a `Message` value, Rust goes through
	each of the variants to see which variant needs the most space. Rust sees that
	`Message::Quit` doesn’t need any space, `Message::Move` needs enough space to
	store two `i32` values, and so forth. Because only one variant will be used, the
	most space a `Message` value will need is the space it would take to store the
	largest of its variants.

	Contrast this with what happens when Rust tries to determine how much space a
	recursive type like the `List` enum in Listing 15-2 needs. The compiler starts
	by looking at the `Cons` variant, which holds a value of type `i32` and a value
	of type `List`. Therefore, `Cons` needs an amount of space equal to the size of
	an `i32` plus the size of a `List`. To figure out how much memory the `List`
	type needs, the compiler looks at the variants, starting with the `Cons`
	variant. The `Cons` variant holds a value of type `i32` and a value of type
	`List`, and this process continues infinitely, as shown in Figure 15-1.

	<img alt="An infinite Cons list" src="img/trpl15-01.svg" class="center" style="width: 50%;" />

	<span class="caption">Figure 15-1: An infinite `List` consisting of infinite
	`Cons` variants</span>

	#### Using `Box<T>` to Get a Recursive Type with a Known Size

	Because Rust can’t figure out how much space to allocate for recursively defined
	types, the compiler gives an error with this helpful suggestion:

	<!-- manual-regeneration
	after doing automatic regeneration, look at listings/ch15-smart-pointers/listing-15-03/output.txt and copy the relevant line
	-->

	```text
	help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
	\|
	2 \| Cons(i32, Box<List>),
	\| ++++ +
	```

	In this suggestion, “indirection” means that instead of storing a value
	directly, we should change the data structure to store the value indirectly by
	storing a pointer to the value instead.

	Because a `Box<T>` is a pointer, Rust always knows how much space a `Box<T>`
	needs: a pointer’s size doesn’t change based on the amount of data it’s pointing
	to. This means we can put a `Box<T>` inside the `Cons` variant instead of
	another `List` value directly. The `Box<T>` will point to the next `List` value
	that will be on the heap rather than inside the `Cons` variant. Conceptually, we
	still have a list, created with lists holding other lists, but this
	implementation is now more like placing the items next to one another rather
	than inside one another.

	We can change the definition of the `List` enum in Listing 15-2 and the usage of
	the `List` in Listing 15-3 to the code in Listing 15-5, which will compile:

	<Listing number="15-5" file-name="src/main.rs" caption="Definition of `List` that uses `Box<T>` in order to have a known size">

	```rust
	{{#rustdoc_include ../listings/ch15-smart-pointers/listing-15-05/src/main.rs}}
	```

	</Listing>

	The `Cons` variant needs the size of an `i32` plus the space to store the box’s
	pointer data. The `Nil` variant stores no values, so it needs less space than
	the `Cons` variant. We now know that any `List` value will take up the size of
	an `i32` plus the size of a box’s pointer data. By using a box, we’ve broken the
	infinite, recursive chain, so the compiler can figure out the size it needs to
	store a `List` value. Figure 15-2 shows what the `Cons` variant looks like now.

	<img alt="A finite Cons list" src="img/trpl15-02.svg" class="center" />

	<span class="caption">Figure 15-2: A `List` that is not infinitely sized because
	`Cons` holds a `Box`</span>

	Boxes provide only the indirection and heap allocation; they don’t have any
	other special capabilities, like those we’ll see with the other smart pointer
	types. They also don’t have the performance overhead that these special
	capabilities incur, so they can be useful in cases like the cons list where the
	indirection is the only feature we need. We’ll look at more use cases for boxes
	in Chapter 18, too.

	The `Box<T>` type is a smart pointer because it implements the `Deref` trait,
	which allows `Box<T>` values to be treated like references. When a `Box<T>`
	value goes out of scope, the heap data that the box is pointing to is cleaned up
	as well because of the `Drop` trait implementation. These two traits will be
	even more important to the functionality provided by the other smart pointer
	types we’ll discuss in the rest of this chapter. Let’s explore these two traits
	in more detail.

	[trait-objects]: ch18-02-trait-objects.html#using-trait-objects-that-allow-for-values-of-different-types