| <!-- DO NOT EDIT THIS FILE. |
| |
| This file is periodically generated from the content in the `/src/` |
| directory, so all fixes need to be made in `/src/`. |
| --> |
| |
| [TOC] |
| |
| # Smart Pointers |
| |
| A *pointer* is a general concept for a variable that contains an address in |
| memory. This address refers to, or “points at,” some other data. The most |
| common kind of pointer in Rust is a reference, which you learned about in |
| Chapter 4. References are indicated by the `&` symbol and borrow the value they |
| point to. They don’t have any special capabilities other than referring to |
| data, and they have no overhead. |
| |
| *Smart pointers*, on the other hand, are data structures that act like a |
| pointer but also have additional metadata and capabilities. The concept of |
| smart pointers isn’t unique to Rust: smart pointers originated in C++ and exist |
| in other languages as well. Rust has a variety of smart pointers defined in the |
| standard library that provide functionality beyond that provided by references. |
| To explore the general concept, we’ll look at a couple of different examples of |
| smart pointers, including a *reference counting* smart pointer type. This |
| pointer enables you to allow data to have multiple owners by keeping track of |
| the number of owners and, when no owners remain, cleaning up the data. |
| |
| Rust, with its concept of ownership and borrowing, has an additional difference |
| between references and smart pointers: while references only borrow data, in |
| many cases smart pointers *own* the data they point to. |
| |
| Though we didn’t call them as such at the time, we’ve already encountered a few |
| smart pointers in this book, including `String` and `Vec<T>` in Chapter 8. Both |
| of these types count as smart pointers because they own some memory and allow |
| you to manipulate it. They also have metadata and extra capabilities or |
| guarantees. `String`, for example, stores its capacity as metadata and has the |
| extra ability to ensure its data will always be valid UTF-8. |
| |
| Smart pointers are usually implemented using structs. Unlike an ordinary |
| struct, smart pointers implement the `Deref` and `Drop` traits. The `Deref` |
| trait allows an instance of the smart pointer struct to behave like a reference |
| so you can write your code to work with either references or smart pointers. |
| The `Drop` trait allows you to customize the code that’s run when an instance |
| of the smart pointer goes out of scope. In this chapter, we’ll discuss both |
| traits and demonstrate why they’re important to smart pointers. |
| |
| Given that the smart pointer pattern is a general design pattern used |
| frequently in Rust, this chapter won’t cover every existing smart pointer. Many |
| libraries have their own smart pointers, and you can even write your own. We’ll |
| cover the most common smart pointers in the standard library: |
| |
| * `Box<T>`, for allocating values on the heap |
| * `Rc<T>`, a reference counting type that enables multiple ownership |
| * `Ref<T>` and `RefMut<T>`, accessed through `RefCell<T>`, a type that enforces |
| the borrowing rules at runtime instead of compile time |
| |
| In addition, we’ll cover the *interior mutability* pattern where an immutable |
| type exposes an API for mutating an interior value. We’ll also discuss |
| *reference cycles*: how they can leak memory and how to prevent them. |
| |
| Let’s dive in! |
| |
| ## Using Box<T> to Point to Data on the Heap |
| |
| The most straightforward smart pointer is a *box*, whose type is written |
| `Box<T>`. Boxes allow you to store data on the heap rather than the stack. What |
| remains on the stack is the pointer to the heap data. Refer to Chapter 4 to |
| review the difference between the stack and the heap. |
| |
| Boxes don’t have performance overhead, other than storing their data on the |
| heap instead of on the stack. But they don’t have many extra capabilities |
| either. You’ll use them most often in these situations: |
| |
| * When you have a type whose size can’t be known at compile time and you want |
| to use a value of that type in a context that requires an exact size |
| * When you have a large amount of data and you want to transfer ownership but |
| ensure the data won’t be copied when you do so |
| * When you want to own a value and you care only that it’s a type that |
| implements a particular trait rather than being of a specific type |
| |
| We’ll demonstrate the first situation in “Enabling Recursive Types with Boxes” |
| on page XX. In the second case, transferring ownership of a large amount of |
| data can take a long time because the data is copied around on the stack. To |
| improve performance in this situation, we can store the large amount of data on |
| the heap in a box. Then, only the small amount of pointer data is copied around |
| on the stack, while the data it references stays in one place on the heap. The |
| third case is known as a *trait object*, and “Using Trait Objects That Allow |
| for Values of Different Types” on page XX is devoted to that topic. So what you |
| learn here you’ll apply again in that section! |
| |
| ### Using Box<T> to Store Data on the Heap |
| |
| Before we discuss the heap storage use case for `Box<T>`, we’ll cover the |
| syntax and how to interact with values stored within a `Box<T>`. |
| |
| Listing 15-1 shows how to use a box to store an `i32` value on the heap. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let b = Box::new(5); |
| println!("b = {b}"); |
| } |
| ``` |
| |
| Listing 15-1: Storing an `i32` value on the heap using a box |
| |
| We define the variable `b` to have the value of a `Box` that points to the |
| value `5`, which is allocated on the heap. This program will print `b = 5`; in |
| this case, we can access the data in the box similar to how we would if this |
| data were on the stack. Just like any owned value, when a box goes out of |
| scope, as `b` does at the end of `main`, it will be deallocated. The |
| deallocation happens both for the box (stored on the stack) and the data it |
| points to (stored on the heap). |
| |
| Putting a single value on the heap isn’t very useful, so you won’t use boxes by |
| themselves in this way very often. Having values like a single `i32` on the |
| stack, where they’re stored by default, is more appropriate in the majority of |
| situations. Let’s look at a case where boxes allow us to define types that we |
| wouldn’t be allowed to define if we didn’t have boxes. |
| |
| ### Enabling Recursive Types with Boxes |
| |
| A value of a *recursive type* can have another value of the same type as part |
| of itself. Recursive types pose an issue because at compile time Rust needs to |
| know how much space a type takes up. However, the nesting of values of |
| recursive types could theoretically continue infinitely, so Rust can’t know how |
| much space the value needs. Because boxes have a known size, we can enable |
| recursive types by inserting a box in the recursive type definition. |
| |
| As an example of a recursive type, let’s explore the *cons list*. This is a |
| data type commonly found in functional programming languages. The cons list |
| type we’ll define is straightforward except for the recursion; therefore, the |
| concepts in the example we’ll work with will be useful any time you get into |
| more complex situations involving recursive types. |
| |
| #### More Information About the Cons List |
| |
| A *cons list* is a data structure that comes from the Lisp programming language |
| and its dialects, is made up of nested pairs, and is the Lisp version of a |
| linked list. Its name comes from the `cons` function (short for *construct |
| function*) in Lisp that constructs a new pair from its two arguments. By |
| calling `cons` on a pair consisting of a value and another pair, we can |
| construct cons lists made up of recursive pairs. |
| |
| For example, here’s a pseudocode representation of a cons list containing the |
| list `1, 2, 3` with each pair in parentheses: |
| |
| ``` |
| (1, (2, (3, Nil))) |
| ``` |
| |
| Each item in a cons list contains two elements: the value of the current item |
| and the next item. The last item in the list contains only a value called `Nil` |
| without a next item. A cons list is produced by recursively calling the `cons` |
| function. The canonical name to denote the base case of the recursion is `Nil`. |
| Note that this is not the same as the “null” or “nil” concept in Chapter 6, |
| which is an invalid or absent value. |
| |
| The cons list isn’t a commonly used data structure in Rust. Most of the time |
| when you have a list of items in Rust, `Vec<T>` is a better choice to use. |
| Other, more complex recursive data types *are* useful in various situations, |
| but by starting with the cons list in this chapter, we can explore how boxes |
| let us define a recursive data type without much distraction. |
| |
| Listing 15-2 contains an enum definition for a cons list. Note that this code |
| won’t compile yet because the `List` type doesn’t have a known size, which |
| we’ll demonstrate. |
| |
| Filename: src/main.rs |
| |
| ``` |
| enum List { |
| Cons(i32, List), |
| Nil, |
| } |
| ``` |
| |
| Listing 15-2: The first attempt at defining an enum to represent a cons list |
| data structure of `i32` values |
| |
| > Note: We’re implementing a cons list that holds only `i32` values for the |
| purposes of this example. We could have implemented it using generics, as we |
| discussed in Chapter 10, to define a cons list type that could store values of |
| any type. |
| |
| Using the `List` type to store the list `1, 2, 3` would look like the code in |
| Listing 15-3. |
| |
| Filename: src/main.rs |
| |
| ``` |
| --snip-- |
| |
| use crate::List::{Cons, Nil}; |
| |
| fn main() { |
| let list = Cons(1, Cons(2, Cons(3, Nil))); |
| } |
| ``` |
| |
| Listing 15-3: Using the `List` enum to store the list `1, 2, 3` |
| |
| The first `Cons` value holds `1` and another `List` value. This `List` value is |
| another `Cons` value that holds `2` and another `List` value. This `List` value |
| is one more `Cons` value that holds `3` and a `List` value, which is finally |
| `Nil`, the non-recursive variant that signals the end of the list. |
| |
| If we try to compile the code in Listing 15-3, we get the error shown in |
| Listing 15-4. |
| |
| ``` |
| error[E0072]: recursive type `List` has infinite size |
| --> src/main.rs:1:1 |
| | |
| 1 | enum List { |
| | ^^^^^^^^^ recursive type has infinite size |
| 2 | Cons(i32, List), |
| | ---- recursive without indirection |
| | |
| help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `List` |
| representable |
| | |
| 2 | Cons(i32, Box<List>), |
| | ++++ + |
| ``` |
| |
| Listing 15-4: The error we get when attempting to define a recursive enum |
| |
| The error shows this type “has infinite size.” The reason is that we’ve defined |
| `List` with a variant that is recursive: it holds another value of itself |
| directly. As a result, Rust can’t figure out how much space it needs to store a |
| `List` value. Let’s break down why we get this error. First we’ll look at how |
| Rust decides how much space it needs to store a value of a non-recursive type. |
| |
| #### Computing the Size of a Non-Recursive Type |
| |
| Recall the `Message` enum we defined in Listing 6-2 when we discussed enum |
| definitions in Chapter 6: |
| |
| ``` |
| enum Message { |
| Quit, |
| Move { x: i32, y: i32 }, |
| Write(String), |
| ChangeColor(i32, i32, i32), |
| } |
| ``` |
| |
| To determine how much space to allocate for a `Message` value, Rust goes |
| through each of the variants to see which variant needs the most space. Rust |
| sees that `Message::Quit` doesn’t need any space, `Message::Move` needs enough |
| space to store two `i32` values, and so forth. Because only one variant will be |
| used, the most space a `Message` value will need is the space it would take to |
| store the largest of its variants. |
| |
| Contrast this with what happens when Rust tries to determine how much space a |
| recursive type like the `List` enum in Listing 15-2 needs. The compiler starts |
| by looking at the `Cons` variant, which holds a value of type `i32` and a value |
| of type `List`. Therefore, `Cons` needs an amount of space equal to the size of |
| an `i32` plus the size of a `List`. To figure out how much memory the `List` |
| type needs, the compiler looks at the variants, starting with the `Cons` |
| variant. The `Cons` variant holds a value of type `i32` and a value of type |
| `List`, and this process continues infinitely, as shown in Figure 15-1. |
| |
| Figure 15-1: An infinite `List` consisting of infinite `Cons` variants |
| |
| #### Using Box<T> to Get a Recursive Type with a Known Size |
| |
| Because Rust can’t figure out how much space to allocate for recursively |
| defined types, the compiler gives an error with this helpful suggestion: |
| |
| ``` |
| help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to make `List` |
| representable |
| | |
| 2 | Cons(i32, Box<List>), |
| | ++++ + |
| ``` |
| |
| In this suggestion, *indirection* means that instead of storing a value |
| directly, we should change the data structure to store the value indirectly by |
| storing a pointer to the value instead. |
| |
| Because a `Box<T>` is a pointer, Rust always knows how much space a `Box<T>` |
| needs: a pointer’s size doesn’t change based on the amount of data it’s |
| pointing to. This means we can put a `Box<T>` inside the `Cons` variant instead |
| of another `List` value directly. The `Box<T>` will point to the next `List` |
| value that will be on the heap rather than inside the `Cons` variant. |
| Conceptually, we still have a list, created with lists holding other lists, but |
| this implementation is now more like placing the items next to one another |
| rather than inside one another. |
| |
| We can change the definition of the `List` enum in Listing 15-2 and the usage |
| of the `List` in Listing 15-3 to the code in Listing 15-5, which will compile. |
| |
| Filename: src/main.rs |
| |
| ``` |
| enum List { |
| Cons(i32, Box<List>), |
| Nil, |
| } |
| |
| use crate::List::{Cons, Nil}; |
| |
| fn main() { |
| let list = Cons( |
| 1, |
| Box::new(Cons( |
| 2, |
| Box::new(Cons( |
| 3, |
| Box::new(Nil) |
| )) |
| )) |
| ); |
| } |
| ``` |
| |
| Listing 15-5: Definition of `List` that uses `Box<T>` in order to have a known |
| size |
| |
| The `Cons` variant needs the size of an `i32` plus the space to store the box’s |
| pointer data. The `Nil` variant stores no values, so it needs less space than |
| the `Cons` variant. We now know that any `List` value will take up the size of |
| an `i32` plus the size of a box’s pointer data. By using a box, we’ve broken |
| the infinite, recursive chain, so the compiler can figure out the size it needs |
| to store a `List` value. Figure 15-2 shows what the `Cons` variant looks like |
| now. |
| |
| Figure 15-2: A `List` that is not infinitely sized, because `Cons` holds a `Box` |
| |
| Boxes provide only the indirection and heap allocation; they don’t have any |
| other special capabilities, like those we’ll see with the other smart pointer |
| types. They also don’t have the performance overhead that these special |
| capabilities incur, so they can be useful in cases like the cons list where the |
| indirection is the only feature we need. We’ll look at more use cases for boxes |
| in Chapter 17. |
| |
| The `Box<T>` type is a smart pointer because it implements the `Deref` trait, |
| which allows `Box<T>` values to be treated like references. When a `Box<T>` |
| value goes out of scope, the heap data that the box is pointing to is cleaned |
| up as well because of the `Drop` trait implementation. These two traits will be |
| even more important to the functionality provided by the other smart pointer |
| types we’ll discuss in the rest of this chapter. Let’s explore these two traits |
| in more detail. |
| |
| ## Treating Smart Pointers Like Regular References with Deref |
| |
| Implementing the `Deref` trait allows you to customize the behavior of the |
| *dereference operator* `*` (not to be confused with the multiplication or glob |
| operator). By implementing `Deref` in such a way that a smart pointer can be |
| treated like a regular reference, you can write code that operates on |
| references and use that code with smart pointers too. |
| |
| Let’s first look at how the dereference operator works with regular references. |
| Then we’ll try to define a custom type that behaves like `Box<T>`, and see why |
| the dereference operator doesn’t work like a reference on our newly defined |
| type. We’ll explore how implementing the `Deref` trait makes it possible for |
| smart pointers to work in ways similar to references. Then we’ll look at Rust’s |
| *deref coercion* feature and how it lets us work with either references or |
| smart pointers. |
| |
| > Note: There’s one big difference between the `MyBox<T>` type we’re about to |
| build and the real `Box<T>`: our version will not store its data on the heap. |
| We are focusing this example on `Deref`, so where the data is actually stored |
| is less important than the pointer-like behavior. |
| |
| ### Following the Pointer to the Value |
| |
| A regular reference is a type of pointer, and one way to think of a pointer is |
| as an arrow to a value stored somewhere else. In Listing 15-6, we create a |
| reference to an `i32` value and then use the dereference operator to follow the |
| reference to the value. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| 1 let x = 5; |
| 2 let y = &x; |
| |
| 3 assert_eq!(5, x); |
| 4 assert_eq!(5, *y); |
| } |
| ``` |
| |
| Listing 15-6: Using the dereference operator to follow a reference to an `i32` |
| value |
| |
| The variable `x` holds an `i32` value `5` [1]. We set `y` equal to a reference |
| to `x` [2]. We can assert that `x` is equal to `5` [3]. However, if we want to |
| make an assertion about the value in `y`, we have to use `*y` to follow the |
| reference to the value it’s pointing to (hence *dereference*) so the compiler |
| can compare the actual value [4]. Once we dereference `y`, we have access to |
| the integer value `y` is pointing to that we can compare with `5`. |
| |
| If we tried to write `assert_eq!(5, y);` instead, we would get this compilation |
| error: |
| |
| ``` |
| error[E0277]: can't compare `{integer}` with `&{integer}` |
| --> src/main.rs:6:5 |
| | |
| 6 | assert_eq!(5, y); |
| | ^^^^^^^^^^^^^^^^ no implementation for `{integer} == |
| &{integer}` |
| | |
| = help: the trait `PartialEq<&{integer}>` is not implemented |
| for `{integer}` |
| ``` |
| |
| Comparing a number and a reference to a number isn’t allowed because they’re |
| different types. We must use the dereference operator to follow the reference |
| to the value it’s pointing to. |
| |
| ### Using Box<T> Like a Reference |
| |
| We can rewrite the code in Listing 15-6 to use a `Box<T>` instead of a |
| reference; the dereference operator used on the `Box<T>` in Listing 15-7 |
| functions in the same way as the dereference operator used on the reference in |
| Listing 15-6. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let x = 5; |
| 1 let y = Box::new(x); |
| |
| assert_eq!(5, x); |
| 2 assert_eq!(5, *y); |
| } |
| ``` |
| |
| Listing 15-7: Using the dereference operator on a `Box<i32>` |
| |
| The main difference between Listing 15-7 and Listing 15-6 is that here we set |
| `y` to be an instance of a box pointing to a copied value of `x` rather than a |
| reference pointing to the value of `x` [1]. In the last assertion [2], we can |
| use the dereference operator to follow the box’s pointer in the same way that |
| we did when `y` was a reference. Next, we’ll explore what is special about |
| `Box<T>` that enables us to use the dereference operator by defining our own |
| box type. |
| |
| ### Defining Our Own Smart Pointer |
| |
| Let’s build a smart pointer similar to the `Box<T>` type provided by the |
| standard library to experience how smart pointers behave differently from |
| references by default. Then we’ll look at how to add the ability to use the |
| dereference operator. |
| |
| The `Box<T>` type is ultimately defined as a tuple struct with one element, so |
| Listing 15-8 defines a `MyBox<T>` type in the same way. We’ll also define a |
| `new` function to match the `new` function defined on `Box<T>`. |
| |
| Filename: src/main.rs |
| |
| ``` |
| 1 struct MyBox<T>(T); |
| |
| impl<T> MyBox<T> { |
| 2 fn new(x: T) -> MyBox<T> { |
| 3 MyBox(x) |
| } |
| } |
| ``` |
| |
| Listing 15-8: Defining a `MyBox<T>` type |
| |
| We define a struct named `MyBox` and declare a generic parameter `T` [1] |
| because we want our type to hold values of any type. The `MyBox` type is a |
| tuple struct with one element of type `T`. The `MyBox::new` function takes one |
| parameter of type `T` [2] and returns a `MyBox` instance that holds the value |
| passed in [3]. |
| |
| Let’s try adding the `main` function in Listing 15-7 to Listing 15-8 and |
| changing it to use the `MyBox<T>` type we’ve defined instead of `Box<T>`. The |
| code in Listing 15-9 won’t compile because Rust doesn’t know how to dereference |
| `MyBox`. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let x = 5; |
| let y = MyBox::new(x); |
| |
| assert_eq!(5, x); |
| assert_eq!(5, *y); |
| } |
| ``` |
| |
| Listing 15-9: Attempting to use `MyBox<T>` in the same way we used references |
| and `Box<T>` |
| |
| Here’s the resultant compilation error: |
| |
| ``` |
| error[E0614]: type `MyBox<{integer}>` cannot be dereferenced |
| --> src/main.rs:14:19 |
| | |
| 14 | assert_eq!(5, *y); |
| | ^^ |
| ``` |
| |
| Our `MyBox<T>` type can’t be dereferenced because we haven’t implemented that |
| ability on our type. To enable dereferencing with the `*` operator, we |
| implement the `Deref` trait. |
| |
| ### Implementing the Deref Trait |
| |
| As discussed in “Implementing a Trait on a Type” on page XX, to implement a |
| trait we need to provide implementations for the trait’s required methods. The |
| `Deref` trait, provided by the standard library, requires us to implement one |
| method named `deref` that borrows `self` and returns a reference to the inner |
| data. Listing 15-10 contains an implementation of `Deref` to add to the |
| definition of `MyBox``<T>`. |
| |
| Filename: src/main.rs |
| |
| ``` |
| use std::ops::Deref; |
| |
| impl<T> Deref for MyBox<T> { |
| 1 type Target = T; |
| |
| fn deref(&self) -> &Self::Target { |
| 2 &self.0 |
| } |
| } |
| ``` |
| |
| Listing 15-10: Implementing `Deref` on `MyBox<T>` |
| |
| The `type Target = T;` syntax [1] defines an associated type for the `Deref` |
| trait to use. Associated types are a slightly different way of declaring a |
| generic parameter, but you don’t need to worry about them for now; we’ll cover |
| them in more detail in Chapter 19. |
| |
| We fill in the body of the `deref` method with `&self.0` so `deref` returns a |
| reference to the value we want to access with the `*` operator [2]; recall from |
| “Using Tuple Structs Without Named Fields to Create Different Types” on page XX |
| that `.0` accesses the first value in a tuple struct. The `main` function in |
| Listing 15-9 that calls `*` on the `MyBox<T>` value now compiles, and the |
| assertions pass! |
| |
| Without the `Deref` trait, the compiler can only dereference `&` references. |
| The `deref` method gives the compiler the ability to take a value of any type |
| that implements `Deref` and call the `deref` method to get a `&` reference that |
| it knows how to dereference. |
| |
| When we entered `*y` in Listing 15-9, behind the scenes Rust actually ran this |
| code: |
| |
| ``` |
| *(y.deref()) |
| ``` |
| |
| Rust substitutes the `*` operator with a call to the `deref` method and then a |
| plain dereference so we don’t have to think about whether or not we need to |
| call the `deref` method. This Rust feature lets us write code that functions |
| identically whether we have a regular reference or a type that implements |
| `Deref`. |
| |
| The reason the `deref` method returns a reference to a value, and that the |
| plain dereference outside the parentheses in `*(y.deref())` is still necessary, |
| has to do with the ownership system. If the `deref` method returned the value |
| directly instead of a reference to the value, the value would be moved out of |
| `self`. We don’t want to take ownership of the inner value inside `MyBox<T>` in |
| this case or in most cases where we use the dereference operator. |
| |
| Note that the `*` operator is replaced with a call to the `deref` method and |
| then a call to the `*` operator just once, each time we use a `*` in our code. |
| Because the substitution of the `*` operator does not recurse infinitely, we |
| end up with data of type `i32`, which matches the `5` in `assert_eq!` in |
| Listing 15-9. |
| |
| ### Implicit Deref Coercions with Functions and Methods |
| |
| *Deref coercion* converts a reference to a type that implements the `Deref` |
| trait into a reference to another type. For example, deref coercion can convert |
| `&String` to `&str` because `String` implements the `Deref` trait such that it |
| returns `&str`. Deref coercion is a convenience Rust performs on arguments to |
| functions and methods, and works only on types that implement the `Deref` |
| trait. It happens automatically when we pass a reference to a particular type’s |
| value as an argument to a function or method that doesn’t match the parameter |
| type in the function or method definition. A sequence of calls to the `deref` |
| method converts the type we provided into the type the parameter needs. |
| |
| Deref coercion was added to Rust so that programmers writing function and |
| method calls don’t need to add as many explicit references and dereferences |
| with `&` and `*`. The deref coercion feature also lets us write more code that |
| can work for either references or smart pointers. |
| |
| To see deref coercion in action, let’s use the `MyBox<T>` type we defined in |
| Listing 15-8 as well as the implementation of `Deref` that we added in Listing |
| 15-10. Listing 15-11 shows the definition of a function that has a string slice |
| parameter. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn hello(name: &str) { |
| println!("Hello, {name}!"); |
| } |
| ``` |
| |
| Listing 15-11: A `hello` function that has the parameter `name` of type `&str` |
| |
| We can call the `hello` function with a string slice as an argument, such as |
| `hello("Rust");`, for example. Deref coercion makes it possible to call `hello` |
| with a reference to a value of type `MyBox<String>`, as shown in Listing 15-12. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let m = MyBox::new(String::from("Rust")); |
| hello(&m); |
| } |
| ``` |
| |
| Listing 15-12: Calling `hello` with a reference to a `MyBox<String>` value, |
| which works because of deref coercion |
| |
| Here we’re calling the `hello` function with the argument `&m`, which is a |
| reference to a `MyBox<String>` value. Because we implemented the `Deref` trait |
| on `MyBox<T>` in Listing 15-10, Rust can turn `&MyBox<String>` into `&String` |
| by calling `deref`. The standard library provides an implementation of `Deref` |
| on `String` that returns a string slice, and this is in the API documentation |
| for `Deref`. Rust calls `deref` again to turn the `&String` into `&str`, which |
| matches the `hello` function’s definition. |
| |
| If Rust didn’t implement deref coercion, we would have to write the code in |
| Listing 15-13 instead of the code in Listing 15-12 to call `hello` with a value |
| of type `&MyBox<String>`. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let m = MyBox::new(String::from("Rust")); |
| hello(&(*m)[..]); |
| } |
| ``` |
| |
| Listing 15-13: The code we would have to write if Rust didn’t have deref |
| coercion |
| |
| The `(*m)` dereferences the `MyBox<String>` into a `String`. Then the `&` and |
| `[..]` take a string slice of the `String` that is equal to the whole string to |
| match the signature of `hello`. This code without deref coercions is harder to |
| read, write, and understand with all of these symbols involved. Deref coercion |
| allows Rust to handle these conversions for us automatically. |
| |
| When the `Deref` trait is defined for the types involved, Rust will analyze the |
| types and use `Deref::deref` as many times as necessary to get a reference to |
| match the parameter’s type. The number of times that `Deref::deref` needs to be |
| inserted is resolved at compile time, so there is no runtime penalty for taking |
| advantage of deref coercion! |
| |
| ### How Deref Coercion Interacts with Mutability |
| |
| Similar to how you use the `Deref` trait to override the `*` operator on |
| immutable references, you can use the `DerefMut` trait to override the `*` |
| operator on mutable references. |
| |
| Rust does deref coercion when it finds types and trait implementations in three |
| cases: |
| |
| * From `&T` to `&U` when `T: Deref<Target=U>` |
| * From `&mut T` to `&mut U` when `T: DerefMut<Target=U>` |
| * From `&mut T` to `&U` when `T: Deref<Target=U>` |
| |
| The first two cases are the same except that the second implements mutability. |
| The first case states that if you have a `&T`, and `T` implements `Deref` to |
| some type `U`, you can get a `&U` transparently. The second case states that |
| the same deref coercion happens for mutable references. |
| |
| The third case is trickier: Rust will also coerce a mutable reference to an |
| immutable one. But the reverse is *not* possible: immutable references will |
| never coerce to mutable references. Because of the borrowing rules, if you have |
| a mutable reference, that mutable reference must be the only reference to that |
| data (otherwise, the program wouldn’t compile). Converting one mutable |
| reference to one immutable reference will never break the borrowing rules. |
| Converting an immutable reference to a mutable reference would require that the |
| initial immutable reference is the only immutable reference to that data, but |
| the borrowing rules don’t guarantee that. Therefore, Rust can’t make the |
| assumption that converting an immutable reference to a mutable reference is |
| possible. |
| |
| ## Running Code on Cleanup with the Drop Trait |
| |
| The second trait important to the smart pointer pattern is `Drop`, which lets |
| you customize what happens when a value is about to go out of scope. You can |
| provide an implementation for the `Drop` trait on any type, and that code can |
| be used to release resources like files or network connections. |
| |
| We’re introducing `Drop` in the context of smart pointers because the |
| functionality of the `Drop` trait is almost always used when implementing a |
| smart pointer. For example, when a `Box<T>` is dropped it will deallocate the |
| space on the heap that the box points to. |
| |
| In some languages, for some types, the programmer must call code to free memory |
| or resources every time they finish using an instance of those types. Examples |
| include file handles, sockets, and locks. If they forget, the system might |
| become overloaded and crash. In Rust, you can specify that a particular bit of |
| code be run whenever a value goes out of scope, and the compiler will insert |
| this code automatically. As a result, you don’t need to be careful about |
| placing cleanup code everywhere in a program that an instance of a particular |
| type is finished with—you still won’t leak resources! |
| |
| You specify the code to run when a value goes out of scope by implementing the |
| `Drop` trait. The `Drop` trait requires you to implement one method named |
| `drop` that takes a mutable reference to `self`. To see when Rust calls `drop`, |
| let’s implement `drop` with `println!` statements for now. |
| |
| Listing 15-14 shows a `CustomSmartPointer` struct whose only custom |
| functionality is that it will print `Dropping CustomSmartPointer!` when the |
| instance goes out of scope, to show when Rust runs the `drop` method. |
| |
| Filename: src/main.rs |
| |
| ``` |
| struct CustomSmartPointer { |
| data: String, |
| } |
| |
| 1 impl Drop for CustomSmartPointer { |
| fn drop(&mut self) { |
| 2 println!( |
| "Dropping CustomSmartPointer with data `{}`!", |
| self.data |
| ); |
| } |
| } |
| |
| fn main() { |
| 3 let c = CustomSmartPointer { |
| data: String::from("my stuff"), |
| }; |
| 4 let d = CustomSmartPointer { |
| data: String::from("other stuff"), |
| }; |
| 5 println!("CustomSmartPointers created."); |
| 6 } |
| ``` |
| |
| Listing 15-14: A `CustomSmartPointer` struct that implements the `Drop` trait |
| where we would put our cleanup code |
| |
| The `Drop` trait is included in the prelude, so we don’t need to bring it into |
| scope. We implement the `Drop` trait on `CustomSmartPointer` [1] and provide an |
| implementation for the `drop` method that calls `println!` [2]. The body of the |
| `drop` method is where you would place any logic that you wanted to run when an |
| instance of your type goes out of scope. We’re printing some text here to |
| demonstrate visually when Rust will call `drop`. |
| |
| In `main`, we create two instances of `CustomSmartPointer` at [3] and [4] and |
| then print `CustomSmartPointers created` [5]. At the end of `main` [6], our |
| instances of `CustomSmartPointer` will go out of scope, and Rust will call the |
| code we put in the `drop` method [2], printing our final message. Note that we |
| didn’t need to call the `drop` method explicitly. |
| |
| When we run this program, we’ll see the following output: |
| |
| ``` |
| CustomSmartPointers created. |
| Dropping CustomSmartPointer with data `other stuff`! |
| Dropping CustomSmartPointer with data `my stuff`! |
| ``` |
| |
| Rust automatically called `drop` for us when our instances went out of scope, |
| calling the code we specified. Variables are dropped in the reverse order of |
| their creation, so `d` was dropped before `c`. This example’s purpose is to |
| give you a visual guide to how the `drop` method works; usually you would |
| specify the cleanup code that your type needs to run rather than a print |
| message. |
| |
| Unfortunately, it’s not straightforward to disable the automatic `drop` |
| functionality. Disabling `drop` isn’t usually necessary; the whole point of the |
| `Drop` trait is that it’s taken care of automatically. Occasionally, however, |
| you might want to clean up a value early. One example is when using smart |
| pointers that manage locks: you might want to force the `drop` method that |
| releases the lock so that other code in the same scope can acquire the lock. |
| Rust doesn’t let you call the `Drop` trait’s `drop` method manually; instead, |
| you have to call the `std::mem::drop` function provided by the standard library |
| if you want to force a value to be dropped before the end of its scope. |
| |
| If we try to call the `Drop` trait’s `drop` method manually by modifying the |
| `main` function from Listing 15-14, as shown in Listing 15-15, we’ll get a |
| compiler error. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let c = CustomSmartPointer { |
| data: String::from("some data"), |
| }; |
| println!("CustomSmartPointer created."); |
| c.drop(); |
| println!( |
| "CustomSmartPointer dropped before the end of main." |
| ); |
| } |
| ``` |
| |
| Listing 15-15: Attempting to call the `drop` method from the `Drop` trait |
| manually to clean up early |
| |
| When we try to compile this code, we’ll get this error: |
| |
| ``` |
| error[E0040]: explicit use of destructor method |
| --> src/main.rs:16:7 |
| | |
| 16 | c.drop(); |
| | --^^^^-- |
| | | | |
| | | explicit destructor calls not allowed |
| | help: consider using `drop` function: `drop(c)` |
| ``` |
| |
| This error message states that we’re not allowed to explicitly call `drop`. The |
| error message uses the term *destructor*, which is the general programming term |
| for a function that cleans up an instance. A *destructor* is analogous to a |
| *constructor*, which creates an instance. The `drop` function in Rust is one |
| particular destructor. |
| |
| Rust doesn’t let us call `drop` explicitly because Rust would still |
| automatically call `drop` on the value at the end of `main`. This would cause a |
| *double free* error because Rust would be trying to clean up the same value |
| twice. |
| |
| We can’t disable the automatic insertion of `drop` when a value goes out of |
| scope, and we can’t call the `drop` method explicitly. So, if we need to force |
| a value to be cleaned up early, we use the `std::mem::drop` function. |
| |
| The `std::mem::drop` function is different from the `drop` method in the `Drop` |
| trait. We call it by passing as an argument the value we want to force-drop. |
| The function is in the prelude, so we can modify `main` in Listing 15-15 to |
| call the `drop` function, as shown in Listing 15-16. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let c = CustomSmartPointer { |
| data: String::from("some data"), |
| }; |
| println!("CustomSmartPointer created."); |
| drop(c); |
| println!( |
| "CustomSmartPointer dropped before the end of main." |
| ); |
| } |
| ``` |
| |
| Listing 15-16: Calling `std::mem::drop` to explicitly drop a value before it |
| goes out of scope |
| |
| Running this code will print the following: |
| |
| ``` |
| CustomSmartPointer created. |
| Dropping CustomSmartPointer with data `some data`! |
| CustomSmartPointer dropped before the end of main. |
| ``` |
| |
| The text `Dropping CustomSmartPointer with data `some data`!` is printed |
| between the `CustomSmartPointer created.` and `CustomSmartPointer dropped |
| before the end of main.` text, showing that the `drop` method code is called to |
| drop `c` at that point. |
| |
| You can use code specified in a `Drop` trait implementation in many ways to |
| make cleanup convenient and safe: for instance, you could use it to create your |
| own memory allocator! With the `Drop` trait and Rust’s ownership system, you |
| don’t have to remember to clean up because Rust does it automatically. |
| |
| You also don’t have to worry about problems resulting from accidentally |
| cleaning up values still in use: the ownership system that makes sure |
| references are always valid also ensures that `drop` gets called only once when |
| the value is no longer being used. |
| |
| Now that we’ve examined `Box<T>` and some of the characteristics of smart |
| pointers, let’s look at a few other smart pointers defined in the standard |
| library. |
| |
| ## Rc<T>, the Reference Counted Smart Pointer |
| |
| In the majority of cases, ownership is clear: you know exactly which variable |
| owns a given value. However, there are cases when a single value might have |
| multiple owners. For example, in graph data structures, multiple edges might |
| point to the same node, and that node is conceptually owned by all of the edges |
| that point to it. A node shouldn’t be cleaned up unless it doesn’t have any |
| edges pointing to it and so has no owners. |
| |
| You have to enable multiple ownership explicitly by using the Rust type |
| `Rc<T>`, which is an abbreviation for *reference counting*. The `Rc<T>` type |
| keeps track of the number of references to a value to determine whether or not |
| the value is still in use. If there are zero references to a value, the value |
| can be cleaned up without any references becoming invalid. |
| |
| Imagine `Rc<T>` as a TV in a family room. When one person enters to watch TV, |
| they turn it on. Others can come into the room and watch the TV. When the last |
| person leaves the room, they turn off the TV because it’s no longer being used. |
| If someone turns off the TV while others are still watching it, there would be |
| an uproar from the remaining TV watchers! |
| |
| We use the `Rc<T>` type when we want to allocate some data on the heap for |
| multiple parts of our program to read and we can’t determine at compile time |
| which part will finish using the data last. If we knew which part would finish |
| last, we could just make that part the data’s owner, and the normal ownership |
| rules enforced at compile time would take effect. |
| |
| Note that `Rc<T>` is only for use in single-threaded scenarios. When we discuss |
| concurrency in Chapter 16, we’ll cover how to do reference counting in |
| multithreaded programs. |
| |
| ### Using Rc<T> to Share Data |
| |
| Let’s return to our cons list example in Listing 15-5. Recall that we defined |
| it using `Box<T>`. This time, we’ll create two lists that both share ownership |
| of a third list. Conceptually, this looks similar to Figure 15-3. |
| |
| Figure 15-3: Two lists, `b` and `c`, sharing ownership of a third list, `a` |
| |
| We’ll create list `a` that contains `5` and then `10`. Then we’ll make two more |
| lists: `b` that starts with `3` and `c` that starts with `4`. Both `b` and `c` |
| lists will then continue on to the first `a` list containing `5` and `10`. In |
| other words, both lists will share the first list containing `5` and `10`. |
| |
| Trying to implement this scenario using our definition of `List` with `Box<T>` |
| won’t work, as shown in Listing 15-17. |
| |
| Filename: src/main.rs |
| |
| ``` |
| enum List { |
| Cons(i32, Box<List>), |
| Nil, |
| } |
| |
| use crate::List::{Cons, Nil}; |
| |
| fn main() { |
| let a = Cons(5, Box::new(Cons(10, Box::new(Nil)))); |
| 1 let b = Cons(3, Box::new(a)); |
| 2 let c = Cons(4, Box::new(a)); |
| } |
| ``` |
| |
| Listing 15-17: Demonstrating that we’re not allowed to have two lists using |
| `Box<T>` that try to share ownership of a third list |
| |
| When we compile this code, we get this error: |
| |
| ``` |
| error[E0382]: use of moved value: `a` |
| --> src/main.rs:11:30 |
| | |
| 9 | let a = Cons(5, Box::new(Cons(10, Box::new(Nil)))); |
| | - move occurs because `a` has type `List`, which |
| does not implement the `Copy` trait |
| 10 | let b = Cons(3, Box::new(a)); |
| | - value moved here |
| 11 | let c = Cons(4, Box::new(a)); |
| | ^ value used here after move |
| ``` |
| |
| The `Cons` variants own the data they hold, so when we create the `b` list [1], |
| `a` is moved into `b` and `b` owns `a`. Then, when we try to use `a` again when |
| creating `c` [2], we’re not allowed to because `a` has been moved. |
| |
| We could change the definition of `Cons` to hold references instead, but then |
| we would have to specify lifetime parameters. By specifying lifetime |
| parameters, we would be specifying that every element in the list will live at |
| least as long as the entire list. This is the case for the elements and lists |
| in Listing 15-17, but not in every scenario. |
| |
| Instead, we’ll change our definition of `List` to use `Rc<T>` in place of |
| `Box<T>`, as shown in Listing 15-18. Each `Cons` variant will now hold a value |
| and an `Rc<T>` pointing to a `List`. When we create `b`, instead of taking |
| ownership of `a`, we’ll clone the `Rc<List>` that `a` is holding, thereby |
| increasing the number of references from one to two and letting `a` and `b` |
| share ownership of the data in that `Rc<List>`. We’ll also clone `a` when |
| creating `c`, increasing the number of references from two to three. Every time |
| we call `Rc::clone`, the reference count to the data within the `Rc<List>` will |
| increase, and the data won’t be cleaned up unless there are zero references to |
| it. |
| |
| Filename: src/main.rs |
| |
| ``` |
| enum List { |
| Cons(i32, Rc<List>), |
| Nil, |
| } |
| |
| use crate::List::{Cons, Nil}; |
| 1 use std::rc::Rc; |
| |
| fn main() { |
| 2 let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); |
| 3 let b = Cons(3, Rc::clone(&a)); |
| 4 let c = Cons(4, Rc::clone(&a)); |
| } |
| ``` |
| |
| Listing 15-18: A definition of `List` that uses `Rc<T>` |
| |
| We need to add a `use` statement to bring `Rc<T>` into scope [1] because it’s |
| not in the prelude. In `main`, we create the list holding `5` and `10` and |
| store it in a new `Rc<List>` in `a` [2]. Then, when we create `b` [3] and `c` |
| [4], we call the `Rc::clone` function and pass a reference to the `Rc<List>` in |
| `a` as an argument. |
| |
| We could have called `a.clone()` rather than `Rc::clone(&a)`, but Rust’s |
| convention is to use `Rc::clone` in this case. The implementation of |
| `Rc::clone` doesn’t make a deep copy of all the data like most types’ |
| implementations of `clone` do. The call to `Rc::clone` only increments the |
| reference count, which doesn’t take much time. Deep copies of data can take a |
| lot of time. By using `Rc::clone` for reference counting, we can visually |
| distinguish between the deep-copy kinds of clones and the kinds of clones that |
| increase the reference count. When looking for performance problems in the |
| code, we only need to consider the deep-copy clones and can disregard calls to |
| `Rc::clone`. |
| |
| ### Cloning an Rc<T> Increases the Reference Count |
| |
| Let’s change our working example in Listing 15-18 so we can see the reference |
| counts changing as we create and drop references to the `Rc<List>` in `a`. |
| |
| In Listing 15-19, we’ll change `main` so it has an inner scope around list `c`; |
| then we can see how the reference count changes when `c` goes out of scope. |
| |
| Filename: src/main.rs |
| |
| ``` |
| --snip-- |
| |
| fn main() { |
| let a = Rc::new(Cons(5, Rc::new(Cons(10, Rc::new(Nil))))); |
| println!( |
| "count after creating a = {}", |
| Rc::strong_count(&a) |
| ); |
| let b = Cons(3, Rc::clone(&a)); |
| println!( |
| "count after creating b = {}", |
| Rc::strong_count(&a) |
| ); |
| { |
| let c = Cons(4, Rc::clone(&a)); |
| println!( |
| "count after creating c = {}", |
| Rc::strong_count(&a) |
| ); |
| } |
| println!( |
| "count after c goes out of scope = {}", |
| Rc::strong_count(&a) |
| ); |
| } |
| ``` |
| |
| Listing 15-19: Printing the reference count |
| |
| At each point in the program where the reference count changes, we print the |
| reference count, which we get by calling the `Rc::strong_count` function. This |
| function is named `strong_count` rather than `count` because the `Rc<T>` type |
| also has a `weak_count`; we’ll see what `weak_count` is used for in “Preventing |
| Reference Cycles Using Weak<T>” on page XX. |
| |
| This code prints the following: |
| |
| ``` |
| count after creating a = 1 |
| count after creating b = 2 |
| count after creating c = 3 |
| count after c goes out of scope = 2 |
| ``` |
| |
| We can see that the `Rc<List>` in `a` has an initial reference count of 1; then |
| each time we call `clone`, the count goes up by 1. When `c` goes out of scope, |
| the count goes down by 1. We don’t have to call a function to decrease the |
| reference count like we have to call `Rc::clone` to increase the reference |
| count: the implementation of the `Drop` trait decreases the reference count |
| automatically when an `Rc<T>` value goes out of scope. |
| |
| What we can’t see in this example is that when `b` and then `a` go out of scope |
| at the end of `main`, the count is then 0, and the `Rc<List>` is cleaned up |
| completely. Using `Rc<T>` allows a single value to have multiple owners, and |
| the count ensures that the value remains valid as long as any of the owners |
| still exist. |
| |
| Via immutable references, `Rc<T>` allows you to share data between multiple |
| parts of your program for reading only. If `Rc<T>` allowed you to have multiple |
| mutable references too, you might violate one of the borrowing rules discussed |
| in Chapter 4: multiple mutable borrows to the same place can cause data races |
| and inconsistencies. But being able to mutate data is very useful! In the next |
| section, we’ll discuss the interior mutability pattern and the `RefCell<T>` |
| type that you can use in conjunction with an `Rc<T>` to work with this |
| immutability restriction. |
| |
| ## RefCell<T> and the Interior Mutability Pattern |
| |
| *Interior mutability* is a design pattern in Rust that allows you to mutate |
| data even when there are immutable references to that data; normally, this |
| action is disallowed by the borrowing rules. To mutate data, the pattern uses |
| `unsafe` code inside a data structure to bend Rust’s usual rules that govern |
| mutation and borrowing. Unsafe code indicates to the compiler that we’re |
| checking the rules manually instead of relying on the compiler to check them |
| for us; we will discuss unsafe code more in Chapter 19. |
| |
| We can use types that use the interior mutability pattern only when we can |
| ensure that the borrowing rules will be followed at runtime, even though the |
| compiler can’t guarantee that. The `unsafe` code involved is then wrapped in a |
| safe API, and the outer type is still immutable. |
| |
| Let’s explore this concept by looking at the `RefCell<T>` type that follows the |
| interior mutability pattern. |
| |
| ### Enforcing Borrowing Rules at Runtime with RefCell<T> |
| |
| Unlike `Rc<T>`, the `RefCell<T>` type represents single ownership over the data |
| it holds. So what makes `RefCell<T>` different from a type like `Box<T>`? |
| Recall the borrowing rules you learned in Chapter 4: |
| |
| * At any given time, you can have *either* one mutable reference or any number |
| of immutable references (but not both). |
| * References must always be valid. |
| |
| With references and `Box<T>`, the borrowing rules’ invariants are enforced at |
| compile time. With `RefCell<T>`, these invariants are enforced *at runtime*. |
| With references, if you break these rules, you’ll get a compiler error. With |
| `RefCell<T>`, if you break these rules, your program will panic and exit. |
| |
| The advantages of checking the borrowing rules at compile time are that errors |
| will be caught sooner in the development process, and there is no impact on |
| runtime performance because all the analysis is completed beforehand. For those |
| reasons, checking the borrowing rules at compile time is the best choice in the |
| majority of cases, which is why this is Rust’s default. |
| |
| The advantage of checking the borrowing rules at runtime instead is that |
| certain memory-safe scenarios are then allowed, where they would’ve been |
| disallowed by the compile-time checks. Static analysis, like the Rust compiler, |
| is inherently conservative. Some properties of code are impossible to detect by |
| analyzing the code: the most famous example is the Halting Problem, which is |
| beyond the scope of this book but is an interesting topic to research. |
| |
| Because some analysis is impossible, if the Rust compiler can’t be sure the |
| code complies with the ownership rules, it might reject a correct program; in |
| this way, it’s conservative. If Rust accepted an incorrect program, users |
| wouldn’t be able to trust in the guarantees Rust makes. However, if Rust |
| rejects a correct program, the programmer will be inconvenienced, but nothing |
| catastrophic can occur. The `RefCell<T>` type is useful when you’re sure your |
| code follows the borrowing rules but the compiler is unable to understand and |
| guarantee that. |
| |
| Similar to `Rc<T>`, `RefCell<T>` is only for use in single-threaded scenarios |
| and will give you a compile-time error if you try using it in a multithreaded |
| context. We’ll talk about how to get the functionality of `RefCell<T>` in a |
| multithreaded program in Chapter 16. |
| |
| Here is a recap of the reasons to choose `Box<T>`, `Rc<T>`, or `RefCell<T>`: |
| |
| * `Rc<T>` enables multiple owners of the same data; `Box<T>` and `RefCell<T>` |
| have single owners. |
| * `Box<T>` allows immutable or mutable borrows checked at compile time; `Rc<T>` |
| allows only immutable borrows checked at compile time; `RefCell<T>` allows |
| immutable or mutable borrows checked at runtime. |
| * Because `RefCell<T>` allows mutable borrows checked at runtime, you can |
| mutate the value inside the `RefCell<T>` even when the `RefCell<T>` is |
| immutable. |
| |
| Mutating the value inside an immutable value is the *interior mutability* |
| pattern. Let’s look at a situation in which interior mutability is useful and |
| examine how it’s possible. |
| |
| ### Interior Mutability: A Mutable Borrow to an Immutable Value |
| |
| A consequence of the borrowing rules is that when you have an immutable value, |
| you can’t borrow it mutably. For example, this code won’t compile: |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let x = 5; |
| let y = &mut x; |
| } |
| ``` |
| |
| If you tried to compile this code, you’d get the following error: |
| |
| ``` |
| error[E0596]: cannot borrow `x` as mutable, as it is not declared |
| as mutable |
| --> src/main.rs:3:13 |
| | |
| 2 | let x = 5; |
| | - help: consider changing this to be mutable: `mut x` |
| 3 | let y = &mut x; |
| | ^^^^^^ cannot borrow as mutable |
| ``` |
| |
| However, there are situations in which it would be useful for a value to mutate |
| itself in its methods but appear immutable to other code. Code outside the |
| value’s methods would not be able to mutate the value. Using `RefCell<T>` is |
| one way to get the ability to have interior mutability, but `RefCell<T>` |
| doesn’t get around the borrowing rules completely: the borrow checker in the |
| compiler allows this interior mutability, and the borrowing rules are checked |
| at runtime instead. If you violate the rules, you’ll get a `panic!` instead of |
| a compiler error. |
| |
| Let’s work through a practical example where we can use `RefCell<T>` to mutate |
| an immutable value and see why that is useful. |
| |
| #### A Use Case for Interior Mutability: Mock Objects |
| |
| Sometimes during testing a programmer will use a type in place of another type, |
| in order to observe particular behavior and assert that it’s implemented |
| correctly. This placeholder type is called a *test double*. Think of it in the |
| sense of a stunt double in filmmaking, where a person steps in and substitutes |
| for an actor to do a particularly tricky scene. Test doubles stand in for other |
| types when we’re running tests. *Mock objects* are specific types of test |
| doubles that record what happens during a test so you can assert that the |
| correct actions took place. |
| |
| Rust doesn’t have objects in the same sense as other languages have objects, |
| and Rust doesn’t have mock object functionality built into the standard library |
| as some other languages do. However, you can definitely create a struct that |
| will serve the same purposes as a mock object. |
| |
| Here’s the scenario we’ll test: we’ll create a library that tracks a value |
| against a maximum value and sends messages based on how close to the maximum |
| value the current value is. This library could be used to keep track of a |
| user’s quota for the number of API calls they’re allowed to make, for example. |
| |
| Our library will only provide the functionality of tracking how close to the |
| maximum a value is and what the messages should be at what times. Applications |
| that use our library will be expected to provide the mechanism for sending the |
| messages: the application could put a message in the application, send an |
| email, send a text message, or do something else. The library doesn’t need to |
| know that detail. All it needs is something that implements a trait we’ll |
| provide called `Messenger`. Listing 15-20 shows the library code. |
| |
| Filename: src/lib.rs |
| |
| ``` |
| pub trait Messenger { |
| 1 fn send(&self, msg: &str); |
| } |
| |
| pub struct LimitTracker<'a, T: Messenger> { |
| messenger: &'a T, |
| value: usize, |
| max: usize, |
| } |
| |
| impl<'a, T> LimitTracker<'a, T> |
| where |
| T: Messenger, |
| { |
| pub fn new( |
| messenger: &'a T, |
| max: usize |
| ) -> LimitTracker<'a, T> { |
| LimitTracker { |
| messenger, |
| value: 0, |
| max, |
| } |
| } |
| |
| 2 pub fn set_value(&mut self, value: usize) { |
| self.value = value; |
| |
| let percentage_of_max = |
| self.value as f64 / self.max as f64; |
| |
| if percentage_of_max >= 1.0 { |
| self.messenger |
| .send("Error: You are over your quota!"); |
| } else if percentage_of_max >= 0.9 { |
| self.messenger |
| .send("Urgent: You're at 90% of your quota!"); |
| } else if percentage_of_max >= 0.75 { |
| self.messenger |
| .send("Warning: You're at 75% of your quota!"); |
| } |
| } |
| } |
| ``` |
| |
| Listing 15-20: A library to keep track of how close a value is to a maximum |
| value and warn when the value is at certain levels |
| |
| One important part of this code is that the `Messenger` trait has one method |
| called `send` that takes an immutable reference to `self` and the text of the |
| message [1]. This trait is the interface our mock object needs to implement so |
| that the mock can be used in the same way a real object is. The other important |
| part is that we want to test the behavior of the `set_value` method on the |
| `LimitTracker` [2]. We can change what we pass in for the `value` parameter, |
| but `set_value` doesn’t return anything for us to make assertions on. We want |
| to be able to say that if we create a `LimitTracker` with something that |
| implements the `Messenger` trait and a particular value for `max`, when we pass |
| different numbers for `value` the messenger is told to send the appropriate |
| messages. |
| |
| We need a mock object that, instead of sending an email or text message when we |
| call `send`, will only keep track of the messages it’s told to send. We can |
| create a new instance of the mock object, create a `LimitTracker` that uses the |
| mock object, call the `set_value` method on `LimitTracker`, and then check that |
| the mock object has the messages we expect. Listing 15-21 shows an attempt to |
| implement a mock object to do just that, but the borrow checker won’t allow it. |
| |
| Filename: src/lib.rs |
| |
| ``` |
| #[cfg(test)] |
| mod tests { |
| use super::*; |
| |
| 1 struct MockMessenger { |
| 2 sent_messages: Vec<String>, |
| } |
| |
| impl MockMessenger { |
| 3 fn new() -> MockMessenger { |
| MockMessenger { |
| sent_messages: vec![], |
| } |
| } |
| } |
| |
| 4 impl Messenger for MockMessenger { |
| fn send(&self, message: &str) { |
| 5 self.sent_messages.push(String::from(message)); |
| } |
| } |
| |
| #[test] |
| 6 fn it_sends_an_over_75_percent_warning_message() { |
| let mock_messenger = MockMessenger::new(); |
| let mut limit_tracker = LimitTracker::new( |
| &mock_messenger, |
| 100 |
| ); |
| |
| limit_tracker.set_value(80); |
| |
| assert_eq!(mock_messenger.sent_messages.len(), 1); |
| } |
| } |
| ``` |
| |
| Listing 15-21: An attempt to implement a `MockMessenger` that isn’t allowed by |
| the borrow checker |
| |
| This test code defines a `MockMessenger` struct [1] that has a `sent_messages` |
| field with a `Vec` of `String` values [2] to keep track of the messages it’s |
| told to send. We also define an associated function `new` [3] to make it |
| convenient to create new `MockMessenger` values that start with an empty list |
| of messages. We then implement the `Messenger` trait for `MockMessenger` [4] so |
| we can give a `MockMessenger` to a `LimitTracker`. In the definition of the |
| `send` method [5], we take the message passed in as a parameter and store it in |
| the `MockMessenger` list of `sent_messages`. |
| |
| In the test, we’re testing what happens when the `LimitTracker` is told to set |
| `value` to something that is more than 75 percent of the `max` value [6]. First |
| we create a new `MockMessenger`, which will start with an empty list of |
| messages. Then we create a new `LimitTracker` and give it a reference to the |
| new `MockMessenger` and a `max` value of `100`. We call the `set_value` method |
| on the `LimitTracker` with a value of `80`, which is more than 75 percent of |
| 100. Then we assert that the list of messages that the `MockMessenger` is |
| keeping track of should now have one message in it. |
| |
| However, there’s one problem with this test, as shown here: |
| |
| ``` |
| error[E0596]: cannot borrow `self.sent_messages` as mutable, as it is behind a |
| `&` reference |
| --> src/lib.rs:58:13 |
| | |
| 2 | fn send(&self, msg: &str); |
| | ----- help: consider changing that to be a mutable reference: |
| `&mut self` |
| ... |
| 58 | self.sent_messages.push(String::from(message)); |
| | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ `self` is a |
| `&` reference, so the data it refers to cannot be borrowed as mutable |
| ``` |
| |
| We can’t modify the `MockMessenger` to keep track of the messages because the |
| `send` method takes an immutable reference to `self`. We also can’t take the |
| suggestion from the error text to use `&mut self` instead because then the |
| signature of `send` wouldn’t match the signature in the `Messenger` trait |
| definition (feel free to try it and see what error message you get). |
| |
| This is a situation in which interior mutability can help! We’ll store the |
| `sent_messages` within a `RefCell<T>`, and then the `send` method will be able |
| to modify `sent_messages` to store the messages we’ve seen. Listing 15-22 shows |
| what that looks like. |
| |
| Filename: src/lib.rs |
| |
| ``` |
| #[cfg(test)] |
| mod tests { |
| use super::*; |
| use std::cell::RefCell; |
| |
| struct MockMessenger { |
| 1 sent_messages: RefCell<Vec<String>>, |
| } |
| |
| impl MockMessenger { |
| fn new() -> MockMessenger { |
| MockMessenger { |
| 2 sent_messages: RefCell::new(vec![]), |
| } |
| } |
| } |
| |
| impl Messenger for MockMessenger { |
| fn send(&self, message: &str) { |
| self.sent_messages |
| 3 .borrow_mut() |
| .push(String::from(message)); |
| } |
| } |
| |
| #[test] |
| fn it_sends_an_over_75_percent_warning_message() { |
| --snip-- |
| |
| assert_eq!( |
| 4 mock_messenger.sent_messages.borrow().len(), |
| 1 |
| ); |
| } |
| } |
| ``` |
| |
| Listing 15-22: Using `RefCell<T>` to mutate an inner value while the outer |
| value is considered immutable |
| |
| The `sent_messages` field is now of type `RefCell<Vec<String>>` [1] instead of |
| `Vec<String>`. In the `new` function, we create a new `RefCell<Vec<String>>` |
| instance around the empty vector [2]. |
| |
| For the implementation of the `send` method, the first parameter is still an |
| immutable borrow of `self`, which matches the trait definition. We call |
| `borrow_mut` on the `RefCell<Vec<String>>` in `self.sent_messages` [3] to get a |
| mutable reference to the value inside the `RefCell<Vec<String>>`, which is the |
| vector. Then we can call `push` on the mutable reference to the vector to keep |
| track of the messages sent during the test. |
| |
| The last change we have to make is in the assertion: to see how many items are |
| in the inner vector, we call `borrow` on the `RefCell<Vec<String>>` to get an |
| immutable reference to the vector [4]. |
| |
| Now that you’ve seen how to use `RefCell<T>`, let’s dig into how it works! |
| |
| #### Keeping Track of Borrows at Runtime with RefCell<T> |
| |
| When creating immutable and mutable references, we use the `&` and `&mut` |
| syntax, respectively. With `RefCell<T>`, we use the `borrow` and `borrow_mut` |
| methods, which are part of the safe API that belongs to `RefCell<T>`. The |
| `borrow` method returns the smart pointer type `Ref<T>`, and `borrow_mut` |
| returns the smart pointer type `RefMut<T>`. Both types implement `Deref`, so we |
| can treat them like regular references. |
| |
| The `RefCell<T>` keeps track of how many `Ref<T>` and `RefMut<T>` smart |
| pointers are currently active. Every time we call `borrow`, the `RefCell<T>` |
| increases its count of how many immutable borrows are active. When a `Ref<T>` |
| value goes out of scope, the count of immutable borrows goes down by 1. Just |
| like the compile-time borrowing rules, `RefCell<T>` lets us have many immutable |
| borrows or one mutable borrow at any point in time. |
| |
| If we try to violate these rules, rather than getting a compiler error as we |
| would with references, the implementation of `RefCell<T>` will panic at |
| runtime. Listing 15-23 shows a modification of the implementation of `send` in |
| Listing 15-22. We’re deliberately trying to create two mutable borrows active |
| for the same scope to illustrate that `RefCell<T>` prevents us from doing this |
| at runtime. |
| |
| Filename: src/lib.rs |
| |
| ``` |
| impl Messenger for MockMessenger { |
| fn send(&self, message: &str) { |
| let mut one_borrow = self.sent_messages.borrow_mut(); |
| let mut two_borrow = self.sent_messages.borrow_mut(); |
| |
| one_borrow.push(String::from(message)); |
| two_borrow.push(String::from(message)); |
| } |
| } |
| ``` |
| |
| Listing 15-23: Creating two mutable references in the same scope to see that |
| `RefCell<T>` will panic |
| |
| We create a variable `one_borrow` for the `RefMut<T>` smart pointer returned |
| from `borrow_mut`. Then we create another mutable borrow in the same way in the |
| variable `two_borrow`. This makes two mutable references in the same scope, |
| which isn’t allowed. When we run the tests for our library, the code in Listing |
| 15-23 will compile without any errors, but the test will fail: |
| |
| ``` |
| ---- tests::it_sends_an_over_75_percent_warning_message stdout ---- |
| thread 'main' panicked at 'already borrowed: BorrowMutError', src/lib.rs:60:53 |
| note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace |
| ``` |
| |
| Notice that the code panicked with the message `already borrowed: |
| BorrowMutError`. This is how `RefCell<T>` handles violations of the borrowing |
| rules at runtime. |
| |
| Choosing to catch borrowing errors at runtime rather than compile time, as |
| we’ve done here, means you’d potentially be finding mistakes in your code later |
| in the development process: possibly not until your code was deployed to |
| production. Also, your code would incur a small runtime performance penalty as |
| a result of keeping track of the borrows at runtime rather than compile time. |
| However, using `RefCell<T>` makes it possible to write a mock object that can |
| modify itself to keep track of the messages it has seen while you’re using it |
| in a context where only immutable values are allowed. You can use `RefCell<T>` |
| despite its trade-offs to get more functionality than regular references |
| provide. |
| |
| ### Allowing Multiple Owners of Mutable Data with Rc<T> and RefCell<T> |
| |
| A common way to use `RefCell<T>` is in combination with `Rc<T>`. Recall that |
| `Rc<T>` lets you have multiple owners of some data, but it only gives immutable |
| access to that data. If you have an `Rc<T>` that holds a `RefCell<T>`, you can |
| get a value that can have multiple owners *and* that you can mutate! |
| |
| For example, recall the cons list example in Listing 15-18 where we used |
| `Rc<T>` to allow multiple lists to share ownership of another list. Because |
| `Rc<T>` holds only immutable values, we can’t change any of the values in the |
| list once we’ve created them. Let’s add in `RefCell<T>` for its ability to |
| change the values in the lists. Listing 15-24 shows that by using a |
| `RefCell<T>` in the `Cons` definition, we can modify the value stored in all |
| the lists. |
| |
| Filename: src/main.rs |
| |
| ``` |
| #[derive(Debug)] |
| enum List { |
| Cons(Rc<RefCell<i32>>, Rc<List>), |
| Nil, |
| } |
| |
| use crate::List::{Cons, Nil}; |
| use std::cell::RefCell; |
| use std::rc::Rc; |
| |
| fn main() { |
| 1 let value = Rc::new(RefCell::new(5)); |
| |
| 2 let a = Rc::new(Cons(Rc::clone(&value), Rc::new(Nil))); |
| |
| let b = Cons(Rc::new(RefCell::new(3)), Rc::clone(&a)); |
| let c = Cons(Rc::new(RefCell::new(4)), Rc::clone(&a)); |
| |
| 3 *value.borrow_mut() += 10; |
| |
| println!("a after = {:?}", a); |
| println!("b after = {:?}", b); |
| println!("c after = {:?}", c); |
| } |
| ``` |
| |
| Listing 15-24: Using `Rc<RefCell<i32>>` to create a `List` that we can mutate |
| |
| We create a value that is an instance of `Rc<RefCell<i32>>` and store it in a |
| variable named `value` [1] so we can access it directly later. Then we create a |
| `List` in `a` with a `Cons` variant that holds `value` [2]. We need to clone |
| `value` so both `a` and `value` have ownership of the inner `5` value rather |
| than transferring ownership from `value` to `a` or having `a` borrow from |
| `value`. |
| |
| We wrap the list `a` in an `Rc<T>` so when we create lists `b` and `c`, they |
| can both refer to `a`, which is what we did in Listing 15-18. |
| |
| After we’ve created the lists in `a`, `b`, and `c`, we want to add 10 to the |
| value in `value` [3]. We do this by calling `borrow_mut` on `value`, which uses |
| the automatic dereferencing feature we discussed in “Where’s the -> Operator?” |
| on page XX to dereference the `Rc<T>` to the inner `RefCell<T>` value. The |
| `borrow_mut` method returns a `RefMut<T>` smart pointer, and we use the |
| dereference operator on it and change the inner value. |
| |
| When we print `a`, `b`, and `c`, we can see that they all have the modified |
| value of `15` rather than `5`: |
| |
| ``` |
| a after = Cons(RefCell { value: 15 }, Nil) |
| b after = Cons(RefCell { value: 3 }, Cons(RefCell { value: 15 }, Nil)) |
| c after = Cons(RefCell { value: 4 }, Cons(RefCell { value: 15 }, Nil)) |
| ``` |
| |
| This technique is pretty neat! By using `RefCell<T>`, we have an outwardly |
| immutable `List` value. But we can use the methods on `RefCell<T>` that provide |
| access to its interior mutability so we can modify our data when we need to. |
| The runtime checks of the borrowing rules protect us from data races, and it’s |
| sometimes worth trading a bit of speed for this flexibility in our data |
| structures. Note that `RefCell<T>` does not work for multithreaded code! |
| `Mutex<T>` is the thread-safe version of `RefCell<T>`, and we’ll discuss |
| `Mutex<T>` in Chapter 16. |
| |
| ## Reference Cycles Can Leak Memory |
| |
| Rust’s memory safety guarantees make it difficult, but not impossible, to |
| accidentally create memory that is never cleaned up (known as a *memory leak*). |
| Preventing memory leaks entirely is not one of Rust’s guarantees, meaning |
| memory leaks are memory safe in Rust. We can see that Rust allows memory leaks |
| by using `Rc<T>` and `RefCell<T>`: it’s possible to create references where |
| items refer to each other in a cycle. This creates memory leaks because the |
| reference count of each item in the cycle will never reach 0, and the values |
| will never be dropped. |
| |
| ### Creating a Reference Cycle |
| |
| Let’s look at how a reference cycle might happen and how to prevent it, |
| starting with the definition of the `List` enum and a `tail` method in Listing |
| 15-25. |
| |
| Filename: src/main.rs |
| |
| ``` |
| use crate::List::{Cons, Nil}; |
| use std::cell::RefCell; |
| use std::rc::Rc; |
| |
| #[derive(Debug)] |
| enum List { |
| 1 Cons(i32, RefCell<Rc<List>>), |
| Nil, |
| } |
| |
| impl List { |
| 2 fn tail(&self) -> Option<&RefCell<Rc<List>>> { |
| match self { |
| Cons(_, item) => Some(item), |
| Nil => None, |
| } |
| } |
| } |
| ``` |
| |
| Listing 15-25: A cons list definition that holds a `RefCell<T>` so we can |
| modify what a `Cons` variant is referring to |
| |
| We’re using another variation of the `List` definition from Listing 15-5. The |
| second element in the `Cons` variant is now `RefCell<Rc<List>>` [1], meaning |
| that instead of having the ability to modify the `i32` value as we did in |
| Listing 15-24, we want to modify the `List` value a `Cons` variant is pointing |
| to. We’re also adding a `tail` method [2] to make it convenient for us to |
| access the second item if we have a `Cons` variant. |
| |
| In Listing 15-26, we’re adding a `main` function that uses the definitions in |
| Listing 15-25. This code creates a list in `a` and a list in `b` that points to |
| the list in `a`. Then it modifies the list in `a` to point to `b`, creating a |
| reference cycle. There are `println!` statements along the way to show what the |
| reference counts are at various points in this process. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| 1 let a = Rc::new(Cons(5, RefCell::new(Rc::new(Nil)))); |
| |
| println!("a initial rc count = {}", Rc::strong_count(&a)); |
| println!("a next item = {:?}", a.tail()); |
| |
| 2 let b = Rc::new(Cons(10, RefCell::new(Rc::clone(&a)))); |
| |
| println!( |
| "a rc count after b creation = {}", |
| Rc::strong_count(&a) |
| ); |
| println!("b initial rc count = {}", Rc::strong_count(&b)); |
| println!("b next item = {:?}", b.tail()); |
| |
| 3 if let Some(link) = a.tail() { |
| 4 *link.borrow_mut() = Rc::clone(&b); |
| } |
| |
| println!( |
| "b rc count after changing a = {}", |
| Rc::strong_count(&b) |
| ); |
| println!( |
| "a rc count after changing a = {}", |
| Rc::strong_count(&a) |
| ); |
| |
| // Uncomment the next line to see that we have a cycle; |
| // it will overflow the stack |
| // println!("a next item = {:?}", a.tail()); |
| } |
| ``` |
| |
| Listing 15-26: Creating a reference cycle of two `List` values pointing to each |
| other |
| |
| We create an `Rc<List>` instance holding a `List` value in the variable `a` |
| with an initial list of `5, Nil` [1]. We then create an `Rc<List>` instance |
| holding another `List` value in the variable `b` that contains the value `10` |
| and points to the list in `a` [2]. |
| |
| We modify `a` so it points to `b` instead of `Nil`, creating a cycle. We do |
| that by using the `tail` method to get a reference to the `RefCell<Rc<List>>` |
| in `a`, which we put in the variable `link` [3]. Then we use the `borrow_mut` |
| method on the `RefCell<Rc<List>>` to change the value inside from an `Rc<List>` |
| that holds a `Nil` value to the `Rc<List>` in `b` [4]. |
| |
| When we run this code, keeping the last `println!` commented out for the |
| moment, we’ll get this output: |
| |
| ``` |
| a initial rc count = 1 |
| a next item = Some(RefCell { value: Nil }) |
| a rc count after b creation = 2 |
| b initial rc count = 1 |
| b next item = Some(RefCell { value: Cons(5, RefCell { value: Nil }) }) |
| b rc count after changing a = 2 |
| a rc count after changing a = 2 |
| ``` |
| |
| The reference count of the `Rc<List>` instances in both `a` and `b` is 2 after |
| we change the list in `a` to point to `b`. At the end of `main`, Rust drops the |
| variable `b`, which decreases the reference count of the `b` `Rc<List>` |
| instance from 2 to 1. The memory that `Rc<List>` has on the heap won’t be |
| dropped at this point because its reference count is 1, not 0. Then Rust drops |
| `a`, which decreases the reference count of the `a` `Rc<List>` instance from 2 |
| to 1 as well. This instance’s memory can’t be dropped either, because the other |
| `Rc<List>` instance still refers to it. The memory allocated to the list will |
| remain uncollected forever. To visualize this reference cycle, we’ve created a |
| diagram in Figure 15-4. |
| |
| Figure 15-4: A reference cycle of lists `a` and `b` pointing to each other |
| |
| If you uncomment the last `println!` and run the program, Rust will try to |
| print this cycle with `a` pointing to `b` pointing to `a` and so forth until it |
| overflows the stack. |
| |
| Compared to a real-world program, the consequences of creating a reference |
| cycle in this example aren’t very dire: right after we create the reference |
| cycle, the program ends. However, if a more complex program allocated lots of |
| memory in a cycle and held onto it for a long time, the program would use more |
| memory than it needed and might overwhelm the system, causing it to run out of |
| available memory. |
| |
| Creating reference cycles is not easily done, but it’s not impossible either. |
| If you have `RefCell<T>` values that contain `Rc<T>` values or similar nested |
| combinations of types with interior mutability and reference counting, you must |
| ensure that you don’t create cycles; you can’t rely on Rust to catch them. |
| Creating a reference cycle would be a logic bug in your program that you should |
| use automated tests, code reviews, and other software development practices to |
| minimize. |
| |
| Another solution for avoiding reference cycles is reorganizing your data |
| structures so that some references express ownership and some references don’t. |
| As a result, you can have cycles made up of some ownership relationships and |
| some non-ownership relationships, and only the ownership relationships affect |
| whether or not a value can be dropped. In Listing 15-25, we always want `Cons` |
| variants to own their list, so reorganizing the data structure isn’t possible. |
| Let’s look at an example using graphs made up of parent nodes and child nodes |
| to see when non-ownership relationships are an appropriate way to prevent |
| reference cycles. |
| |
| ### Preventing Reference Cycles Using Weak<T> |
| |
| So far, we’ve demonstrated that calling `Rc::clone` increases the |
| `strong_count` of an `Rc<T>` instance, and an `Rc<T>` instance is only cleaned |
| up if its `strong_count` is 0. You can also create a *weak reference* to the |
| value within an `Rc<T>` instance by calling `Rc::downgrade` and passing a |
| reference to the `Rc<T>`. Strong references are how you can share ownership of |
| an `Rc<T>` instance. Weak references don’t express an ownership relationship, |
| and their count doesn’t affect when an `Rc<T>` instance is cleaned up. They |
| won’t cause a reference cycle because any cycle involving some weak references |
| will be broken once the strong reference count of values involved is 0. |
| |
| When you call `Rc::downgrade`, you get a smart pointer of type `Weak<T>`. |
| Instead of increasing the `strong_count` in the `Rc<T>` instance by 1, calling |
| `Rc::downgrade` increases the `weak_count` by 1. The `Rc<T>` type uses |
| `weak_count` to keep track of how many `Weak<T>` references exist, similar to |
| `strong_count`. The difference is the `weak_count` doesn’t need to be 0 for the |
| `Rc<T>` instance to be cleaned up. |
| |
| Because the value that `Weak<T>` references might have been dropped, to do |
| anything with the value that a `Weak<T>` is pointing to you must make sure the |
| value still exists. Do this by calling the `upgrade` method on a `Weak<T>` |
| instance, which will return an `Option<Rc<T>>`. You’ll get a result of `Some` |
| if the `Rc<T>` value has not been dropped yet and a result of `None` if the |
| `Rc<T>` value has been dropped. Because `upgrade` returns an `Option<Rc<T>>`, |
| Rust will ensure that the `Some` case and the `None` case are handled, and |
| there won’t be an invalid pointer. |
| |
| As an example, rather than using a list whose items know only about the next |
| item, we’ll create a tree whose items know about their children items *and* |
| their parent items. |
| |
| #### Creating a Tree Data Structure: A Node with Child Nodes |
| |
| To start, we’ll build a tree with nodes that know about their child nodes. |
| We’ll create a struct named `Node` that holds its own `i32` value as well as |
| references to its children `Node` values: |
| |
| Filename: src/main.rs |
| |
| ``` |
| use std::cell::RefCell; |
| use std::rc::Rc; |
| |
| #[derive(Debug)] |
| struct Node { |
| value: i32, |
| children: RefCell<Vec<Rc<Node>>>, |
| } |
| ``` |
| |
| We want a `Node` to own its children, and we want to share that ownership with |
| variables so we can access each `Node` in the tree directly. To do this, we |
| define the `Vec<T>` items to be values of type `Rc<Node>`. We also want to |
| modify which nodes are children of another node, so we have a `RefCell<T>` in |
| `children` around the `Vec<Rc<Node>>`. |
| |
| Next, we’ll use our struct definition and create one `Node` instance named |
| `leaf` with the value `3` and no children, and another instance named `branch` |
| with the value `5` and `leaf` as one of its children, as shown in Listing 15-27. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let leaf = Rc::new(Node { |
| value: 3, |
| children: RefCell::new(vec![]), |
| }); |
| |
| let branch = Rc::new(Node { |
| value: 5, |
| children: RefCell::new(vec![Rc::clone(&leaf)]), |
| }); |
| } |
| ``` |
| |
| Listing 15-27: Creating a `leaf` node with no children and a `branch` node with |
| `leaf` as one of its children |
| |
| We clone the `Rc<Node>` in `leaf` and store that in `branch`, meaning the |
| `Node` in `leaf` now has two owners: `leaf` and `branch`. We can get from |
| `branch` to `leaf` through `branch.children`, but there’s no way to get from |
| `leaf` to `branch`. The reason is that `leaf` has no reference to `branch` and |
| doesn’t know they’re related. We want `leaf` to know that `branch` is its |
| parent. We’ll do that next. |
| |
| #### Adding a Reference from a Child to Its Parent |
| |
| To make the child node aware of its parent, we need to add a `parent` field to |
| our `Node` struct definition. The trouble is in deciding what the type of |
| `parent` should be. We know it can’t contain an `Rc<T>` because that would |
| create a reference cycle with `leaf.parent` pointing to `branch` and |
| `branch.children` pointing to `leaf`, which would cause their `strong_count` |
| values to never be 0. |
| |
| Thinking about the relationships another way, a parent node should own its |
| children: if a parent node is dropped, its child nodes should be dropped as |
| well. However, a child should not own its parent: if we drop a child node, the |
| parent should still exist. This is a case for weak references! |
| |
| So, instead of `Rc<T>`, we’ll make the type of `parent` use `Weak<T>`, |
| specifically a `RefCell<Weak<Node>>`. Now our `Node` struct definition looks |
| like this: |
| |
| Filename: src/main.rs |
| |
| ``` |
| use std::cell::RefCell; |
| use std::rc::{Rc, Weak}; |
| |
| #[derive(Debug)] |
| struct Node { |
| value: i32, |
| parent: RefCell<Weak<Node>>, |
| children: RefCell<Vec<Rc<Node>>>, |
| } |
| ``` |
| |
| A node will be able to refer to its parent node but doesn’t own its parent. In |
| Listing 15-28, we update `main` to use this new definition so the `leaf` node |
| will have a way to refer to its parent, `branch`. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let leaf = Rc::new(Node { |
| value: 3, |
| 1 parent: RefCell::new(Weak::new()), |
| children: RefCell::new(vec![]), |
| }); |
| |
| 2 println!( |
| "leaf parent = {:?}", |
| leaf.parent.borrow().upgrade() |
| ); |
| |
| let branch = Rc::new(Node { |
| value: 5, |
| 3 parent: RefCell::new(Weak::new()), |
| children: RefCell::new(vec![Rc::clone(&leaf)]), |
| }); |
| |
| 4 *leaf.parent.borrow_mut() = Rc::downgrade(&branch); |
| |
| 5 println!( |
| "leaf parent = {:?}", |
| leaf.parent.borrow().upgrade() |
| ); |
| } |
| ``` |
| |
| Listing 15-28: A `leaf` node with a weak reference to its parent node, `branch` |
| |
| Creating the `leaf` node looks similar to Listing 15-27 with the exception of |
| the `parent` field: `leaf` starts out without a parent, so we create a new, |
| empty `Weak<Node>` reference instance [1]. |
| |
| At this point, when we try to get a reference to the parent of `leaf` by using |
| the `upgrade` method, we get a `None` value. We see this in the output from the |
| first `println!` statement [2]: |
| |
| ``` |
| leaf parent = None |
| ``` |
| |
| When we create the `branch` node, it will also have a new `Weak<Node>` |
| reference in the `parent` field [3] because `branch` doesn’t have a parent |
| node. We still have `leaf` as one of the children of `branch`. Once we have the |
| `Node` instance in `branch`, we can modify `leaf` to give it a `Weak<Node>` |
| reference to its parent [4]. We use the `borrow_mut` method on the |
| `RefCell<Weak<Node>>` in the `parent` field of `leaf`, and then we use the |
| `Rc::downgrade` function to create a `Weak<Node>` reference to `branch` from |
| the `Rc<Node>` in `branch`. |
| |
| When we print the parent of `leaf` again [5], this time we’ll get a `Some` |
| variant holding `branch`: now `leaf` can access its parent! When we print |
| `leaf`, we also avoid the cycle that eventually ended in a stack overflow like |
| we had in Listing 15-26; the `Weak<Node>` references are printed as `(Weak)`: |
| |
| ``` |
| leaf parent = Some(Node { value: 5, parent: RefCell { value: (Weak) }, |
| children: RefCell { value: [Node { value: 3, parent: RefCell { value: (Weak) }, |
| children: RefCell { value: [] } }] } }) |
| ``` |
| |
| The lack of infinite output indicates that this code didn’t create a reference |
| cycle. We can also tell this by looking at the values we get from calling |
| `Rc::strong_count` and `Rc::weak_count`. |
| |
| #### Visualizing Changes to strong_count and weak_count |
| |
| Let’s look at how the `strong_count` and `weak_count` values of the `Rc<Node>` |
| instances change by creating a new inner scope and moving the creation of |
| `branch` into that scope. By doing so, we can see what happens when `branch` is |
| created and then dropped when it goes out of scope. The modifications are shown |
| in Listing 15-29. |
| |
| Filename: src/main.rs |
| |
| ``` |
| fn main() { |
| let leaf = Rc::new(Node { |
| value: 3, |
| parent: RefCell::new(Weak::new()), |
| children: RefCell::new(vec![]), |
| }); |
| |
| 1 println!( |
| "leaf strong = {}, weak = {}", |
| Rc::strong_count(&leaf), |
| Rc::weak_count(&leaf), |
| ); |
| |
| 2 { |
| let branch = Rc::new(Node { |
| value: 5, |
| parent: RefCell::new(Weak::new()), |
| children: RefCell::new(vec![Rc::clone(&leaf)]), |
| }); |
| |
| *leaf.parent.borrow_mut() = Rc::downgrade(&branch); |
| |
| 3 println!( |
| "branch strong = {}, weak = {}", |
| Rc::strong_count(&branch), |
| Rc::weak_count(&branch), |
| ); |
| |
| 4 println!( |
| "leaf strong = {}, weak = {}", |
| Rc::strong_count(&leaf), |
| Rc::weak_count(&leaf), |
| ); |
| 5 } |
| |
| 6 println!( |
| "leaf parent = {:?}", |
| leaf.parent.borrow().upgrade() |
| ); |
| 7 println!( |
| "leaf strong = {}, weak = {}", |
| Rc::strong_count(&leaf), |
| Rc::weak_count(&leaf), |
| ); |
| } |
| ``` |
| |
| Listing 15-29: Creating `branch` in an inner scope and examining strong and |
| weak reference counts |
| |
| After `leaf` is created, its `Rc<Node>` has a strong count of 1 and a weak |
| count of 0 [1]. In the inner scope [2], we create `branch` and associate it |
| with `leaf`, at which point when we print the counts [3], the `Rc<Node>` in |
| `branch` will have a strong count of 1 and a weak count of 1 (for `leaf.parent` |
| pointing to `branch` with a `Weak<Node>`). When we print the counts in `leaf` |
| [4], we’ll see it will have a strong count of 2 because `branch` now has a |
| clone of the `Rc<Node>` of `leaf` stored in `branch.children`, but will still |
| have a weak count of 0. |
| |
| When the inner scope ends [5], `branch` goes out of scope and the strong count |
| of the `Rc<Node>` decreases to 0, so its `Node` is dropped. The weak count of 1 |
| from `leaf.parent` has no bearing on whether or not `Node` is dropped, so we |
| don’t get any memory leaks! |
| |
| If we try to access the parent of `leaf` after the end of the scope, we’ll get |
| `None` again [6]. At the end of the program [7], the `Rc<Node>` in `leaf` has a |
| strong count of 1 and a weak count of 0 because the variable `leaf` is now the |
| only reference to the `Rc<Node>` again. |
| |
| All of the logic that manages the counts and value dropping is built into |
| `Rc<T>` and `Weak<T>` and their implementations of the `Drop` trait. By |
| specifying that the relationship from a child to its parent should be a |
| `Weak<T>` reference in the definition of `Node`, you’re able to have parent |
| nodes point to child nodes and vice versa without creating a reference cycle |
| and memory leaks. |
| |
| ## Summary |
| |
| This chapter covered how to use smart pointers to make different guarantees and |
| trade-offs from those Rust makes by default with regular references. The |
| `Box<T>` type has a known size and points to data allocated on the heap. The |
| `Rc<T>` type keeps track of the number of references to data on the heap so |
| that data can have multiple owners. The `RefCell<T>` type with its interior |
| mutability gives us a type that we can use when we need an immutable type but |
| need to change an inner value of that type; it also enforces the borrowing |
| rules at runtime instead of at compile time. |
| |
| Also discussed were the `Deref` and `Drop` traits, which enable a lot of the |
| functionality of smart pointers. We explored reference cycles that can cause |
| memory leaks and how to prevent them using `Weak<T>`. |
| |
| If this chapter has piqued your interest and you want to implement your own |
| smart pointers, check out “The Rustonomicon” at |
| *https://doc.rust-lang.org/stable/nomicon* for more useful information. |
| |
| Next, we’ll talk about concurrency in Rust. You’ll even learn about a few new |
| smart pointers. |
| |