| ## What Is Ownership? |
| |
| _Ownership_ is a set of rules that govern how a Rust program manages memory. |
| All programs have to manage the way they use a computer’s memory while running. |
| Some languages have garbage collection that regularly looks for no-longer-used |
| memory as the program runs; in other languages, the programmer must explicitly |
| allocate and free the memory. Rust uses a third approach: memory is managed |
| through a system of ownership with a set of rules that the compiler checks. If |
| any of the rules are violated, the program won’t compile. None of the features |
| of ownership will slow down your program while it’s running. |
| |
| Because ownership is a new concept for many programmers, it does take some time |
| to get used to. The good news is that the more experienced you become with Rust |
| and the rules of the ownership system, the easier you’ll find it to naturally |
| develop code that is safe and efficient. Keep at it! |
| |
| When you understand ownership, you’ll have a solid foundation for understanding |
| the features that make Rust unique. In this chapter, you’ll learn ownership by |
| working through some examples that focus on a very common data structure: |
| strings. |
| |
| > ### The Stack and the Heap |
| > |
| > Many programming languages don’t require you to think about the stack and the |
| > heap very often. But in a systems programming language like Rust, whether a |
| > value is on the stack or the heap affects how the language behaves and why |
| > you have to make certain decisions. Parts of ownership will be described in |
| > relation to the stack and the heap later in this chapter, so here is a brief |
| > explanation in preparation. |
| > |
| > Both the stack and the heap are parts of memory available to your code to use |
| > at runtime, but they are structured in different ways. The stack stores |
| > values in the order it gets them and removes the values in the opposite |
| > order. This is referred to as _last in, first out_. Think of a stack of |
| > plates: when you add more plates, you put them on top of the pile, and when |
| > you need a plate, you take one off the top. Adding or removing plates from |
| > the middle or bottom wouldn’t work as well! Adding data is called _pushing |
| > onto the stack_, and removing data is called _popping off the stack_. All |
| > data stored on the stack must have a known, fixed size. Data with an unknown |
| > size at compile time or a size that might change must be stored on the heap |
| > instead. |
| > |
| > The heap is less organized: when you put data on the heap, you request a |
| > certain amount of space. The memory allocator finds an empty spot in the heap |
| > that is big enough, marks it as being in use, and returns a _pointer_, which |
| > is the address of that location. This process is called _allocating on the |
| > heap_ and is sometimes abbreviated as just _allocating_ (pushing values onto |
| > the stack is not considered allocating). Because the pointer to the heap is a |
| > known, fixed size, you can store the pointer on the stack, but when you want |
| > the actual data, you must follow the pointer. Think of being seated at a |
| > restaurant. When you enter, you state the number of people in your group, and |
| > the host finds an empty table that fits everyone and leads you there. If |
| > someone in your group comes late, they can ask where you’ve been seated to |
| > find you. |
| > |
| > Pushing to the stack is faster than allocating on the heap because the |
| > allocator never has to search for a place to store new data; that location is |
| > always at the top of the stack. Comparatively, allocating space on the heap |
| > requires more work because the allocator must first find a big enough space |
| > to hold the data and then perform bookkeeping to prepare for the next |
| > allocation. |
| > |
| > Accessing data in the heap is slower than accessing data on the stack because |
| > you have to follow a pointer to get there. Contemporary processors are faster |
| > if they jump around less in memory. Continuing the analogy, consider a server |
| > at a restaurant taking orders from many tables. It’s most efficient to get |
| > all the orders at one table before moving on to the next table. Taking an |
| > order from table A, then an order from table B, then one from A again, and |
| > then one from B again would be a much slower process. By the same token, a |
| > processor can do its job better if it works on data that’s close to other |
| > data (as it is on the stack) rather than farther away (as it can be on the |
| > heap). |
| > |
| > When your code calls a function, the values passed into the function |
| > (including, potentially, pointers to data on the heap) and the function’s |
| > local variables get pushed onto the stack. When the function is over, those |
| > values get popped off the stack. |
| > |
| > Keeping track of what parts of code are using what data on the heap, |
| > minimizing the amount of duplicate data on the heap, and cleaning up unused |
| > data on the heap so you don’t run out of space are all problems that ownership |
| > addresses. Once you understand ownership, you won’t need to think about the |
| > stack and the heap very often, but knowing that the main purpose of ownership |
| > is to manage heap data can help explain why it works the way it does. |
| |
| ### Ownership Rules |
| |
| First, let’s take a look at the ownership rules. Keep these rules in mind as we |
| work through the examples that illustrate them: |
| |
| - Each value in Rust has an _owner_. |
| - There can only be one owner at a time. |
| - When the owner goes out of scope, the value will be dropped. |
| |
| ### Variable Scope |
| |
| Now that we’re past basic Rust syntax, we won’t include all the `fn main() {` |
| code in examples, so if you’re following along, make sure to put the following |
| examples inside a `main` function manually. As a result, our examples will be a |
| bit more concise, letting us focus on the actual details rather than |
| boilerplate code. |
| |
| As a first example of ownership, we’ll look at the _scope_ of some variables. A |
| scope is the range within a program for which an item is valid. Take the |
| following variable: |
| |
| ```rust |
| let s = "hello"; |
| ``` |
| |
| The variable `s` refers to a string literal, where the value of the string is |
| hardcoded into the text of our program. The variable is valid from the point at |
| which it’s declared until the end of the current _scope_. Listing 4-1 shows a |
| program with comments annotating where the variable `s` would be valid. |
| |
| <Listing number="4-1" caption="A variable and the scope in which it is valid"> |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-01/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| In other words, there are two important points in time here: |
| |
| - When `s` comes _into_ scope, it is valid. |
| - It remains valid until it goes _out of_ scope. |
| |
| At this point, the relationship between scopes and when variables are valid is |
| similar to that in other programming languages. Now we’ll build on top of this |
| understanding by introducing the `String` type. |
| |
| ### The `String` Type |
| |
| To illustrate the rules of ownership, we need a data type that is more complex |
| than those we covered in the [“Data Types”][data-types]<!-- ignore --> section |
| of Chapter 3. The types covered previously are of a known size, can be stored |
| on the stack and popped off the stack when their scope is over, and can be |
| quickly and trivially copied to make a new, independent instance if another |
| part of code needs to use the same value in a different scope. But we want to |
| look at data that is stored on the heap and explore how Rust knows when to |
| clean up that data, and the `String` type is a great example. |
| |
| We’ll concentrate on the parts of `String` that relate to ownership. These |
| aspects also apply to other complex data types, whether they are provided by |
| the standard library or created by you. We’ll discuss `String` in more depth in |
| [Chapter 8][ch8]<!-- ignore -->. |
| |
| We’ve already seen string literals, where a string value is hardcoded into our |
| program. String literals are convenient, but they aren’t suitable for every |
| situation in which we may want to use text. One reason is that they’re |
| immutable. Another is that not every string value can be known when we write |
| our code: for example, what if we want to take user input and store it? For |
| these situations, Rust has a second string type, `String`. This type manages |
| data allocated on the heap and as such is able to store an amount of text that |
| is unknown to us at compile time. You can create a `String` from a string |
| literal using the `from` function, like so: |
| |
| ```rust |
| let s = String::from("hello"); |
| ``` |
| |
| The double colon `::` operator allows us to namespace this particular `from` |
| function under the `String` type rather than using some sort of name like |
| `string_from`. We’ll discuss this syntax more in the [“Method |
| Syntax”][method-syntax]<!-- ignore --> section of Chapter 5, and when we talk |
| about namespacing with modules in [“Paths for Referring to an Item in the |
| Module Tree”][paths-module-tree]<!-- ignore --> in Chapter 7. |
| |
| This kind of string _can_ be mutated: |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-01-can-mutate-string/src/main.rs:here}} |
| ``` |
| |
| So, what’s the difference here? Why can `String` be mutated but literals |
| cannot? The difference is in how these two types deal with memory. |
| |
| ### Memory and Allocation |
| |
| In the case of a string literal, we know the contents at compile time, so the |
| text is hardcoded directly into the final executable. This is why string |
| literals are fast and efficient. But these properties only come from the string |
| literal’s immutability. Unfortunately, we can’t put a blob of memory into the |
| binary for each piece of text whose size is unknown at compile time and whose |
| size might change while running the program. |
| |
| With the `String` type, in order to support a mutable, growable piece of text, |
| we need to allocate an amount of memory on the heap, unknown at compile time, |
| to hold the contents. This means: |
| |
| - The memory must be requested from the memory allocator at runtime. |
| - We need a way of returning this memory to the allocator when we’re done with |
| our `String`. |
| |
| That first part is done by us: when we call `String::from`, its implementation |
| requests the memory it needs. This is pretty much universal in programming |
| languages. |
| |
| However, the second part is different. In languages with a _garbage collector |
| (GC)_, the GC keeps track of and cleans up memory that isn’t being used |
| anymore, and we don’t need to think about it. In most languages without a GC, |
| it’s our responsibility to identify when memory is no longer being used and to |
| call code to explicitly free it, just as we did to request it. Doing this |
| correctly has historically been a difficult programming problem. If we forget, |
| we’ll waste memory. If we do it too early, we’ll have an invalid variable. If |
| we do it twice, that’s a bug too. We need to pair exactly one `allocate` with |
| exactly one `free`. |
| |
| Rust takes a different path: the memory is automatically returned once the |
| variable that owns it goes out of scope. Here’s a version of our scope example |
| from Listing 4-1 using a `String` instead of a string literal: |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-02-string-scope/src/main.rs:here}} |
| ``` |
| |
| There is a natural point at which we can return the memory our `String` needs |
| to the allocator: when `s` goes out of scope. When a variable goes out of |
| scope, Rust calls a special function for us. This function is called |
| [`drop`][drop]<!-- ignore -->, and it’s where the author of `String` can put |
| the code to return the memory. Rust calls `drop` automatically at the closing |
| curly bracket. |
| |
| > Note: In C++, this pattern of deallocating resources at the end of an item’s |
| > lifetime is sometimes called _Resource Acquisition Is Initialization (RAII)_. |
| > The `drop` function in Rust will be familiar to you if you’ve used RAII |
| > patterns. |
| |
| This pattern has a profound impact on the way Rust code is written. It may seem |
| simple right now, but the behavior of code can be unexpected in more |
| complicated situations when we want to have multiple variables use the data |
| we’ve allocated on the heap. Let’s explore some of those situations now. |
| |
| <!-- Old heading. Do not remove or links may break. --> |
| |
| <a id="ways-variables-and-data-interact-move"></a> |
| |
| #### Variables and Data Interacting with Move |
| |
| Multiple variables can interact with the same data in different ways in Rust. |
| Let’s look at an example using an integer in Listing 4-2. |
| |
| <Listing number="4-2" caption="Assigning the integer value of variable `x` to `y`"> |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-02/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| We can probably guess what this is doing: “bind the value `5` to `x`; then make |
| a copy of the value in `x` and bind it to `y`.” We now have two variables, `x` |
| and `y`, and both equal `5`. This is indeed what is happening, because integers |
| are simple values with a known, fixed size, and these two `5` values are pushed |
| onto the stack. |
| |
| Now let’s look at the `String` version: |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-03-string-move/src/main.rs:here}} |
| ``` |
| |
| This looks very similar, so we might assume that the way it works would be the |
| same: that is, the second line would make a copy of the value in `s1` and bind |
| it to `s2`. But this isn’t quite what happens. |
| |
| Take a look at Figure 4-1 to see what is happening to `String` under the |
| covers. A `String` is made up of three parts, shown on the left: a pointer to |
| the memory that holds the contents of the string, a length, and a capacity. |
| This group of data is stored on the stack. On the right is the memory on the |
| heap that holds the contents. |
| |
| <img alt="Two tables: the first table contains the representation of s1 on the |
| stack, consisting of its length (5), capacity (5), and a pointer to the first |
| value in the second table. The second table contains the representation of the |
| string data on the heap, byte by byte." src="img/trpl04-01.svg" class="center" |
| style="width: 50%;" /> |
| |
| <span class="caption">Figure 4-1: Representation in memory of a `String` |
| holding the value `"hello"` bound to `s1`</span> |
| |
| The length is how much memory, in bytes, the contents of the `String` are |
| currently using. The capacity is the total amount of memory, in bytes, that the |
| `String` has received from the allocator. The difference between length and |
| capacity matters, but not in this context, so for now, it’s fine to ignore the |
| capacity. |
| |
| When we assign `s1` to `s2`, the `String` data is copied, meaning we copy the |
| pointer, the length, and the capacity that are on the stack. We do not copy the |
| data on the heap that the pointer refers to. In other words, the data |
| representation in memory looks like Figure 4-2. |
| |
| <img alt="Three tables: tables s1 and s2 representing those strings on the |
| stack, respectively, and both pointing to the same string data on the heap." |
| src="img/trpl04-02.svg" class="center" style="width: 50%;" /> |
| |
| <span class="caption">Figure 4-2: Representation in memory of the variable `s2` |
| that has a copy of the pointer, length, and capacity of `s1`</span> |
| |
| The representation does _not_ look like Figure 4-3, which is what memory would |
| look like if Rust instead copied the heap data as well. If Rust did this, the |
| operation `s2 = s1` could be very expensive in terms of runtime performance if |
| the data on the heap were large. |
| |
| <img alt="Four tables: two tables representing the stack data for s1 and s2, |
| and each points to its own copy of string data on the heap." |
| src="img/trpl04-03.svg" class="center" style="width: 50%;" /> |
| |
| <span class="caption">Figure 4-3: Another possibility for what `s2 = s1` might |
| do if Rust copied the heap data as well</span> |
| |
| Earlier, we said that when a variable goes out of scope, Rust automatically |
| calls the `drop` function and cleans up the heap memory for that variable. But |
| Figure 4-2 shows both data pointers pointing to the same location. This is a |
| problem: when `s2` and `s1` go out of scope, they will both try to free the |
| same memory. This is known as a _double free_ error and is one of the memory |
| safety bugs we mentioned previously. Freeing memory twice can lead to memory |
| corruption, which can potentially lead to security vulnerabilities. |
| |
| To ensure memory safety, after the line `let s2 = s1;`, Rust considers `s1` as |
| no longer valid. Therefore, Rust doesn’t need to free anything when `s1` goes |
| out of scope. Check out what happens when you try to use `s1` after `s2` is |
| created; it won’t work: |
| |
| ```rust,ignore,does_not_compile |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-04-cant-use-after-move/src/main.rs:here}} |
| ``` |
| |
| You’ll get an error like this because Rust prevents you from using the |
| invalidated reference: |
| |
| ```console |
| {{#include ../listings/ch04-understanding-ownership/no-listing-04-cant-use-after-move/output.txt}} |
| ``` |
| |
| If you’ve heard the terms _shallow copy_ and _deep copy_ while working with |
| other languages, the concept of copying the pointer, length, and capacity |
| without copying the data probably sounds like making a shallow copy. But |
| because Rust also invalidates the first variable, instead of being called a |
| shallow copy, it’s known as a _move_. In this example, we would say that `s1` |
| was _moved_ into `s2`. So, what actually happens is shown in Figure 4-4. |
| |
| <img alt="Three tables: tables s1 and s2 representing those strings on the |
| stack, respectively, and both pointing to the same string data on the heap. |
| Table s1 is grayed out be-cause s1 is no longer valid; only s2 can be used to |
| access the heap data." src="img/trpl04-04.svg" class="center" style="width: |
| 50%;" /> |
| |
| <span class="caption">Figure 4-4: Representation in memory after `s1` has been |
| invalidated</span> |
| |
| That solves our problem! With only `s2` valid, when it goes out of scope it |
| alone will free the memory, and we’re done. |
| |
| In addition, there’s a design choice that’s implied by this: Rust will never |
| automatically create “deep” copies of your data. Therefore, any _automatic_ |
| copying can be assumed to be inexpensive in terms of runtime performance. |
| |
| #### Scope and Assignment |
| |
| The inverse of this is true for the relationship between scoping, ownership, and |
| memory being freed via the `drop` function as well. When you assign a completely |
| new value to an existing variable, Rust will call `drop` and free the original |
| value’s memory immediately. Consider this code, for example: |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-04b-replacement-drop/src/main.rs:here}} |
| ``` |
| |
| We initially declare a variable `s` and bind it to a `String` with the value |
| `"hello"`. Then we immediately create a new `String` with the value `"ahoy"` and |
| assign it to `s`. At this point, nothing is referring to the original value on |
| the heap at all. |
| |
| <img alt="One table s representing the string value on the stack, pointing to |
| the second piece of string data (ahoy) on the heap, with the original string |
| data (hello) grayed out because it cannot be accessed anymore." |
| src="img/trpl04-05.svg" |
| class="center" |
| style="width: 50%;" |
| /> |
| |
| <span class="caption">Figure 4-5: Representation in memory after the initial |
| value has been replaced in its entirety.</span> |
| |
| The original string thus immediately goes out of scope. Rust will run the `drop` |
| function on it and its memory will be freed right away. When we print the value |
| at the end, it will be `"ahoy, world!"`. |
| |
| <!-- Old heading. Do not remove or links may break. --> |
| |
| <a id="ways-variables-and-data-interact-clone"></a> |
| |
| #### Variables and Data Interacting with Clone |
| |
| If we _do_ want to deeply copy the heap data of the `String`, not just the |
| stack data, we can use a common method called `clone`. We’ll discuss method |
| syntax in Chapter 5, but because methods are a common feature in many |
| programming languages, you’ve probably seen them before. |
| |
| Here’s an example of the `clone` method in action: |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-05-clone/src/main.rs:here}} |
| ``` |
| |
| This works just fine and explicitly produces the behavior shown in Figure 4-3, |
| where the heap data _does_ get copied. |
| |
| When you see a call to `clone`, you know that some arbitrary code is being |
| executed and that code may be expensive. It’s a visual indicator that something |
| different is going on. |
| |
| #### Stack-Only Data: Copy |
| |
| There’s another wrinkle we haven’t talked about yet. This code using |
| integers—part of which was shown in Listing 4-2—works and is valid: |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/no-listing-06-copy/src/main.rs:here}} |
| ``` |
| |
| But this code seems to contradict what we just learned: we don’t have a call to |
| `clone`, but `x` is still valid and wasn’t moved into `y`. |
| |
| The reason is that types such as integers that have a known size at compile |
| time are stored entirely on the stack, so copies of the actual values are quick |
| to make. That means there’s no reason we would want to prevent `x` from being |
| valid after we create the variable `y`. In other words, there’s no difference |
| between deep and shallow copying here, so calling `clone` wouldn’t do anything |
| different from the usual shallow copying, and we can leave it out. |
| |
| Rust has a special annotation called the `Copy` trait that we can place on |
| types that are stored on the stack, as integers are (we’ll talk more about |
| traits in [Chapter 10][traits]<!-- ignore -->). If a type implements the `Copy` |
| trait, variables that use it do not move, but rather are trivially copied, |
| making them still valid after assignment to another variable. |
| |
| Rust won’t let us annotate a type with `Copy` if the type, or any of its parts, |
| has implemented the `Drop` trait. If the type needs something special to happen |
| when the value goes out of scope and we add the `Copy` annotation to that type, |
| we’ll get a compile-time error. To learn about how to add the `Copy` annotation |
| to your type to implement the trait, see [“Derivable |
| Traits”][derivable-traits]<!-- ignore --> in Appendix C. |
| |
| So, what types implement the `Copy` trait? You can check the documentation for |
| the given type to be sure, but as a general rule, any group of simple scalar |
| values can implement `Copy`, and nothing that requires allocation or is some |
| form of resource can implement `Copy`. Here are some of the types that |
| implement `Copy`: |
| |
| - All the integer types, such as `u32`. |
| - The Boolean type, `bool`, with values `true` and `false`. |
| - All the floating-point types, such as `f64`. |
| - The character type, `char`. |
| - Tuples, if they only contain types that also implement `Copy`. For example, |
| `(i32, i32)` implements `Copy`, but `(i32, String)` does not. |
| |
| ### Ownership and Functions |
| |
| The mechanics of passing a value to a function are similar to those when |
| assigning a value to a variable. Passing a variable to a function will move or |
| copy, just as assignment does. Listing 4-3 has an example with some annotations |
| showing where variables go into and out of scope. |
| |
| <Listing number="4-3" file-name="src/main.rs" caption="Functions with ownership and scope annotated"> |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-03/src/main.rs}} |
| ``` |
| |
| </Listing> |
| |
| If we tried to use `s` after the call to `takes_ownership`, Rust would throw a |
| compile-time error. These static checks protect us from mistakes. Try adding |
| code to `main` that uses `s` and `x` to see where you can use them and where |
| the ownership rules prevent you from doing so. |
| |
| ### Return Values and Scope |
| |
| Returning values can also transfer ownership. Listing 4-4 shows an example of a |
| function that returns some value, with similar annotations as those in Listing |
| 4-3. |
| |
| <Listing number="4-4" file-name="src/main.rs" caption="Transferring ownership of return values"> |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-04/src/main.rs}} |
| ``` |
| |
| </Listing> |
| |
| The ownership of a variable follows the same pattern every time: assigning a |
| value to another variable moves it. When a variable that includes data on the |
| heap goes out of scope, the value will be cleaned up by `drop` unless ownership |
| of the data has been moved to another variable. |
| |
| While this works, taking ownership and then returning ownership with every |
| function is a bit tedious. What if we want to let a function use a value but |
| not take ownership? It’s quite annoying that anything we pass in also needs to |
| be passed back if we want to use it again, in addition to any data resulting |
| from the body of the function that we might want to return as well. |
| |
| Rust does let us return multiple values using a tuple, as shown in Listing 4-5. |
| |
| <Listing number="4-5" file-name="src/main.rs" caption="Returning ownership of parameters"> |
| |
| ```rust |
| {{#rustdoc_include ../listings/ch04-understanding-ownership/listing-04-05/src/main.rs}} |
| ``` |
| |
| </Listing> |
| |
| But this is too much ceremony and a lot of work for a concept that should be |
| common. Luckily for us, Rust has a feature for using a value without |
| transferring ownership, called _references_. |
| |
| [data-types]: ch03-02-data-types.html#data-types |
| [ch8]: ch08-02-strings.html |
| [traits]: ch10-02-traits.html |
| [derivable-traits]: appendix-03-derivable-traits.html |
| [method-syntax]: ch05-03-method-syntax.html#method-syntax |
| [paths-module-tree]: ch07-03-paths-for-referring-to-an-item-in-the-module-tree.html |
| [drop]: ../std/ops/trait.Drop.html#tymethod.drop |