blob: 8a13a252e0ef1e5b65c50be39afe6fd7bf977483 [file] [log] [blame] [view] [edit]
## Generic Data Types
We use generics to create definitions for items like function signatures or
structs, which we can then use with many different concrete data types. Lets
first look at how to define functions, structs, enums, and methods using
generics. Then, well discuss how generics affect code performance.
### In Function Definitions
When defining a function that uses generics, we place the generics in the
signature of the function where we would usually specify the data types of the
parameters and return value. Doing so makes our code more flexible and provides
more functionality to callers of our function while preventing code duplication.
Continuing with our `largest` function, Listing 10-4 shows two functions that
both find the largest value in a slice. Well then combine these into a single
function that uses generics.
<Listing number="10-4" file-name="src/main.rs" caption="Two functions that differ only in their names and in the types in their signatures">
```rust
{{#rustdoc_include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-04/src/main.rs:here}}
```
</Listing>
The `largest_i32` function is the one we extracted in Listing 10-3 that finds
the largest `i32` in a slice. The `largest_char` function finds the largest
`char` in a slice. The function bodies have the same code, so lets eliminate
the duplication by introducing a generic type parameter in a single function.
To parameterize the types in a new single function, we need to name the type
parameter, just as we do for the value parameters to a function. You can use
any identifier as a type parameter name. But well use `T` because, by
convention, type parameter names in Rust are short, often just one letter, and
Rusts type-naming convention is UpperCamelCase. Short for _type_, `T` is the
default choice of most Rust programmers.
When we use a parameter in the body of the function, we have to declare the
parameter name in the signature so that the compiler knows what that name
means. Similarly, when we use a type parameter name in a function signature, we
have to declare the type parameter name before we use it. To define the generic
`largest` function, we place type name declarations inside angle brackets,
`<>`, between the name of the function and the parameter list, like this:
```rust,ignore
fn largest<T>(list: &[T]) -> &T {
```
We read this definition as The function `largest` is generic over some type
`T`.” This function has one parameter named `list`, which is a slice of values
of type `T`. The `largest` function will return a reference to a value of the
same type `T`.
Listing 10-5 shows the combined `largest` function definition using the generic
data type in its signature. The listing also shows how we can call the function
with either a slice of `i32` values or `char` values. Note that this code wont
compile yet.
<Listing number="10-5" file-name="src/main.rs" caption="The `largest` function using generic type parameters; this doesn’t compile yet">
```rust,ignore,does_not_compile
{{#rustdoc_include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-05/src/main.rs}}
```
</Listing>
If we compile this code right now, well get this error:
```console
{{#include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-05/output.txt}}
```
The help text mentions `std::cmp::PartialOrd`, which is a trait, and were
going to talk about traits in the next section. For now, know that this error
states that the body of `largest` wont work for all possible types that `T`
could be. Because we want to compare values of type `T` in the body, we can
only use types whose values can be ordered. To enable comparisons, the standard
library has the `std::cmp::PartialOrd` trait that you can implement on types
(see Appendix C for more on this trait). To fix Listing 10-5, we can follow the
help texts suggestion and restrict the types valid for `T` to only those that
implement `PartialOrd`. The listing will then compile, because the standard
library implements `PartialOrd` on both `i32` and `char`.
### In Struct Definitions
We can also define structs to use a generic type parameter in one or more
fields using the `<>` syntax. Listing 10-6 defines a `Point<T>` struct to hold
`x` and `y` coordinate values of any type.
<Listing number="10-6" file-name="src/main.rs" caption="A `Point<T>` struct that holds `x` and `y` values of type `T`">
```rust
{{#rustdoc_include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-06/src/main.rs}}
```
</Listing>
The syntax for using generics in struct definitions is similar to that used in
function definitions. First, we declare the name of the type parameter inside
angle brackets just after the name of the struct. Then, we use the generic type
in the struct definition where we would otherwise specify concrete data types.
Note that because weve used only one generic type to define `Point<T>`, this
definition says that the `Point<T>` struct is generic over some type `T`, and
the fields `x` and `y` are _both_ that same type, whatever that type may be. If
we create an instance of a `Point<T>` that has values of different types, as in
Listing 10-7, our code wont compile.
<Listing number="10-7" file-name="src/main.rs" caption="The fields `x` and `y` must be the same type because both have the same generic data type `T`.">
```rust,ignore,does_not_compile
{{#rustdoc_include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-07/src/main.rs}}
```
</Listing>
In this example, when we assign the integer value `5` to `x`, we let the
compiler know that the generic type `T` will be an integer for this instance of
`Point<T>`. Then, when we specify `4.0` for `y`, which weve defined to have
the same type as `x`, well get a type mismatch error like this:
```console
{{#include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-07/output.txt}}
```
To define a `Point` struct where `x` and `y` are both generics but could have
different types, we can use multiple generic type parameters. For example, in
Listing 10-8, we change the definition of `Point` to be generic over types `T`
and `U` where `x` is of type `T` and `y` is of type `U`.
<Listing number="10-8" file-name="src/main.rs" caption="A `Point<T, U>` generic over two types so that `x` and `y` can be values of different types">
```rust
{{#rustdoc_include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-08/src/main.rs}}
```
</Listing>
Now all the instances of `Point` shown are allowed! You can use as many generic
type parameters in a definition as you want, but using more than a few makes
your code hard to read. If youre finding you need lots of generic types in
your code, it could indicate that your code needs restructuring into smaller
pieces.
### In Enum Definitions
As we did with structs, we can define enums to hold generic data types in their
variants. Lets take another look at the `Option<T>` enum that the standard
library provides, which we used in Chapter 6:
```rust
enum Option<T> {
Some(T),
None,
}
```
This definition should now make more sense to you. As you can see, the
`Option<T>` enum is generic over type `T` and has two variants: `Some`, which
holds one value of type `T`, and a `None` variant that doesnt hold any value.
By using the `Option<T>` enum, we can express the abstract concept of an
optional value, and because `Option<T>` is generic, we can use this abstraction
no matter what the type of the optional value is.
Enums can use multiple generic types as well. The definition of the `Result`
enum that we used in Chapter 9 is one example:
```rust
enum Result<T, E> {
Ok(T),
Err(E),
}
```
The `Result` enum is generic over two types, `T` and `E`, and has two variants:
`Ok`, which holds a value of type `T`, and `Err`, which holds a value of type
`E`. This definition makes it convenient to use the `Result` enum anywhere we
have an operation that might succeed (return a value of some type `T`) or fail
(return an error of some type `E`). In fact, this is what we used to open a
file in Listing 9-3, where `T` was filled in with the type `std::fs::File` when
the file was opened successfully and `E` was filled in with the type
`std::io::Error` when there were problems opening the file.
When you recognize situations in your code with multiple struct or enum
definitions that differ only in the types of the values they hold, you can
avoid duplication by using generic types instead.
### In Method Definitions
We can implement methods on structs and enums (as we did in Chapter 5) and use
generic types in their definitions too. Listing 10-9 shows the `Point<T>`
struct we defined in Listing 10-6 with a method named `x` implemented on it.
<Listing number="10-9" file-name="src/main.rs" caption="Implementing a method named `x` on the `Point<T>` struct that will return a reference to the `x` field of type `T`">
```rust
{{#rustdoc_include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-09/src/main.rs}}
```
</Listing>
Here, weve defined a method named `x` on `Point<T>` that returns a reference
to the data in the field `x`.
Note that we have to declare `T` just after `impl` so that we can use `T` to
specify that were implementing methods on the type `Point<T>`. By declaring
`T` as a generic type after `impl`, Rust can identify that the type in the
angle brackets in `Point` is a generic type rather than a concrete type. We
could have chosen a different name for this generic parameter than the generic
parameter declared in the struct definition, but using the same name is
conventional. If you write a method within an `impl` that declares a generic
type, that method will be defined on any instance of the type, no matter what
concrete type ends up substituting for the generic type.
We can also specify constraints on generic types when defining methods on the
type. We could, for example, implement methods only on `Point<f32>` instances
rather than on `Point<T>` instances with any generic type. In Listing 10-10, we
use the concrete type `f32`, meaning we dont declare any types after `impl`.
<Listing number="10-10" file-name="src/main.rs" caption="An `impl` block that only applies to a struct with a particular concrete type for the generic type parameter `T`">
```rust
{{#rustdoc_include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-10/src/main.rs:here}}
```
</Listing>
This code means the type `Point<f32>` will have a `distance_from_origin`
method; other instances of `Point<T>` where `T` is not of type `f32` will not
have this method defined. The method measures how far our point is from the
point at coordinates (0.0, 0.0) and uses mathematical operations that are
available only for floating-point types.
Generic type parameters in a struct definition arent always the same as those
you use in that same structs method signatures. Listing 10-11 uses the generic
types `X1` and `Y1` for the `Point` struct and `X2` and `Y2` for the `mixup`
method signature to make the example clearer. The method creates a new `Point`
instance with the `x` value from the `self` `Point` (of type `X1`) and the `y`
value from the passed-in `Point` (of type `Y2`).
<Listing number="10-11" file-name="src/main.rs" caption="A method that uses generic types that are different from its struct’s definition">
```rust
{{#rustdoc_include ../listings/ch10-generic-types-traits-and-lifetimes/listing-10-11/src/main.rs}}
```
</Listing>
In `main`, weve defined a `Point` that has an `i32` for `x` (with value `5`)
and an `f64` for `y` (with value `10.4`). The `p2` variable is a `Point` struct
that has a string slice for `x` (with value `"Hello"`) and a `char` for `y`
(with value `c`). Calling `mixup` on `p1` with the argument `p2` gives us `p3`,
which will have an `i32` for `x` because `x` came from `p1`. The `p3` variable
will have a `char` for `y` because `y` came from `p2`. The `println!` macro
call will print `p3.x = 5, p3.y = c`.
The purpose of this example is to demonstrate a situation in which some generic
parameters are declared with `impl` and some are declared with the method
definition. Here, the generic parameters `X1` and `Y1` are declared after
`impl` because they go with the struct definition. The generic parameters `X2`
and `Y2` are declared after `fn mixup` because theyre only relevant to the
method.
### Performance of Code Using Generics
You might be wondering whether there is a runtime cost when using generic type
parameters. The good news is that using generic types wont make your program
run any slower than it would with concrete types.
Rust accomplishes this by performing monomorphization of the code using
generics at compile time. _Monomorphization_ is the process of turning generic
code into specific code by filling in the concrete types that are used when
compiled. In this process, the compiler does the opposite of the steps we used
to create the generic function in Listing 10-5: The compiler looks at all the
places where generic code is called and generates code for the concrete types
the generic code is called with.
Lets look at how this works by using the standard librarys generic
`Option<T>` enum:
```rust
let integer = Some(5);
let float = Some(5.0);
```
When Rust compiles this code, it performs monomorphization. During that
process, the compiler reads the values that have been used in `Option<T>`
instances and identifies two kinds of `Option<T>`: One is `i32` and the other
is `f64`. As such, it expands the generic definition of `Option<T>` into two
definitions specialized to `i32` and `f64`, thereby replacing the generic
definition with the specific ones.
The monomorphized version of the code looks similar to the following (the
compiler uses different names than what were using here for illustration):
<Listing file-name="src/main.rs">
```rust
enum Option_i32 {
Some(i32),
None,
}
enum Option_f64 {
Some(f64),
None,
}
fn main() {
let integer = Option_i32::Some(5);
let float = Option_f64::Some(5.0);
}
```
</Listing>
The generic `Option<T>` is replaced with the specific definitions created by
the compiler. Because Rust compiles generic code into code that specifies the
type in each instance, we pay no runtime cost for using generics. When the code
runs, it performs just as it would if we had duplicated each definition by
hand. The process of monomorphization makes Rusts generics extremely efficient
at runtime.