blob: 4d56bec8059b1035077e152959b95a40ba44ce20 [file] [log] [blame] [view] [edit]
## Defining an Enum
Where structs give you a way of grouping together related fields and data, like
a `Rectangle` with its `width` and `height`, enums give you a way of saying a
value is one of a possible set of values. For example, we may want to say that
`Rectangle` is one of a set of possible shapes that also includes `Circle` and
`Triangle`. To do this, Rust allows us to encode these possibilities as an enum.
Lets look at a situation we might want to express in code and see why enums are
useful and more appropriate than structs in this case. Say we need to work with
IP addresses. Currently, two major standards are used for IP addresses: version
four and version six. Because these are the only possibilities for an IP address
that our program will come across, we can _enumerate_ all possible variants,
which is where enumeration gets its name.
Any IP address can be either a version four or a version six address, but not
both at the same time. That property of IP addresses makes the enum data
structure appropriate because an enum value can only be one of its variants.
Both version four and version six addresses are still fundamentally IP
addresses, so they should be treated as the same type when the code is handling
situations that apply to any kind of IP address.
We can express this concept in code by defining an `IpAddrKind` enumeration and
listing the possible kinds an IP address can be, `V4` and `V6`. These are the
variants of the enum:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-01-defining-enums/src/main.rs:def}}
```
`IpAddrKind` is now a custom data type that we can use elsewhere in our code.
### Enum Values
We can create instances of each of the two variants of `IpAddrKind` like this:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-01-defining-enums/src/main.rs:instance}}
```
Note that the variants of the enum are namespaced under its identifier, and we
use a double colon to separate the two. This is useful because now both values
`IpAddrKind::V4` and `IpAddrKind::V6` are of the same type: `IpAddrKind`. We can
then, for instance, define a function that takes any `IpAddrKind`:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-01-defining-enums/src/main.rs:fn}}
```
And we can call this function with either variant:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-01-defining-enums/src/main.rs:fn_call}}
```
Using enums has even more advantages. Thinking more about our IP address type,
at the moment we dont have a way to store the actual IP address _data_; we only
know what _kind_ it is. Given that you just learned about structs in Chapter 5,
you might be tempted to tackle this problem with structs as shown in Listing
6-1.
<Listing number="6-1" caption="Storing the data and `IpAddrKind` variant of an IP address using a `struct`">
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/listing-06-01/src/main.rs:here}}
```
</Listing>
Here, weve defined a struct `IpAddr` that has two fields: a `kind` field that
is of type `IpAddrKind` (the enum we defined previously) and an `address` field
of type `String`. We have two instances of this struct. The first is `home`, and
it has the value `IpAddrKind::V4` as its `kind` with associated address data of
`127.0.0.1`. The second instance is `loopback`. It has the other variant of
`IpAddrKind` as its `kind` value, `V6`, and has address `::1` associated with
it. Weve used a struct to bundle the `kind` and `address` values together, so
now the variant is associated with the value.
However, representing the same concept using just an enum is more concise:
rather than an enum inside a struct, we can put data directly into each enum
variant. This new definition of the `IpAddr` enum says that both `V4` and `V6`
variants will have associated `String` values:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-02-enum-with-data/src/main.rs:here}}
```
We attach data to each variant of the enum directly, so there is no need for an
extra struct. Here, its also easier to see another detail of how enums work:
the name of each enum variant that we define also becomes a function that
constructs an instance of the enum. That is, `IpAddr::V4()` is a function call
that takes a `String` argument and returns an instance of the `IpAddr` type. We
automatically get this constructor function defined as a result of defining the
enum.
Theres another advantage to using an enum rather than a struct: each variant
can have different types and amounts of associated data. Version four IP
addresses will always have four numeric components that will have values between
0 and 255. If we wanted to store `V4` addresses as four `u8` values but still
express `V6` addresses as one `String` value, we wouldnt be able to with a
struct. Enums handle this case with ease:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-03-variants-with-different-data/src/main.rs:here}}
```
Weve shown several different ways to define data structures to store version
four and version six IP addresses. However, as it turns out, wanting to store IP
addresses and encode which kind they are is so common that
[the standard library has a definition we can use!][IpAddr]<!-- ignore --> Lets
look at how the standard library defines `IpAddr`: it has the exact enum and
variants that weve defined and used, but it embeds the address data inside the
variants in the form of two different structs, which are defined differently for
each variant:
```rust
struct Ipv4Addr {
// --snip--
}
struct Ipv6Addr {
// --snip--
}
enum IpAddr {
V4(Ipv4Addr),
V6(Ipv6Addr),
}
```
This code illustrates that you can put any kind of data inside an enum variant:
strings, numeric types, or structs, for example. You can even include another
enum! Also, standard library types are often not much more complicated than what
you might come up with.
Note that even though the standard library contains a definition for `IpAddr`,
we can still create and use our own definition without conflict because we
havent brought the standard librarys definition into our scope. Well talk
more about bringing types into scope in Chapter 7.
Lets look at another example of an enum in Listing 6-2: this one has a wide
variety of types embedded in its variants.
<Listing number="6-2" caption="A `Message` enum whose variants each store different amounts and types of values">
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/listing-06-02/src/main.rs:here}}
```
</Listing>
This enum has four variants with different types:
* `Quit` has no data associated with it at all.
* `Move` has named fields, like a struct does.
* `Write` includes a single `String`.
* `ChangeColor` includes three `i32` values.
Defining an enum with variants such as the ones in Listing 6-2 is similar to
defining different kinds of struct definitions, except the enum doesnt use the
`struct` keyword and all the variants are grouped together under the `Message`
type. The following structs could hold the same data that the preceding enum
variants hold:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-04-structs-similar-to-message-enum/src/main.rs:here}}
```
But if we used the different structs, each of which has its own type, we
couldnt as easily define a function to take any of these kinds of messages as
we could with the `Message` enum defined in Listing 6-2, which is a single type.
There is one more similarity between enums and structs: just as were able to
define methods on structs using `impl`, were also able to define methods on
enums. Heres a method named `call` that we could define on our `Message` enum:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-05-methods-on-enums/src/main.rs:here}}
```
The body of the method would use `self` to get the value that we called the
method on. In this example, weve created a variable `m` that has the value
`Message::Write(String::from("hello"))`, and that is what `self` will be in the
body of the `call` method when `m.call()` runs.
Lets look at another enum in the standard library that is very common and
useful: `Option`.
### The `Option` Enum and Its Advantages Over Null Values
This section explores a case study of `Option`, which is another enum defined by
the standard library. The `Option` type encodes the very common scenario in
which a value could be something or it could be nothing.
For example, if you request the first item in a non-empty list, you would get a
value. If you request the first item in an empty list, you would get nothing.
Expressing this concept in terms of the type system means the compiler can check
whether youve handled all the cases you should be handling; this functionality
can prevent bugs that are extremely common in other programming languages.
Programming language design is often thought of in terms of which features you
include, but the features you exclude are important too. Rust doesnt have the
null feature that many other languages have. _Null_ is a value that means there
is no value there. In languages with null, variables can always be in one of two
states: null or not-null.
In his 2009 presentation Null References: The Billion Dollar Mistake,” Tony
Hoare, the inventor of null, has this to say:
> I call it my billion-dollar mistake. At that time, I was designing the first
> comprehensive type system for references in an object-oriented language. My
> goal was to ensure that all use of references should be absolutely safe, with
> checking performed automatically by the compiler. But I couldnt resist the
> temptation to put in a null reference, simply because it was so easy to
> implement. This has led to innumerable errors, vulnerabilities, and system
> crashes, which have probably caused a billion dollars of pain and damage in
> the last forty years.
The problem with null values is that if you try to use a null value as a
not-null value, youll get an error of some kind. Because this null or not-null
property is pervasive, its extremely easy to make this kind of error.
However, the concept that null is trying to express is still a useful one: a
null is a value that is currently invalid or absent for some reason.
The problem isnt really with the concept but with the particular
implementation. As such, Rust does not have nulls, but it does have an enum that
can encode the concept of a value being present or absent. This enum is
`Option<T>`, and it is [defined by the standard library][option]<!-- ignore -->
as follows:
```rust
enum Option<T> {
None,
Some(T),
}
```
The `Option<T>` enum is so useful that its even included in the prelude; you
dont need to bring it into scope explicitly. Its variants are also included in
the prelude: you can use `Some` and `None` directly without the `Option::`
prefix. The `Option<T>` enum is still just a regular enum, and `Some(T)` and
`None` are still variants of type `Option<T>`.
The `<T>` syntax is a feature of Rust we havent talked about yet. Its a
generic type parameter, and well cover generics in more detail in Chapter 10.
For now, all you need to know is that `<T>` means that the `Some` variant of the
`Option` enum can hold one piece of data of any type, and that each concrete
type that gets used in place of `T` makes the overall `Option<T>` type a
different type. Here are some examples of using `Option` values to hold number
types and string types:
```rust
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-06-option-examples/src/main.rs:here}}
```
The type of `some_number` is `Option<i32>`. The type of `some_char` is
`Option<char>`, which is a different type. Rust can infer these types because
weve specified a value inside the `Some` variant. For `absent_number`, Rust
requires us to annotate the overall `Option` type: the compiler cant infer the
type that the corresponding `Some` variant will hold by looking only at a `None`
value. Here, we tell Rust that we mean for `absent_number` to be of type
`Option<i32>`.
When we have a `Some` value, we know that a value is present and the value is
held within the `Some`. When we have a `None` value, in some sense it means the
same thing as null: we dont have a valid value. So why is having `Option<T>`
any better than having null?
In short, because `Option<T>` and `T` (where `T` can be any type) are different
types, the compiler wont let us use an `Option<T>` value as if it were
definitely a valid value. For example, this code wont compile, because its
trying to add an `i8` to an `Option<i8>`:
```rust,ignore,does_not_compile
{{#rustdoc_include ../listings/ch06-enums-and-pattern-matching/no-listing-07-cant-use-option-directly/src/main.rs:here}}
```
If we run this code, we get an error message like this one:
```console
{{#include ../listings/ch06-enums-and-pattern-matching/no-listing-07-cant-use-option-directly/output.txt}}
```
Intense! In effect, this error message means that Rust doesnt understand how to
add an `i8` and an `Option<i8>`, because theyre different types. When we have a
value of a type like `i8` in Rust, the compiler will ensure that we always have
a valid value. We can proceed confidently without having to check for null
before using that value. Only when we have an `Option<i8>` (or whatever type of
value were working with) do we have to worry about possibly not having a value,
and the compiler will make sure we handle that case before using the value.
In other words, you have to convert an `Option<T>` to a `T` before you can
perform `T` operations with it. Generally, this helps catch one of the most
common issues with null: assuming that something isnt null when it actually is.
Eliminating the risk of incorrectly assuming a not-null value helps you to be
more confident in your code. In order to have a value that can possibly be null,
you must explicitly opt in by making the type of that value `Option<T>`. Then,
when you use that value, you are required to explicitly handle the case when the
value is null. Everywhere that a value has a type that isnt an `Option<T>`, you
_can_ safely assume that the value isnt null. This was a deliberate design
decision for Rust to limit nulls pervasiveness and increase the safety of Rust
code.
So how do you get the `T` value out of a `Some` variant when you have a value of
type `Option<T>` so that you can use that value? The `Option<T>` enum has a
large number of methods that are useful in a variety of situations; you can
check them out in [its documentation][docs]<!-- ignore -->. Becoming familiar
with the methods on `Option<T>` will be extremely useful in your journey with
Rust.
In general, in order to use an `Option<T>` value, you want to have code that
will handle each variant. You want some code that will run only when you have a
`Some(T)` value, and this code is allowed to use the inner `T`. You want some
other code to run only if you have a `None` value, and that code doesnt have a
`T` value available. The `match` expression is a control flow construct that
does just this when used with enums: it will run different code depending on
which variant of the enum it has, and that code can use the data inside the
matching value.
[IpAddr]: ../std/net/enum.IpAddr.html
[option]: ../std/option/enum.Option.html
[docs]: ../std/option/enum.Option.html