| ## Refactoring to Improve Modularity and Error Handling |
| |
| To improve our program, we’ll fix four problems that have to do with the |
| program’s structure and how it’s handling potential errors. First, our `main` |
| function now performs two tasks: it parses arguments and reads files. As our |
| program grows, the number of separate tasks the `main` function handles will |
| increase. As a function gains responsibilities, it becomes more difficult to |
| reason about, harder to test, and harder to change without breaking one of its |
| parts. It’s best to separate functionality so each function is responsible for |
| one task. |
| |
| This issue also ties into the second problem: although `query` and `file_path` |
| are configuration variables to our program, variables like `contents` are used |
| to perform the program’s logic. The longer `main` becomes, the more variables |
| we’ll need to bring into scope; the more variables we have in scope, the harder |
| it will be to keep track of the purpose of each. It’s best to group the |
| configuration variables into one structure to make their purpose clear. |
| |
| The third problem is that we’ve used `expect` to print an error message when |
| reading the file fails, but the error message just prints `Should have been |
| able to read the file`. Reading a file can fail in a number of ways: for |
| example, the file could be missing, or we might not have permission to open it. |
| Right now, regardless of the situation, we’d print the same error message for |
| everything, which wouldn’t give the user any information! |
| |
| Fourth, we use `expect` to handle an error, and if the user runs our program |
| without specifying enough arguments, they’ll get an `index out of bounds` error |
| from Rust that doesn’t clearly explain the problem. It would be best if all the |
| error-handling code were in one place so future maintainers had only one place |
| to consult the code if the error-handling logic needed to change. Having all the |
| error-handling code in one place will also ensure that we’re printing messages |
| that will be meaningful to our end users. |
| |
| Let’s address these four problems by refactoring our project. |
| |
| ### Separation of Concerns for Binary Projects |
| |
| The organizational problem of allocating responsibility for multiple tasks to |
| the `main` function is common to many binary projects. As a result, the Rust |
| community has developed guidelines for splitting the separate concerns of a |
| binary program when `main` starts getting large. This process has the following |
| steps: |
| |
| - Split your program into a _main.rs_ file and a _lib.rs_ file and move your |
| program’s logic to _lib.rs_. |
| - As long as your command line parsing logic is small, it can remain in |
| _main.rs_. |
| - When the command line parsing logic starts getting complicated, extract it |
| from _main.rs_ and move it to _lib.rs_. |
| |
| The responsibilities that remain in the `main` function after this process |
| should be limited to the following: |
| |
| - Calling the command line parsing logic with the argument values |
| - Setting up any other configuration |
| - Calling a `run` function in _lib.rs_ |
| - Handling the error if `run` returns an error |
| |
| This pattern is about separating concerns: _main.rs_ handles running the |
| program and _lib.rs_ handles all the logic of the task at hand. Because you |
| can’t test the `main` function directly, this structure lets you test all of |
| your program’s logic by moving it into functions in _lib.rs_. The code that |
| remains in _main.rs_ will be small enough to verify its correctness by reading |
| it. Let’s rework our program by following this process. |
| |
| #### Extracting the Argument Parser |
| |
| We’ll extract the functionality for parsing arguments into a function that |
| `main` will call to prepare for moving the command line parsing logic to |
| _src/lib.rs_. Listing 12-5 shows the new start of `main` that calls a new |
| function `parse_config`, which we’ll define in _src/main.rs_ for the moment. |
| |
| <Listing number="12-5" file-name="src/main.rs" caption="Extracting a `parse_config` function from `main`"> |
| |
| ```rust,ignore |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-05/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| We’re still collecting the command line arguments into a vector, but instead of |
| assigning the argument value at index 1 to the variable `query` and the |
| argument value at index 2 to the variable `file_path` within the `main` |
| function, we pass the whole vector to the `parse_config` function. The |
| `parse_config` function then holds the logic that determines which argument |
| goes in which variable and passes the values back to `main`. We still create |
| the `query` and `file_path` variables in `main`, but `main` no longer has the |
| responsibility of determining how the command line arguments and variables |
| correspond. |
| |
| This rework may seem like overkill for our small program, but we’re refactoring |
| in small, incremental steps. After making this change, run the program again to |
| verify that the argument parsing still works. It’s good to check your progress |
| often, to help identify the cause of problems when they occur. |
| |
| #### Grouping Configuration Values |
| |
| We can take another small step to improve the `parse_config` function further. |
| At the moment, we’re returning a tuple, but then we immediately break that |
| tuple into individual parts again. This is a sign that perhaps we don’t have |
| the right abstraction yet. |
| |
| Another indicator that shows there’s room for improvement is the `config` part |
| of `parse_config`, which implies that the two values we return are related and |
| are both part of one configuration value. We’re not currently conveying this |
| meaning in the structure of the data other than by grouping the two values into |
| a tuple; we’ll instead put the two values into one struct and give each of the |
| struct fields a meaningful name. Doing so will make it easier for future |
| maintainers of this code to understand how the different values relate to each |
| other and what their purpose is. |
| |
| Listing 12-6 shows the improvements to the `parse_config` function. |
| |
| <Listing number="12-6" file-name="src/main.rs" caption="Refactoring `parse_config` to return an instance of a `Config` struct"> |
| |
| ```rust,should_panic,noplayground |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-06/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| We’ve added a struct named `Config` defined to have fields named `query` and |
| `file_path`. The signature of `parse_config` now indicates that it returns a |
| `Config` value. In the body of `parse_config`, where we used to return |
| string slices that reference `String` values in `args`, we now define `Config` |
| to contain owned `String` values. The `args` variable in `main` is the owner of |
| the argument values and is only letting the `parse_config` function borrow |
| them, which means we’d violate Rust’s borrowing rules if `Config` tried to take |
| ownership of the values in `args`. |
| |
| There are a number of ways we could manage the `String` data; the easiest, |
| though somewhat inefficient, route is to call the `clone` method on the values. |
| This will make a full copy of the data for the `Config` instance to own, which |
| takes more time and memory than storing a reference to the string data. |
| However, cloning the data also makes our code very straightforward because we |
| don’t have to manage the lifetimes of the references; in this circumstance, |
| giving up a little performance to gain simplicity is a worthwhile trade-off. |
| |
| > ### The Trade-Offs of Using `clone` |
| > |
| > There’s a tendency among many Rustaceans to avoid using `clone` to fix |
| > ownership problems because of its runtime cost. In |
| > [Chapter 13][ch13]<!-- ignore -->, you’ll learn how to use more efficient |
| > methods in this type of situation. But for now, it’s okay to copy a few |
| > strings to continue making progress because you’ll make these copies only |
| > once and your file path and query string are very small. It’s better to have |
| > a working program that’s a bit inefficient than to try to hyperoptimize code |
| > on your first pass. As you become more experienced with Rust, it’ll be |
| > easier to start with the most efficient solution, but for now, it’s |
| > perfectly acceptable to call `clone`. |
| |
| We’ve updated `main` so it places the instance of `Config` returned by |
| `parse_config` into a variable named `config`, and we updated the code that |
| previously used the separate `query` and `file_path` variables so it now uses |
| the fields on the `Config` struct instead. |
| |
| Now our code more clearly conveys that `query` and `file_path` are related and |
| that their purpose is to configure how the program will work. Any code that |
| uses these values knows to find them in the `config` instance in the fields |
| named for their purpose. |
| |
| #### Creating a Constructor for `Config` |
| |
| So far, we’ve extracted the logic responsible for parsing the command line |
| arguments from `main` and placed it in the `parse_config` function. Doing so |
| helped us see that the `query` and `file_path` values were related, and that |
| relationship should be conveyed in our code. We then added a `Config` struct to |
| name the related purpose of `query` and `file_path` and to be able to return the |
| values’ names as struct field names from the `parse_config` function. |
| |
| So now that the purpose of the `parse_config` function is to create a `Config` |
| instance, we can change `parse_config` from a plain function to a function |
| named `new` that is associated with the `Config` struct. Making this change |
| will make the code more idiomatic. We can create instances of types in the |
| standard library, such as `String`, by calling `String::new`. Similarly, by |
| changing `parse_config` into a `new` function associated with `Config`, we’ll |
| be able to create instances of `Config` by calling `Config::new`. Listing 12-7 |
| shows the changes we need to make. |
| |
| <Listing number="12-7" file-name="src/main.rs" caption="Changing `parse_config` into `Config::new`"> |
| |
| ```rust,should_panic,noplayground |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-07/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| We’ve updated `main` where we were calling `parse_config` to instead call |
| `Config::new`. We’ve changed the name of `parse_config` to `new` and moved it |
| within an `impl` block, which associates the `new` function with `Config`. Try |
| compiling this code again to make sure it works. |
| |
| ### Fixing the Error Handling |
| |
| Now we’ll work on fixing our error handling. Recall that attempting to access |
| the values in the `args` vector at index 1 or index 2 will cause the program to |
| panic if the vector contains fewer than three items. Try running the program |
| without any arguments; it will look like this: |
| |
| ```console |
| {{#include ../listings/ch12-an-io-project/listing-12-07/output.txt}} |
| ``` |
| |
| The line `index out of bounds: the len is 1 but the index is 1` is an error |
| message intended for programmers. It won’t help our end users understand what |
| they should do instead. Let’s fix that now. |
| |
| #### Improving the Error Message |
| |
| In Listing 12-8, we add a check in the `new` function that will verify that the |
| slice is long enough before accessing index 1 and index 2. If the slice isn’t |
| long enough, the program panics and displays a better error message. |
| |
| <Listing number="12-8" file-name="src/main.rs" caption="Adding a check for the number of arguments"> |
| |
| ```rust,ignore |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-08/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| This code is similar to [the `Guess::new` function we wrote in Listing |
| 9-13][ch9-custom-types]<!-- ignore -->, where we called `panic!` when the |
| `value` argument was out of the range of valid values. Instead of checking for |
| a range of values here, we’re checking that the length of `args` is at least |
| `3` and the rest of the function can operate under the assumption that this |
| condition has been met. If `args` has fewer than three items, this condition |
| will be `true`, and we call the `panic!` macro to end the program immediately. |
| |
| With these extra few lines of code in `new`, let’s run the program without any |
| arguments again to see what the error looks like now: |
| |
| ```console |
| {{#include ../listings/ch12-an-io-project/listing-12-08/output.txt}} |
| ``` |
| |
| This output is better: we now have a reasonable error message. However, we also |
| have extraneous information we don’t want to give to our users. Perhaps the |
| technique we used in Listing 9-13 isn’t the best one to use here: a call to |
| `panic!` is more appropriate for a programming problem than a usage problem, |
| [as discussed in Chapter 9][ch9-error-guidelines]<!-- ignore -->. Instead, |
| we’ll use the other technique you learned about in Chapter 9—[returning a |
| `Result`][ch9-result]<!-- ignore --> that indicates either success or an error. |
| |
| <!-- Old headings. Do not remove or links may break. --> |
| |
| <a id="returning-a-result-from-new-instead-of-calling-panic"></a> |
| |
| #### Returning a `Result` Instead of Calling `panic!` |
| |
| We can instead return a `Result` value that will contain a `Config` instance in |
| the successful case and will describe the problem in the error case. We’re also |
| going to change the function name from `new` to `build` because many |
| programmers expect `new` functions to never fail. When `Config::build` is |
| communicating to `main`, we can use the `Result` type to signal there was a |
| problem. Then we can change `main` to convert an `Err` variant into a more |
| practical error for our users without the surrounding text about `thread |
| 'main'` and `RUST_BACKTRACE` that a call to `panic!` causes. |
| |
| Listing 12-9 shows the changes we need to make to the return value of the |
| function we’re now calling `Config::build` and the body of the function needed |
| to return a `Result`. Note that this won’t compile until we update `main` as |
| well, which we’ll do in the next listing. |
| |
| <Listing number="12-9" file-name="src/main.rs" caption="Returning a `Result` from `Config::build`"> |
| |
| ```rust,ignore,does_not_compile |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-09/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| Our `build` function returns a `Result` with a `Config` instance in the success |
| case and a string literal in the error case. Our error values will always be |
| string literals that have the `'static` lifetime. |
| |
| We’ve made two changes in the body of the function: instead of calling `panic!` |
| when the user doesn’t pass enough arguments, we now return an `Err` value, and |
| we’ve wrapped the `Config` return value in an `Ok`. These changes make the |
| function conform to its new type signature. |
| |
| Returning an `Err` value from `Config::build` allows the `main` function to |
| handle the `Result` value returned from the `build` function and exit the |
| process more cleanly in the error case. |
| |
| <!-- Old headings. Do not remove or links may break. --> |
| |
| <a id="calling-confignew-and-handling-errors"></a> |
| |
| #### Calling `Config::build` and Handling Errors |
| |
| To handle the error case and print a user-friendly message, we need to update |
| `main` to handle the `Result` being returned by `Config::build`, as shown in |
| Listing 12-10. We’ll also take the responsibility of exiting the command line |
| tool with a nonzero error code away from `panic!` and instead implement it by |
| hand. A nonzero exit status is a convention to signal to the process that |
| called our program that the program exited with an error state. |
| |
| <Listing number="12-10" file-name="src/main.rs" caption="Exiting with an error code if building a `Config` fails"> |
| |
| ```rust,ignore |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-10/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| In this listing, we’ve used a method we haven’t covered in detail yet: |
| `unwrap_or_else`, which is defined on `Result<T, E>` by the standard library. |
| Using `unwrap_or_else` allows us to define some custom, non-`panic!` error |
| handling. If the `Result` is an `Ok` value, this method’s behavior is similar |
| to `unwrap`: it returns the inner value that `Ok` is wrapping. However, if the |
| value is an `Err` value, this method calls the code in the _closure_, which is |
| an anonymous function we define and pass as an argument to `unwrap_or_else`. |
| We’ll cover closures in more detail in [Chapter 13][ch13]<!-- ignore -->. For |
| now, you just need to know that `unwrap_or_else` will pass the inner value of |
| the `Err`, which in this case is the static string `"not enough arguments"` |
| that we added in Listing 12-9, to our closure in the argument `err` that |
| appears between the vertical pipes. The code in the closure can then use the |
| `err` value when it runs. |
| |
| We’ve added a new `use` line to bring `process` from the standard library into |
| scope. The code in the closure that will be run in the error case is only two |
| lines: we print the `err` value and then call `process::exit`. The |
| `process::exit` function will stop the program immediately and return the |
| number that was passed as the exit status code. This is similar to the |
| `panic!`-based handling we used in Listing 12-8, but we no longer get all the |
| extra output. Let’s try it: |
| |
| ```console |
| {{#include ../listings/ch12-an-io-project/listing-12-10/output.txt}} |
| ``` |
| |
| Great! This output is much friendlier for our users. |
| |
| ### Extracting Logic from `main` |
| |
| Now that we’ve finished refactoring the configuration parsing, let’s turn to |
| the program’s logic. As we stated in [“Separation of Concerns for Binary |
| Projects”](#separation-of-concerns-for-binary-projects)<!-- ignore -->, we’ll |
| extract a function named `run` that will hold all the logic currently in the |
| `main` function that isn’t involved with setting up configuration or handling |
| errors. When we’re done, `main` will be concise and easy to verify by |
| inspection, and we’ll be able to write tests for all the other logic. |
| |
| Listing 12-11 shows the extracted `run` function. For now, we’re just making |
| the small, incremental improvement of extracting the function. We’re still |
| defining the function in _src/main.rs_. |
| |
| <Listing number="12-11" file-name="src/main.rs" caption="Extracting a `run` function containing the rest of the program logic"> |
| |
| ```rust,ignore |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-11/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| The `run` function now contains all the remaining logic from `main`, starting |
| from reading the file. The `run` function takes the `Config` instance as an |
| argument. |
| |
| #### Returning Errors from the `run` Function |
| |
| With the remaining program logic separated into the `run` function, we can |
| improve the error handling, as we did with `Config::build` in Listing 12-9. |
| Instead of allowing the program to panic by calling `expect`, the `run` |
| function will return a `Result<T, E>` when something goes wrong. This will let |
| us further consolidate the logic around handling errors into `main` in a |
| user-friendly way. Listing 12-12 shows the changes we need to make to the |
| signature and body of `run`. |
| |
| <Listing number="12-12" file-name="src/main.rs" caption="Changing the `run` function to return `Result`"> |
| |
| ```rust,ignore |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-12/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| We’ve made three significant changes here. First, we changed the return type of |
| the `run` function to `Result<(), Box<dyn Error>>`. This function previously |
| returned the unit type, `()`, and we keep that as the value returned in the |
| `Ok` case. |
| |
| For the error type, we used the _trait object_ `Box<dyn Error>` (and we’ve |
| brought `std::error::Error` into scope with a `use` statement at the top). |
| We’ll cover trait objects in [Chapter 18][ch18]<!-- ignore -->. For now, just |
| know that `Box<dyn Error>` means the function will return a type that |
| implements the `Error` trait, but we don’t have to specify what particular type |
| the return value will be. This gives us flexibility to return error values that |
| may be of different types in different error cases. The `dyn` keyword is short |
| for _dynamic_. |
| |
| Second, we’ve removed the call to `expect` in favor of the `?` operator, as we |
| talked about in [Chapter 9][ch9-question-mark]<!-- ignore -->. Rather than |
| `panic!` on an error, `?` will return the error value from the current function |
| for the caller to handle. |
| |
| Third, the `run` function now returns an `Ok` value in the success case. |
| We’ve declared the `run` function’s success type as `()` in the signature, |
| which means we need to wrap the unit type value in the `Ok` value. This |
| `Ok(())` syntax might look a bit strange at first, but using `()` like this is |
| the idiomatic way to indicate that we’re calling `run` for its side effects |
| only; it doesn’t return a value we need. |
| |
| When you run this code, it will compile but will display a warning: |
| |
| ```console |
| {{#include ../listings/ch12-an-io-project/listing-12-12/output.txt}} |
| ``` |
| |
| Rust tells us that our code ignored the `Result` value and the `Result` value |
| might indicate that an error occurred. But we’re not checking to see whether or |
| not there was an error, and the compiler reminds us that we probably meant to |
| have some error-handling code here! Let’s rectify that problem now. |
| |
| #### Handling Errors Returned from `run` in `main` |
| |
| We’ll check for errors and handle them using a technique similar to one we used |
| with `Config::build` in Listing 12-10, but with a slight difference: |
| |
| <span class="filename">Filename: src/main.rs</span> |
| |
| ```rust,ignore |
| {{#rustdoc_include ../listings/ch12-an-io-project/no-listing-01-handling-errors-in-main/src/main.rs:here}} |
| ``` |
| |
| We use `if let` rather than `unwrap_or_else` to check whether `run` returns an |
| `Err` value and to call `process::exit(1)` if it does. The `run` function |
| doesn’t return a value that we want to `unwrap` in the same way that |
| `Config::build` returns the `Config` instance. Because `run` returns `()` in |
| the success case, we only care about detecting an error, so we don’t need |
| `unwrap_or_else` to return the unwrapped value, which would only be `()`. |
| |
| The bodies of the `if let` and the `unwrap_or_else` functions are the same in |
| both cases: we print the error and exit. |
| |
| ### Splitting Code into a Library Crate |
| |
| Our `minigrep` project is looking good so far! Now we’ll split the |
| _src/main.rs_ file and put some code into the _src/lib.rs_ file. That way, we |
| can test the code and have a _src/main.rs_ file with fewer responsibilities. |
| |
| Let’s move all the code that isn’t in the `main` function from _src/main.rs_ to |
| _src/lib.rs_: |
| |
| - The `run` function definition |
| - The relevant `use` statements |
| - The definition of `Config` |
| - The `Config::build` function definition |
| |
| The contents of _src/lib.rs_ should have the signatures shown in Listing 12-13 |
| (we’ve omitted the bodies of the functions for brevity). Note that this won’t |
| compile until we modify _src/main.rs_ in Listing 12-14. |
| |
| <Listing number="12-13" file-name="src/lib.rs" caption="Moving `Config` and `run` into *src/lib.rs*"> |
| |
| ```rust,ignore,does_not_compile |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-13/src/lib.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| We’ve made liberal use of the `pub` keyword: on `Config`, on its fields and its |
| `build` method, and on the `run` function. We now have a library crate that has |
| a public API we can test! |
| |
| Now we need to bring the code we moved to _src/lib.rs_ into the scope of the |
| binary crate in _src/main.rs_, as shown in Listing 12-14. |
| |
| <Listing number="12-14" file-name="src/main.rs" caption="Using the `minigrep` library crate in *src/main.rs*"> |
| |
| ```rust,ignore |
| {{#rustdoc_include ../listings/ch12-an-io-project/listing-12-14/src/main.rs:here}} |
| ``` |
| |
| </Listing> |
| |
| We add a `use minigrep::Config` line to bring the `Config` type from the |
| library crate into the binary crate’s scope, and we prefix the `run` function |
| with our crate name. Now all the functionality should be connected and should |
| work. Run the program with `cargo run` and make sure everything works correctly. |
| |
| Whew! That was a lot of work, but we’ve set ourselves up for success in the |
| future. Now it’s much easier to handle errors, and we’ve made the code more |
| modular. Almost all of our work will be done in _src/lib.rs_ from here on out. |
| |
| Let’s take advantage of this newfound modularity by doing something that would |
| have been difficult with the old code but is easy with the new code: we’ll |
| write some tests! |
| |
| [ch13]: ch13-00-functional-features.html |
| [ch9-custom-types]: ch09-03-to-panic-or-not-to-panic.html#creating-custom-types-for-validation |
| [ch9-error-guidelines]: ch09-03-to-panic-or-not-to-panic.html#guidelines-for-error-handling |
| [ch9-result]: ch09-02-recoverable-errors-with-result.html |
| [ch18]: ch18-00-oop.html |
| [ch9-question-mark]: ch09-02-recoverable-errors-with-result.html#a-shortcut-for-propagating-errors-the--operator |