Implementing an object factory pattern in Rust using closures

Coming from a Java background, I’m used to using the object factory pattern to inject behavior into an existing component. For this tutorial we’ll use the example of building a connection pool that can be used with different types of database connections.

Disclaimer: The entire approach taken here might not be the best way of doing things in Rust at all, but I made it work with some help from the Rust IRC channel (shout out to Quxxy!). If you are a Rust guru and know of a more idiomatic way of solving this problem I would love to hear about it!

Anyway, without further ado, here is our naive connection pool class. Oops, did I just say “class”? I meant “struct”. It’s hard to undo so much Java.

This struct uses generics so we could create instances of this connection pool using different implementations of T (the connection type) although F must always be a function that has no arguments and returns a T.

The struct stores the F closure for creating new connections and has a Vec<T> to hold the connections.

Next, we implement some methods on this struct.

Ignoring the dubious logic in here (luckily, we’re not trying to build a real connection pool) the main point of interest is at line 10 where we create new connections by invoking the closure using the syntax (self.create_conn)().

Now that the generic connection pool is defined, we can go ahead and define our concrete connection struct that we want to store in the pool.

And finally, we can implement some code to create and use the connection pool.

For the closure that we pass to the ConnectionPool::new() method, we are just passing a reference to a function. It would have been possible to use closure syntax instead e.g. ConnectionPool::new(|| OracleConnection {}) but using the function is much cleaner.

Source code for this tutorial is available here.

Implementing a Pratt parser in Rust

Following on from the previous tutorial that demonstrated how to implement a tokenizer, it is now time to implement a parser to transform the list of tokens into an Abstract Syntax Tree, hereafter referred to as an AST.

We will be using a Pratt parser design, which is a recursive descent parser that is very simple to reason with. I discovered this design a few years ago (largely thanks to this blog post) and I have been using it extensively since then in a number of programming languages. It works very nicely with functional languages such as Rust and Scala.

Following the theme of the previous tutorial we will be parsing simple math expressions such as 1+2*3. We will need to handle operator precedence correctly to interpret 1+2*3 as 1+(2*3) rather than (1+2)*3.

The Pratt parser uses the concepts of a prefix parser and in infix parser. A prefix parser is responsible for parsing the start of a statement. In our case that would be an integer. The infix parser is responsible for parsing an infix operator (in our case the infix operator would be one of the math operators: +, – , *, or /).

Before we dive into the main logic, we need some data structures to store the parsed expressions:

We’ll also need some logic for determining the precedence of the different operators. Note that this code operates on the Symbol token rather than the Op enum that will contain the parsed operator.

The top level parsing code is actually very simple to follow. First the prefix parser is used to parse a prefix expression. Then the code loops until the precedence changes, parsing the next infix operator. We’ll see where the recursion comes in shortly when we write the parse_infix method.

The parse_prefix method is very simple for our use case. There is only one supported prefix token and that is an integer.

The infix parser is slightly more involved. It gets passed the “left expression” and is then responsible for parsing the infix operator and then recursing back into the top level parse_expr method to parse the right-hand side of the binary expression.

To make the logic clearer, let’s walk through two different scenarios and show what is happening with debug output in each function.

First, let’s parse 123+456*789.

Now let’s swap the operators so we are now parsing 123*456+789.

We can see from the debug output that the logic is determined the operator precedence correctly for both of these use cases.

The full source code for this example is available from github and I encourage you to try this code out yourself to get a deeper understanding of how this works.

Writing a Tokenizer with Iterator and Peekable

In this tutorial we will write a low-level tokenizer for tokenizing simple math expressions like “12+34*56” using Rust’s Iterator and Peekable traits.

First, we need to declare some data structures for storing the results. Rust only has two types of data structure – struct and enum. If you’re coming from a Java or Scala background you would possibly be wanting classes with an object-oriented hierarchy, but enums are the way to go here.

Rust enumerations are a great way of modeling data structures and can be nested and recursive with the help of Box and Vec. However, in this example we will keep things simple and avoid recursion. We’ll tackle that in part two of this tutorial where we will build a parser for math expressions.

We will use the #[derive(…)] syntax to implement the Debug and PartialEq traits for these enumerations to make it easy to write unit tests.

We now need a tokenize function that can take a String as input and produce a Vec as output. There are two main approaches for doing this in Rust – we can write a standalone function that accepts a String parameter or we can define a trait and implement the trait directly on the String class.

The latter is generally the better approach since it ties functionality to a particular type, and if we want to be able tokenize other types in the future we can implement the trait for those types too.

Here is our Tokenizer trait definition:

We now use the following syntax to implement the trait for the String type:

We can get an iterator over the chars in the string by calling the chars() method, but we want the ability to peek at the next character without consuming it, so we call peekable() on the iterator to give us a Peekable trait.

We also need to prepare a vector to store the tokens:

Next, we need to loop until all characters have been processed. Here’s the starting code for this.

There are some important things to explain here. Firstly, the loop keyword is equivalent to while(true) in other languages and will loop until hitting a break or return statement (or a panic).

Next, we’re performing pattern matching on the result of it.peek(). Because this method returns an Option<&T> (an optional reference, rather than an optional value) we need to match with Some(&ch) rather than Some(ch) otherwise we start running into lifetime issues. If peek() returns Some(&ch) then we can start building a specific token and if it peek() returns None we can quit the loop since there are no characters left to process.

Here is the full implementation of the tokenize() function:

A design goal here is to keep the tokenizing code as functional as possible, so we make use of a function named my_take_while to iterate over the string collecting characters for as long as the supplied closure returns true. This is very similar to the take_while() method on Iterator with one major difference – our version only consumes characters that match, whereas the Iterator implementation consumes one extra character that does not match, making it unsuitable for our purposes.

When I started writing this tutorial I had hoped it would be much simpler and we’re already getting into closures and functional programming. However, if you stumbled across this because you are trying to implement a tokenizer in Rust, hopefully it will give you some good ideas.

The full source code for this example is available in github here:

Part two of this tutorial is now available: