What is the idiomatic way to write a linked list with a tail pointer?

1.6k views Asked by At

As a learning project for Rust, I have a very simple (working, if incomplete) implementation of a singly linked list. The declaration of the structs looks like this:

type NodePtr<T> = Option<Box<Node<T>>>;

struct Node<T> {
    data: T,
    next: NodePtr<T>,
}

pub struct LinkedList<T> {
    head: NodePtr<T>,
}

Implementing size and push_front were both reasonably straight-forward, although doing size iteratively did involve some "fighting with the borrow checker."

The next thing I wanted to try was adding a tail pointer to the LinkedList structure. to enable an efficient push_back operation. Here I've run into a bit of a wall. At first I attempted to use Option<&Box<Node<T>>> and then Option<&Node<T>>. Both of those led to sprinkling 'as everywhere, but still eventually being unable to promise the lifetime checker that tail would be valid.

I have since come to the tentative conclusion that it is impossible with these definitions: that there is no way to guarantee to the compiler that tail would be valid in the places that I think it is valid. The only way I can possibly accomplish this is to have all my pointers be Rc<_> or Rc<RefCell<_>>, because those are the only safe ways to have two things pointing at the same object (the final node).

My question: is this the correct conclusion? More generally: what is the idiomatic Rust solution for unowned pointers inside data structures? To my mind, reference counting seems awfully heavy-weight for something so simple, so I think I must be missing something. (Or perhaps I just haven't gotten my mind into the right mindset for memory safety yet.)

1

There are 1 answers

4
Alexis Beingessner On BEST ANSWER

Yes, if you want to write a singly-linked-list with a tail-pointer you have three choices:

  • Safe and Mutable: Use NodePtr = Option<Rc<RefCell<Node<T>>>>
  • Safe and Immutable: Use NodePtr = Option<Rc<Node<T>>>
  • Unsafe and Mutable: Use tail: *mut Node<T>

The *mut is going to be more efficient, and it's not like the Rc is actually going to prevent you from producing completely nonsense states (as you correctly deduced). It's just going to guarantee that they don't cause segfaults (and with RefCell it may still cause runtime crashes though...).

Ultimately, any linked list more complex than vanilla singly-linked has an ownership story that's too complex to encode in Rust's ownership system safely and efficiently (it's not a tree). I personally favour just embracing the unsafety at that point and leaning on unit tests to get to the finish-line in one piece (why write a suboptimal data structure...?).