Convenient 'Option<Box<Any>>' access when success is assured?

254 views Asked by At

When writing callbacks for generic interfaces, it can be useful for them to define their own local data which they are responsible for creating and accessing.

In C I would just use a void pointer, C-like example:

struct SomeTool {
    int type;
    void *custom_data;
};

void invoke(SomeTool *tool) {
    StructOnlyForThisTool *data = malloc(sizeof(*data));
    /* ... fill in the data ... */
    tool.custom_data = custom_data;
}
void execute(SomeTool *tool) {
    StructOnlyForThisTool *data = tool.custom_data;
    if (data.foo_bar) { /* do something */ }
}

When writing something similar in Rust, replacing void * with Option<Box<Any>>, however I'm finding that accessing the data is unreasonably verbose, eg:

struct SomeTool {
    type: i32,
    custom_data: Option<Box<Any>>,
};

fn invoke(tool: &mut SomeTool) {
    let data = StructOnlyForThisTool { /* my custom data */ }
    /* ... fill in the data ... */
    tool.custom_data = Some(Box::new(custom_data));
}
fn execute(tool: &mut SomeTool) {
    let data = tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap();
    if data.foo_bar { /* do something */ }
}

There is one line here which I'd like to be able to write in a more compact way:

  • tool.custom_data.as_ref().unwrap().downcast_ref::<StructOnlyForThisTool>().unwrap()
  • tool.custom_data.as_ref().unwrap().downcast_mut::<StructOnlyForThisTool>().unwrap()

While each method makes sense on its own, in practice it's not something I'd want to write throughout a code-base, and not something I'm going to want to type out often or remember easily.

By convention, the uses of unwrap here aren't dangerous because:

  • While only some tools define custom data, the ones that do always define it.
  • When the data is set, by convention the tool only ever sets its own data. So there is no chance of having the wrong data.
  • Any time these conventions aren't followed, its a bug and should panic.

Given these conventions, and assuming accessing custom-data from a tool is something that's done often - what would be a good way to simplify this expression?


Some possible options:

  • Remove the Option, just use Box<Any> with Box::new(()) representing None so access can be simplified a little.
  • Use a macro or function to hide verbosity - passing in the Option<Box<Any>>: will work of course, but prefer not - would use as a last resort.
  • Add a trait to Option<Box<Any>> which exposes a method such as tool.custom_data.unwrap_box::<StructOnlyForThisTool>() with matching unwrap_box_mut.

Update 1): since asking this question a point I didn't include seems relevant. There may be multiple callback functions like execute which must all be able to access the custom_data. At the time I didn't think this was important to point out.

Update 2): Wrapping this in a function which takes tool isn't practical, since the borrow checker then prevents further access to members of tool until the cast variable goes out of scope, I found the only reliable way to do this was to write a macro.

1

There are 1 answers

2
user4815162342 On

If the implementation really only has a single method with a name like execute, that is a strong indication to consider using a closure to capture the implementation data. SomeTool can incorporate an arbitrary callable in a type-erased manner using a boxed FnMut, as shown in this answer. execute() then boils down to invoking the closure stored in the struct field implementation closure using (self.impl_)(). For a more general approach, that will also work when you have more methods on the implementation, read on.

An idiomatic and type-safe equivalent of the type+dataptr C pattern is to store the implementation type and pointer to data together as a trait object. The SomeTool struct can contain a single field, a boxed SomeToolImpl trait object, where the trait specifies tool-specific methods such as execute. This has the following characteristics:

  • You no longer need an explicit type field because the run-time type information is incorporated in the trait object.

  • Each tool's implementation of the trait methods can access its own data in a type-safe manner without casts or unwraps. This is because the trait object's vtable automatically invokes the correct function for the correct trait implementation, and it is a compile-time error to try to invoke a different one.

  • The "fat pointer" representation of the trait object has the same performance characteristics as the type+dataptr pair - for example, the size of SomeTool will be two pointers, and accessing the implementation data will still involve a single pointer dereference.

Here is an example implementation:

struct SomeTool {
    impl_: Box<SomeToolImpl>,
}

impl SomeTool {
    fn execute(&mut self) {
        self.impl_.execute();
    }
}

trait SomeToolImpl {
    fn execute(&mut self);
}

struct SpecificTool1 {
    foo_bar: bool
}

impl SpecificTool1 {
    pub fn new(foo_bar: bool) -> SomeTool {
        let my_data = SpecificTool1 { foo_bar: foo_bar };
        SomeTool { impl_: Box::new(my_data) }
    }
}

impl SomeToolImpl for SpecificTool1 {
    fn execute(&mut self) {
        println!("I am {}", self.foo_bar);
    }
}

struct SpecificTool2 {
    num: u64
}

impl SpecificTool2 {
    pub fn new(num: u64) -> SomeTool {
        let my_data = SpecificTool2 { num: num };
        SomeTool { impl_: Box::new(my_data) }
    }
}

impl SomeToolImpl for SpecificTool2 {
    fn execute(&mut self) {
        println!("I am {}", self.num);
    }
}

pub fn main() {
    let mut tool1: SomeTool = SpecificTool1::new(true);
    let mut tool2: SomeTool = SpecificTool2::new(42);
    tool1.execute();
    tool2.execute();
}

Note that, in this design, it doesn't make sense to make implementation an Option because we always associate the tool type with the implementation. While it is perfectly valid to have an implementation without data, it must always have a type associated with it.