What is `offset` in wasmtime Memory.read?

260 views Asked by At

Problem

I've got wasmtime up-and-running, calling a TinyGo WASM/WASI module from a Rust host. All is well until I try to return a string from the Go WASI module, which seems like something everyone struggles with. I understand the concept of accessing the WASM module's memory at a particular location and reading for a particular length; what I don't understand is how to do that with an offset instead of a pointer.

I'm thinking clarification of wasmtime's own example from their docs may point me in the right direction:

use wasmtime::{Memory, Store, MemoryAccessError};

fn safe_examples(mem: Memory, store: &mut Store<()>) -> Result<(), MemoryAccessError> {
    let offset = 5;
    mem.write(&mut *store, offset, b"hello")?;
    let mut buffer = [0u8; 5];
    mem.read(&store, offset, &mut buffer)?;
    assert_eq!(b"hello", &buffer);

    assert_eq!(&mem.data(&store)[offset..offset + 5], b"hello");
    mem.data_mut(&mut *store)[offset..offset + 5].copy_from_slice(b"bye!!");

    Ok(())
}

Questions

  1. What is offset? My impression is that it is NOT a pointer address, but a usize offset from the beginning of the WASM module's memory.
  2. Assuming that is correct, how do I get the offset of a particular variable? I see plenty of examples where a random value (say 5 or 10 is used), but in my own examples, anything greater than 0 segfaults. I think I may be misunderstanding what offset is.
  3. Does shared WASM memory need to be allocated by the host? My assumption has been that the WASM module itself would expand its own memory naturally (as it would if running natively). If I have to allocate on the host, how can I be sure of how much memory to allocate, if it's the WASM module that is creating the variables that use the memory?
1

There are 1 answers

3
linuskmr On BEST ANSWER

Regarding your questions

  1. Yes, offset is an offset from the beginning of the WASM memory. It is like a pointer inside the WASM memory. Trying to access offset as a normal pointer inside the host Rust application will likely result in a segfault.
  2. Variables of your host Rust application are separated from the WASM memory. They are not automatically part of the WASM memory's contents. This is the whole trick with WASM. Everything has to be explicitly copied into the WASM memory, or read from the WASM memory and written to a variable of the host application.
  3. If you need memory inside of the WASM instance, you have to manually allocate it there, e.g. by using an exported malloc() function (more on that later). For every allocation, you need to know how many bytes you need.

How to read a string from a TinyGo WASM module

  1. TinyGo encodes a string as a (ptr, len) tuple when crossing the WASM boundary [0, 1].
  2. Since returning non-trivial types is a rather new feature [2], TinyGo uses a workaround (likely for backwards compatibility): Instead of returning a (ptr, len) tuple, it asks you to pass a pointer to a free memory segment/buffer where it can store the (ptr, len) tuple. Because ptr and len are of type i32, you need to pass an 8-byte buffer.
  3. Where to get the buffer from? You need to allocate it first, and this has to happen inside the WASM memory, so you need to call the exported malloc function of the module.
  4. Now you can call the function returning the string while passing the buffer as an argument.
  5. Then you have to read the (ptr, len) tuple from the WASM memory.
  6. Finally, you read [ptr..ptr+len] from the WASM memory and convert the bytes to a Rust String.

A simple example:

  1. Create a basic TinyGo WASM module exporting a function returning a string:
package main

//export ReturnString
func ReturnString() string {
    return "hello from TinyGo/WASM"
}

func main() {}

Compile it to WASM using TinyGo: tinygo build -o return_string.wasm -target wasm ./return_string.go

  1. Rust Code:
use wasmtime::*;
use wasmtime_wasi::sync::WasiCtxBuilder;
use std::mem;

/// Go's string representation for export.
/// 
/// According to <https://tinygo.org/docs/concepts/compiler-internals/datatypes/#string> and
/// <https://github.com/tinygo-org/tinygo/blob/731532cd2b6353b60b443343b51296ec0fafae09/src/runtime/string.go#L10-L13>
#[derive(Debug)]
#[repr(C)]
struct GoStringParameters {
    ptr: i32,
    len: i32,
}

fn main() {
    // Create wasmtime runtime with WASI support, according to <https://docs.wasmtime.dev/examples-rust-wasi.html#wasirs>
    let engine = Engine::default();
    let module = Module::from_file(&engine, "../return_string.wasm").expect("Create module");
    let mut linker = Linker::new(&engine);
    let wasi = WasiCtxBuilder::new()
        .inherit_stdio()
        .inherit_args().expect("WASI: inherit args")
        .build();
    let mut store = Store::new(&engine, wasi);
    wasmtime_wasi::add_to_linker(&mut linker, |s| s).expect("Add WASI to linker");
    let instance = linker.instantiate(&mut store, &module).expect("Create instance");


    // malloc a GoStringParameters in WASM memory
    let go_str_addr = {
        let malloc = instance.get_func(&mut store, "malloc").expect("Couldn't get malloc function");
        let mut result = [wasmtime::Val::I32(0)];
        malloc.call(&mut store, &[wasmtime::Val::I32(mem::size_of::<GoStringParameters>() as i32)], &mut result).expect("malloc GoStringParameters");
        result[0].unwrap_i32()
    };

    // Call ReturnString() and pass a pointer where it should store the GoStringParameters
    let wasm_return_string_function = instance.get_func(&mut store, "ReturnString").expect("Couldn't get function");
    wasm_return_string_function.call(&mut store, &[wasmtime::Val::I32(go_str_addr)], &mut []).expect("Call ReturnString");

    // Read the GoStringParameters from WASM memory
    let mut buf = [0u8; mem::size_of::<GoStringParameters>()];
    let mem = instance.get_memory(&mut store, "memory").unwrap();
    mem.read(&mut store, go_str_addr as usize, &mut buf).expect("Get WASM memory");
    // SAFETY: This hack (mem::transmute) only works on little endian machines, because WASM memory is always in little endian
    let go_str_parameters: GoStringParameters = unsafe { mem::transmute(buf) };
    dbg!(&go_str_parameters);
    
    // Read the actual bytes of the string from WASM memory
    let mut str_bytes = vec![0u8; go_str_parameters.len as usize];
    mem.read(&mut store, go_str_parameters.ptr as usize, &mut str_bytes).expect("Read string bytes");
    let rust_str = String::from_utf8(str_bytes).unwrap();
    dbg!(rust_str);

   // TODO: Call exported free() function on the GoStringParameters address
}

Output:

$ cargo run -q --release
[src/main.rs:36] &go_str_parameters = GoStringParameters {
    ptr: 65736,
    len: 22,
}
[src/main.rs:42] rust_str = "hello from TinyGo/WASM"