@adlrocha - Playing with Wasmtime and Web Assembly's linear memory

Towards a Universal Runtime

I’ve been flirting for a while with this idea of building a Universal Runtime (a.k.a the InterPlanetary Runtime). If you have been a loyal reader of this newsletter, you may have been able to identify glimpses of this idea in pieces like this one or this one. My recent research endeavors have allowed me to invest some time on exploring this idea further. After some preliminary testing, something I have clear by now is that any implementation of this Universal Runtime needs to leverage all the developments being made in the Web Assembly ecosystem.

Today I won’t focus on discussing in detail the design of this “InterPlanetary Runtime” (let’s leave this for when I have a clearer view of the system’s architecture), but on sharing some of the tests I’ve been doing with Web Assembly to validate the feasibility of my design ideas. So without further ado, let’s talk about Web Assembly’s linear memory, and Wasmtime.

Wasm and Wasmtime: Foundational modules of the system

We’ve discussed Web Assembly (Wasm) several times already in this newsletter. Wasm “is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications”. Initially, Wasm could only be run in the browser, but the release of WASI (The Web Assembly System Interface) opened the door to the seamless execution of Wasm binaries outside the browser. 

“WASI is a modular system interface for WebAssembly. As described in the initial announcement, it’s focused on security and portability.”

So far so good. We have a universal bytecode, and a set of standards for the seamless execution of binaries over different target architectures. Now we need the runtime to execute these binaries, and here is where Wasmtime comes into play. Wasmtime is a standalone small and efficient runtime for WebAssembly being developed by the Bytecode Alliance. Why Wasmtime? There are other Wasm runtimes out there, such as Wasmer, but the fact that the Bytecode Alliance is behind Wasmtime, and that the developer community seems to be more active on Wasmtime than on Wasmer (according to their repos Github activity), made me choose Wasmtime as the preliminary core runtime for the initial explorations of my system.

So we have the core runtime for our Universal Runtime, what’s next?

A rough diagram of the runtime module

In the following diagram I depicted a rough idea of how I imagine the core module of the InterPlanetary Runtime would look like. The runtime would expect some data and the Wasm bytecode as inputs, and it would return the result of the execution as an output. Along with the bytecode we would have to provide some kind of ABI (Application Binary Interface)  to specify the input data format the bytecode expects. For this project to be a success, any device should be able to embed the InterPlanetary Runtime and run the computational loads targeted for the runtime (for now, a Wasm binary with some input data).

There are still many things to figure out in the design. Actually, my idea of an InterPlanetary Runtime does not consider exclusively the execution of Wasm as a universal bytecode (too easy). It is a slightly more ambitious idea. But again, let’s talk about this in the near future, for now I just wanted to introduce you to the diagram below so you can have some additional context for what comes next.

With this simple diagram in mind, I decided to start building a simple proof of concept to explore the feasibility of the model. The two main ideas I wanted to validate were:

  • The ability to embed a Wasm runtime (in my case wasmtime) in any (or almost any) device.

  • The feasibility of running a Wasm binaries introducing data in the Wasm runtime and fetching the result of the execution from the host machine.

Before jumping into the code, let’s introduce one last concept we need to understand for the proof of concept: Wasm’s linear memory.

Wasm’s Linear Memory as an interface

Wasm binaries expose a linear memory to the host. The linear memory is a continuous buffer of unsigned bytes that can be read from and stored into by both Wasm and the host’s runtime. In other words, Wasm memory is an expandable array of bytes that the host and Wasm can synchronously read and modify.

The linear memory can be used for many things, one of them being passing values back and forth between Wasm and the host. So the linear memory is exactly what we need as an interface to introduce the input data into the Wasm binaries of our runtimes. 

The proof of concept

This proof of concept allowed me two things: to validate the ideas mentioned above around the InterPlanetary Runtime, and to clearly understand how Wasm’s linear memory model works with Wasmtime (a key thing for future developments).

I decided to build a simple Rust application embedding the Wasmtime runtime. The application loads a Wasm module, introduces some data into the Wasm environment through the linear memory, runs a Web Assembly function, and fetches the result of the execution, again, using the linear memory. For the Wasm module I didn’t overthink it. I coded a simple function that takes a string as an input, appends a new string to it and returns the result as an output. That’s it!

I thought this would be pretty straightforward, but many of the technologies I used aren’t extensively documented and I ended up facing several blockers. I won’t bore you with the blockers and jump straight to the code that worked, but do not hesitate to ping me if you want to chat about all the things I tried before reaching this “complete solution”.

The Web Assembly module

I included two functions in my Wasm module:

  • An “alloc” function to allocate memory in the Wasm runtime. The memory allocated will be used afterwards by the host application to introduce the input data for the execution. This function receives as input the size of the memory chunk to be allocated, and returns the address of the chunk in Wasm’s linear memory.

  • An “append” function which takes the input data and runs the actual code we want to run. The function receives as an input the address of where the input data is stored in the linear memory, and the size of the data. The result of the function is the size of the output. This output is stored in the memory allocated through the “alloc” function (i.e. we use the same allocated memory chunk used to introduce the input data to communicate the result to the host machine). 

In this case, as we are using a single function with strings as inputs, our ABI is just the address of the data and its size for the only function available in the module. We don’t need to give any additional information about the functions, the input and the output of the Wasm module.

One of Wasm’s current limitations is its inability of using complex types as input in functions. Until the Interface Types standard is finalized, we can only use i32 data types as input and output of Wasm functions. This is one of the reasons why I chose to use strings for the proof of concept (the ABI for strings is simple, size of the string and the address where it is stored). There are already tools to ease the generation of interfaces for the use of complex data types between Wasm and the host system, such as wasm-bindgen in Rust. Unfortunately, wasm-bindgen can’t be used with Wasmtime in Rust (as far as I know), as the interfaces it automatically generates are meant to be used with Javascript hosts (i.e. the browser).

use std::mem;
use std::os::raw::c_void;
use std::slice;
use std::str;
use std::ptr::copy;
/*
Allocate a chunk of memory of `size` bytes in wasm module
*/
#[no_mangle]
pub extern "C" fn alloc(size: usize) -> *mut c_void {
let mut buf = Vec::with_capacity(size);
let ptr = buf.as_mut_ptr();
mem::forget(buf);
return ptr as *mut c_void;
}
#[no_mangle]
pub extern fn append(data_ptr: *mut c_void, size: u32) -> i32 {
let slice = unsafe { slice::from_raw_parts(data_ptr as _, size as _) };
let in_str = str::from_utf8(&slice).unwrap();
let mut out_str = String::new();
out_str += in_str;
out_str += "<---- This is your string";
unsafe {
copy(out_str.as_ptr(), data_ptr as *mut u8, out_str.len())
};
out_str.len() as i32
}
view raw memory.rs hosted with ❤ by GitHub

So we have the code of the Wasm module to be run in our Wasm runtime ready to go. We just have to compile it with the standard `cargo build --release --target=wasm32-wasi` or `cargo build --target wasm32-unknown-unknown` according to if you want compile the binary with the WASI interface or not, respectively. 

Embedding the runtime in Rust.

The next step of the proof of concept was to build a simple Rust application to embed the runtime, load the binary and interact with it through the linear memory. This is the code for the Rust application:

use anyhow::Result;
use wasmtime::*;
use std::ptr::copy;
fn main() -> Result<()> {
// Create our `Store` context and then compile a module and create an
// instance from the compiled module all in one go.
let wasmtime_store = Store::default();
// Load Wasm module
let module = Module::from_file(wasmtime_store.engine(), "../modules/memory.wasm")?;
let instance = Instance::new(&wasmtime_store, &module, &[])?;
// Expose alloc function from Wasm module
let alloc = instance
.get_func("alloc")
.ok_or(anyhow::format_err!("failed to find `alloc` export"))?
.get1::<i32, i32>()?;
// Get linear memory
let memory = instance
.get_memory("memory")
.ok_or(anyhow::format_err!("failed to find `memory` export"))?;
// Input string
let text = String::from("The input string");
let size = text.len() as i32;
// Allocate memory and visualize it.
let mem_ptr = alloc(size+30)?;
println!("Pointer received {:#x}, {}", mem_ptr, size);
println!("Host memory pointer {:#x?}", memory.data_ptr());
println!("wasm allocated memory {:#x?}",
unsafe{ memory.data_unchecked_mut().as_ptr() });
let pointer = unsafe { memory.data_ptr().add(mem_ptr as usize) };
println!("address for wasm object in rust: {:#x?}", pointer);
// Copy input data in linear memory-
unsafe {
let bytes = text.as_bytes();
copy(bytes.as_ptr(),
pointer,
bytes.len());
}
// Expose append function from wasm module
let append = instance
.get_func("append")
.ok_or(anyhow::format_err!("failed to find `append` export"))?
.get2::<i32, i32, i32>()?;
// Execute function
let new_size = append(mem_ptr, size)?;
println!("New Size received: {}", new_size);
// Get output data
let byte3 = unsafe {
String::from_utf8(memory.data_unchecked()[mem_ptr as usize..][..new_size as usize]
.to_vec()).unwrap()
};
println!("Result: {}", byte3);
Ok(())
}
view raw main.rs hosted with ❤ by GitHub

The  flow of the program is simple:

  • We first prepare a new environment to run our Wasm module with `wasmtime::Store`. And we load the Wasm module `memory.wasm`. With this we have the runtime environment and the bytecode ready for execution.

  • We then expose the “alloc” function from the Wasm module in the host machine, and the linear memory through the `get_func(“alloc”)` and `get_memory` lines, respectively.

  • As an input we use the following string: “the input string”. We compute the size of the string and use it as a baseline to allocate memory in Wasm linear memory. The execution of the bytecode will generate some additional data, and the memory allocated will also be used to store the result of the execution. This is why we allocate 30 bytes more than the size of the string in `alloc(size+30)?;`, to accommodate the potential size of the result.

  •  The ouput of the “alloc” function is the address of the memory allocated in the Wasm environment linear memory. To store data in the linear memory from the host machine, we need to transform the Wasm address received from the “alloc” function into a host machine address. This is done by using the base address of the linear memory and adding the Wasm pointer received (Wasm environment’s linear memory always starts in address 0). 

memory.data_ptr().add(mem_ptr as usize)
  • Memory allocated: ☑️ Linear memory exposed to host machine: ☑️  Now we just need to copy the input data into the linear memory using this pointer. This is easily done using the copy function:

unsafe {
let bytes = text.as_bytes();
copy(bytes.as_ptr(),
       pointer,
       bytes.len());
   }
  • We have the binary, we have the input data, let’s run some code over the data. We expose the “append” function to the host machine, and run it using as input the sise of the string and the Wasm pointer for the data —now using its Wasm address and not its host address, i.e. the output of the “alloc” function—. As a result of the function we receive the size of the output string.

  • The only thing left is to fetch the result from Wasm’s linear memory for its use in the host machine. This is straightforward, we just need to use the host pointer we used to input the data, and the size of the result to fetch it from memory. 

let byte3 = unsafe {
       String::from_utf8(memory.data_unchecked()[mem_ptr as usize..][..new_size as usize]
       .to_vec()).unwrap()
   };
  • The application also includes  some additional traces to visualize Wasm linear memory and addresses —the perfect way to understand the concepts presented in the publication—. I highly recommend running the program and playing with the code to see the linear memory in action.

The output of the Rust application should look something like this:

Wrapping up!

The more I dive into Web Assembly’s internals, and the more I play with Rust, the more potential I see in these technologies. I am growing increasingly excited with the idea of an InterPlanetary Runtime and the impact it may have, so expect more publications exploring this idea. See you next week!

PS: Substack’s code formatting is horrible. This is one of the reasions why I started adlrocha.github.io. All my publications will be available there the Monday after the release of the newsletter, so if you can’t stand this code formatting, see you tomorrow at adlrocha.github.io.