Snowhash in Rust with WASM

Published on to joshleeb's blog

A little while ago I came across this Snowflake generator. It’s a project by Raph Levein that takes a hash string and uses it to procedurally generate a unique snowflake. He explains that

the original motivation was as a cryptographically secure visual hash, so that people would reliably be able to tell by visual inspection whether two hashes were identical.

I thought that was a pretty cool idea. Having someway to quickly tell if two similar strings were distinct has some real benefits. But to be honest, I thought the snowflakes that were generated were the coolest part (pun completely intended).

I set about rewriting the project in Rust using the algorithm Raph described in a follow up article he posted. As I was writing it I thought generating snowflakes would also be a great way to play around with WASM.

Targeting WASM from Rust code has been something I’ve been meaning to check out. And this project was perfect for it. The Rust code could handle all the procedural generation stuff, and then I could write a small bit of JS to call the WASM function and draw each point onto a Canvas.

impl Snowhash

Let’s start from the top: the main generate function.

/// Generate points from the provided hash string.
pub fn generate(hash: &str) -> Vec<Point> {
    let seed: &[_] = &[hash_sum(hash)];
    let mut rng: StdRng = SeedableRng::from_seed(seed);

    let mut open = vec![Point::origin()];
    let mut closed = vec![];
    let mut filled = vec![];

    for bit in BitStr::from_str(hash) {
        let index = rng.gen_range(0, open.len());
        let point = open.remove(index);

        if point.on_axis() || bit.as_bool() {
            let mut reflection = point.reflection();
            filled.append(&mut reflection);

            let mut extension = extend(&point, &closed);
            open.append(&mut extension);
        }
        closed.push(point);
    }
    filled
}

That’s not a lot of code, but let’s run through it bit by bit.

let seed: &[_] = &[hash_sum(hash)];
let mut rng: StdRng = SeedableRng::from_seed(seed);

First we seed the random number generator (RNG). hash_sum takes the u8 value of each character in hash and adds them all together. I found that this resulted in greater changes between two similar hashes, when compared to having a fixed seed.

We also don’t want this to be a non-reproducible seed, such as Unix time since epoch. Using that would mean that the same hash would produce two distinct snowflakes. It would still look cool, but that would go against the whole point of generating the snowflake from the hash.

Next, we iterate through all the bits in the hash with BitStr::from_str(hash). I’ll come back to the BitStr implementation a bit later.

let index = rng.gen_range(0, open.len());
let point = open.remove(index);

And for each iteration we get a random point from open and remove it from the vector.

if point.on_axis() || bit.as_bool() {
    let mut reflection = point.reflection();
    filled.append(&mut reflection);

    let mut extension = extend(&point, &closed);
    open.append(&mut extension);
}
closed.push(point);

The final section is the loop iteration. This is just the algorithm that Raph described. The extend function grabs all neighbors of point, in our specific slice, that we haven’t seen yet (isn’t in closed).

At the end of this function we return filled: Vec<Point> which is, as the name suggests, all the points we need to fill to show the snowflake.

Point

The Point struct is exactly what you’d expect. It’s a just a tuple struct of two i32s.

pub struct Point(i32, i32);

It also has a bunch of methods to make the generate function a bit cleaner, namely Point::reflection and Point::neighbours.

BitStr

Implementing BitStr required a bit more thought. Here’s the struct.

struct BitStr<'a> {
    bytes: Bytes<'a>,
    cur: Option<u8>,
    cur_idx: u8,
}

bytes are the bytes we want to iterate though. cur is the current byte, or another way of putting it is the most recent byte taken from bytes. And cur_inx is the index into cur, used for extracting individual bits from the byte.

This struct just implements the Iterator trait so we can iterated over all the bits. Then, for each iteration we grab the next bit, with some bit shifting. If we find we have gone through all the bits in cur then we reset cur_idx and grab the next byte from bytes.

impl<'a> Iterator for BitStr<'a> {
    type Item = Bit;

    fn next(&mut self) -> Option<Self::Item> {
        if self.cur.is_none() || self.cur_idx >= 8 {
            self.cur = self.bytes.next();
            self.cur_idx = 0;
        }

        if let Some(byte) = self.cur {
            let bit = Bit::from_u8((byte << self.cur_idx) & 128);
            self.cur_idx += 1;
            return Some(bit);
        }
        None
    }
}

Bit

If I’m implementing my own BitStr I may as well implement my own Bit as well. For this I also wanted to take advantage of Rust’s empty tuple, which has a size of 0.

struct Bit(Option<()>);

Even though we’re now using a Rust specific type, the Bit implementation is still super simple. If bit.0.is_some() then it’s a high bit, and if bit.0.is_none() then it’s a low bit.

impl From for Png

That’s the entire implementation. Not too much logic for some really interesting snowflakes. Now I just need to display them.

Before jumping straight into compiling to WASM, I went for something a bit more familiar and rendered the points as a PNG. For this I used cairo-rs, a wrapper around the Cairo Graphics clib.

Coming this far we get some pretty interesting looking snowflakes, like this one.

$ snowhash_png 2552a8f69a6d45dbd786a8452d14dc841c665f04e763bb2c6b2157b8b95a231a

Or this one from a randomly generated hash.

$ snowhash_png dd76ecd83a8b238b58507b13096f6727cabf33ad90530c9a97de07b55d3efa3b

Plug those hashes into the Snowhash generator on this site and you should see the exact same snowflake. But this generator makes no guarantees between machines when running snowhash_png.

To achieve consistency between snowhash_wasm, and snowhash_png on multiple devices I imagine all that is required is some small changes to the RNG. But that’s something to look into another day.

impl From for WASM

For now, onto WASM.

This turned out to be much easier than I was expecting. In fact, excluding imports and a helper function, this is all that we need.

#[derive(Serialize)]
struct CanvasPoint(f64, f64);

#[wasm_bindgen]
pub fn generate(hash: &str) -> String {
    let points: Vec<CanvasPoint> = snowhash::generate(hash)
        .iter()
        .map(|point| hex_to_cartesian(point.x() as f64, point.y() as f64))
        .map(|(x, y)| CanvasPoint(x, y))
        .collect();
    serde_json::to_string(&points).unwrap()
}

I created a new struct CanvasPoint just so I didn’t have to make the Snowhash library depend on serde.

And the generate function just takes the points from snowhash::generate(hash), converts them from Hex grid coordinates into Cartesian grid coordinates, and turns the resulting Vec into a JSON list.

The #[wasm_bindgen] macro above the generate function tells the rust compiler that we want to make this function available as part of the WASM module. Without it, the JS code wouldn’t be able to call snowhash.generate().

Something else worth mentioning is that WASM, at the time of writing, only supports numeric types. I thought that this would allow me to return a Vec<(f64, f64)> from generate but that isn’t supported just yet.

As a workaround I went with serializing Vec<CanvasPoint> to JSON. In JS land, there’s already a module to parse JSON strings without needing to include any external libs.

Reducing the Bundle

Compiling to WASM and adding some JS to draw the hexagons to the canvas worked out pretty well. Except that the bundle started out as being ~710KB.

I’d heard that this is a bit of an issue. There seems to be some good work going into improving this but it’s still a fair way to go.

Still I wanted to reduce the size at least a little if I could.

I was surprised by how little effort I had to put in to drop the bundle size by nearly 50%, to 358KB. All I did was install wasm-opt, as part of the WebAssembly/binaryen toolchain, and run wasm-opt -Os snowhash.wasm. The -Os flag tells wasm-opt to optimize for size over speed and, well it seemed to do the job.

Snowhash

So after that explanation you can find the snowhash crate here. And you can also play around with it right in your browser here.