Thinking About an ECS in Rust

Published on 2018-04-18 to joshleeb's blog

A few weeks ago I started looking into Game Development and exploring building games in Rust. Specifically I’ve been thinking about ECS game engines.

These are just some initial thoughts and ideas. I’m not nearly at a stage to start building generic game engines for people to actually make games with. This is more of a project to explore game engines, maybe build a game, but most importantly write some Rust.

Entities

Entities are really simple. To start with each entity is just going to be a unique ID.

Initially I thought about generating a UUID, and there are some other methods out there for generating unique IDs. But I’m currently leaning towards using a u32, and incrementing and reusing IDs.

2^32 - 1 = 4,294,967,295 which is a pretty big number. And whatever engine I build is likely not going to need to handle that many simultaneous entities (even if it could handle that many). So using a ID type of u16 is probably even enough.

The reason this matters is CPU caching and prefetching. The less memory this ID takes up, the more entity IDs can be prefetched and kept in cache. This means faster iterations over lists of entities.

So setting up an Entity, with an EntityCreator, to handle the IDs might look something like:

type EntityId = u32;

struct Entity {
    eid: EntityId,
}

And to create each entity, we can have an EntityCreator:

pub struct EntityCreator {
    max_eid: EntityId,
}

impl EntityCreator {
    pub fn new() -> Self {
        EntityCreator { max_eid: 0 }
    }

    pub fn create(&mut self) -> Entity {
        let new_entity = Entity {
            eid: self.max_eid,
            components: vec![],
        };
        self.max_eid += 1;
        new_entity
    }
}

This implementation doesn’t reuse IDs, but that’s something that could be added pretty easily by giving the EntityCreator a queue of IDs.

When an Entity is killed, add it’s ID to queue. If the queue is empty, resume using the max_eid as shown above.

Components

Structuring components is what I’ve been thinking about most so far. Which is probably pretty obvious from comparing the lengths of these sections.

I’ve been trying to hit three targets:

Efficiently add a component to an entity.
Efficiently remove a component from an entity.
Store component data contiguously in memory to take advantage of prefetching.

This last one has been the trickiest because it removes using data structures such as maps, and requires that the underlying data is stored in a Vec or similar.

Here’s where I started, for the Positionable components:

struct Position {
    pos_x: i32,
    pos_y: i32,
}

struct Positionable {
    data: Vec<Position>
}

impl Positionable {
    pub fn new() -> Self {
        Positionable {
            entities: Vec::with_capacity(128),
        }
    }
}

And similarly for the Moveable component with

struct Moveable {
    data: Vec<Movement>
}

Mutate/Iterate Tradeoff

I was thinking about how a system would use these two components. Maybe it’s called MoveSystem and wants to operate on entities that have at least the Moveable and Positionable components.

For convenience, let’s call an instance of the Positionable struct positionable and let N = positionable.data.len(). Similarly, let’s call an instance of the Moveable struct moveable and let M = moveable.data.len().

So, the ideal would be:

Mutate (add or remove) in O(1).
Iterate over both components in O(min(N, M)).

But keep in mind that we need entity IDs to line up when iterating. Otherwise we could see the position of one entity, and modify the position of a completely different entity on each iteration.

Ordered by Entity ID

One assumption we could make is that positionable.data and moveable.data are sorted by entity ID. Iteration then becomes fairly trivial and it satisfies our ideal for iterating in O(min(N, M)).

But it does require that we update our component data structs:

struct Position {
    eid: EntityId,
    pos_x: i32,
    pos_y: i32,
} // And similarly for the Movement struct.

Then we setup the iteration as:

let p = 0;
let m = 0;
while p < N && m < M {
    if positionable.data[p].eid < movable.data[m].eid {
        p += 1
    } else if positionable.data[p].eid > moveable.data[m].eid {
        m += 1
    } else {
        // Entity has both components so do some stuff...
        p += 1
        m += 1
    }
}

But this comes at a cost as we have to keep positionable.data and moveable.data sorted, which is at least O(nlogn) every time we add or remove a component from an entity, where n is the size of the Vec being sorted.

So this approach favors iterating, but takes a pretty big hit on mutating.

Unordered

If we try to use our component managers as they are, iterating over two or more unordered component vectors is really slow.

Naively, we would start by iterating over all the elements in one Vec, say positionable.data. Then, for each iteration, we would have to find the entry in moveable.data with the same entity ID.

for pos in positionable.data {
    for mov in moveable.data {
        if pos.eid == mov.eid {
            // Entity has both components so do some stuff...
        }
    }
}

This has a runtime complexity of O(NM). But I think by modifying our Entity struct a little bit we can do much better.

I though about starting with an enum for the components.

enum Component {
    Positionable(usize),
    Moveable(usize),
}

Here, each enum variant has a parameter of type usize. This parameter is the index of the entity in the component manager’s data: Vec.

To use this to our advantage, let’s add it to the Entity struct, and (once again) update our structs.

pub struct Entity {
    eid: EntityId,
    components: Vec<Component>,
}

pub struct Position {
    entity: Entity, // Should be Rc<RefCell<Entity>>.
    pos_x: i32,
    pos_y: i32,
} // And similarly for the Movement struct.

Great, so now that we don’t care about ordering we can really easily hit our ideal for adding and removing components from entities.

Adding a component is simply positionable.data.push(...), which is O(1). And removing a component is positionable.data.swap_remove(...), which is also O(1).

But iterating is now a little trickier to do.

Just like in the naive solution, we want to iterate over the shortest list. Then for each entry, instead of searching we can look up the index of the other components we care about from the entity.

// Assuming N < M.
for entry in positionable.data {
    use Component::*;

    if let Some(Movement(idx)) = entry.Entity.get_component(Movement(_)) {
        let movement_entry = moveable.data[idx];
        // Entity has both components so do some stuff...
    }
}

Now we also get our ideal runtime complexity for iteration! It’s not as performant as the sorted data vectors, but the benchmarks show it to be pretty damn close!

Systems

I haven’t given systems a whole lot of thought just yet but so far I’ve been using simple structs to represent them.

pub struct MovementSystem {
    positionable: Positionable,
    moveable: Moveable,
}

Then when we want to perform actions on entities with both the Positionable and Moveable components, we can do as I described above, depending on if your component manager list is ordered or not.

I’m sure there’s a more elegant way to represent these in Rust, but I’ll have to think about that some more in the future.

Quick Note on the Borrow Checker

So far, I’ve omitted any mention of the borrow checker, and any code that would make it not explode. So if you try to compile these structs with an actual implementation, Rust will yell at you. A lot.

For example, in the unordered scenario Position and Movement must contain an Entity. But we don’t want it to be a copy, we want it to be entity: Rc<RefCell<Entity>>.

Rc is a reference counter allowing for multiple shared references, but not allowing mutability. And RefCell allows for the Interior Mutability Pattern.

I’m sure there’s also a more elegant way of doing this but so far it’s meant I can get this working without putting explicit lifetime parameters everywhere. But it does now mean that the actual implementation is littered with .borrow() and .borrow_mut() calls everywhere.

Ordered/Unordered Benchmark

Iterating through two unordered component managers, is analogous to a much simple implementation that can be easily benchmarked.

We start with some vectors, each containing an Entry struct.

struct Entry {
    value: usize,
    index: usize,
}

Then we iterate through all the vectors and add up all entry.value.

For the ordered benchmark we iterate through one vector and grab elements from both vectors. This isn’t the exact same as iterating through components ordered by entity Id but it’s pretty close.

fn ordered(a1: &Vec<Entry>, a2: &Vec<Entry>) -> usize {
    let mut sum = 0;
    for i in 0..a1.len() {
        sum += a1[i].value + a2[i].value;
    }
    sum
}

And the unordered benchmark we iterate through one vector and jump around the second vector based on the entry.index.

fn unordered(a1: &Vec<Entry>, a2: &Vec<Entry>) -> usize {
    let mut sum = 0;
    for entry in a1 {
        sum += entry.value + a2[entry.index].value;
    }
    sum
}

There’s also some logic to populate the vectors. But this won’t be included in the benchmark.

fn setup(n: usize) -> (Vec<Entry>, Vec<Entry>) {
    let mut a1 = Vec::with_capacity(n);
    let mut a2 = Vec::with_capacity(n);

    for i in 0..n {
        a1.push(Entry {
            value: i,
            index: (n - 1) - i,
        });
        a2.push(Entry { value: i, index: 0 });
    }
    (a1, a2)
}

And finally the benchmarks themselves:

#[bench]
fn bench_ordered(b: &mut Bencher) {
    let (a1, a2) = setup(MAX_ITEMS);
    b.iter(|| ordered(&a1, &a2));
}

#[bench]
fn bench_unordered(b: &mut Bencher) {
    let (a1, a2) = setup(MAX_ITEMS);
    b.iter(|| unordered(&a1, &a2));
}

Results

Just for fun, I included a HashMap implementation to benchmark as well.

`N_VECS = 2, MAX_ITEMS = 1000`

test tests::bench_hashmap   ... bench:      24,785 ns/iter (+/- 4,563)
test tests::bench_ordered   ... bench:         558 ns/iter (+/- 63)
test tests::bench_unordered ... bench:         575 ns/iter (+/- 98)

`N_VECS = 2, MAX_ITEMS = 10000`

test tests::bench_hashmap   ... bench:     296,271 ns/iter (+/- 35,722)
test tests::bench_ordered   ... bench:       5,989 ns/iter (+/- 749)
test tests::bench_unordered ... bench:       9,286 ns/iter (+/- 2,877)

`N_VECS = 3, MAX_ITEMS = 1000`

test tests::bench_hashmap   ... bench:      45,850 ns/iter (+/- 6,411)
test tests::bench_ordered   ... bench:         843 ns/iter (+/- 79)
test tests::bench_unordered ... bench:         854 ns/iter (+/- 90)

`N_VECS = 3, MAX_ITEMS = 10000`

test tests::bench_hashmap   ... bench:     575,907 ns/iter (+/- 85,579)
test tests::bench_ordered   ... bench:       8,959 ns/iter (+/- 1,305)
test tests::bench_unordered ... bench:      10,693 ns/iter (+/- 1,815)