Published on to joshleeb's blog
A few weeks ago I started looking into Game Development and exploring building games in Rust. Specifically I’ve been thinking about ECS game engines.
These are just some initial thoughts and ideas. I’m not nearly at a stage to start building generic game engines for people to actually make games with. This is more of a project to explore game engines, maybe build a game, but most importantly write some Rust.
Entities
Entities are really simple. To start with each entity is just going to be a unique ID.
Initially I thought about generating a UUID
, and there are some other
methods out there for generating unique IDs. But I’m currently leaning towards
using a u32
, and incrementing and reusing IDs.
2^32 - 1 = 4,294,967,295
which is a pretty big number. And whatever engine I
build is likely not going to need to handle that many simultaneous entities
(even if it could handle that many). So using a ID type of u16
is probably
even enough.
The reason this matters is CPU caching and prefetching. The less memory this ID takes up, the more entity IDs can be prefetched and kept in cache. This means faster iterations over lists of entities.
So setting up an Entity
, with an EntityCreator
, to handle the IDs might
look something like:
type EntityId = u32;
struct Entity {
eid: EntityId,
}
And to create each entity, we can have an EntityCreator
:
pub struct EntityCreator {
max_eid: EntityId,
}
impl EntityCreator {
pub fn new() -> Self {
EntityCreator { max_eid: 0 }
}
pub fn create(&mut self) -> Entity {
let new_entity = Entity {
eid: self.max_eid,
components: vec![],
};
self.max_eid += 1;
new_entity
}
}
This implementation doesn’t reuse IDs, but that’s something that could be
added pretty easily by giving the EntityCreator
a queue of IDs.
When an Entity
is killed, add it’s ID to queue. If the queue is empty,
resume using the max_eid
as shown above.
Components
Structuring components is what I’ve been thinking about most so far. Which is probably pretty obvious from comparing the lengths of these sections.
I’ve been trying to hit three targets:
- Efficiently add a component to an entity.
- Efficiently remove a component from an entity.
- Store component data contiguously in memory to take advantage of prefetching.
This last one has been the trickiest because it removes using data structures
such as maps, and requires that the underlying data is stored in a Vec
or
similar.
Here’s where I started, for the Positionable
components:
struct Position {
pos_x: i32,
pos_y: i32,
}
struct Positionable {
data: Vec<Position>
}
impl Positionable {
pub fn new() -> Self {
Positionable {
entities: Vec::with_capacity(128),
}
}
}
And similarly for the Moveable
component with
struct Moveable {
data: Vec<Movement>
}
Mutate/Iterate Tradeoff
I was thinking about how a system would use these two components. Maybe it’s
called MoveSystem
and wants to operate on entities that have at least the
Moveable
and Positionable
components.
For convenience, let’s call an instance of the Positionable
struct
positionable
and let N = positionable.data.len()
. Similarly, let’s call an
instance of the Moveable
struct moveable
and let M = moveable.data.len()
.
So, the ideal would be:
- Mutate (add or remove) in
O(1)
. - Iterate over both components in
O(min(N, M))
.
But keep in mind that we need entity IDs to line up when iterating. Otherwise we could see the position of one entity, and modify the position of a completely different entity on each iteration.
Ordered by Entity ID
One assumption we could make is that positionable.data
and moveable.data
are sorted by entity ID. Iteration then becomes fairly trivial and it
satisfies our ideal for iterating in O(min(N, M))
.
But it does require that we update our component data structs:
struct Position {
eid: EntityId,
pos_x: i32,
pos_y: i32,
} // And similarly for the Movement struct.
Then we setup the iteration as:
let p = 0;
let m = 0;
while p < N && m < M {
if positionable.data[p].eid < movable.data[m].eid {
p += 1
} else if positionable.data[p].eid > moveable.data[m].eid {
m += 1
} else {
// Entity has both components so do some stuff...
p += 1
m += 1
}
}
But this comes at a cost as we have to keep positionable.data
and
moveable.data
sorted, which is at least O(nlogn)
every time we add or
remove a component from an entity, where n
is the size of the Vec
being
sorted.
So this approach favors iterating, but takes a pretty big hit on mutating.
Unordered
If we try to use our component managers as they are, iterating over two or more unordered component vectors is really slow.
Naively, we would start by iterating over all the elements in one Vec
, say
positionable.data
. Then, for each iteration, we would have to find the entry
in moveable.data
with the same entity ID.
for pos in positionable.data {
for mov in moveable.data {
if pos.eid == mov.eid {
// Entity has both components so do some stuff...
}
}
}
This has a runtime complexity of O(NM)
. But I think by modifying our
Entity
struct a little bit we can do much better.
I though about starting with an enum
for the components.
enum Component {
Positionable(usize),
Moveable(usize),
}
Here, each enum
variant has a parameter of type usize
. This parameter is
the index of the entity in the component manager’s data: Vec
.
To use this to our advantage, let’s add it to the Entity
struct, and (once
again) update our structs.
pub struct Entity {
eid: EntityId,
components: Vec<Component>,
}
pub struct Position {
entity: Entity, // Should be Rc<RefCell<Entity>>.
pos_x: i32,
pos_y: i32,
} // And similarly for the Movement struct.
Great, so now that we don’t care about ordering we can really easily hit our ideal for adding and removing components from entities.
Adding a component is simply positionable.data.push(...)
, which
is O(1)
. And removing a component is
positionable.data.swap_remove(...)
, which is also O(1)
.
But iterating is now a little trickier to do.
Just like in the naive solution, we want to iterate over the shortest list. Then for each entry, instead of searching we can look up the index of the other components we care about from the entity.
// Assuming N < M.
for entry in positionable.data {
use Component::*;
if let Some(Movement(idx)) = entry.Entity.get_component(Movement(_)) {
let movement_entry = moveable.data[idx];
// Entity has both components so do some stuff...
}
}
Now we also get our ideal runtime complexity for iteration! It’s not as
performant as the sorted data
vectors, but the benchmarks
show it to be pretty damn close!
Systems
I haven’t given systems a whole lot of thought just yet but so far I’ve been using simple structs to represent them.
pub struct MovementSystem {
positionable: Positionable,
moveable: Moveable,
}
Then when we want to perform actions on entities with both the Positionable
and Moveable
components, we can do as I described above, depending on if
your component manager list is ordered or not.
I’m sure there’s a more elegant way to represent these in Rust, but I’ll have to think about that some more in the future.
Quick Note on the Borrow Checker
So far, I’ve omitted any mention of the borrow checker, and any code that would make it not explode. So if you try to compile these structs with an actual implementation, Rust will yell at you. A lot.
For example, in the unordered scenario Position
and Movement
must contain
an Entity
. But we don’t want it to be a copy, we want it to be entity: Rc<RefCell<Entity>>
.
Rc
is a reference counter allowing for multiple shared references, but not
allowing mutability. And RefCell
allows for the Interior Mutability
Pattern.
I’m sure there’s also a more elegant way of doing this but so far it’s meant I
can get this working without putting explicit lifetime parameters everywhere.
But it does now mean that the actual implementation is littered with
.borrow()
and .borrow_mut()
calls everywhere.
Ordered/Unordered Benchmark
Iterating through two unordered component managers, is analogous to a much simple implementation that can be easily benchmarked.
We start with some vectors, each containing an Entry
struct.
struct Entry {
value: usize,
index: usize,
}
Then we iterate through all the vectors and add up all entry.value
.
For the ordered benchmark we iterate through one vector and grab elements from both vectors. This isn’t the exact same as iterating through components ordered by entity Id but it’s pretty close.
fn ordered(a1: &Vec<Entry>, a2: &Vec<Entry>) -> usize {
let mut sum = 0;
for i in 0..a1.len() {
sum += a1[i].value + a2[i].value;
}
sum
}
And the unordered benchmark we iterate through one vector and jump around the
second vector based on the entry.index
.
fn unordered(a1: &Vec<Entry>, a2: &Vec<Entry>) -> usize {
let mut sum = 0;
for entry in a1 {
sum += entry.value + a2[entry.index].value;
}
sum
}
There’s also some logic to populate the vectors. But this won’t be included in the benchmark.
fn setup(n: usize) -> (Vec<Entry>, Vec<Entry>) {
let mut a1 = Vec::with_capacity(n);
let mut a2 = Vec::with_capacity(n);
for i in 0..n {
a1.push(Entry {
value: i,
index: (n - 1) - i,
});
a2.push(Entry { value: i, index: 0 });
}
(a1, a2)
}
And finally the benchmarks themselves:
#[bench]
fn bench_ordered(b: &mut Bencher) {
let (a1, a2) = setup(MAX_ITEMS);
b.iter(|| ordered(&a1, &a2));
}
#[bench]
fn bench_unordered(b: &mut Bencher) {
let (a1, a2) = setup(MAX_ITEMS);
b.iter(|| unordered(&a1, &a2));
}
Results
Just for fun, I included a HashMap
implementation to benchmark as well.
N_VECS = 2, MAX_ITEMS = 1000
test tests::bench_hashmap ... bench: 24,785 ns/iter (+/- 4,563)
test tests::bench_ordered ... bench: 558 ns/iter (+/- 63)
test tests::bench_unordered ... bench: 575 ns/iter (+/- 98)
N_VECS = 2, MAX_ITEMS = 10000
test tests::bench_hashmap ... bench: 296,271 ns/iter (+/- 35,722)
test tests::bench_ordered ... bench: 5,989 ns/iter (+/- 749)
test tests::bench_unordered ... bench: 9,286 ns/iter (+/- 2,877)
N_VECS = 3, MAX_ITEMS = 1000
test tests::bench_hashmap ... bench: 45,850 ns/iter (+/- 6,411)
test tests::bench_ordered ... bench: 843 ns/iter (+/- 79)
test tests::bench_unordered ... bench: 854 ns/iter (+/- 90)
N_VECS = 3, MAX_ITEMS = 10000
test tests::bench_hashmap ... bench: 575,907 ns/iter (+/- 85,579)
test tests::bench_ordered ... bench: 8,959 ns/iter (+/- 1,305)
test tests::bench_unordered ... bench: 10,693 ns/iter (+/- 1,815)