I started from the premise that the most atomic, the smallest addressable piece on the venerable x86 PC is a byte. However, this algorithm (cellular automaton) really manipulates single bits. It requires some flexibility - it relies on fairly sophisticated mechanisms, however I am not sure if using Java's integer doesn't slow it down.
I think my question is would switching to assembly opened doors for some more robust techniques - would bitwise operations for example help it a lot?
The algorithm in question was found in a repository of another ingenious programmer: https://bitbucket.org/BWerness/voxel-automata-terrain/src/master/ThreeState3dBitbucket.pde