nasty fun automatas
I took an Intro to Automatas class last semester, and I really wanted to apply some of the theory I learned. I also really wanted to try out WebAssembly (WASM) on a small project. So, I decided to implement my own regex engine using Non-deterministic Finite Automatas (NFAs).
The biggest challenge was deciding on how to represent the data in Rust because I’m still a beginner. I thought about using reference counting pointers (Rc/Arc) or something of that nature, but I kind of felt like Rc would be overkill for the problem. After constructing the NFA, we do not really need to edit the transitions/pointers between nodes, so I ended up deciding to use a states (nodes) array with the indices acting as pointers. The challenge was how do I go about constructing the NFA. After looking at Russ Cox’s elegant C pointer implementation, it turns out we can use postfix notation. Postfix notation allows us to build up the NFA based on dependency, and in a way, it's sort of like building the NFA via topological sort on dependency.
After writing tests and building out the regex engine, I tried compiling it to WASM, which ended up being super simple because of wasm-pack and other Rust WASM tooling. WASM seems much easier to work with than I initially thought. The only issue I ran into was forgetting to initialize the WASM.
It was super cool being able to apply the concepts I learned in class. I’d love to come back sometime soon and expand the Regex features supported. Thanks for reading!