What an undertaking! This can only have been done in assembler. How many lines of code was this in the end and what was the hardest bit? I can imagine level generation was tricky...would you ever consider releasing the asm files?
Hardest bit is actually not the level generation. That part is well documented in the original source code.
Performing all the collision checks between player/enemies/backgrounds/items/traps without overloading the CPU was actually the hardest part.