ramblings...!

Can you believe someone wrote this?


Actual Engineering

Here's a markdown page

Parsing markdown is nigh impossible really difficult but possible, there's a reason every parser out there has it's quirks and stuff. Markdown is not a context free grammar. All those fancy recursive descent and combinatorial parsers probably would work if you disallow nesting style tags, but the original was implemented with regexes, and honestly I think that's just how it's meant to me. Disallowing nesting styling tags would turn it into a context-free language, but that's not fun man.

Markdown is really nice to write. It's a (deceptively) simple format, and the way the syntax naturally integrates into the text makes it less obstructive than other markup languages. HTML is more machine readable than human readable, and BBCode is literally just an XSS waiting to happen.

People say markdown has a lot of quirks, but HTML h1 tags literally change font size if they're in nested <section> tags. wtf literally no one knows about this


Hey guess what I made a parser

I wrote my own markdown compiler in common lisp. It's not open source yet, and I didn't quite fully integrate it yet into the backend. The current one is my third attempt, twice in python and once in lisp.

The first two I tried to implement actual computer science and use a state machine for lexing and parsing, iterating over the file character by character. That was a nightmare to extend, and I realized that it doesn't support tokens longer than one character, meaning the strike and header tags (###) broke it.

The second one threw all that away and instead used a whole load of regular expressions for both lexing and parsing. There wasn't an actual abstract syntax tree being generated, but inline replacing tags. That one was a bit better, but I felt that it could be a bit cleaner. Also it did weird stuff around end of file which required some special case handing.

The third attempt, the one I'm somewhat proud of, uses regex for lexing and a state machine for parsing. The regexes are very simple, are aware of newlines and file start and ends. Technically I'm doing the special case handing in the regex instead. It's not very lisp like, it uses a lot of imperative loops and questionable performing string concatenation.

One thing that's missing is parsing anchor tags. Once that's done I'll start actually using the lisp markdown compiler to compile the pages you see here.

GO BACK