I'm currently trying to write a (very) small interpreter/compiler for a programming language. I have set the syntax for the language, and I now need to write down the grammar for the language. I intend to use an LL(1) parser because, after a bit of research, it seems that it is the easiest to use.
I am new to this domain, but from what I gathered, formalising the syntax using BNF or EBNF is highly recommended. However, it seems that not all grammars are suitable for implementation using an LL(1) parser. Therefore, I was wondering what was the correct (or recommended) approach to writing grammars in LL(1) form.
Thank you for your help, Charlie.
PS: I intend to write the parser using Haskell's Parsec library.
EDIT: Also, according to SK-logic, Parsec can handle an infinite lookahead (LL(k) ?) - but I guess the question still stands for that type of grammar.
I'm not an expert on this as I have only made a similar small project with an LR(0) parser. The general approach I would recommend:
Get the arithmetics working. By this, make rules and derivations for
+, -, /, *
etc and be sure that the parser produces a working abstract syntax tree. Test and evaluate the tree on different input to ensure that it does the arithmetic correctly. Make things step by step. If you encounter any conflict, resolve it first before moving on.Get simper constructs working like
if-then-else
orcase
expressions working.Going further depends more on the language you're writing the grammar for.
Definetly check out other programming language grammars as an reference (unfortunately I did not find in 1 min any full LL grammar for any language online, but LR grammars should be useful as an reference too). For example:
ANSI C grammar
Python grammar
and of course some small examples in Wikipedia about LL grammars Wikipedia LL Parser that you probably have already checked out.
I hope you find some of this stuff useful