%{
#include <stdio.h>
#include <stdlib.h>
void yyerror(const char*);
%}
%token WORD
%token EOL
%%
input:
/* empty */
|
input line
;
line:
EOL { exit(1); }
| words EOL { printf("correct!\n"); }
;
words:
words WORD
| WORD
;
%%
void yyerror(const char *str)
{
fprintf(stderr," error: %s\n",str);
}
main()
{
yyparse();
}
Can someone help me understand non-terminal symbols and symbol rules like line input words expr
.Is expr
a reserved word??
It looks like you are a complete beginner with using parser generating tools like bison. Although you have a small working example program you are wanting to extend it but do not fully understand how it works or understand some of the terms used in the documentation you are using. I'll to explain these points in a tutorial ending with code which extends your example with expressions.
To make the tutorial more readable some of the explanations are hidden until you hover with a mouse. This enables anyone to only read the bits they need and makes the page less of a wall of text.
Overview of Bison
The Bison Manual is a good source of reference for questions on using bison. The first thing to note is the overall layout of bison code which is like this:
The first section is the Prologue which is used to put the declarations and imports for the generated code (which is usually in C.
The second section are the bison declarations which introduce the terminal symbols used in the grammar, and optionally indicate the start rule for the grammar.
The third section contains the (context free) grammar rules and actions to be taken when a rule is matched. These rules are converted into a parser in the output language (usually C) by the bison tool. This is usually the main body of any bison file.
The fourth and last section contains the epilogue. This is used to place any code that should be copied into the output file. It usually contains any code for functions required in the rule actions used earlier.
Grammar Rules
Understanding the grammar rules used by bison involves some computer science study. It is often one of the areas that programmers who have not studied computer science have trouble with and therefore sometimes is in need of further explanation. Grammar rules are used to define what sequences of symbols are valid in a language. Parsing is the process of matching input to those rules. Bison's job is to build a parser that implements those rules. The rules specify sequences of input entities known as tokens which are defined in the bison declaration section. The tokens and other input symbols are known as terminal symbols in the grammar. The rules are given names, and the names for the rules are called the non-terminals of the grammar. The rules have the form of:
By convention TERMINAL names are capitalised and non terminal names are not.
An example:
Those rules can have actions added to them, which define code to be executed when the rule match occurs. Actions are put in the
{ ... }
pair at the end of each rule, like this:For advanced usage we can also extract information from the tokens to include in our action:
Now we have covered the basics of what is a terminal and non-terminal symbol, let's use that to decode the example program you used...
Annotated Example
Extending the parser
You asked about
expr
expecting it to be some command or built-in feature of bison I suspect. It is not clear what you are asking, as you never clarified that point. However, in my experience, many students are given a simple parser, like the one you illustrated, in some class notes, and asked, by way of an exercise, to extend it to recognise lines containing arithmetic expressions. That is what I am going to assume, and explain.Let's imagine that we wish to change (or expand) the parser to match simple expressions like WORD + WORD instead of lines of words.
We could redefine the rule
line
like this:and the rule
expr
like this:We could even make this into a simple calculator, if our lexical analyser matched NUMBER instead of WORD. (The lexical analyser is define elsewhere). We could now do things like:
Further Reading
If you want to go further, I have several hours of (unexciting) YouTube video tutorials on building simple parsers that support my classes. They might be helpful.