*G=<N,T,P,Prog>* where *N={Prog,Expr,Term,Storable,Factor}*,
*T={number,+,-,*,/,S,R,(,),EOF}* the start symbol is *Prog*,
and the set of productions is given by these rules

`Prog -> Expr EOF`

`Expr -> Expr + Term | Expr - Term | Term`

`Term -> Term * Storable | Term / Storable
| Storable`

`Storable -> Factor S | Factor`

`Factor -> number | R | ( Expr )`

Using this grammar we can check to see that the example expression given above is valid by constructing a derivation of the sentence. A derivation starts with the start symbol and replace one non-terminal at each step to generate the sentence. For instance, a derivation that generates the example above is:

`Prog => Expr EOF =>Term EOF =>Term * Storable
EOF => Storable * Storable EOF =>Factor * Storable EOF => ( Expr ) * Storable
=>EOF ( Expr + Term ) * Storable EOF => ( Term + Term ) * Storable EOF
=> ( Storable + Term ) * Storable EOF => ( Factor S + Term ) * Storable
EOF => ( 5 S + Term ) * Storable EOF => ( 5 S + Storable ) * Storable EOF
=> ( 5 S + Factor ) * Storable EOF => ( 5 S + Factor ) * Storable EOF =>
( 5 S + 4 ) * Storable EOF => ( 5 S + 4 ) * Factor EOF => ( 5 S + 4 ) *
R EOF`

This derivation proves that *(5S+4)*R EOF* is a syntactically valid
expression in the language of the grammar. The grammar presented above
is what is called an LALR(1) grammar. This kind of grammar can be used
by a program called a parser generator that given the grammar, will automatically
generate a program called a parser. A parser is a program that given a
sentence in its language, will construct a derivation of that sentence
to check it for syntactic correctness. The derivation can also be expressed
in tree form, called a parse tree. There may be many different derivations
for a sentence in a language, but only one parse tree (if the grammar is
unambiguous). Parse trees are outside the scope of this document. Take
CS67 if you want to learn about parse trees.

Although parsers can be generated by parser generators, it is still sometimes convenient to write a parser by hand. However, LALR(1) grammars are not easy to use to manually construct parsers. Instead, we want an LL(1) grammar if we are going to manually construct a parser. An LL(1) grammar can be used to construct a top-down or recursive descent parser where an LALR(1) grammar is typically used to construct a bottom-up parser. A top-down parser constructs (or at least traverses) the parse tree starting at the root of the tree and proceeding downward. A bottom-up parser constructs or traverses the parse tree in a bottom-up fashion. Again, for more details, take CS67.

In a recursive descent parser, each non-terminal in the grammar becomes a function in a program. The right hand side of the productions become the bodies of the functions. An LALR(1) grammar is not appropriate for constructing a recursive descent parser. To create a recursive-descent parser (the topic of this page) we must convert the LALR(1) grammar above to an LL(1) grammar. Typically, there are two steps involved.

- Eliminate left recursion
- Perform left factorization where appropriate

*Expr -> Term RestExpr*

*RestExpr -> + Term RestExpr |
- Term RestExpr | <null>*

We must also eliminate left recursion in the *Term Term * Factor |
Term / Factor* productions in the same way. We end up with an **LL(1)
grammar **that looks like this:

*Prog -> Expr EOF*

*Expr -> Term RestExpr*

*RestExpr -> + Term RestExpr |
- Term RestExpr | <null>*

*Term -> Storable RestTerm*

*RestTerm -> * Storable RestTerm
| / Storable RestTerm | <null>*

*Storable -> Factor S | Factor*

*Factor -> number | R | ( Expr
)*

Once you have an LL(1) grammar you use it to build a parser as follows. The following construction causes the parser to return an abstract syntax tree or expression tree for the sentence being parsed.

- Construct a function for each non-terminal. Each of these functions should return a node in the abstract syntax tree.
- Depending on your grammar, some non-terminal functions may require an input parameter of an abstract syntax tree (ast) to be able to complete a partial ast that is recognized by the non-terminal function.
- Each non-terminal function should call a function to get the next token as needed. If the next token is not needed, the non-terminal function should call another function to put back the token. Your parser (if it is based on an LL(1) grammar, should never have to get or put back more than one token at a time.
- The body of each non-terminal function should be be a series of if statements that choose which production right-handside to expand depending on the value of the next token.

Assume that you have a binary tree class in Java called BTNode that can hold an Object and that operators in the expression are Characters and numbers in the expression are Doubles. Also, assume there are two functions that get the next Token in the input (getToken) and push the last token back on the input (pushBackToken). Then, we can write the first part of the parser as follows:

private BTNode Prog() { BTNode tmp = Expr(); if (!getNextToken().toString().equals("$")) throw new IllegalStateException("Error in expression"); return tmp; } private BTNode Expr() { return RestExpr(Term()); } private BTNode RestExpr(BTNode E1) { Object on = getNextToken(); if (on.toString().equals("+") || on.toString().equals("-")) return RestExpr(new BTNode(on,E1,Term())); pushBackToken(); return E1; }