335lect02-2-projected

Please send questions to st10@humboldt.edu .

another CFG:

S -> 0S1 | epsilon

*   for an epsilon production, you replace the non-terminal
    with the empty string -- (you don't type the epsilon in
    the derivation step result...!) (nor a space)

    ...essentially, replace the non-terminal with nothing

*   for example, here is a derivation using this CFG:

S => 0S1
  => 00S11
  => 000S111
  => 000111

...this grammar describes the language
{w | w is 0^n1^n }  where n >= 0
(a number of 0's followed by that same number of 1's)

(which is a CFL that is not a regular language...)

Backus-Naur Form (BNF)
*   in the 1950's, while linguists were studying
    CFG's, computer scientists began to describe 
    programming languages using BNF,
    (Hopcroft, Ullman, p. 78) "which is the context-free
    grammar notation with minor changes in format and some
    shorthand"

*   BIG step in simplifying the definition and
    description of programming language syntax!

    BNF is a meta-language
    (language describing a language)
    (named after Backus, Naur in their work describing
    Algol-58, Algol-60)

*   syntactic categories of strings -> non-terminals
    write angle brackets around a descriptive name
       for the synactic category

    <decimal fraction>
    <while-statement>
    <statement>
    <identifier>
    <named constant>

   you follow the syntactic category you are describing
   (or one rule in its description) with
   ::=
  
   can be read as, "is defined as"

   and on the right-hand-side (RHS) of the ::=,
   you concatenate the terminals and syntactic categories
   used in that definition, if you will

   And | can be used to "condense" two rules into one
   with the same LHS

   <integer> ::= +<unsigned integer> | -<unsigned integer> 
                 | <unsigned integer>

   *   here, + and - are terminals
   
consider <unsigned integer>
*  <unsigned integer> ::= <digit> | <digit><unsigned integer> 

   <digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

*   describing an expression in a simple language...

<expr> ::= <expr> + <expr> |
           <expr> * <expr> |
           <expr> - <expr> |
           <expr> / <expr> |
           (<expr>) |
           <id>

<expr> => <expr> + <expr>
       => (<expr>) + <expr>
       => (<expr> * <expr>) + <expr>
       => (<id> * <expr>) + <expr>
       => (<id> * <id>) + <expr>
       => (<id> * <id>) + <id>

another form of a derivation is a derivation tree
(sometimes called a parse tree)

root: is the start non-terminal
(label the root node with the desired start 
    non-terminal)
for each node in the tree,
if that node is labeled with the non-terminal A,
then its children could be X1 X2 ... Xn
     only if there is a rule A -> X1 X2 ... Xn
note that, when you do this,
    eventually you'll see that the leaves of the
    resulting derivation tree are terminals,
    all the interior nodes are non-terminals,
    and if you read the leaves left-to-right,
    that's a string in the language defined by
    that grammar;