335lect03-1-projected

Please send questions to st10@humboldt.edu .

*   Extended BNF (EBNF)

*   reference: MacLennan, p. 153-154

(please use BNF for Homework 1....)

*   for convenience, pretty much...

*   * - Kleene star
    + - Kleene cross

<unsigned integer> ::= <digit>+
...
<identifier> ::= <letter><alphanumeric>*

*   some version of EBNF let you use { } to
    indicate a grouping that the * or + applies to;

<identifier> ::= <letter>{<letter>|<digit>|_}*

    (you can also stack options within a big
    { } instead of using | for or, also)

*   also common: to use [ ] to indicate something
    that is optional

    <integer> ::= [+ | -]<unsigned integer>

    (and you can also stack options inside a big [ ] 
    too)

*   definition: the language generation by a CFG

    *   assume that =*=> can be used to indicate
        a sequence of derivations

    *   that is, if you had:   u => u1 => u2 => v
        you could write:       u =*=> v

    *   then, you could say (Sipser, p. 94 and HU, p. 80)
  
        The language of the grammar G, L(G), is:

        {w | (w contains characters from its alphabet)
             and (S =*=> w) }

*   leftmost and rightmost derivations

    *   for a given derivation tree, there may
        be more than one derivation...

        (drew derivations + derivation tree for
        <expr> + <expr> on the board)

    *   so -- a string having two different
        derivations doesn't tell you much;

        BUT -- if a string has two different
	derivation TREES -- then its grammar
	is said to be AMBIGUOUS
	(there's at least one string which
        might essentially have two different
	meaning/semantics)

        *   ambiguity isn't ideal for a programming
            language -- we need a given statement
	    to have a consistent meaning/semantics;

            (sometimes handled with additional
	    assumptions...)

*   a leftmost derivation:
    you ALWAYS substitute for the leftmost non-terminal
    in each step;

*   a rightmost derivation:
    you ALWAYS substitute for the rightmost non-terminal
    in each step;
     
    ...it turns out that if a grammar has a string
       with two different LEFTMOST derivations,
       it can be proven to be ambiguous;

    ...it turns out that if a grammar has a string
       with two different RIGHTMOST derivations,
       it can be proven to be ambiguous;

...so that's 3 ways to prove a grammar ambiguous

*   abstract syntax tree
    (Louden, p. 89)
   
    ...if you condense this down, sort of replacing
       each non-terminal with its terminal leaf,
       you get a different tree, an abstract syntax
       tree;

       (there are only terminals in a abstract
       syntax tree)

    *   it turns out it is reasonable to go from
        a statement to an abstract syntax tree
	version of that statement;

	(as opposed as from a grammar to a derivation
        of a statement using that grammar);

        ...it is not uncommon for a compiler
	to turn a statement into abstract
	syntax tree form, and then evaluate that;

     *  it is pretty reasonable to use a parenthesized
        notation to represent such abstract syntax
	trees...

        (for the two trees on the board...)

        (+ (* <id> <id>) <id>)

        (* <id> (+ <id> <id>))

(yes, this IS Lisp's/Scheme's syntax...!)

and that segues us to Functional Programming/Lisp/Scheme

*   now, LISP and Scheme are multi-model -- they
    can be used for functional and non-functional
    programming;

    (they are not "pure" functional languages)

    ...but they can be used in a quite functional
    manner;

*   a little LISP history (some of this is in MacLennan,
    but I also used Webber, pp. 535-538)