CS 279 - Week 8 Lecture 1 - 2022-10-10

TODAY WE WILL:
*   announcements
*   follow-up: a few more examples of UNIX/Linux signals
*   more fun with patterns, part 1:
    more file globbing options

*   more fun with patterns, part 2:
    start our discussion of UNIX/Linux' implementation
        of Regular Expressions (REs)

*   prep for next class

=====
follow-up: *some* EXAMPLES of Linux/UNIX signals:
=====
*   signal: a message from the OS to a process;
    one web site called it a software interrupt;

    *   example: when the user types ^Z, the Linux/UNIX operating system
        sends a signal to the process trying to put it into
	the background...

*   in UNIX/Linux, each signal has a numerical code and a
    name

   num
   code name
   ---- -------
*    1  SIGHUP   terminal hangup
*    2  SIGINT   terminal interrupt
*    3  SIGQUIT  terminal quit (with a memory dump in a file core)
*    9  SIGKILL  process killed

     *   notice that numerical code of 9!
	 *   I think when you run the command:
	
	     kill -9 <processid-or-job-num>

             ...you are asking for SIGKILL
	    
*   13  SIGPIPE  broken pipe (writing when the reader has terminated)
*   14  SIGALRM  alarm clock interrupt
*   15  SIGTERM  software termination
*   23  SIGCONT  continue job if stopped

    *   the fg command sends this signal, for example
         
*   24  SIGSTOP  noninteractive stop signal

    *   stops the job -- suspends its activity but
        does not terminate it; it can be resumed later

*   26  SIGTTIN  read attempted by a background job
*   27  SIGTTOU  write attempted by a background job

    *   by default, these signals stop a job that
        is running in the bakground

*   programs such as kill normally use SIGTERM to kill 
    another process --

    *   receiving process can catch it an choose to
        continue;

    *   BUT a SIGKILL, in theory, cannot be caught
        (and that's kill with a -9 option) --
	in theory, that process will be killed
	(although the kill may not aloways work...?)

=====
more fun with patterns, part 1:
    more GLOBBING options!
=====
*   fun facts:
    *   some shells call file globbing by different
        names:
        *   pathname expansion
        *   filename expansion

    *   globbing is NOT built into the UNIX file mechanism;
        it is recognized by various shells!

    *   the *shell* expands the file globbing pattern
        into a space-separated list of matching filenames
	BEFORE the command is done

*   you already know:
    * - matches any ZERO OR MORE charaters

    Here are a few more:
    ? - matches any EXACTLY ONE character

    [ ... ] - this matches an SINGLE character within the [ ]
    if you prefer:
    [cset]  - this matches any single character in cset

    *   [moxie] - matches a single m or o or x or i or e

        [0123456789] - matches a single digit

        *.[ch]* - here, the file's suffix must start with c or h
	          (nice for matching C++ source code files...!)

    *   ranges are also supported in [ ]!
        [0-9] same as [0123456789]
	[a-z] same as [abcdefghijklmnopqrstuvwxyz]
	...etc.
	*   these ranges ARE inclusive!
	    [d-f] DOES match d or e or f

    *   there are also a number of predefined sets (!!)
        *   within the usual [ ],
	    you put another [, then a :, then the set name,
	                    then another :, then another ]

        [[:digit:]] - same as [0-9] and [0123456789]

        [[:alpha:]] - will match the set of [a-zA-z] PLUS
	              any other characters considered letters
		      in your locale
        similarly, [[:upper:]] and [[:lower:]]

        [[:space:]] - matches characters such as space, newline,
	              and more
	[[:blank:]] - matches just space and tab

        [[:cntrl:]] - matches control characters

    *   within [ ], can use an ! to indicate you want to
        match something that ISN'T one of those

        *   starts with an uppercase letter,
	    ends with anything BUT a digit

            [A-Z]*[![:digit:]]

        *   adding after class: ! does not HAVE to be before
            a predefined set; these also work:

            [A-Z]*[!0123456789]
            [A-Z]*[!0-9]

    *   NOTE the following LIMITATIONS on these patterns:
        *   a / in the actual pathname MUST be matched
	    by an explicit / in the pattern (NOT a wildcard)

        *   a . in the actual pathname that comes at the
	    beginning or follows a / must similarly
	    be matched by an explicit . in the pattern

*   playing around with this a bit:
    gn*.l   - gnu.l    MATCHES
              gneiss.l MATCHES
	      gn.l     MATCHES (* matches 0 or more)
	      gnu      nope!
	      gn/x.l   nope!   (have to explicitly include / to
	                        match it)

    ~/.[[:alpha:]]* - ~/.login    MATCHES
                      ~/..login   nope! (have to explicitly include
		                         second . to match ..)
		      ~/.mailrc   MATCHES
		      ~/login     nope! (no . after ~/)

    */doit*        -  one/doit    MATCHES
                      two/doit.c  MATCHES (. must be explicitly
		                           included at beginning
					   or after a /, other locations
					   CAN be matched with wildcards)
                      three/doit.cpp MATCHES
		      doit        nope (no / )

    [A-Z]*[![:digit:]] - Gz    MATCHES
                         X     nope ([ ] has to match ONE at least)

    *.[acAC]           - file.a MATCHES
                         .a     no?! <-- oh, .a can only match
			                 a pattern starting with
					 an explicit .

					 ls .[acAC]

					 ...WOULD list .a in its output
					 
                         file     nope
			 stuff.ac no -- [acAC] matches exactly 1 character
			                in the set, stuff.ac has TWO
					characters after its .

=====
more fun with patterns, part 2
    UNIX/Linux regular expressions
    BUT FIRST: more on the grep command
=====
*   you'd like even more expressive power?
    ...that you can use in other contexts besides
       representing lists of filenames?

*   UNIX/Linux has it! Supports:
    BRE - Basic Regular Expression
    ERE - Extended Regular Expression

*   we'll start with BREs in the context of the grep
    command

    grep - general regular expression parser

    *   has options! WITHOUT options,

        grep pattern filelist

        ...returns the lines of files in the filelist
	   that include the pattern, with the filename, a colon, then
	   the entire line that includes the pattern

        grep "oink" *.txt

        ...return all lines including oink within files
	   ending in .txt

    *   just a TASTE of some grep options:
    
    *   -l (before the pattern)
	returns JUST the names of files with that pattern

	grep -l "oink" *

    *   -E - lets grep accept EREs as its pattern

    *   -c - only show a count of matching lines