279lect09-2-projected


CS 279 - Week 9 Lecture 2 - 2022-10-19

TODAY WE WILL
*   announcements/reminders
*   a BIT on more EREs
*   a couple of significant device files:
    /dev/tty and /dev/null
*   interlude: which; cmp and diff; bit more on wc; tee
*   [did not get to] intro to bash arrays
*   if time: intro to the BASH_REMATCH array
*   prep for next class

=====
a bit more on EREs
=====
*   mentioned last time that you need to use the =~ operator in [[ ]] --
    did not mention that =~ in this setting DOES support EREs

    as noted in "Bash Reference Manual", Section 3.2.5.2, "Conditional
        Constructs":
	"When you use =~, the string to the right of the operator
	 is considered a POSIX extended regular expression pattern
	 and matched accordingly"

*   in EREs, UNQUOTED parentheses can be used for GROUPING subexpressions,
    and e+ matches ONE or MORE occurrences in an ERE e

    *   does that mean we could use an additional set of ( ) to
        say we want 1 or more of an ERE?

        ([A-Z]|[0-9])+

        this SHOULD match 1 or more uppercase-letters-or-digits

*   In EREs, UNQUOTED curly braces are used for INTERVAL expressions
    *   so, for example, and ERE for a line with exactly 4
        lowercase-letters-or-digits:

	^([a-z]|[0-9]){4}$

*   In EREs, e? matches zero-or-one occurrences of the ERE e
    (this is good for OPTIONAL things)

    lines that contain JUST an integer?

    ^(\+|-)?[0-9]+$    # since + is special in EREs, need to escape +
                       #   if you want to actually match a plus sign

*   so, note: since, in EREs, + ? | ( ) are now also special
    characters -- to match just the character, escape them

=====
a couple signficant device files: /dev/tty and /dev/null
=====
*   so: recall:
    UNIX/Linux has 3 main categories of files:
    *   regular files
    *   directory files
    *   special files

    *   what we might not have mentioned earlier:
        device files are ONE of the subcategories
	within the "special files" category;

    *   device: [from an old UNIX text -- "UNIX for the Impatient", I think]
        "a piece of equipment for storing or communicating data"
	*   e.g., printers, memory of various types, terminals

        *   and UNIX/Linux, it provides access to a device by associating
	    one or more device files with it

            read from a device? You accomplish that (on the shell level)
	        by reading from its device file

            write to a device? You accomplish that (on the shell level)
	        by writing to its device file

            (under the hood, a device driver associated with that
	        device actually handles the operation, with instructions
		specific to that devices hardware)

    *   Conventionally, device files "live" in the /dev directory

    *   two common/special device files:
        *   /dev/tty refers to your terminal, whichever one it
	    happens to be

        *   it also has a specific designation, which you can find
	    with the tty command

        *   /dev/null - the null device, sometimes called a "bit bucket"

            *   anything you send to it, is SIMPLY THROWN AWAY

            *   when you attempt to read from it, you get an end-of-file
	        (EOF)

            *   sometimes called a pseudo-device, because it does
	        not correspond to any actual hardware

            *   you'll see:
	    	some-command 2> /dev/null

            *   you will see UNIX/Linux humor/snark based on this...
	        ("send your complaints to /dev/null"...)

=====
a smattering of additional Linx/UNIX commands
=====
which
-----
*   followed by 1 or more program names
*   outputs the full pathname for an executable program
    (so, WHICH version of that are you running)

*   actually does simulate the search your shell would
    make through the current PATH

-----
cmp and diff
-----
*   cmp - lets you do a quick comparison of two files, and JUST
    be able to find out if they are different or not

    *   if the same, it has no output, but has an exit status of 0

    *   if not the same, it may have a message (depending on your
        distro), but also has an exit status of 1

    *   if one or both could not be accessed, should have an exit
        status of 2 (and may also have a message)

*   BUT what if you want MORE details about how they differ?
    diff command can be helpful;

    diff analyzes the differences between 2 files
    diff file1 file2

    *   no differences? No output.
    *   if there are differences, it tries to
        produce a list of instructions for transforming file1
        to files
	(it WILL break down if the files are TOO different)

        for example:
	n1, n2 d n3 - delete lines n1 through n2 of file 1
	n1 a n3, n4 - append lines n3 through n4 of file2 after line n1 of
	              file1
        n1, n2 c n3, n4 - replace (change) lines n1 through n2 of file1
	                  with n3 through n4 of file2

        *   also shows ALL lines involved in the transformation,
	    with
	    < indicating lines deleted from file1
	    > indicating lines TAKEN from the original file2

    *   -c option: provides CONTEXT for the differences (3 lines of
        surrounding text per difference)

        -b options: ignore trailing blanks/tabs, treat sequences
	of blanks/tabs as a single blank

-----
a bit more wc
-----
*   wordcount command!
*   no args: number of characters, words, and lines in a file or set of files
    -c   - just # of characters!
    -l   - just # of lines!
    -w   - just # of words!

    ...these can be nice in a pipe!

    number of people logged in?

    who | wc -l

    how many files in the current directory?

    ls | wc -l