CS 279 - Week 11 Lecture 1 - 2022-10-31

TODAY WE WILL:
*   announcements
*   intro to the find command
*   intro to tar and gzip/gunzip
*   (if time) intro to head and tail
*   [did not get to today] (if time) intro to sed
*   prep for next class

*   Should be working on Homework 7!
    *   attempts due by 11:59 pm November 4

*   Exam 2 will be given in class on Wednesday, November 9

    *   So: example solutions will hopefully be posted
        by 12:01 am on Monday, November 7

    *   So: any improved versions of non-Canvas-question
        problems on Homeworks 4-7 need to be submitted
	by 11:59 pm on Sunday, November 6

*   (We'll review for Exam 2 in class on Wednesday, November 2,
    then there is NO LAB on THURSDAY, NOV 3rd,
    because I'll be traveling to a conference)

=====
intro to find!
=====
*   find can be used to locate files on a UNUX/Linux system

    It will search any set of directories you specify
    for files that match the supplied search criteria

*   You can search for files by name, owner, group, type,
    permissions, date, and other criteria

*   its search is recursive -- it WILL search subdirectories,
    subdirectories of subdirectories, etc.

*   The basic structure of its syntax:

    find where-to-look desired-criteria what-to-do

    *   there are often defaults if you don't specify all 3
        of where-to-look, desired-criteria, and what-to-do --

        BUT these defaults can vary on different Linux/UNIX
	versions/distributions

    *   for example: if not specified, where-to-look often
        defaults to .  <-- the current directory

        criteria often defaults to all files

        what-to-do defaults to -print (display the names of
	found files to standard output)

*   another popular criteria option:

    -name desired-filename   # searches for files with that filename

*   FOR EXAMPLE...

    find ~ -name moo.txt -print

    *   this WILL complain for directories you don't have permission
        to read --

	2> is useful for redirecting these error messages elsewhere
	(indeed, to /dev/null if you really don't care about ever
	checking out all of them...)

	find ~ -name moo.txt -print 2> /dev/null

    find ~/public_html -name index.php -print | wc -l

    *   counts the number of files named index.php in the
        public_html directory and all of its subdirectories

*   NOTE: you CAN use file globbing wildcards in the
    files you want to look for,
    
    BUT you need to escape them or quote them (single and double
    quotes both SEEM to work) so they won't be expanded too soon
    (so they won't be expanded before "giving" them to the find command)

    find ~ -name \*.cpp -print
    find ~ -name "*.cpp" -print
    find ~ -name '*.cpp' -print

*   a taste of a few more find criteria:

    -type - look for files of the specified type

        find . -type d -print  # listing directories reachable from
                               #    the current directory

        and -type f would search for regular files, etc.

    -perm - look for files with specified permissions

    -mtime - last modified time

    -mmin - modified number of minutes ago

=====
tar
=====
*   (in UNIX/Linux)
    archive - a collection of files in which each file is
    labelled with information about its origin

    two main purposes:
    *   backing up a group of files so they can be restored if
        they are lost or damaged

    *   packaging a group of files (including whole directories of
        files) for transmission to another computer

*   tar is ONE such program for creating archives

    tar - the name DOES stand for "tape archive"

    *   because WAS originally used to read and write archives
        stored on magnetic tape...! BUT tapes are not required to
	use tar!!!

*   another command with MANY options --
    I am focusing on the 4 I've used most often

    My most frequent use of tar: to package a directory of files
    to send somebody

    tar cvf desired-archive-name.tar original-directory
    *   c - create this new archive desired-archive-name using this
            file/directory original-directory

    *   v - optional option - verbose - output info about
            the tarring in progress

    *   f - take the archive pathname from the next argument

*   to list the contents of a tar file/tarball,
    can use the t and options (t: list the contents of the archive)

    tar tf blah.tar

*   to extract all of the files from a tar file?
    option x, for extract, is what you want

    tar xvf desired_tar.tar

*   ...and there are more options...! see man tar

*   fun fact: tar in general does not provide data compression,
    at least according to my possibly-old sources;
    BUT that's ok, you CAN compress the resulting archive!!

*   for fun:
    https://xkcd.com/1168/

=====
gzip, gunzip
=====
*   the GNU versions of the commands zip and unzip
    (two of SEVERAL programs for file compression)
    
    *   uses Lempel-Ziv compression (in case you'd like to
        Google that!)

*   "UNIX for the Impatient" claims typical compression rates
    are 60-70% for natural language files and computer source code
    files

*   gzip desired_file
    ...results in a compressed file desired_file.gz

    gunzip desired_file.gz
    ...results in the uncompressed file desired_file

    (if the gzipped file happens to be a tar file,
    tar xvf IS "smart" enough to note the .gz suffix
    and uncompress it, also...)

*   a few options:
    *   there are options for also keeping the original file unchanged,
        changing the suffix, specifying not to keep the original date
	and time stamp while compressing, etc.

    *   -v - (verbose) display the name of each compressed or expanded
        and percentage by which it was reduced
=====
head and tail!
=====
*   head extracts the beginning of a file, or several files

    head [-n num] [files....]

    *   by default, it grabs the first 10 lines
        (and with -n num, it grabs the first num lines)

*   tail extracts the end of a file, or several files

    tail [-n num] [files...]

    *   by default, does it grab the last 10 lines of the file
        (and with -n num, it grabs the last num lines)