CS 279 - Week 8 Lecture 1 - 2022-10-10 TODAY WE WILL: * announcements * follow-up: a few more examples of UNIX/Linux signals * more fun with patterns, part 1: more file globbing options * more fun with patterns, part 2: start our discussion of UNIX/Linux' implementation of Regular Expressions (REs) * prep for next class ===== follow-up: *some* EXAMPLES of Linux/UNIX signals: ===== * signal: a message from the OS to a process; one web site called it a software interrupt; * example: when the user types ^Z, the Linux/UNIX operating system sends a signal to the process trying to put it into the background... * in UNIX/Linux, each signal has a numerical code and a name num code name ---- ------- * 1 SIGHUP terminal hangup * 2 SIGINT terminal interrupt * 3 SIGQUIT terminal quit (with a memory dump in a file core) * 9 SIGKILL process killed * notice that numerical code of 9! * I think when you run the command: kill -9 <processid-or-job-num> ...you are asking for SIGKILL * 13 SIGPIPE broken pipe (writing when the reader has terminated) * 14 SIGALRM alarm clock interrupt * 15 SIGTERM software termination * 23 SIGCONT continue job if stopped * the fg command sends this signal, for example * 24 SIGSTOP noninteractive stop signal * stops the job -- suspends its activity but does not terminate it; it can be resumed later * 26 SIGTTIN read attempted by a background job * 27 SIGTTOU write attempted by a background job * by default, these signals stop a job that is running in the bakground * programs such as kill normally use SIGTERM to kill another process -- * receiving process can catch it an choose to continue; * BUT a SIGKILL, in theory, cannot be caught (and that's kill with a -9 option) -- in theory, that process will be killed (although the kill may not aloways work...?) ===== more fun with patterns, part 1: more GLOBBING options! ===== * fun facts: * some shells call file globbing by different names: * pathname expansion * filename expansion * globbing is NOT built into the UNIX file mechanism; it is recognized by various shells! * the *shell* expands the file globbing pattern into a space-separated list of matching filenames BEFORE the command is done * you already know: * - matches any ZERO OR MORE charaters Here are a few more: ? - matches any EXACTLY ONE character [ ... ] - this matches an SINGLE character within the [ ] if you prefer: [cset] - this matches any single character in cset * [moxie] - matches a single m or o or x or i or e [0123456789] - matches a single digit *.[ch]* - here, the file's suffix must start with c or h (nice for matching C++ source code files...!) * ranges are also supported in [ ]! [0-9] same as [0123456789] [a-z] same as [abcdefghijklmnopqrstuvwxyz] ...etc. * these ranges ARE inclusive! [d-f] DOES match d or e or f * there are also a number of predefined sets (!!) * within the usual [ ], you put another [, then a :, then the set name, then another :, then another ] [[:digit:]] - same as [0-9] and [0123456789] [[:alpha:]] - will match the set of [a-zA-z] PLUS any other characters considered letters in your locale similarly, [[:upper:]] and [[:lower:]] [[:space:]] - matches characters such as space, newline, and more [[:blank:]] - matches just space and tab [[:cntrl:]] - matches control characters * within [ ], can use an ! to indicate you want to match something that ISN'T one of those * starts with an uppercase letter, ends with anything BUT a digit [A-Z]*[![:digit:]] * adding after class: ! does not HAVE to be before a predefined set; these also work: [A-Z]*[!0123456789] [A-Z]*[!0-9] * NOTE the following LIMITATIONS on these patterns: * a / in the actual pathname MUST be matched by an explicit / in the pattern (NOT a wildcard) * a . in the actual pathname that comes at the beginning or follows a / must similarly be matched by an explicit . in the pattern * playing around with this a bit: gn*.l - gnu.l MATCHES gneiss.l MATCHES gn.l MATCHES (* matches 0 or more) gnu nope! gn/x.l nope! (have to explicitly include / to match it) ~/.[[:alpha:]]* - ~/.login MATCHES ~/..login nope! (have to explicitly include second . to match ..) ~/.mailrc MATCHES ~/login nope! (no . after ~/) */doit* - one/doit MATCHES two/doit.c MATCHES (. must be explicitly included at beginning or after a /, other locations CAN be matched with wildcards) three/doit.cpp MATCHES doit nope (no / ) [A-Z]*[![:digit:]] - Gz MATCHES X nope ([ ] has to match ONE at least) *.[acAC] - file.a MATCHES .a no?! <-- oh, .a can only match a pattern starting with an explicit . ls .[acAC] ...WOULD list .a in its output file nope stuff.ac no -- [acAC] matches exactly 1 character in the set, stuff.ac has TWO characters after its . ===== more fun with patterns, part 2 UNIX/Linux regular expressions BUT FIRST: more on the grep command ===== * you'd like even more expressive power? ...that you can use in other contexts besides representing lists of filenames? * UNIX/Linux has it! Supports: BRE - Basic Regular Expression ERE - Extended Regular Expression * we'll start with BREs in the context of the grep command grep - general regular expression parser * has options! WITHOUT options, grep pattern filelist ...returns the lines of files in the filelist that include the pattern, with the filename, a colon, then the entire line that includes the pattern grep "oink" *.txt ...return all lines including oink within files ending in .txt * just a TASTE of some grep options: * -l (before the pattern) returns JUST the names of files with that pattern grep -l "oink" * * -E - lets grep accept EREs as its pattern * -c - only show a count of matching lines