325lect13-2-projected


=====
CS 325 - Week 13 Asynchronous Material - Part 1
=====

=====
In this part we will:
=====
*   announcements
*   start discussion Transaction Management - Part 1
*   prep for next class? (or maybe that will be in a later part)

*   There ARE CS 325 labs on Friday, November 19th!

*   Remember that you Project DESIGN Draft Milestone is due
    by 11:59 pm on Friday, November 19th!

*   (Homework 10 will come out this weekend, and be due 11:59 pm
    on Friday, DECEMBER 3rd.)

*   PLEASE remember to register for Spring 2022 when registration
    window/enrollment appointment opens!

=====
*   Transaction Management and Concurrency Control - part 1
=====
*   concept of a TRANSACTION

*   in database terms, we define this more specifically than you
    may have seen before:
    *   starts with the concept of a transaction being
        a LOGICAL UNIT of WORK (luw)

    *   the idea here: we'd like a transaction NOT to be
        "partially" done; logically, we would like it to be
	as IF it was one, indivisible unit, either completely
	completed or as if it were never started;

    *   logical unit? ...the transaction might actually consist
        of numerous separate steps, but we want it in effect to be
	as if it were one single step;

    *   a transaction can be a real-world event (significant in some
        scenario) that we want to be treated as a "single" logical piece;
	*   a sale of an item
	*   a deposit into an account
	*   a registration for a course
	*   a reservation for a plane seat
	*   a sale of a concert ticket
	*   etc.!

    *   consider a transfer of $100 from a bank customer's savings accounts
        to their checking account:
	*   might actually consist of a set of steps

        1.  read savings account balance (into a local variable or
	    register)
	2.  reduce savings balance by 100 (in that local variable or
	    register)
	3.  write the new savings balance back into the database
	4.  read the checking account balance (into a local variable or
            register)
	5.  increase the checking balance by 100 (in that local variable or
            register)
	6.  write the new checking balance back into the database

        What if, after the first 4 steps, something went wrong?
	(power outage, hard drive failure, network issue, etc.)

	...customer would NOT be happy! They've lost $100 from their
        savings account!

    *   SO: we'd like a transaction to be either COMPLETED (all its steps
        done), or completely not done (at least, be as if NONE of its
	steps were even started)

*   so, being more specific,
    a transaction could be defined as a logical unit of work
    that must be either entirely completed or entirely aborted;
    no intermediate/partial states are acceptable.

    or if you prefer:
    A transaction, also known as an atomic transaction or logical unit
    of work, is a series of actions (representing a single "logical" unit
    of work) taken on the database such that either ALL the actions are
    done successfully, or NONE are done and the database remains UNCHANGED.

*   Defining a series of stesp as a transaction is an important of
    concurrency control.

*   Also as it relates to transactions, we want to revisit another concept:
    CONSISTENCY

    *   A consistent database state is one in which all data integrity
        constraints are satisfied.

        So, a transaction should change the contents of the database
	from one consistent state to another consistent state;

        The DBMS needs to do all it can to help support this.

=====
ASIDE - some of the levels of DBMS support for data integrity
=====
*   remember at least some of these?
*   (and how DBMSs may support some of these more than others...)
    *   entity integrity
        *   if the DBMS requires unique primary keys for every relation
	    (and enforces them, does not allow two rows to have the same
	    value of that table's primary key,
	    THAT is an example of enforcing entity integrity

        *   interestingly, NOT permitting null values in any attribute
	    making up a primary key is also considered as enforcing
	    entity integrity!

    *   domain integrity
        *   if a DBMS allows you to specify that a column be from a
	    particular set of values (from a domain, even if just a
	    physical domain), and the DBMS won't allow a value for
	    that column that is not from that domain,
	    then it is considered as enforcing at least some degree of
	    domain integrity

        *   DBMSs can vary quite a bit in their support for domain integrity;

    *   referential integrity
        *   if a DBMS allows you specify that column in a table
	    is a foreign key referencing the primary key in another
	    table, and it enforces that, that DBMS is providing support
	    for referential integrity

	    e.g., you can't put in a row with a foreign key NOT in
	    the referenced table;
	    e.g., you can't update a row to have a foreign key that
	    is not in the referenced table;
	    e.g., you can't delete a row in the referenced table
	    if there are rows in other tables referencing that row;

    *   transaction integrity
        *   a DBMS supports this if it supports enforcing atomicity
	    of transactions -- it supports defining a logical set of steps
	    as a transaction, and supports those steps as being ALL completed
	    or as if they had never been started

        *   We'll be discussiong some ways that SQL and DBMSs might help
	    enforce transaction integrity

=====
The five main database transaction properties: ACIDS
=====
*   there are four (or five, depending on the textbook...!)
    main database transaction properties that we'd like transactions
    to meet:

    *   *A*tomicity
    *   *C*onsistency
    *   *I*solation
    *   *D*urability
    *   *S*erializability

    *   Atomicity - what you think it is! requires that all parts of
        a transaction be completed; and if not, any partial actions
	be able to be "rolled back" such that the result is as if NONE
	of those steps were done

    *   Consistency - a transaction must transform the database from one
        consistent state to another consistent state

        *   so, before the transaction, the database is in a consistent state;
	*   AFTER the transaction, the database has been transformed to
	    a NEW consistent state (or the transaction's actions have
	    to be rolled back... so the database is back to its previous
	    consistent state)

    *   Isolation - means the data used during the execution of a transaction
        cannot be used by a second transaction until the first one is
	completed (at least, the EFFECT must be as if this was the case)

        That is, transactions should execute independently of one another --
	partial effects of incomplete transactions should not be visible
	to other transactions.
	^ it is the responsibility of the concurrency contorl subsystem
	  to ensure isolation

    *   Durability - indicates the permanence of the database' consistent
        state.

        don't confuse this with the ACIDS property of consistency --
	that consistency property means that a *transaction* *transforms*
	the database FROM one consistent state to ANOTHER state;

        DURABILITY means the consistent state of the database is
	maintained BETWEEN transactions;

        That is, the effects of committed transactions are permanently
	recorded so that between transactions they are not lost,
	even if power failures, fires, equipment malfunctions, etc.
	cause memory failures;

        If durability is supported by a DBMS, the DBMS provides facilities
	to recover the changes from committed transactions as needed in'
	the face of such issues

   *   Serializability - you have this property when the concurrent
       execution of transactions is EQUIVALENT to any of the cases
       where the transactions executed SERIALLY in some arbitrary order.

       If I have transactions A, B, and C all done at the SAME time,
       interleaving their steps doesn't mess up their results -
       the results when completed are as if you had done ONE of the
       serial orderings of these (as if they had been done one after the
       other, rather than at the same time)
       the results are as if you did A then B then C
                                  OR B then A then C
				  OR C then A then B
				  OR C then B then A
				  OR B then C then A
				  OR A then C then B

=====
how CAN SQL support transaction atomicity?
=====
*   ...well, the two statements:
    commit;
    rollback;

    commit; <-- intended to permanently record all changes since
                the previous commit of the database

    rollback; <-- "roll" back, or undo, all changes made since the previous
                  commit (so the database is back at the state of the
		  previous commit)

    *   so the idea is: do a commit, do the actions of the transaction,
                        ONLY when all the actions are successfully done
			do you do a final commit (to complete that
			transaction);
			if anything goes wrong along the way,
			a rollback should be done so it is as if that
			transaction had never even been started;

=====
ONE example of a way to support DURABILITY:
   transaction logs
=====
*   a transaction log keeps track of the effects of all transactions
    that change database since the previous complete **backup**

*   before each change to the database,
    record that change to the log
    (in chronological order)

    IF a failure occurs, ...
    in a ROLLFORWARD approach,
    *   you'd restore the database to its PREVIOUS backup state,
        then go through the transaction log, ONLY redoing, in order,
	    the steps from transactions that had been COMMITTED at
	    the time of the failure.

        (you DO NOT redo steps from ROLLED BACK or NOT-YET-COMMITTED
	     transactions!)

    (after the next complete backup, you start a new transaction log)

    (BY the way, there can be all sorts of approaches here --
    rollback: you instead, after a failure,
        just ROLL BACK the changes from transactions that had
	not yet been committed at the time of the failure.)

====
in part 2, we'll talk about just a few examples of algorithms
for CONCURRENCY CONTROL from three of the common classes of
such algorithms:
*   locks
*   time stamping
*   optimistic methods