=====
CS 325 - Week 13 Asynchronous Material - Part 1
=====
=====
In this part we will:
=====
* announcements
* start discussion Transaction Management - Part 1
* prep for next class? (or maybe that will be in a later part)
* There ARE CS 325 labs on Friday, November 19th!
* Remember that you Project DESIGN Draft Milestone is due
by 11:59 pm on Friday, November 19th!
* (Homework 10 will come out this weekend, and be due 11:59 pm
on Friday, DECEMBER 3rd.)
* PLEASE remember to register for Spring 2022 when registration
window/enrollment appointment opens!
=====
* Transaction Management and Concurrency Control - part 1
=====
* concept of a TRANSACTION
* in database terms, we define this more specifically than you
may have seen before:
* starts with the concept of a transaction being
a LOGICAL UNIT of WORK (luw)
* the idea here: we'd like a transaction NOT to be
"partially" done; logically, we would like it to be
as IF it was one, indivisible unit, either completely
completed or as if it were never started;
* logical unit? ...the transaction might actually consist
of numerous separate steps, but we want it in effect to be
as if it were one single step;
* a transaction can be a real-world event (significant in some
scenario) that we want to be treated as a "single" logical piece;
* a sale of an item
* a deposit into an account
* a registration for a course
* a reservation for a plane seat
* a sale of a concert ticket
* etc.!
* consider a transfer of $100 from a bank customer's savings accounts
to their checking account:
* might actually consist of a set of steps
1. read savings account balance (into a local variable or
register)
2. reduce savings balance by 100 (in that local variable or
register)
3. write the new savings balance back into the database
4. read the checking account balance (into a local variable or
register)
5. increase the checking balance by 100 (in that local variable or
register)
6. write the new checking balance back into the database
What if, after the first 4 steps, something went wrong?
(power outage, hard drive failure, network issue, etc.)
...customer would NOT be happy! They've lost $100 from their
savings account!
* SO: we'd like a transaction to be either COMPLETED (all its steps
done), or completely not done (at least, be as if NONE of its
steps were even started)
* so, being more specific,
a transaction could be defined as a logical unit of work
that must be either entirely completed or entirely aborted;
no intermediate/partial states are acceptable.
or if you prefer:
A transaction, also known as an atomic transaction or logical unit
of work, is a series of actions (representing a single "logical" unit
of work) taken on the database such that either ALL the actions are
done successfully, or NONE are done and the database remains UNCHANGED.
* Defining a series of stesp as a transaction is an important of
concurrency control.
* Also as it relates to transactions, we want to revisit another concept:
CONSISTENCY
* A consistent database state is one in which all data integrity
constraints are satisfied.
So, a transaction should change the contents of the database
from one consistent state to another consistent state;
The DBMS needs to do all it can to help support this.
=====
ASIDE - some of the levels of DBMS support for data integrity
=====
* remember at least some of these?
* (and how DBMSs may support some of these more than others...)
* entity integrity
* if the DBMS requires unique primary keys for every relation
(and enforces them, does not allow two rows to have the same
value of that table's primary key,
THAT is an example of enforcing entity integrity
* interestingly, NOT permitting null values in any attribute
making up a primary key is also considered as enforcing
entity integrity!
* domain integrity
* if a DBMS allows you to specify that a column be from a
particular set of values (from a domain, even if just a
physical domain), and the DBMS won't allow a value for
that column that is not from that domain,
then it is considered as enforcing at least some degree of
domain integrity
* DBMSs can vary quite a bit in their support for domain integrity;
* referential integrity
* if a DBMS allows you specify that column in a table
is a foreign key referencing the primary key in another
table, and it enforces that, that DBMS is providing support
for referential integrity
e.g., you can't put in a row with a foreign key NOT in
the referenced table;
e.g., you can't update a row to have a foreign key that
is not in the referenced table;
e.g., you can't delete a row in the referenced table
if there are rows in other tables referencing that row;
* transaction integrity
* a DBMS supports this if it supports enforcing atomicity
of transactions -- it supports defining a logical set of steps
as a transaction, and supports those steps as being ALL completed
or as if they had never been started
* We'll be discussiong some ways that SQL and DBMSs might help
enforce transaction integrity
=====
The five main database transaction properties: ACIDS
=====
* there are four (or five, depending on the textbook...!)
main database transaction properties that we'd like transactions
to meet:
* *A*tomicity
* *C*onsistency
* *I*solation
* *D*urability
* *S*erializability
* Atomicity - what you think it is! requires that all parts of
a transaction be completed; and if not, any partial actions
be able to be "rolled back" such that the result is as if NONE
of those steps were done
* Consistency - a transaction must transform the database from one
consistent state to another consistent state
* so, before the transaction, the database is in a consistent state;
* AFTER the transaction, the database has been transformed to
a NEW consistent state (or the transaction's actions have
to be rolled back... so the database is back to its previous
consistent state)
* Isolation - means the data used during the execution of a transaction
cannot be used by a second transaction until the first one is
completed (at least, the EFFECT must be as if this was the case)
That is, transactions should execute independently of one another --
partial effects of incomplete transactions should not be visible
to other transactions.
^ it is the responsibility of the concurrency contorl subsystem
to ensure isolation
* Durability - indicates the permanence of the database' consistent
state.
don't confuse this with the ACIDS property of consistency --
that consistency property means that a *transaction* *transforms*
the database FROM one consistent state to ANOTHER state;
DURABILITY means the consistent state of the database is
maintained BETWEEN transactions;
That is, the effects of committed transactions are permanently
recorded so that between transactions they are not lost,
even if power failures, fires, equipment malfunctions, etc.
cause memory failures;
If durability is supported by a DBMS, the DBMS provides facilities
to recover the changes from committed transactions as needed in'
the face of such issues
* Serializability - you have this property when the concurrent
execution of transactions is EQUIVALENT to any of the cases
where the transactions executed SERIALLY in some arbitrary order.
If I have transactions A, B, and C all done at the SAME time,
interleaving their steps doesn't mess up their results -
the results when completed are as if you had done ONE of the
serial orderings of these (as if they had been done one after the
other, rather than at the same time)
the results are as if you did A then B then C
OR B then A then C
OR C then A then B
OR C then B then A
OR B then C then A
OR A then C then B
=====
how CAN SQL support transaction atomicity?
=====
* ...well, the two statements:
commit;
rollback;
commit; <-- intended to permanently record all changes since
the previous commit of the database
rollback; <-- "roll" back, or undo, all changes made since the previous
commit (so the database is back at the state of the
previous commit)
* so the idea is: do a commit, do the actions of the transaction,
ONLY when all the actions are successfully done
do you do a final commit (to complete that
transaction);
if anything goes wrong along the way,
a rollback should be done so it is as if that
transaction had never even been started;
=====
ONE example of a way to support DURABILITY:
transaction logs
=====
* a transaction log keeps track of the effects of all transactions
that change database since the previous complete **backup**
* before each change to the database,
record that change to the log
(in chronological order)
IF a failure occurs, ...
in a ROLLFORWARD approach,
* you'd restore the database to its PREVIOUS backup state,
then go through the transaction log, ONLY redoing, in order,
the steps from transactions that had been COMMITTED at
the time of the failure.
(you DO NOT redo steps from ROLLED BACK or NOT-YET-COMMITTED
transactions!)
(after the next complete backup, you start a new transaction log)
(BY the way, there can be all sorts of approaches here --
rollback: you instead, after a failure,
just ROLL BACK the changes from transactions that had
not yet been committed at the time of the failure.)
====
in part 2, we'll talk about just a few examples of algorithms
for CONCURRENCY CONTROL from three of the common classes of
such algorithms:
* locks
* time stamping
* optimistic methods