Please send questions to st10@humboldt.edu .
what is a database?
* definition 1 
  a collection of tables, holding information about
  different interrelated entities;

*   Software such as Oracle, Access, MS SQLServer,
    FoxPro, MySQL, PostgresSQL... are all examples
    of DATABASE MANAGEMENT SYSTEMS, DBMS's 

    *   a DBMS serves as the INTERFACE between a
        person and a database, or between
	an application and a database

        ...managing the database structure,
           controlling access to the database,
           providing various tools to the users
	       the help them in managing their
	       data;
           providing a convenient ABSTRACTION for
               the data;
*   database design is not the design of the DBMS
    software you use to create a database --

    it is the design of the database structure 
    that will be ised to store and manage the
    data...

*   in here, we are primarily concerned with
    OPERATIONAL databases, used for day-to-day
    activities within a scenario;

*   even a *good* DBMS will work poorly
    with a poorly-designed database;

*   It is a classic approach to compare/contrast
    a FILE-PROCESSING SYSTEM -- with one's data all
    in separate files -- versus a DATABASE-PROCESSING
    SYSTEM -- where your data is in tables managed by
    a DBMS

    file: two definitions:
    *   one: as a stream of characters (stream-based
        file)

    *   another -- the old, mainframe view --
        as a collection of related records,
        where each record is a collection of logically
            connected fields, and
        each field is a character or group of characters

*   in the 1960's -- companies producing more and more
    larger and larger collections of record-based files...
    ...bumped right into some of the limitations
    of file-processing systems; [Kroenke]

    *   separated and isolated data
    *   unnecessary data duplication
    *   application program dependency
    *   difficulty in representing data in the users'
        perspective

    *   it's not that files are "bad" -- it is just
        as the quantity of data grows, it gets
	more unwieldy to deal with the data in
	different files;

*   Database technology was developed largely to overcome
    the limitations of file-processing systems;

    a key idea: the "lower"-level file details are
    HIDDEN from applications by the DBMS;
    ...applications see what the DBMS provides,
       and don't care how the DBMS actually stored them;

    POTENTIAL advantages of the database processing
    approach:
    *   integrated data
    *   less unnecessary data duplication
    *   increased application program independence
    *   easier representation of the users' perspective

*   of course, DBMS's have disadvantages/costs, too...
    *   there is some complexity
    *   size (DBMS's themselves can be big, the
        related METADATA that goes with the data in a
	database may be considerable, etc.)

        ...will it be more than the savings in
	less-duplicated data? Hard to say...

    *   cost of the DBMS
    *   cost of conversion
    *   performance (because a DBMS is a general
        tool, it might not be as fast as a customized
	application going right to the data it needs in
	a file;
    *   might have to be more concerned about a
        "single point of failure";

SECOND definition of a database:
a database is a self-describing collection of integrated
records;

*   self-describing;
    means that the dataase contains METADATA, data
    about the data, in addition to the data itself;

    ...the database contains the information about the
    structure of the data;

    *   often this metadata is in the form of
        a DATA DICTIONARY;

    *   why? because this description can be used
        by a program to determine what the database
	contains;
        it can help promote application program
	independence;

*  collection of integrate records
   *   you not only have source data -- you also have
       information about how the data is related;
       ...this makes the data integrated;

*   really, it is a model of a model...
    *   a database is a model, not of reality,
        but of the USER's model of their world;

*   early databases were optimized for dealing with
    regular tranactions at the organizational
    level -- they are FAST at those regular transactions

    ...but... they weren't very flexible;

*   then: 1970 - E. F. Codd -- developed the
    relational database model
    *   based on relational algebra;

    *   you consider data to be stored in the
        form of relations (more on this later!)

    *   all the relational operations are available
        on your data -- VERY flexible!

        ...encourage more creative use of one's data!

    *   there IS overhead here -- and at first, no
        one thought relational DBMS's could ever be
	practical -- but thanks to Moore's Law.
        they did become practical;

        most new databases now use either the relational
        or an object-relational model;