Please send questions to
st10@humboldt.edu .
what is a database?
* definition 1
a collection of tables, holding information about
different interrelated entities;
* Software such as Oracle, Access, MS SQLServer,
FoxPro, MySQL, PostgresSQL... are all examples
of DATABASE MANAGEMENT SYSTEMS, DBMS's
* a DBMS serves as the INTERFACE between a
person and a database, or between
an application and a database
...managing the database structure,
controlling access to the database,
providing various tools to the users
the help them in managing their
data;
providing a convenient ABSTRACTION for
the data;
* database design is not the design of the DBMS
software you use to create a database --
it is the design of the database structure
that will be ised to store and manage the
data...
* in here, we are primarily concerned with
OPERATIONAL databases, used for day-to-day
activities within a scenario;
* even a *good* DBMS will work poorly
with a poorly-designed database;
* It is a classic approach to compare/contrast
a FILE-PROCESSING SYSTEM -- with one's data all
in separate files -- versus a DATABASE-PROCESSING
SYSTEM -- where your data is in tables managed by
a DBMS
file: two definitions:
* one: as a stream of characters (stream-based
file)
* another -- the old, mainframe view --
as a collection of related records,
where each record is a collection of logically
connected fields, and
each field is a character or group of characters
* in the 1960's -- companies producing more and more
larger and larger collections of record-based files...
...bumped right into some of the limitations
of file-processing systems; [Kroenke]
* separated and isolated data
* unnecessary data duplication
* application program dependency
* difficulty in representing data in the users'
perspective
* it's not that files are "bad" -- it is just
as the quantity of data grows, it gets
more unwieldy to deal with the data in
different files;
* Database technology was developed largely to overcome
the limitations of file-processing systems;
a key idea: the "lower"-level file details are
HIDDEN from applications by the DBMS;
...applications see what the DBMS provides,
and don't care how the DBMS actually stored them;
POTENTIAL advantages of the database processing
approach:
* integrated data
* less unnecessary data duplication
* increased application program independence
* easier representation of the users' perspective
* of course, DBMS's have disadvantages/costs, too...
* there is some complexity
* size (DBMS's themselves can be big, the
related METADATA that goes with the data in a
database may be considerable, etc.)
...will it be more than the savings in
less-duplicated data? Hard to say...
* cost of the DBMS
* cost of conversion
* performance (because a DBMS is a general
tool, it might not be as fast as a customized
application going right to the data it needs in
a file;
* might have to be more concerned about a
"single point of failure";
SECOND definition of a database:
a database is a self-describing collection of integrated
records;
* self-describing;
means that the dataase contains METADATA, data
about the data, in addition to the data itself;
...the database contains the information about the
structure of the data;
* often this metadata is in the form of
a DATA DICTIONARY;
* why? because this description can be used
by a program to determine what the database
contains;
it can help promote application program
independence;
* collection of integrate records
* you not only have source data -- you also have
information about how the data is related;
...this makes the data integrated;
* really, it is a model of a model...
* a database is a model, not of reality,
but of the USER's model of their world;
* early databases were optimized for dealing with
regular tranactions at the organizational
level -- they are FAST at those regular transactions
...but... they weren't very flexible;
* then: 1970 - E. F. Codd -- developed the
relational database model
* based on relational algebra;
* you consider data to be stored in the
form of relations (more on this later!)
* all the relational operations are available
on your data -- VERY flexible!
...encourage more creative use of one's data!
* there IS overhead here -- and at first, no
one thought relational DBMS's could ever be
practical -- but thanks to Moore's Law.
they did become practical;
most new databases now use either the relational
or an object-relational model;