Introduction
- Data is recorded facts and numbers.
- The easiest way to think of data is as a table.
- My class rosters
- Your scores in a class.
- List of citations issued on a given night
- Data at one time in our history was fairly limited.
- But the modern information age has made data rapidly available in large quantities.
- Information
- Knowledge derived from data.
- Data presented in a meaningful context
- Data processed by summing, ordering, averaging, grouping, comparing or other similar operations.
- A Database is an organized collection of data. (Wikipedia)
- Simple databases, such as spreadsheets, log books, stacks of forms have existed for a long time.
- In arelational database data is organized in tables, with relationships between tables serving as a fundamental tool to organize the data.
- A object oriented database treats data as objects
- NoSQL or not only SQL databases are a newer invention driven by massively distributed databases.
- These are document oriented databases, think XML documents.
- Big data
- A Database Management System (DBMS) is a software application that manages the storage and interaction with a database or multiple databases.
- We will use MySQL
- Others include PostgreSQL, Microsoft SQL Server, Oracle, SAP, IBM DB2, ...
- We will, for the most part, focus on relational databases.
- Tables
- Data is stored in tables, consisting of rows and columns.
- A table should contain information about a single "thing".
- For example, a conference might have a table of
- Students in a university.
- Classes at a university.
- Grades for students in classes at a university.
-
- Each column in a table represents an attribute
- Think field in a record.
- First name, last name, ...
- Each row in a table represents an instance
- Or one individual thing.
- A collection of all of the attributes.
- This use of rows and columns are by convention
- They could be transposed.
- Kroenky says that 99.999% of the dbms systems in the world
- Naming Conventions
- For the book and for this class.
- TABLES are all named with upper case letters, with the underscore separating words ( STUDENT, STUDENT_RECORD)
- ColumnNames will appear as CamelCase. ClassName, Grade, FirstName ...
- Relationships
- Relationships link tables forming compound instances.
-
- The first two tables are ok, but the third is meaningless.
- Relations between tables provide that meaning
-
- Keys
- Tables are required to have keys which uniquely identify each instance.
- For example,
- In the CLASS table, ClassNumber is the key
- Each class has a unique number.
- If I know that I am in Class 10, I know that it is Chem 101, section 1, fall 2012
- Keys are used to identify an instance in a table.
- A table may have multiple keys, but one is selected to be the primary key
- Other keys are called alternate keys
- Sometimes a key can be a natural attribute of an instance
- Consider, for example, university names in a conference database.
-
- The name, or address could be the primary key.
- The other would be an alternate key.
- Other times, an artificial key must be used.
- This is called a surrogate key.
- It is generated by the DBMS (usually).
- The DBMS ensures that these keys are unique.
- At times, a key may be constructed from two or more attributes
- This is called a composite key
- The GRADE table contains a composite key, (StudentNumber, ClassNumber)
- If the key is constructed from an attribute stored in another table, it is a foreign key
- In the GRADE table, (StudentNumber, ClassNumber) constitute a foreign key.
- Foreign Keys are the mechanism for constructing relationships.
- We will spend quite a bit of time discussing these concepts in more detail later.