Homework 5

Design and implement a database from existing data.

This homework is worth 75 points.

Goals: When you finish this homework you should be able to:

Assignment

For this assignment you will use data in the file /home/CS/dbennett/DB_HW5/HW5Dataset. This file is located on the system and has read permission for all users. You are encouraged to reference the copy in my directory and not make new copies of this data.

The file contains movie ratings from Movie Lens, which I have modified. Each line in the file contains a user's rating for a movie. The format of the line is as follows:

The fields in this file are separated by a vertical bar (|), which is not part of the data.

The following are the first four lines in the file:

Caleb|Cross|49|M|Writer|Kolya |24-Jan-1997|Comedy|3
Ester|Barr|39|F|Executive|L.A. Confidential |01-Jan-1997|Crime,Film-Noir,Mystery,Thriller|3
Blake|Leach|25|M|Writer|Heavyweights |01-Jan-1994|Children's,Comedy|1
Stephan|Villarreal|28|M|Technician|Legends of the Fall |01-Jan-1994|Drama,Romance,War,Western|2

The Genre field is problematic for our level of knowledge. You may solve this problem several ways, including a program to prepossess the data file, a collection of queries to deal with this data, or other commands. Acceptable, but not preferred, is to allow this multi-value dependency to exist in your final database. You must, however, deal with this data in your design.

Using the techniques discussed in chapters 1-4, create a database in BCNF to represent this data. These files must be in your personal database in MySQL on the CSCS313Lin machine. Document each stage of this process, including:

Your client wishes to provide information about the rating of movies. To demonstrate this, create queries to answer the following questions (using your normalized database, not the raw data file)

  1. How many individuals provided ratings?
  2. Were there any two raters with the same name?
  3. What is the top rated movie?
  4. What was the top rated war movie among men?
  5. What was the top rated comedy among women who were less than 30?
  6. What was the move that Doctors enjoyed the most?
  7. List the top 5 rated movies
  8. What occupation provides the fewest number of ratings?
  9. What user provided the most/fewest ratings?

Report

You must provide the following documentation in the form of a written report: The following should be placed in an appendix of your report: I have access to your MySQL database, so I will examine your final set of tables when I grade this project.

Submission

Your answers must be typed. Print out the report and turn it in by class time on the due date. Keep the tables associated with this database on the MySQL server for the class until you receive feedback for this assignment.