Project

The course description specifies a semester length project. This project has two objectives:

To demonstrate the first objective, you will need to produce a number of reports, a presentation, and working spreadsheets which demonstrate mastery of the software discussed.

The second objective will be demonstrated by the content of these documents.

General Description

For this project you must select and analyze a dataset. This dataset should consist of at least 100 records, with sufficient fields to provide something to analyze. In addition, it will be much better if your dataset contains some numeric data. This is not a requirement, but it will make your work in the class much more simple.

For the most part you should stay away from summary data as most of the analysis has already been completed. However, if you find a summary dataset of sufficient size it may be acceptable. If you are unsure of the suitability of a data set, please discuss it with your instructor.

This project is reasonably open ended. You are free to select a dataset from any area that is of interest to you. You could select a serious data set such as:

However, the subject does not have to be serious. Some less serious examples might include:

There are many interesting data sets available on line. We will spend class time discussing available data and how to locate it. However, spending some time thinking about potential subjects would be wise. It would also be reasonable to begin exploring available data sets early in the semester.

Again, you need to be somewhat selective when you choose a dataset. The Superhero dataset might be fun, but it will be difficult to process at the skill level for the class. I would advise holding off analysis of any mostly text dataset until you have more experience.

For the project you will be expected to download and possibly clean a dataset. You will then be required to produce a well documented explanation of the dataset. Finally, you will attempt to answer a question by analyzing the data in the dataset, which you will document in a final report and present to the class.

You should try to locate an acceptable dataset to analyze by the end of the fifth week (about September 22). If you are struggling to locate an acceptable dataset, or wish to discuss a potential dataset, please see your instructor. After locating the dataset, the real work on the project can begin.

The project will consist of the following components: