Methods

The purpose of this document is to demonstrate that you can or have:

In addition, the goals of this assignment are to:

Update 11/18

Please review the following notes

I expect you to provide the basic analysis of any fields you use in your work.

By documenting the actions you have taken, you allow others to understand how you achieved your results. This allows others to have more confidence in your results.

In addition the document will allow you to return to your work after some time has passed and remember what you have done. This is very important as we tend to forget what we have done over time. If we need to update the results, or perform this analysis again for any reason, this document is invaluable.

Finally, this document will give your instructor a solid basis to judge your work.

This is a technical paper intended for someone who has a knowledge of Excel. You should use technical terms and provide detailed discussion of the work. This is perhaps the most important document you will produce this semester for this course. The contents may vary somewhat due to the nature of the data, but the document should provide sufficient detail that an experienced reader can reproduce your results.

This document should include the following sections.

  1. An Introduction which introduces the project and states the main goals.
  2. A description of the original data. This includes the source and format of the dataset. This portion can mirror the description in the Proposal Paper.. You should also include a description of any file transformations required to load the data into Excel. For example, if you imported a CSV file, a screen shot of the import dialog box (with delimiters selected) would be a good inclusion. Be sure to include a table with examples of the dataset.
  3. A data dictionary. This should include the field name and description for each field in a record from the dataset. Note the type of each data field, describe the data values that are legal for the field. If there are any specialized terms used to describe the data, please provide a definition and a reference to the source for the definition.
  4. A detailed description of how the data was cleaned. This should include what fields contained data that you considered to be in error and how you adjusted that data. You should provide a justification for any actions you have performed. This includes splitting fields, dealing with missing values or extreme outliers. This section should include screen shots of functions used or detailed descriptions of steps taken to clean the data.
  5. Summary of data used
    1. Summary statistics for original data. This should include discussions of all original data fields used in the final analysis. For numeric data include the five number summary and a graph (if such a graph makes sense). For non numeric data, consider a frequency distribution or some other graphical representation.
    2. Summary statistics for derived. If you have employed derived data in your analysis, how did you derive that data? What computations were used to produce it? Provide summary information as well as graphical depiction for any derived data.
  6. Analysis Methods Describe the analysis you performed on that data. If you used a pivot table, what fields did you use? If you filtered the data, what were the filter settings. For each different type of analysis performed, you should include screen shots of actions you have taken, a description of what was done and possibly screen shots of the results.
  7. A description of any noteworthy failures. This section is useful to remind yourself of what not to try again or provide a warning to someone who is trying to reproduce your work.
  8. A section containing conclusions. This section provides an answer to your question or a summary of your discoveries.
  9. A description of future work. This is a list of items you did not have time to perform, did not have the skills to accomplish, or were left unfinished.

The formatting of this document is important. It should include citations, a bibliography, a table of contents and be double spaced. It should be broken into sections and the majority of formatting should be performed at the document level. Data should be presented in tables, which include column headings. All graphs should be labeled.

ItemWeight Full Partial
Contents
Introduction 5% Comprehensive introduction and description of dataset is provided. The project is introduced and some introduction to the data is provided.
Data Dictionary 10% The dictionary is complete and usable. Some information is missing.
Data Cleaning 10% All computations employed in cleaning the data are described and well documented. Actions taken to clean/exclude data are justified. Computations are described partially. Actions are only partially justified.
Summary Statistics 15% Summary statistics are provided for all data used in the investigation. Graphs are used when appropriate. Some statistics are provided, some graphs are used.
Description of Computations 10% All computations are described and well documented. Computations are described partially.
Reproducible 10% The document provides sufficient detail to reproduce the work done. Some of the details required to reproduce the work are missing.
Summary 10% The document provides a summary of results. Some of discussion of results.
Future Work/Problems 5% Some future work is indicated and problems encountered are reported.  
Other
General Formatting 20% The document is well formatted, contains no errors in grammar or spelling, and is presented in a professional manner. The document contains minor flaws in formatting, spelling or grammar.

This document should be a living document. You should start working on it as soon as you locate an appropriate data source. For example, the beginning sections can be modified versions of the proposal paper. As you progress through the project, you should keep a record of work done. Using document level formatting will yield a more consistent look. Some minor variations in style are acceptable but the final document should be done in a professional manner.

This document should be submitted to the D2L folder Project Methods.