Homework 1: Counting Avocados
The goals of this homework are:
- Use Xif and Xifs for X∈ {count, sum, average}
Kaggle has a Avocado Prices dataset. While the context is somewhat insulting to Millenials, the dataset seems good. Let's explore it.
- Here is a local copy of the csv file.
Please do this work using count*, sum*, average*. You could do this with a pivot table, but the goal is to learn these functions.
Table in this case does not mean a formal excel table, just a range with labels and computations.
- Right now save as an excel worksheet. If you just save, you will store your results as a csv file and lose your work.
- (5 points) About and Comments
- Build an About worksheet.
- The kaggle page might be helpful
- This page might also be helpful.
- Build a Comments worksheet.
- this is a place to answer questions and make observations.
- Please indicate the question you are answering.
- Provide answers in full sentences written in a professional manner.
- (5 points) Name the columns in the raw data table.
- (5 points) Total Volume
- Build a table showing the total volume avocados by region
- This should be sorted lowest to highest by total volume.
-
- Do you notice any problems? Document these an a Comments worksheet.
- How will you address this problem?
- (10 points) Volume Report
- Produce a table of total volume by region by year.
- Apply any corrections from the previous step.
-
- Do you notice any problems? Document this in the Comments worksheet.
- How will you address this problem?
- (10 points) Verify your observation from above.
- Count the total number of reports by region.
- Count the total number of reports by region by year
-
- Comment on your findings.
- How will you address this problem?
- (10 points) Basic Price Analysis
- You may ignore any repeated data from the previous exercise (IE Don't exclude any regions).
- Add a new worksheet.
- Count, counta, countblank the average price column
-
- Build the five number summary of the average price
- quartile.inc for those of you who did not have 101
- find the range : max-min
- find the average: average
-
- (10 points) Build a histogram of prices.
- You may ignore any repeated data from the previous exercise (IE Don't exclude any regions).
- There should be 10 bins
- Lowest to highest price.
- Computed in excel, not by hand
-
- (10 points) Repeat this table but
- Add a Region cell and compute the histogram based on that region.
-
- Changing the value in the blue cell will change the values of the computation.
- (10 points) Build a table using average, averageif, averageifs.
Submission
Upload the final worksheet to the Assignment Submission section of Dropbox by the due date.