Sorting
      -  I would like to use the Chicago Movies in the Park dataset for today.
      
           -  Download a copy.
           
 -  Notice the about worksheet and the RawData worksheet.
      
 
       -  We have multiple purposes today.
      
          -  We would  like to begin to understand this dataset.
          
 -  We would like to look for errors and inconsistencies in the dataset.
          
 -  We would like to understand the filtering and sorting capabilities of excel.
      
 
       -  As we look thought the data is is most likely that we will find problems.
      
            -  We will most likely have to make assumptions about the data to try and fix those problems.
            
 -  What happens if we guess wrong?
            
 -  Since it would be nice to be able to go back,  we should make a copy of the raw data.
            
 -  Since we might need to reproduce our work, we should document what we do.
            
                -  I assume you have a "methods document" started for the project,
                
 -  So as you work through the exercise note the things you do to the data.
            
 
       
       -  Making a copy of the RawData
      
          -  Click on the Raw Data worksheet.
          
 -  Select Move or Copy and make sure that Create a copy is checked.
          
 -  Rename this to be Sorting
      
 
       -  If you want, you can start a new workbook.
      
 
           -  right click on raw data, select Move of Copy, but this time  select (new book) in the To book: area.
           
 -  Create a copy of the about sheet in the New Book worksheet.
      
 
       -  Deleting worksheets can not be undone
      
           -  most actions in excel can be undone
           
 -  But when you delete a worksheet, this can not.
           
 -  Not a major problem, but you need to be aware of this.
           
 -  So duplicate a sheet then delete it.
           
 -  I think it is best to practice everything just so you know. 
      
 
       -  We will discuss tabs more later.
      
 -  I would like to understand what if the last five fields are related.
      
 -  Arranging our data.
      
          -  For many of the functions we will be looking at, Excel likes tables.
          
 -  Most wants a record to be in a row
          
               -  A record is a set of related data.
               
 -  A single data point.
               
 -  In this dataset, what is stored in a row or record?
          
 
           -  Records are made up of fields or the components of a record.
          
               -  Fields are the columns of data.
               
 -  The data in a field should all be of the same type.
               
 -  A bad dataset may have multiple different kinds of information in a field.
          
 
           -  For excel, it is best if our data 
          
              -  Has column headings.
              
                  -  If these are missing, add them or you will be sorry later.
              
 
               -  Has no blank rows
              
 -  Has not blank columns.
              
                   -  Remove any of these.
                   
 -  This is called contiguous
              
 
               -  Has no data right next to it (on the side or top)
              
                   -  As you do computations, be aware of this.
                   
 -  This is called isolated
              
 
           
       
       -  On page 305 he gives a chart of quick movement keys.
      
  
           -  Because we will be dealing with large data sets, just scrolling is not the best.
           
 -  CTRL Arrow moves that direction until
           
                -  The last in a set of data
                
 -  The edge of the worksheet.
           
 
            -  For this reason it is best not to have blank cells in a workbook.
           
 -  CTRL home, end move to the extremes of a dataset.
           
 -  Adding shift to these allows you to select multiple cells.
      
 
       -  Replacing blanks
      
           -  There are blanks in this dataset.
           
 -  It would be easier if there weren't.
           
 -  What should we replace them with?
           
               -  Excel doesn't really care.
               
 -  But other software you will use does.
               
 -  NA is a good choice.
               
                     -  Unless that is a valid field.
               
 
            
            -  Let's replace all of the blanks in the dataset
           
               -  Select Ctrl-Home to move to the top left of the dataset
               
 -  Select Shift-Ctrl-End to select the entire dataset.
               
 -  On the Home tab in the Editing command group select the Find & Select dropdown.
               
 -  Choose Replace
               
                   -  Just leave the Find what: cell empty
                   
 -  Put NA in the Replace With field.
                   
 -  Select Replace All
               
 
                -  Note that 171 fields were changed to be NA.
               
                   -  What should we do about this?
               
 
                -  While we are here, look at other replace options.
               
 -  Let's search for those na's
               
                   -  Perhaps click on Match Entire cell contests
               
 
                -  Find all, notice we have some in Columns K, L, M that are NAs.
           
 
       
       -  This is a little difficult to deal with.
      
           -  I don't remember what those columns are.
           
 -  Go to the View tab in the window commandgroup
           
               -  Notice that we can Freeze Pains.
               
 -  Freezing the top row might be helpful here.
               
 -  This doesn't play well with some other options, but it is ok here.
           
 
       
       -  Currently, I don't think looking at the phone number will help 
      
           -  This is really probably the same as looking at the park name.
           
 -  So right click on column D and select Hide
           
 -  Notice that the column is shown as collapsed at the top.
           
 -  You can 
           
               -  Double click on this to unhide it.
               
 -  Or you can select C-E, right click and select Unhide
               
 -  Or you can just click near it and select Unhide
           
 
            -  I suspect that the last five fields are geographic, so let's hide columns E,F and G.
           
 -  I suspect that I and J are just other ways to represent the park name as well, so hide those.
      
 
       -  Let's see if there is really a relationship between the last five fields and the park name.
      
          -  Click anywhere in column C
          
 -  On the Datatab in the Sort & Filter area click on the Sort A to Z or  Sort Z to A tab.
          
              -  Notice this sorts the entire table. (Because it is contiguous and isolated).
          
 
           -  Try sorting by Wards
          
              -  Scroll to the bottom.
              
 
                  -  Notice that there are 7 parks here that have NA's
                  
 -  What can we do about that?
              
 
               -  Let's try that Filter button.
              
                  -  Drop down the new bar on Park
                  
 -  Select  Chicago History Museum
                  
 -  Note the other two entries are the same.
                  
                      -  Should we copy the values down to the NAs?
                      
 -  If so what else should we do?
                  
 
               
               -  Can you fix Osterman Beach and Humboldt Park?
              
          
 
       
       -  Let's Sort by day of the week.
      
          -  Use A to Z
          
 -  Not What we really wanted
          
              -  Try clicking on the Sort box.
              
 -  Select Custom List and sort by week day.
          
 
 
           -  Create a custom list, sort Mon, Tue, Wed ... Sun
          
 -  Sort on the Date field.
      
 
       -  Sort on the wards.
      
           -  Note the parks are now out of order.
           
 -  Sort by ward then by park.
      
 
       -  Notice in the sort
      
          -  My data has headers
          
 -  We can sort left to right but that is not for our data.
          
 -  We can do case sensitive.
          
               -  Excel normally ignores case.
          
 
       
       -  Try filtering on