Homework 2, Statistics.

Short Description:

Write a program which will perform elementary statistical analysis on a file of numbers.

This assignment is worth 100 points.

Goals

When you finish this homework, you should:

Formal Description

Write a program which, given a data file, will compute basic statistics on the data in the file. The program will also support some data query functions.

The program must compute the following measures (for valid data):

The program must also support the following interactive actions

Input

The program should begin by asking the user for the name of an input file.

The input file contains at most 10,000 valid integer numbers. A valid number starts with a digit or a '-' and contains only the digits 0 through 9.

The file may contain any number of invalid numbers. Invalid numbers contain digits as well as characters.

Entries in the file are separated by white space.

Consider the following data file

1 3.2         4           a5 6c
7 
3 
      -2 8-
This file contains the following valid data: 1, 4, 7, 3, -2

This file contains the following invalid data: 3.2, a5, 6c, 8-

Output

Your program should begin by printing the required statistics, clearly labeled, in the order given. After this, your program should present the user with a list of choices of possible other queries. After a query is selected, the program should prompt for additional input required, process the query, display the results and represent the menu.

A histogram can be displayed using the following technique:

Consider the following:
   data = 2 3 5 5 6 8 8 8 8  10 
   Number of bins = 3
   Different numbers = 10-2+1 = 9
   Bins are 2 to 4, 5 to 7 and 8 to 10
       Bin 1 contains 2 numbers (2,3)
       Bin 2 contains  3 numbers (5,5,6)
       Bin 3 contains  5 numbers (8,8,8,8,10) 
   Normalized
        Bin 1: 2/10 = .2 x 20 = 4
	Bin 2: 3/10 = .3 x 20 = 6
	Bin 3: 5/10 = .5 x 20 = 10
   The y axis labels
        Since there are 10 numbers total, each * in the 
	   histogram represents 1/2 a number.
	Line 20: 10 numbers
	Line 15: 7.5 numbers
	Line 10: 5 numbers
	Line 5: 2.4 numbers
	Line 1:  .5 number

    Output

    10 |   
       |
       |
       |
       |
   7.5 |
       |
       |
       |
       |
     5 |     *
       |     *
       |   * *
       |   * *
       |   * *
   2.5 |   * *
       | * * *
       | * * *
       | * * *
    .5 | * * *
       +-------
Bin #    1 2 3 

	 Bin 1 : 2 to 4
	 Bin 2 : 5 to 7
	 Bin 3 : 8 to 10

When a dataset is displayed, it should be broken into a number of lines, each line no more than 80 characters wide. Numbers should not be split in the middle.

Discussion

Your program should be modular in design. You should employ simple routines which are well documented.

You should begin working on the parts of this program you can accomplish now, reading in data, printing out data.

Your sorting and searching routines should be contained in their own files. You should provide .h files to accompany these files.

Required Files

Source code, a Makefile which builds the entire project, and a README file which provides project documentation.

Submission

Email a tar file containing all required files to dbennett@edinboro.edu by October 6 at class time.