A classification problem.
Objectives
We would like to :
- Solve a simple problem involving strings.
Notes
- Given a file, identify the unique words and count the number of occurences of each word in the file.
- A word is defined as a collection of contiguous characters.
- With the non-alphabetic characters removed.
- And all letters converted to lower case.
- Step 1, read in all the natural words in a file
- Step 2, clean the words, ie remove non alpha and convert to lower case
- Step 3, save in an array if it is new, or increase the count if it is not.
- Step 4, put the array in order