Introduction
- We will learn several ways to evaluate the performance of an algorithm or implementation.
- Each has strengths and weaknesses.
- But each has a place.
- For now, I would like to determine the performance of the string
find
function in the standard library.
- I suspect that this is linear (whatever that means)
- Because sequential search is linear. (But this is a bad assumption)
- But it is probably KMP or better.
- So what should I investigate?
- The linear thing implies that as the data size grows, the time will grow at the same rate.
- So if I just do a series of string finds and time them I should be in good shape.
- My plan:
- Generate a random text of a given length.
- Generate a random word of a given length.
- Search for the word in the text, finding all occurrences.
- I need to see how tings run, so I might want to
- Generate different size words and texts.
- Run the experiment a number of times.
- I think I need the following tools
- Learn how to time things inside a program.
- I really don't want to consider the data generation times.
- This means system level utilities like the
time
function in linux will not be helpful.
- Furthermore, I would like to be somewhat portable.
- Learn how to generate random data.
- The modern (c++11 or better) c++ libraries have new versions of libraries to assist with these tasks.
- And we should look at them.
- The material for the next sections
- Is from Gregorie, fifth edition.
- Is only approached from a surface level.
- I just want a tool set, I don't want to study c++, so we will not go in depth.
- We don't have time.
- Both of these topics are very deep,
- Mathematically
- And perhaps in other ways.
- You should use these ideas, if not this code to do your experiments (homework 1 and beyond).
- So as we discuss them, if you don't understand something, ASK
- And if you don't get the code, ASK
- And if you don't understand the ideas, ASK