Threading Models/Multicore Programming

Objectives

We would like to :

Understand how threads relate to parallel programming
Understand models considered for implementing threads

Notes

Concurrency: allowing all the tasks on a system to make progress
- Interrupt based scheduling as we have discussed
- Only requires one cpu/core/...
Parallelism: allowing multiple process to run at the same time
- Requires multiple cores/cpu/...
The book identifies five challenges with parallel programming
- Identifying independent tasks.
- Balance: Keeping all processors busy
- Division of data: can you get the data to the right place at the right time, and share the data you need.
- Data dependency: can you keep the data synchronized, with the correct process modifying/reading it at the correct time.
- Testing and debugging: Parallel introduces race conditions, dead lock, and non-determinism. More in the next chapter (I think)
Types of parallelism
- Data parallelism: SPMD
  - Split the data across workers.
  - Each worker does the same thing.
  - This, IMHO, is the easiest, and what I know how to do.
- Task parallelism MPMD
  - Break the problem into tasks
  - Each does something different.
  - Coordinate the workers
  - This is hard/easy, depending on the task.
Multithreading models
- Look at the pictures on 166-167
- User threads - run in the process
- Kernel threads - run in the kernel
- Many to one
  - The process handles threads, the kernel knows nothing about them.
  - The thread library must handle creation, scheduling, destruction, ...
  - Does not take advantage of multiple cores/cpus
  - The original threading packages.
  - Probably very few applications any longer.
- One to one
  - Each thread in a process is supported by a thread in the kernel.
  - May overload the kernel with too many threads.
  - Probably the most common. (windows, linux, Unix, ...)
- Many to many
  - Multiple user threads may map to a single kernel thread
  - Perhaps fewer kernel threads
  - The kernel less likely to become overloaded
  - but "Although the many-to many model appears to be the most flexible ... in practice it is difficult to implement"
  - And as cores increase this becomes less important.