Homework 4, The Collatz Problem
Goals
When you finish this homework you should:
- have written a simple parallel program using fork and wait.
- Have used pipes (at least) for communications.
Assignment
On of my colleagues has published a paper at PACISE about the Collatz Conjecture. While we might not solve this problem, we can take a crack at it. For example, there is a Project Euler problem we will try to solve. Project Euler problems are numerical in nature and if not solved carefully can require quite a bit of processing power. The spirit of the site is to solve problems in a clever manner on a single processor in a very short amount of time. That is nice, but we have parallel machines and are looking for a task to put them to so ...
Please solve problem 14 in parallel. The problem statement is as follows
Longest Collatz sequence
Problem 14
The following iterative sequence is defined for the set of positive integers:
n -> n/2 (n is even)
n -> 3n + 1 (n is odd)
Using the rule above and starting with 13, we generate the following sequence:
13 -> 40 -> 20 -> 10 -> 5 -> 16 -> 8 -> 4 -> 2 -> 1
It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.
Which starting number, under one million, produces the longest chain?
NOTE: Once the chain starts the terms are allowed to go above one million.
My single processor solution ran in about .23 seconds and involved a recursive function Collatz which took a long (the starting number) and returned an int (the length of the sequence that starting number produced). I employed a technique called memoization where I kept an array of answers so I didn't have to recalculate the length of a sequence I already found. I then looped over all values between 1 and one million
computing the longest sequence produced.
I will be happy to discuss this problem with anyone who doesn't understand the mathematics or programming techniques discussed. While the math isn't that important right now, the techniques are useful.
Your program should take a single command line argument, -p n, where n is the number of processes to employ in solving the problem. It should then fork n processes, divide the search space into n nearly evenly sized portions and have each process compute the longest Collatz sequence for the search space assigned to the processes.
You should have the child processes report there results back to the parent process which should collate the results and print the final answer.
I believe the following table is correct
n | Length of the chain |
---|
13 | 10 |
1000 | 112 |
2000 | 113 |
3000 | 49 |
4000 | 114 |
5000 | 29 |
6000 | 50 |
7000 | 32 |
8000 | 115 |
9000 | 48 |
10000 | 30 |
Discussion
- I would spawn all processes from the original parent, thus you will know when all processes exit.
- I would solve this problem sequentially first, at least for a small number to make sure it works then scale up to the final solution.
- I would have process 1 check the lengths of the values 1, 1+p, 1+2p, ..., process 2 check 2, 2+p, 2+2p, ... and so on.
- I didn't store all 1,000,000 lengths in the memoziation array, just the first 100,000 or so.
- I did most computations in a long, but you could use a long double.
- Your code must compile with no errors or warnings on cslab103
- If you wish, you should be able to log into euler.cs.edinboro.edu and test your solution of a 40 processor computer.
Required Files
- A Makefile which builds your project (if you use python of bash this is not required)
- The source code to solve this problem
Submission
Submit a tar/zip file containing your solution to the D2L folder Homework 4 by the due date.