## Finding primes in parallel

Posted by Jonas Elfström Thu, 14 Jan 2010 21:55:00 GMT

Justin Etheredge has been blogging about his challenge to find prime numbers with LINQ. He later used `AsParallel()` (coming in .NET 4) to speed things up and then followed that up with a post about using The Sieve Of Eratosthenes.

As you can see in the comments of those posts I tried to speed the Sieve of Eratosthenes up by using `Parallel.For` in the inner loop. I also tried AsParallel() in the LINQ expression but it made no difference in either case. At most it got 5% faster. I'm not sure but it could be that because SoE is very memory intense we could have a scaling issue and maybe also memory bandwidth exhaustion. This is mere speculation.

I then searched for other algorithms and found The Sieve of Atkin. It uses less memory than SoE so I thought I'd give it a try.

I set the limit to 20,000,000 and then benchmarked it. It timed in on 2.48s so actually worse than the 2.2s that SoE took. Not good! Then I added `Parallel.For` in the loop that did most of the work and lo and behold, it scaled! I have two cores in my machine (T7200@2.0GHz) and the average runtime went down to 1.26s. That's almost linear and surprisingly good! If you happen have a quad core (or more) and feel like trying it out then please contact me. It would be interesting to see if it scales further.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 ``` ```static List FindPrimesBySieveOfAtkins(int max) { // var isPrime = new BitArray((int)max+1, false); // Can't use BitArray because of threading issues. var isPrime = new bool[max + 1]; var sqrt = (int)Math.Sqrt(max); Parallel.For(1, sqrt, x => { var xx = x * x; for (int y = 1; y <= sqrt; y++) { var yy = y * y; var n = 4 * xx + yy; if (n <= max && (n % 12 == 1 || n % 12 == 5)) isPrime[n] ^= true; n = 3 * xx + yy; if (n <= max && n % 12 == 7) isPrime[n] ^= true; n = 3 * xx - yy; if (x > y && n <= max && n % 12 == 11) isPrime[n] ^= true; } }); var primes = new List() { 2, 3 }; for (int n = 5; n <= sqrt; n++) { if (isPrime[n]) { primes.Add(n); int nn = n * n; for (int k = nn; k <= max; k += nn) isPrime[k] = false; } } for (int n = sqrt + 1; n <= max; n++) if (isPrime[n]) primes.Add(n); return primes; }```

This code needs C# 4.0 to compile.

Edit 2010-12-14

Dommer found out that the BitArray implementation had some serious threading issues. I had my worries about the non thread safe characteristics of BitArray but I thought that the isPrime[n] ^= true; was an atomic operation and that it didn't matter in what order bit bits was flipped would make it possible to use anyway. Not so. Changed it to a boolean array and that seems to rock the boat but of course at a much higher memory cost.

Edit 2010-01-20

Indications are that this does in fact not scale very good on a quad core. It's even worse, it seems it scales good on my old T7200 but not on a dual core E6320. I don't know why but of course the shared state of the isPrime `BitArray` is a huge problem and maybe it could be that differences in CPU architecture (FSB speed, caches and so on) in the E6320 is an explanation. Average execution time on the E6320 was 1290ms in a single thread and 1064ms in two.

If you want to try this in an older version of C# than 4.0 then check out this post.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 ``` ```var steps = new List(); var watch = new Stopwatch(); for (int i = 0; i < 10; i++) { watch.Reset(); watch.Start(); var primes = FindPrimesBySieveOfAtkins(20000000); watch.Stop(); Console.WriteLine(watch.ElapsedMilliseconds.ToString()); steps.Add(watch.ElapsedMilliseconds); } Console.WriteLine("Average: " + steps.Average().ToString());```

Edit 2010-10-24

Tom's code from the comment below

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 ``` ```using System; using System.Collections.Generic; using System.Linq; using System.Numerics; using System.Text; using System.Threading.Tasks; namespace Calculate_Primes { class Program { private const int _NUMBER_OF_DIGITS = 100; static void Main(string[] args) { BigInteger floor = BigInteger.Parse("1" + string.Empty.PadLeft(_NUMBER_OF_DIGITS - 1, '0')); BigInteger ceiling = BigInteger.Parse(string.Empty.PadLeft(_NUMBER_OF_DIGITS, '9')); Console.WindowWidth = 150; //var primes = Enumerable.Range(floor, ceiling).Where(n => Enumerable.Range(1, n).Where(m => (n / m) * m == n).Count() == 2); Console.Clear(); _calculatePrimes(floor, ceiling, "C:\\100 digit primes.txt"); Console.Clear(); _calculatePrimes(floor, ceiling, "C:\\300 digit primes.txt"); } static IEnumerable Range(BigInteger fromInclusive, BigInteger toInclusive) { for (BigInteger i = fromInclusive; i <= toInclusive; i++) yield return i; } static void ParallelFor(BigInteger fromInclusive, BigInteger toInclusive, Action body) { Parallel.ForEach(Range(fromInclusive, toInclusive), body); } static void _calculatePrimes(BigInteger floor, BigInteger ceiling, string resultsFileFilepath) { using (System.IO.FileStream fs = new System.IO.FileStream(resultsFileFilepath, System.IO.FileMode.Create)) { } using (System.IO.StreamWriter sw = new System.IO.StreamWriter(resultsFileFilepath)) { ParallelFor(floor, ceiling, i => { if (_isPrime(i)) { lock (sw) { sw.Write(i.ToString() + System.Environment.NewLine); sw.Flush(); } } }); } } static bool _isPrime(BigInteger number) { bool returnValue = true; Console.WriteLine("Checking {0} for primality.", number.ToString()); if ((number < 2) || (number > 2 && number.IsEven) || (number > 2 && number.IsPowerOfTwo)) returnValue = false; else for (BigInteger i = 2; i * i <= number; i++) { if (number % i == 0) returnValue = false; } if(returnValue) Console.WriteLine(" {0} IS prime.", number.ToString()); else Console.WriteLine(" {0} IS NOT prime.", number.ToString()); return returnValue; } } }```

Posted in C#, Math | 19 comments

1. Are BitArrays thread-safe, or how does that work?

Is it possible to write a version of this without the Parrallel.For method? Well, I suppose it is possible of course, but would it be a big mess? :p

2. Replace the Parallel.For with for (int x = 1; x <= sqrt; x++) and remove ); from row 24 and you should be good to go.

If I understand http://bit.ly/8lZagW correctly the isPrime[n] = !isPrime[n]; is an atomic operation but I have to investigate the matter of thread safety further. Thanks!

3. But that would remove the parallelism :p I was wondering if it could be done in a nice way without using Parallel.For, but still have the parallelism. (So do whatever Parallel.For does yourself)

4. From that link you posted: “Aside from the library functions designed for that purpose, there is no guarantee of atomic read-modify-write, such as in the case of increment or decrement.” – Wouldn’t that mean that it is not thread-safe? or?

5. That’s for “long, ulong, double, and decimal”. Read/write of booleans is atomic. I’m just not sure that isPrime[n] = !isPrime[n]; is the same as Boolean test = false; test = !test; which would be atomic.

6. “I was wondering if it could be done in a nice way without using Parallel.”

7. So should maybe use the Set method instead then? Or doesn’t make much difference perhaps…

Thanks for the link. Will check it out :)

8. http://coding-time.blogspot.com/2008/03/implement-your-own-parallelfor-in-c.html - makes it possible to run FindPrimesBySieveOfAtkins unchanged in C# 2.0-3.5.

9. These lines: isPrime[n] = !isPrime[n];

Should be replaced with isPrime[n] ^= true;

No more atomicity.

10. Here’s a little something I came up with–except for the commented out LINQ query–using Stephen Toub’s comments on Scott Hansellman’s blog:

using System; using System.Collections.Generic; using System.Linq; using System.Numerics; using System.Text; using System.Threading.Tasks; namespace Calculate_Primes { class Program { private const int NUMBEROF_DIGITS = 100;

``````    static void Main(string[] args)
{
BigInteger floor = BigInteger.Parse("1" + string.Empty.PadLeft(_NUMBER_OF_DIGITS - 1, '0'));

Console.WindowWidth = 150;

//var primes = Enumerable.Range(floor, ceiling).Where(n => Enumerable.Range(1, n).Where(m => (n / m) * m == n).Count() == 2);

Console.Clear();
_calculatePrimes(floor, ceiling, "C:\100 digit primes.txt");

Console.Clear();
_calculatePrimes(floor, ceiling, "C:\300 digit primes.txt");
}

static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toInclusive)
{
for (BigInteger i = fromInclusive; i <= toInclusive; i++)
yield return i;
}

static void ParallelFor(BigInteger fromInclusive, BigInteger toInclusive, Action<BigInteger> body)
{
Parallel.ForEach(Range(fromInclusive, toInclusive), body);
}

static void _calculatePrimes(BigInteger floor, BigInteger ceiling, string resultsFileFilepath)
{
using (System.IO.FileStream fs = new System.IO.FileStream(resultsFileFilepath, System.IO.FileMode.Create)) { }

using (System.IO.StreamWriter sw = new System.IO.StreamWriter(resultsFileFilepath))
{
ParallelFor(floor, ceiling, i =>
{
if (_isPrime(i))
{
lock (sw)
{
sw.Write(i.ToString() + System.Environment.NewLine);
sw.Flush();
}
}
});
}
}

static bool _isPrime(BigInteger number)
{
bool returnValue = true;

Console.WriteLine("Checking {0} for primality.", number.ToString());

if ((number < 2) || (number > 2 && number.IsEven) || (number > 2 && number.IsPowerOfTwo))
returnValue = false;
else
for (BigInteger i = 2; i * i <= number; i++)
{
if (number % i == 0)
returnValue = false;
}

if(returnValue)
Console.WriteLine("         {0} IS prime.", number.ToString());
else
Console.WriteLine("         {0} IS NOT prime.", number.ToString());

return returnValue;
}
}
``````

}

11. @Tom

isPrime[n] ^= true

An XOR toggle, nice! I haven’t checked it but I would guess that both that and mine compile to something similar.

No more atomicity.

Now you lost me. I believe it is and that that’s a good thing.

12. @Tom Thanks for the code and sorry for the not so fancy commenting function on my blog.

I think that you are right in that brute force prime search scales over multiple CPUs. It could be a problem that it’s so terrible slow in comparance to The Sieve Of Eratosthene and The Sieve of Atkin, I just don’t know. Guess I have to read up on how those gigantic primes that have been found was found.

13. This implementation of the sieve of Eatosthenes should take about 500 ms on your PC finding primes upto 20*10^6

Regards,

Peter

14. @Peter Thanks! Even though Sieve of Atkin in theory should be faster than Sieve of Eratosthenes your implementation of the latter is much faster than mine of the former. I’m not surprised because I just did a naive translation from the pseudo code on Wikipedia to C#.

Also my implementation can’t handle searching for primes up to 1000000000. It seems to be the line `int n = 4 * xx + yy;` that is the problem. `4*xx+yy` does not fit in a `Int32` for max=1000000000.

```Peter's impl. of The Sieve Of Eratosthenes
n: 10000  primes: 1229 in 0 ms
n: 100000  primes: 9592 in 2 ms
n: 1000000  primes: 78498 in 20 ms
n: 10000000  primes: 664579 in 213 ms
n: 100000000  primes: 5761455 in 2744 ms
n: 1000000000  primes: 50847534 in 33338 ms

My impl. of The Sieve Of Atkin
n: 10000  primes: 1229 in 3 ms
n: 100000  primes: 9592 in 6 ms
n: 1000000  primes: 78498 in 64 ms
n: 10000000  primes: 664579 in 584 ms
n: 100000000  primes: 5761455 in 7612 ms
```
15. FWIW: 20M took 387 ms on my 6 core using the algo in this post. Rockin! Thanks for the C# implementation.

16. just wanted to say you can use watch.Restart() instead of watch.Reset(); watch.Start();

17. You can fix the original posting without using boolean arrays, but you would need to create a bit array for each thread and do each one separately, then after all the threads are finished, simply XOR them all together, this give the best of both worlds.

18. @AronM That sounds ingeniously simple! I will have to try that out. Still worried about the XOR. Why wouldn’t it be affected by all other threads?