Finding primes in parallel

Posted by Jonas Elfström Thu, 14 Jan 2010 21:55:00 GMT

Justin Etheredge has been blogging about his challenge to find prime numbers with LINQ. He later used AsParallel() (coming in .NET 4) to speed things up and then followed that up with a post about using The Sieve Of Eratosthenes.

As you can see in the comments of those posts I tried to speed the Sieve of Eratosthenes up by using Parallel.For in the inner loop. I also tried AsParallel() in the LINQ expression but it made no difference in either case. At most it got 5% faster. I'm not sure but it could be that because SoE is very memory intense we could have a scaling issue and maybe also memory bandwidth exhaustion. This is mere speculation.

I then searched for other algorithms and found The Sieve of Atkin. It uses less memory than SoE so I thought I'd give it a try.

I set the limit to 20,000,000 and then benchmarked it. It timed in on 2.48s so actually worse than the 2.2s that SoE took. Not good! Then I added Parallel.For in the loop that did most of the work and lo and behold, it scaled! I have two cores in my machine (T7200@2.0GHz) and the average runtime went down to 1.26s. That's almost linear and surprisingly good! If you happen have a quad core (or more) and feel like trying it out then please contact me. It would be interesting to see if it scales further.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
static List<int> FindPrimesBySieveOfAtkins(int max)
{
   //  var isPrime = new BitArray((int)max+1, false); 
   //  Can't use BitArray because of threading issues.

    var isPrime = new bool[max + 1];
    var sqrt = (int)Math.Sqrt(max);

    Parallel.For(1, sqrt, x =>
    {
        var xx = x * x;
        for (int y = 1; y <= sqrt; y++)
        {
            var yy = y * y;
            var n = 4 * xx + yy;
            if (n <= max && (n % 12 == 1 || n % 12 == 5))
                isPrime[n] ^= true;

            n = 3 * xx + yy;
            if (n <= max && n % 12 == 7)
                isPrime[n] ^= true;

            n = 3 * xx - yy;
            if (x > y && n <= max && n % 12 == 11)
                isPrime[n] ^= true;
        }
    });

    var primes = new List<int>() { 2, 3 };
    for (int n = 5; n <= sqrt; n++)
    {
        if (isPrime[n])
        {
            primes.Add(n);
            int nn = n * n;
            for (int k = nn; k <= max; k += nn)
                isPrime[k] = false;
        }
    }

    for (int n = sqrt + 1; n <= max; n++)
        if (isPrime[n])
            primes.Add(n);

    return primes;
}

This code needs C# 4.0 to compile.

Edit 2010-12-14

Dommer found out that the BitArray implementation had some serious threading issues. I had my worries about the non thread safe characteristics of BitArray but I thought that the isPrime[n] ^= true; was an atomic operation and that it didn't matter in what order bit bits was flipped would make it possible to use anyway. Not so. Changed it to a boolean array and that seems to rock the boat but of course at a much higher memory cost.

Edit 2010-01-20

Indications are that this does in fact not scale very good on a quad core. It's even worse, it seems it scales good on my old T7200 but not on a dual core E6320. I don't know why but of course the shared state of the isPrime BitArray is a huge problem and maybe it could be that differences in CPU architecture (FSB speed, caches and so on) in the E6320 is an explanation. Average execution time on the E6320 was 1290ms in a single thread and 1064ms in two.

If you want to try this in an older version of C# than 4.0 then check out this post.

A reader asked how I timed the executions. Here's how.

1
2
3
4
5
6
7
8
9
10
11
12
13
var steps = new List<long>();
var watch = new Stopwatch();

for (int i = 0; i < 10; i++) 
{
    watch.Reset();
    watch.Start();
    var primes = FindPrimesBySieveOfAtkins(20000000);
    watch.Stop();
    Console.WriteLine(watch.ElapsedMilliseconds.ToString());
    steps.Add(watch.ElapsedMilliseconds);
}
Console.WriteLine("Average: " + steps.Average().ToString());



Edit 2010-10-24

Tom's code from the comment below

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
using System; 
using System.Collections.Generic; 
using System.Linq; 
using System.Numerics; 
using System.Text; 
using System.Threading.Tasks;

namespace Calculate_Primes 
{ 
    class Program 
    { 
        private const int _NUMBER_OF_DIGITS = 100;

        static void Main(string[] args)
        {
            BigInteger floor = BigInteger.Parse("1" + string.Empty.PadLeft(_NUMBER_OF_DIGITS - 1, '0'));
            BigInteger ceiling = BigInteger.Parse(string.Empty.PadLeft(_NUMBER_OF_DIGITS, '9'));

            Console.WindowWidth = 150;

            //var primes = Enumerable.Range(floor, ceiling).Where(n => Enumerable.Range(1, n).Where(m => (n / m) * m == n).Count() == 2);

            Console.Clear();
            _calculatePrimes(floor, ceiling, "C:\\100 digit primes.txt");

            Console.Clear();
            _calculatePrimes(floor, ceiling, "C:\\300 digit primes.txt");
        }

        static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toInclusive)
        {
            for (BigInteger i = fromInclusive; i <= toInclusive; i++)
                yield return i;
        }

        static void ParallelFor(BigInteger fromInclusive, BigInteger toInclusive, Action<BigInteger> body)
        {
            Parallel.ForEach(Range(fromInclusive, toInclusive), body);
        } 

        static void _calculatePrimes(BigInteger floor, BigInteger ceiling, string resultsFileFilepath)
        {
            using (System.IO.FileStream fs = new System.IO.FileStream(resultsFileFilepath, System.IO.FileMode.Create)) { }

            using (System.IO.StreamWriter sw = new System.IO.StreamWriter(resultsFileFilepath))
            {
                ParallelFor(floor, ceiling, i =>
                    {
                        if (_isPrime(i))
                        {
                            lock (sw)
                            {
                                sw.Write(i.ToString() + System.Environment.NewLine);
                                sw.Flush();
                            }
                        }
                    });
            }
        }

        static bool _isPrime(BigInteger number)
        {
            bool returnValue = true;

            Console.WriteLine("Checking {0} for primality.", number.ToString());

            if ((number < 2) || (number > 2 && number.IsEven) || (number > 2 && number.IsPowerOfTwo))
                returnValue = false;
            else
                for (BigInteger i = 2; i * i <= number; i++)
                {
                    if (number % i == 0)
                        returnValue = false;
                }

            if(returnValue)
                Console.WriteLine("         {0} IS prime.", number.ToString());
            else
                Console.WriteLine("         {0} IS NOT prime.", number.ToString());

            return returnValue;
        }
    }
}

Posted in C#, Math | 19 comments

Comments

    1. Avatar
      Svish Fri, 15 Jan 2010 10:03:18 GMT

      Are BitArrays thread-safe, or how does that work?

      Is it possible to write a version of this without the Parrallel.For method? Well, I suppose it is possible of course, but would it be a big mess? :p

    2. Avatar
      Jonas Elfström Fri, 15 Jan 2010 12:33:16 GMT

      Replace the Parallel.For with for (int x = 1; x <= sqrt; x++) and remove ); from row 24 and you should be good to go.

      If I understand http://bit.ly/8lZagW correctly the isPrime[n] = !isPrime[n]; is an atomic operation but I have to investigate the matter of thread safety further. Thanks!

    3. Avatar
      Svish Fri, 15 Jan 2010 14:47:57 GMT

      But that would remove the parallelism :p I was wondering if it could be done in a nice way without using Parallel.For, but still have the parallelism. (So do whatever Parallel.For does yourself)

    4. Avatar
      Svish Fri, 15 Jan 2010 14:50:13 GMT

      From that link you posted: “Aside from the library functions designed for that purpose, there is no guarantee of atomic read-modify-write, such as in the case of increment or decrement.” – Wouldn’t that mean that it is not thread-safe? or?

    5. Avatar
      Jonas Elfström Fri, 15 Jan 2010 15:23:15 GMT

      That’s for “long, ulong, double, and decimal”. Read/write of booleans is atomic. I’m just not sure that isPrime[n] = !isPrime[n]; is the same as Boolean test = false; test = !test; which would be atomic.

    6. Avatar
      Jonas Elfström Fri, 15 Jan 2010 16:24:41 GMT
    7. Avatar
      Jonas Elfström Sat, 16 Jan 2010 15:11:21 GMT

      “I was wondering if it could be done in a nice way without using Parallel.”

      Check out: http://www.codeproject.com/KB/dotnet/PoorMansParallelForEach.aspx

    8. Avatar
      Svish Tue, 19 Jan 2010 17:33:16 GMT

      So should maybe use the Set method instead then? Or doesn’t make much difference perhaps…

      Thanks for the link. Will check it out :)

    9. Avatar
      Jonas Elfström Wed, 20 Jan 2010 08:11:22 GMT

      http://coding-time.blogspot.com/2008/03/implement-your-own-parallelfor-in-c.html - makes it possible to run FindPrimesBySieveOfAtkins unchanged in C# 2.0-3.5.

    10. Avatar
      Tom Fri, 22 Oct 2010 16:35:54 GMT

      These lines: isPrime[n] = !isPrime[n];

      Should be replaced with isPrime[n] ^= true;

      No more atomicity.

    11. Avatar
      Tom Fri, 22 Oct 2010 16:59:47 GMT

      Here’s a little something I came up with–except for the commented out LINQ query–using Stephen Toub’s comments on Scott Hansellman’s blog:

      using System; using System.Collections.Generic; using System.Linq; using System.Numerics; using System.Text; using System.Threading.Tasks; namespace Calculate_Primes { class Program { private const int NUMBEROF_DIGITS = 100;

          static void Main(string[] args)
          {
              BigInteger floor = BigInteger.Parse("1" + string.Empty.PadLeft(_NUMBER_OF_DIGITS - 1, '0'));
              BigInteger ceiling = BigInteger.Parse(string.Empty.PadLeft(_NUMBER_OF_DIGITS, '9'));
      
              Console.WindowWidth = 150;
      
              //var primes = Enumerable.Range(floor, ceiling).Where(n => Enumerable.Range(1, n).Where(m => (n / m) * m == n).Count() == 2);
      
              Console.Clear();
              _calculatePrimes(floor, ceiling, "C:\100 digit primes.txt");
      
              Console.Clear();
              _calculatePrimes(floor, ceiling, "C:\300 digit primes.txt");
          }
      
          static IEnumerable<BigInteger> Range(BigInteger fromInclusive, BigInteger toInclusive)
          {
              for (BigInteger i = fromInclusive; i <= toInclusive; i++)
                  yield return i;
          }
      
          static void ParallelFor(BigInteger fromInclusive, BigInteger toInclusive, Action<BigInteger> body)
          {
              Parallel.ForEach(Range(fromInclusive, toInclusive), body);
          } 
      
          static void _calculatePrimes(BigInteger floor, BigInteger ceiling, string resultsFileFilepath)
          {
              using (System.IO.FileStream fs = new System.IO.FileStream(resultsFileFilepath, System.IO.FileMode.Create)) { }
      
              using (System.IO.StreamWriter sw = new System.IO.StreamWriter(resultsFileFilepath))
              {
                  ParallelFor(floor, ceiling, i =>
                      {
                          if (_isPrime(i))
                          {
                              lock (sw)
                              {
                                  sw.Write(i.ToString() + System.Environment.NewLine);
                                  sw.Flush();
                              }
                          }
                      });
              }
          }
      
          static bool _isPrime(BigInteger number)
          {
              bool returnValue = true;
      
              Console.WriteLine("Checking {0} for primality.", number.ToString());
      
              if ((number < 2) || (number > 2 && number.IsEven) || (number > 2 && number.IsPowerOfTwo))
                  returnValue = false;
              else
                  for (BigInteger i = 2; i * i <= number; i++)
                  {
                      if (number % i == 0)
                          returnValue = false;
                  }
      
              if(returnValue)
                  Console.WriteLine("         {0} IS prime.", number.ToString());
              else
                  Console.WriteLine("         {0} IS NOT prime.", number.ToString());
      
              return returnValue;
          }
      }
      

      }

    12. Avatar
      Jonas Elfström Sat, 23 Oct 2010 22:10:50 GMT

      @Tom

      isPrime[n] ^= true

      An XOR toggle, nice! I haven’t checked it but I would guess that both that and mine compile to something similar.

      No more atomicity.

      Now you lost me. I believe it is and that that’s a good thing.

    13. Avatar
      Jonas Elfström Sat, 23 Oct 2010 22:22:29 GMT

      @Tom Thanks for the code and sorry for the not so fancy commenting function on my blog.

      I think that you are right in that brute force prime search scales over multiple CPUs. It could be a problem that it’s so terrible slow in comparance to The Sieve Of Eratosthene and The Sieve of Atkin, I just don’t know. Guess I have to read up on how those gigantic primes that have been found was found.

    14. Avatar
      Peter Tue, 21 Dec 2010 15:47:04 GMT

      This implementation of the sieve of Eatosthenes should take about 500 ms on your PC finding primes upto 20*10^6

      Regards,

      Peter

    15. Avatar
      Jonas Elfström Wed, 22 Dec 2010 10:20:23 GMT

      @Peter Thanks! Even though Sieve of Atkin in theory should be faster than Sieve of Eratosthenes your implementation of the latter is much faster than mine of the former. I’m not surprised because I just did a naive translation from the pseudo code on Wikipedia to C#.

      Also my implementation can’t handle searching for primes up to 1000000000. It seems to be the line int n = 4 * xx + yy; that is the problem. 4*xx+yy does not fit in a Int32 for max=1000000000.

      Peter's impl. of The Sieve Of Eratosthenes
      n: 10000  primes: 1229 in 0 ms
      n: 100000  primes: 9592 in 2 ms
      n: 1000000  primes: 78498 in 20 ms
      n: 10000000  primes: 664579 in 213 ms
      n: 100000000  primes: 5761455 in 2744 ms
      n: 1000000000  primes: 50847534 in 33338 ms
      
      My impl. of The Sieve Of Atkin
      n: 10000  primes: 1229 in 3 ms
      n: 100000  primes: 9592 in 6 ms
      n: 1000000  primes: 78498 in 64 ms
      n: 10000000  primes: 664579 in 584 ms
      n: 100000000  primes: 5761455 in 7612 ms
      
    16. Avatar
      Dave Jellison Thu, 17 Feb 2011 00:01:58 GMT

      FWIW: 20M took 387 ms on my 6 core using the algo in this post. Rockin! Thanks for the C# implementation.

    17. Avatar
      Freeman Wed, 16 Nov 2011 14:00:58 GMT

      just wanted to say you can use watch.Restart() instead of watch.Reset(); watch.Start();

    18. Avatar
      Aaron M Sat, 10 Mar 2012 20:18:45 GMT

      You can fix the original posting without using boolean arrays, but you would need to create a bit array for each thread and do each one separately, then after all the threads are finished, simply XOR them all together, this give the best of both worlds.

    19. Avatar
      Jonas Wed, 14 Mar 2012 00:20:19 GMT

      @AronM That sounds ingeniously simple! I will have to try that out. Still worried about the XOR. Why wouldn’t it be affected by all other threads?

Comments are closed