Finding primes in parallel 9

Posted by Jonas Elfström Thu, 14 Jan 2010 21:55:00 GMT

Justin Etheredge has been blogging about his challenge to find prime numbers with LINQ. He later used AsParallel() (coming in .NET 4) to speed things up and then followed that up with a post about using The Sieve Of Eratosthenes.

As you can see in the comments of those posts I tried to speed the Sieve of Eratosthenes up by using Parallel.For in the inner loop. I also tried AsParallel() in the LINQ expression but it made no difference in either case. At most it got 5% faster. I'm not sure but it could be that because SoE is very memory intense we could have a scaling issue and maybe also memory bandwidth exhaustion. This is mere speculation.

I then searched for other algorithms and found The Sieve of Atkin. It uses less memory than SoE so I thought I'd give it a try.

I set the limit to 20,000,000 and then benchmarked it. It timed in on 2.48s so actually worse than the 2.2s that SoE took. Not good! Then I added Parallel.For in the loop that did most of the work and lo and behold, it scaled! I have two cores in my machine (T7200@2.0GHz) and the average runtime went down to 1.26s. That's almost linear and surprisingly good! If you happen have a quad core (or more) and feel like trying it out then please contact me. It would be interesting to see if it scales further.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
static List<int> FindPrimesBySieveOfAtkins(int max)
{
    var isPrime = new BitArray((int)max+1, false);
    var sqrt = (int)Math.Sqrt(max);

    Parallel.For(1, sqrt, x =>
    {
        var xx = x * x;
        for (int y = 1; y <= sqrt; y++)
        {
            var yy = y * y;
            var n = 4 * xx + yy;
            if (n <= max && (n % 12 == 1 || n % 12 == 5))
                isPrime[n] = !isPrime[n];

            n = 3 * xx + yy;
            if (n <= max && n % 12 == 7)
                isPrime[n] = !isPrime[n];

            n = 3 * xx - yy;
            if (x > y && n <= max && n % 12 == 11)
                isPrime[n] = !isPrime[n];
        }
    });

    var primes = new List<int>() { 2, 3 };
    for (int n = 5; n <= sqrt; n++)
    {
        if (isPrime[n])
        {
            primes.Add(n);
            int nn = n * n;
            for (int k = nn; k <= max; k += nn)
                isPrime[k] = false;
        }
    }

    for (int n = sqrt + 1; n <= max; n++)
        if (isPrime[n])
            primes.Add(n);

    return primes;
}

This is C# 4.0 code, compiled in Visual C# 2010 Express Beta 2.

Edit 2010-01-20

Indications are that this does in fact not scale very good on a quad core. It's even worse, it seems it scales good on my old T7200 but not on a dual core E6320. I don't know why but of course the shared state of the isPrime BitArray is a huge problem and maybe it could be that differences in CPU architecture (FSB speed, caches and so on) in the E6320 is an explanation. Average execution time on the E6320 was 1290ms in a single thread and 1064ms in two.

If you want to try this in an older version of C# than 4.0 then check out this post.

A reader asked how I timed the executions. Here's how.

1
2
3
4
5
6
7
8
9
10
11
12
13
var steps = new List<long>();
var watch = new Stopwatch();

for (int i = 0; i < 10; i++) 
{
    watch.Reset();
    watch.Start();
    var primes = FindPrimesBySieveOfAtkins(20000000);
    watch.Stop();
    Console.WriteLine(watch.ElapsedMilliseconds.ToString());
    steps.Add(watch.ElapsedMilliseconds);
}
Console.WriteLine("Average: " + steps.Average().ToString());

Unit testing strains 3

Posted by Jonas Elfström Mon, 05 May 2008 19:17:00 GMT

I've felt it and I've heard it from colleagues several times. Writing unit tests can be hard work. Especially adding unit test to an existing code base is, at best, cumbersome. Also it's one of those things with delayed gratification. Sometimes it's not even you that will benefit from them being there because the biggest win can be long down the road, when changes to the system has to be made.

Tests may seem to be isolated and it's even considered a good thing to keep them that way. Even so the tests of your application has a correlation to what the system aims to do on a bigger scale. This one of the things BDD focuses on. I think that one of the biggest advantages is that you in one process writes a specification and tests that ensures that the spec. is met. Testing becomes a natural part of the development process. This way it clearly shows that BDD and TDD are design processes and that it's certainly not all about adding unit tests.

Find out more about BDD on: http://behaviour-driven.org
It must be stressed that BDD is a rephrasing of existing good practice, it is not a radically new departure. Its aim is to bring together existing, well-established techniques under a common banner and with a consistent and unambiguous terminology.

For Ruby RSpec has almost become the de facto standard for BDD. The concepts Story, Scenario, and Test feels natural and the syntax is short and easy to read.

In languages like Java or C# the tests often becomes much more cluttered and some of that clutter is the extra code that comes with static typing. I believe that dynamically typed and overall dynamic languages like Ruby or Python could find a nice little niche here. They could become DSL's for testing.

RSpec is on it's way for .NET/C# via IronRuby and for Java via JRuby but don't hold your breath because they are still in alpha and beta.

.NET / C#
Testing .NET with IronRuby...
NSpecify => RSpec… well closer anyway

Java
Java Functional Testing with JRuby and RSpec
JtestR

Ruby
Slapp - A simple chat wall Merb tutorial. With nice exampes of using RSpec.
Behavior-driven testing with RSpec

ASP.NET MVC
ASP.NET MVC Test Framework Integration Walkthrough
MVC Preview - Testing
ASP.NET MVC Session at Mix08, TDD and MvcMockHelpers