Posted by Jonas Elfström
Wed, 03 Feb 2010 22:48:00 GMT
I don't live on that same continent as Ira Glass and because of that I can't listen to the wonderful This American Life on the radio so instead I subscribe to the podcast. Every time a new episode shows up that is the first thing I listen to. There are a lot of great podcasts out there but if I had to choose only one, I think I would go for This American Life (sorry RadioLab, I love you). Recently I learned that the host and producer of the show, Ira Glass, can be found on YouTube talking about storytelling.
It's amazing how he covers how he thinks you should tell a story for a radio/TV show. I recognize how the show executes that narrative but it also doesn't take anything away from the fact the show is excellent and that there are a lot of workmanship and talent put into it.
Ira Glass on Storytelling #1
Ira Glass on Storytelling #2
Ira Glass on Storytelling #3
Ira Glass on Storytelling #4
Posted in Blogging | no comments
Posted by Jonas Elfström
Wed, 03 Feb 2010 16:32:00 GMT
I know how it works and I think I can see why but I'm still not very fond of how eager C# is to perform implicit string conversion.
Contrived example:
1
2
|
string s = -42 + '+' + "+" + -0.1 / -0.1 + "=" + (7 ^ 5) +
" is " + true + " and not " + AddressFamily.Unknown; |
s will be set to "1+1=2 is True and not Unknown"
The answer is in white text above, select the text to see it.
A more real problem is something like this
|
string str = 1 + 2 + "!=" + 1 + 2; |
str will be set to "3!=12".
Edit 2010-02-08
This wouldn't be much of a problem if all objects in .NET always returned a decent string representation of their current state/value with ToString() but that's not the case. Instead "The default implementation returns the fully qualified name of the type of the Object.".
I don't like the inconsistency. It's way too late now but I think it would have been much better if only objects that really produces a human readable output of the data in the object should implement ToString(). If you want the name of the type of the Object there should be another way.
Posted in C# | no comments
Posted by Jonas Elfström
Mon, 01 Feb 2010 23:04:00 GMT
There's no StringBuilder class in Ruby because the String class has the << for appending. The problem is that not every Ruby programmer seems to be aware of it. Recently I've seen += being used to append to strings where << would have been a much better choice.
The problem with using += is that it creates a new String instance and if you do that in a loop you can get really horrible performance.
If you are dealing with an array you don't even have to use << because Array#join is even faster and shows intent in a nice way.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
require 'benchmark'
array_of_random_strings=(0...262144).map{65.+(rand(25)).chr}.join.scan(/.{1,8}/m)
Benchmark.bm do |benchmark|
benchmark.report do
str=array_of_random_strings.join
end
benchmark.report do
str2=""
array_of_random_strings.each do |s|
str2<<s
end
end
benchmark.report do
str3=""
array_of_random_strings.each do |s|
str3+=s
end
end
end |
The array_of_random_strings is an array of 32768 8 characters long random strings.
| user | system | total | real |
| 0.030000 | 0.000000 | 0.030000 | ( 0.027184) |
| 0.160000 | 0.010000 | 0.170000 | ( 0.190277) |
| 106.020000 | 0.300000 | 106.320000 | (113.457793) |
The performance of += was even worse than I imagined!
Posted in Ruby | no comments
Posted by Jonas Elfström
Thu, 14 Jan 2010 21:55:00 GMT
Justin Etheredge has been blogging about his challenge to find prime numbers with LINQ. He later used AsParallel() (coming in .NET 4) to speed things up and then followed that up with a post about using The Sieve Of Eratosthenes.
As you can see in the comments of those posts I tried to speed the Sieve of Eratosthenes up by using Parallel.For in the inner loop. I also tried AsParallel() in the LINQ expression but it made no difference in either case. At most it got 5% faster. I'm not sure but it could be that because SoE is very memory intense we could have a scaling issue and maybe also memory bandwidth exhaustion. This is mere speculation.
I then searched for other algorithms and found The Sieve of Atkin. It uses less memory than SoE so I thought I'd give it a try.
I set the limit to 20,000,000 and then benchmarked it. It timed in on 2.48s so actually worse than the 2.2s that SoE took. Not good!
Then I added Parallel.For in the loop that did most of the work and lo and behold, it scaled! I have two cores in my machine (T7200@2.0GHz) and the average runtime went down to 1.26s. That's almost linear and surprisingly good! If you happen have a quad core (or more) and feel like trying it out then please contact me. It would be interesting to see if it scales further.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
|
static List<int> FindPrimesBySieveOfAtkins(int max)
{
var isPrime = new BitArray((int)max+1, false);
var sqrt = (int)Math.Sqrt(max);
Parallel.For(1, sqrt, x =>
{
var xx = x * x;
for (int y = 1; y <= sqrt; y++)
{
var yy = y * y;
var n = 4 * xx + yy;
if (n <= max && (n % 12 == 1 || n % 12 == 5))
isPrime[n] = !isPrime[n];
n = 3 * xx + yy;
if (n <= max && n % 12 == 7)
isPrime[n] = !isPrime[n];
n = 3 * xx - yy;
if (x > y && n <= max && n % 12 == 11)
isPrime[n] = !isPrime[n];
}
});
var primes = new List<int>() { 2, 3 };
for (int n = 5; n <= sqrt; n++)
{
if (isPrime[n])
{
primes.Add(n);
int nn = n * n;
for (int k = nn; k <= max; k += nn)
isPrime[k] = false;
}
}
for (int n = sqrt + 1; n <= max; n++)
if (isPrime[n])
primes.Add(n);
return primes;
} |
This is C# 4.0 code, compiled in Visual C# 2010 Express Beta 2.
Edit 2010-01-20
Indications are that this does in fact not scale very good on a quad core. It's even worse, it seems it scales good on my old T7200 but not on a dual core E6320. I don't know why but of course the shared state of the isPrime BitArray is a huge problem and maybe it could be that differences in CPU architecture (FSB speed, caches and so on) in the E6320 is an explanation. Average execution time on the E6320 was 1290ms in a single thread and 1064ms in two.
If you want to try this in an older version of C# than 4.0 then check out this post.
A reader asked how I timed the executions. Here's how.
1
2
3
4
5
6
7
8
9
10
11
12
13
|
var steps = new List<long>();
var watch = new Stopwatch();
for (int i = 0; i < 10; i++)
{
watch.Reset();
watch.Start();
var primes = FindPrimesBySieveOfAtkins(20000000);
watch.Stop();
Console.WriteLine(watch.ElapsedMilliseconds.ToString());
steps.Add(watch.ElapsedMilliseconds);
}
Console.WriteLine("Average: " + steps.Average().ToString()); |
Posted in C#, Java | 9 comments
Posted by Jonas Elfström
Mon, 02 Nov 2009 21:50:00 GMT
Say you have two objects of the same class and you want to know what differs between them. Well actually you just want to know the instance variables in object b that differs from the ones in object a.
To begin with, we need a class. I like cheese.
1
2
3
4
5
6
|
class Cheese
attr_accessor :name, :weight, :expire_date
def initialize(name, weight, expire_date)
@name, @weight, @expire_date = name, weight, expire_date
end
end |
Then we need some cheese objects.
1
2
|
stilton=Cheese.new('Stilton', 250, Date.parse("2009-11-02"))
gorgonzola=Cheese.new('Gorgonzola', 250, Date.parse("2009-11-17")) |
With only name, weight and an expiration date it would be easy to compare those but imagine that these two objects has 42 properties. It does not stop there, you are being asked to compare 24 different classes in this way. Are you cringing yet?
Object#instance_variables to the rescue! Well, that and a small hack by me. Below I add a new method called instance_variables_compare to Object. The long method name is because I wanted to follow the naming already in place. Usually I prefer to do these kind of things as a module and then include them where appropriate but in this case I find that a monkey patch will do.
1
2
3
4
5
6
7
|
class Object
def instance_variables_compare(o)
Hash[*self.instance_variables.map {|v|
self.instance_variable_get(v)!=o.instance_variable_get(v) ?
[v,o.instance_variable_get(v)] : []}.flatten]
end
end |
It returns the instance variables that differs as a hash because it's handy and because I like it that way.
1
2
3
4
5
6
7
8
9
10
|
>> stilton.instance_variables_compare(gorgonzola)
=> {"@name"=>"Gorgonzola", "@expire_date"=>#<Date: 4910305/2,0,2299161>}
>> gorgonzola.instance_variables_compare(stilton)
=> {"@name"=>"Stilton", "@expire_date"=>#<Date: 4910275/2,0,2299161>}
>> stilton.expire_date=gorgonzola.expire_date
=> #<Date: 4910305/2,0,2299161>
>> stilton.instance_variables_compare(gorgonzola)
=> {"@name"=>"Gorgonzola"}
>> stilton.instance_variables_compare(stilton)
=> {} |
If you ever think of using this code you should be aware of two things.
- This code is very untested and comes with no guarantees.
- Since instance variables spring into life the first time they are assigned to you either have to work with objects that always initialize everything or you have to change
instance_variables_compare to handle this.
Posted in Ruby | no comments