Counting the number of Google Readers

Posted by Jonas Elfström Mon, 19 Oct 2009 19:55:00 GMT

I run this blog on a 9 year old laptop hidden in a cabinet in the living room. It's not a powerful machine but it has been up to the job since I turned it into a web server 7 years ago. This could maybe be one of the last HP Omnibook 4150b still in use, at least it has to be in a very exclusive club of laptops being switched on for the past 7.5 years. Recently I've seen an increase in traffic and especially from Feedfetcher-Google. It so happens that Feedfetcher also shows the number of subscribers.

[19/Oct/2009:22:01:19 +0200] "GET /xml/rss20/feed.xml HTTP/1.1" 304 0 "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 4 subscribers; feed-id=7686756599804593322)"

The above is only one out of five different feed-ids because I have both atom and rss and for a short while this blog was at another address. The fifth feed is actually myself subscribing to the comments.

I'm not using FeedBurner so I can't get my statistics from there but I still wanted to be able to see the number of Google Readers of my blog (as far as I can see I only have one other type of subscriber).

Usually I script anything more advanced than a grep in Ruby but this time I made an exception and stayed in Bash.

1
2
3
4
5
6
7
8
9
tail -1000 /www/logs/access.log |
grep Feedfetcher |
cut -d ";" -f 4 | sort -u |
while IFS= read -r line
do
   tac /www/logs/access.log | grep -m 1 $line
done |
sed 's/^.*html; \([0-9]*\) subscribers.*/\1/' |
awk '{tot=tot+$1} END {print tot}'

Most certainly this can be optimized in a number of ways. Don't be shy, just tell me!

So what's going on there? Well, first I get the last 1000 rows from my access log and right now my traffic is so low that that is way more than I really would have to. Then I get all unique feeed-ids from the rows containing Feedfetcher. I pipe those to a loop that gets the very last access for each one of them. Then I parse out the number of subscribers with a regexp in sed and count them with awk .

It turns out that I have a whopping number of 14 15 subscribers and I am one of them.

The Thrush combinator in C# 1

Posted by Jonas Elfström Tue, 06 Oct 2009 18:35:00 GMT

Last year I read Reg "Raganwald'" Braithwaite's excellent post The Thrush and he explains it as

The thrush is written Txy = yx. It reverses evaluation.

Back then I didn't even consider trying to implement it in C#. That was before I digged deeper into lambda expressions and extension methods in C# 3.0 and way before last night when I read Debasish Ghosh's post on how to implement the Thrush in Scala. After reading that my first thought was if it was possible to do the same in C#. Here's my attempt.

At first I struggled with the static typing and headed for an easy way out using Object in the extension method of Object:

1
2
3
public static object Into(this Object obj, 
                        Func<object, object> f)
{  return f.Invoke(obj); }


My goal was to translate the Ruby example

(1..100).select(&:odd?).inject(&:+).into { |x| x * x }


in Raganwald's post to C#.

Which reads "Take the numbers from 1 to 100, keep the odd ones, take the sum of those, and then answer the square of that number."

But with the Object based extension method I had to do some ugly casts.

var r = Enumerable.Range(1, 100).Where(x => Odd(x)).Sum().Into(x => (int)x * (int)x);


With som added typing I could do:

var result = Enumerable.Range(1, 100).Where(x => Odd(x)).Sum().Into(x => x * x);


That merely moved the cast to the extension method and also made it work for integers only.

1
2
public static int Into(this Object obj, Func<int, int> f)
{ return f.Invoke((int)obj); }


Then I remembered generics and method type inference which finally led to a decent Thrush combinator in C#.

1
2
public static T Into<T>(this T obj, Func<T, T> f)
{ return f(obj); }


The casts are gone and it's also, as far as I can see, as flexible as the one in Ruby.

Contrived example follows:

1
2
var test = "ball";
var ball = test.Into(s => "Are we having a " + s + " yet?");


The odd part

The Odd(x) method call in the calculation above is a plain static method.

1
2
private static bool Odd(int n)
{ return (n % 2 != 0); }


If you want an even more terse syntax you could try an ext. method on IEnumerable like this:

1
2
public static IEnumerable<int> Odd(this IEnumerable<int> en)
{ return en.Where(n => n % 2 != 0); }


Gives:

var result = Enumerable.Range(1, 100).Odd().Sum().Into(x => x * x);


In C# I don't think it's possible to pull off the Symbol#to_proc stuff that Ruby does. That's the &: in the select(&:odd?) and the inject(&:+) in the Ruby example. Raganwald has a great post on that too.

Edit

Check out Jon Skeet's nice answer on StackOverflow to my question on how to make this even more Ruby-like. I have to try out that Operator class later though.

Edit 2009-10-07

One thing I found a bit surprising is that by implementing the Into ext. method in this way it not only works for all objects based on System.Object but it also works for value types.

1
2
3
4
int n=4711;
int oddOrZero = n.Into(x => x % 2 !=0 ? x : 0); // 4711
n = 4712;
oddOrZero = n.Into(x => x % 2 != 0 ? x : 0); // 0


Edit 2009-10-12

My confusion did stem from my lack of understanding of extension methods. Ex. methods are in fact not extending System.Object or any other type, they are "nothing more than a pleasant syntax for calling a static method" in case no instance method with the same name can be found.

A case for using only three different digits in keypad codes 4

Posted by Jonas Elfström Sun, 27 Sep 2009 19:02:00 GMT

Keypads have obvious security problems and keypads accepting a stream of digits with no # or enter in between, while checking for the four digit long code, are even worse.

The important part is to not leak the digits in the code by wear or intentional markings because if they leak it's suddenly very far from 10000 combinations.

If the "lock picker" only knows that the code contains four digits there are 10000 combinations. Keypads accepting a stream of digits can then be opened in a maximum of 10003 keystrokes using the De Bruijn sequence. That is still quite a lot.

Below is a Ruby implementation of the De Bruijn sequence.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# De Bruijn sequence
# Original implementation by Frank Ruskey (1994)
# translated to C by Joe Sawada
# translated to Ruby by Jonas Elfström (2009)

@n=4
@k=10
@a=[0]
@sequence=[]

def debruijn(t, p)
  if t>@n
    if @n%p==0
      1.upto(p) {|j| @sequence<<@a[j]}
    end
  else
    @a[t]=@a[t-p]
    debruijn(t+1,p)
    (@a[t-p]+1).upto(@k-1) {|j|
      @a[t]=j
      debruijn(t+1,t)
    }
  end
end

debruijn(1,1)
print @sequence.join


It's not uncommon to find keypads with 4 of the 10 keys worn down and if you do you can be pretty sure that the code contains those four different digits. The number of possible combinations are 4! = 4x3x2x1 = 24. I got curious to see if there's a kind of De Bruijn sequence for this that brings down the 4*24=96 keystrokes. By scribbling in a text editor I quickly realized there's not a clean sequence. Not clean in the way that a sequence following the rules can be created. Also it's probably even quite daunting to present it as mathematically dense and beautiful as the De Bruijn but that could be my less than great combinatorics speaking.

I made a quick and dirty brute force hack to try to find a shorter sequence.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
seq=[]
1.upto(4) {|a| 1.upto(4) {|b| 1.upto(4) {|c| 1.upto(4) {|d|
  seq << "%d%d%d%d" % [a,b,c,d] if !(a==b || a==c || a==d || b==c || b==d || c==d)
}}}}

s=seq[0]
seq.delete_at(0)
while (seq.length>0)
  next_code=(seq.select {|c| c[0..2]==s[-3..-1]})
  if next_code.empty?
    next_code=(seq.select {|c| c[0..1]==s[-2..-1]})
    if next_code.empty? 
      next_code=(seq.select {|c| c[0]==s[-1]})
      s+=next_code[0][1..3]
      seq.delete_at(seq.index(next_code[0]))
    else
      s+=next_code[0][2..3]
      seq.delete_at(seq.index(next_code[0]))
    end
  else
    next_code=(seq.select {|c| c[0..2]==s[-3..-1]})
    s+=next_code[0][3].chr
    seq.delete_at(seq.index(next_code[0]))
  end
end


The above code takes the first code "1234" of the 24 and then searches the rest of the array for a code beginning with "234". It finds "2341" and adds "1" to the end of s and continues to look for "341" and so on. Relatively soon there is no three digit match and then it tries two digits and eventually even that fails and then it gets the first one digit match. The resulting sequence is:

123412314231243121342132413214321

From 96 to 33 keystrokes. Not as effective as De Bruijn but still significant. Unlike De Bruijn I have absolutely no proof that this is the shortest one possible but it seems likely. Also notice that in the middle of the sequence we find "3121" and "1213". Those break the criteria of four different digits but they seem to be necessary to be able to enter the reversed mode. Try reading the sequence forward and backwards to see what I mean.

If the code only contains two digits it's gets even more trivial to try them all. There are 14 possible codes and by compressing those to one sequence you get down to 20 keystrokes.

Things get a little more interesting if only three buttons are worn. It turns out that the repeated digits can be placed in the code in six different ways.

0012,1002,1200,0102,0120,1020

That's 6x2x3=36 combinations and, maybe a little unintuitive, 12 more than if you are using four different digits. I compressed it down to 49 key strokes (16 more than with four different digits). Unlike the sequence for four different digits I can't find it with google and I know it's kind of security by obscurity but I still chose not to publish it here.

Be aware that if an attacker knows you are using a 0012-like code he gets a smaller space to search. 6x8x9x10=4320 instead of 10000. You have to weight the risk of button leaks against a code protocol leak.

Edit 2010-10-25
Uckelman noticed that the alike variable in the former version of the debruijn-script wasn't used so I removed it.

Why you should use four different digits for keypad locks 3

Posted by Jonas Elfström Wed, 23 Sep 2009 17:49:00 GMT

I made a couple of very bad mistakes in this article so I took it down. Hopefully I'm more on track in the sequel.

Breaking simple ciphers 7

Posted by Jonas Elfström Wed, 16 Sep 2009 18:19:00 GMT

The last few days I've happened to stumble over a couple of ciphers and I just couldn't help myself from trying to break them.

The Lost Symbol

Dan Brown has a new book coming out and part of the promotion is this cipher text "AOFACFSOA FSZWBEIC EIOA ZOHSFWQWOA OQQSDW". The WQW, QQ and three of the words ending with an A made me believe we could be dealing with a substitution cipher and maybe even a Caesar cipher, the most simple of them all.

My usual tool of choice is Ruby and in this case the splendid Interactive Ruby Shell.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ irb
>> s="AOFACFSOA FSZWBEIC EIOA ZOHSFWQWOA OQQSDW"
=> "AOFACFSOA FSZWBEIC EIOA ZOHSFWQWOA OQQSDW"
>> def caesar(text,n)
>>   alphas=('A'..'Z').to_a*2
>>   text.tr('A-Z', alphas[n..n+26].join)
>> end
>> 1.upto(25) do |n| puts "%2d. %s" % [n, caesar(s,n)] end
 1. BPGBDGTPB GTAXCFJD FJPB APITGXRXPB PRRTEX
 2. CQHCEHUQC HUBYDGKE GKQC BQJUHYSYQC QSSUFY
 3. DRIDFIVRD IVCZEHLF HLRD CRKVIZTZRD RTTVGZ
 4. ESJEGJWSE JWDAFIMG IMSE DSLWJAUASE SUUWHA
 5. FTKFHKXTF KXEBGJNH JNTF ETMXKBVBTF TVVXIB
 6. GULGILYUG LYFCHKOI KOUG FUNYLCWCUG UWWYJC
 7. HVMHJMZVH MZGDILPJ LPVH GVOZMDXDVH VXXZKD
 8. IWNIKNAWI NAHEJMQK MQWI HWPANEYEWI WYYALE
 9. JXOJLOBXJ OBIFKNRL NRXJ IXQBOFZFXJ XZZBMF
10. KYPKMPCYK PCJGLOSM OSYK JYRCPGAGYK YAACNG
11. LZQLNQDZL QDKHMPTN PTZL KZSDQHBHZL ZBBDOH
12. MARMOREAM RELINQUO QUAM LATERICIAM ACCEPI
13. NBSNPSFBN SFMJORVP RVBN MBUFSJDJBN BDDFQJ
14. OCTOQTGCO TGNKPSWQ SWCO NCVGTKEKCO CEEGRK
15. PDUPRUHDP UHOLQTXR TXDP ODWHULFLDP DFFHSL
16. QEVQSVIEQ VIPMRUYS UYEQ PEXIVMGMEQ EGGITM
17. RFWRTWJFR WJQNSVZT VZFR QFYJWNHNFR FHHJUN
18. SGXSUXKGS XKROTWAU WAGS RGZKXOIOGS GIIKVO
19. THYTVYLHT YLSPUXBV XBHT SHALYPJPHT HJJLWP
20. UIZUWZMIU ZMTQVYCW YCIU TIBMZQKQIU IKKMXQ
21. VJAVXANJV ANURWZDX ZDJV UJCNARLRJV JLLNYR
22. WKBWYBOKW BOVSXAEY AEKW VKDOBSMSKW KMMOZS
23. XLCXZCPLX CPWTYBFZ BFLX WLEPCTNTLX LNNPAT
24. YMDYADQMY DQXUZCGA CGMY XMFQDUOUMY MOOQBU
25. ZNEZBERNZ ERYVADHB DHNZ YNGREVPVNZ NPPRCV

Take a closer look at row 12.

MARMOREAM RELINQUO QUAM LATERICIAM ACCEPI

I found Rome a city of bricks and left it a city of marble. - Google tells me it's Augustus.

The code is not the most clear I've written but if you read Ruby in your sleep you can skip this part.

('A'..'Z') is a range in Ruby. Another, maybe more obvious, example of a range is (0..7).

.to_a could be read as to_array and unsurprisingly it converts a range to an array. (0..7).to_a will create [0, 1, 2, 3, 4, 5, 6, 7]

The operator * for arrays appends n copies of the array. Thus [0,1,2]*2 will create [0,1,2,0,1,2].

String#tr works the same way as the Unix command tr, it translates the characters in the string according to the from and to parameters.

At last .join converts the array to a string.

The recruiting agency

A government agency responsible for signals intelligence is hiring. Among the qualifications they are looking for is the ability to break a certain cipher. I will not publish their cipher here but instead one of my own, constructed in the same way as theirs.

"VGhpcyBpcyBleGNsdXNpdmUgZm9yIHlvdSwgb3I/IGMNR0d
LCkZPXgpTRV8K\nTENEQ1lCCkhfXgpoT1NFRElPCkNZCksKSE9eXk
9YCklYU1peRU1YS1pCT1gK\nXkJLRAp5SUJET0NPWAQ="
At first glance it looked like Base64 and the ending "=" made it even more likely.
1
2
3
4
5
$ irb
>> require 'base64'
>> cipher = "VGhpcyBpcyBleGNsdXNpdmUgZm9yIHlvdSwgb3I/IGMNR0dLCkZPXgpTRV8K\nTENEQ1lCCkhfXgpoT1NFRElPCkNZCksKSE9eXk9YCklYU1peRU1YS1pCT1gK\nXkJLRAp5SUJET0NPWAQ="
>> decoded=Base64.decode64(cipher)
=> "This is exclusive for you, or? c\rGGK\nFO^\nSE\nLCDCYB\nH^\nhOSEDIO\nCY\nK\nHO^^OX\nIXSZ^EMXKZBOX\n^BKD\nyIBDOCOX\004"

So it's Base64 but to no surprise it didn't end there. The "This is exclusive for you, or?" hinted at XOR so I tried XORing the text with 0-255.

1
2
3
4
5
6
7
8
>> code=decoded[31..decoded.length].split(//)
>> File.open('xor.txt','w') { |file|
?>   0.upto(255) {|n|
?>     file.write(n.to_s + " ")
>>     code.each {|c| file.write( (c[0]^n).chr ) }
>>     file.write("\n\n")
>>   }
>> }

A quick look in the file told me that XORing with 42 was the solution.

Now you know how to break two of the most simple cipher methods. Use the knowledge wisely. :)

Older posts: 1 ... 3 4 5 6 7 ... 12