Counting the number of Google Readers
Posted by Jonas Elfström Mon, 19 Oct 2009 19:55:00 GMT
I run this blog on a 9 year old laptop hidden in a cabinet in the living room. It's not a powerful machine but it has been up to the job since I turned it into a web server 7 years ago. This could maybe be one of the last HP Omnibook 4150b still in use, at least it has to be in a very exclusive club of laptops being switched on for the past 7.5 years. Recently I've seen an increase in traffic and especially from Feedfetcher-Google. It so happens that Feedfetcher also shows the number of subscribers.
[19/Oct/2009:22:01:19 +0200] "GET /xml/rss20/feed.xml HTTP/1.1" 304 0 "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 4 subscribers; feed-id=7686756599804593322)"
The above is only one out of five different feed-ids because I have both atom and rss and for a short while this blog was at another address. The fifth feed is actually myself subscribing to the comments.
I'm not using FeedBurner so I can't get my statistics from there but I still wanted to be able to see the number of Google Readers of my blog (as far as I can see I only have one other type of subscriber).
Usually I script anything more advanced than a grep in Ruby but this time I made an exception and stayed in Bash.
| 1 2 3 4 5 6 7 8 9 | tail -1000 /www/logs/access.log | grep Feedfetcher | cut -d ";" -f 4 | sort -u | while IFS= read -r line do tac /www/logs/access.log | grep -m 1 $line done | sed 's/^.*html; \([0-9]*\) subscribers.*/\1/' | awk '{tot=tot+$1} END {print tot}' | 
Most certainly this can be optimized in a number of ways. Don't be shy, just tell me!
So what's going on there? Well, first I get the last 1000 rows from my access log and right now my traffic is so low that that is way more than I really would have to. Then I get all unique feeed-ids from the rows containing Feedfetcher. I pipe those to a loop that gets the very last access for each one of them. Then I parse out the number of subscribers with a regexp in sed and count them with awk .
It turns out that I have a whopping number of 14 15 subscribers and I am one of them.
