• Feed

  • « | Main | »

    Oops I did it again. Feedburner top 100

    By maurizio | September 17, 2007

    I am dumb. Instead of doing some test first, I started my crawler with the wrong functionality. Remember last week when I said that I was looking for the wrong links? Well I did a mistake again. Go and check the stats page again. Instead of looking for an IMG with a SRC equal to “feedburner.com”, I started to look for a A-link with the HREF equal to “feedburner.com” which seemed reasonable to me.

    But I forgot one simple thing. People like to link interesting feeds on their blogs. Oops.

    So now the Top 100 is even worse than before, because what you see now is more legitimate than the previous mistake. Actually even the previous “mistake” wasn’t that bad (apart for the first 2-3 links that stole TechCrunch feed’s numbers) because it’s perfectly legitimate to post someone else’s chicklet, for example to create a Top100 list :-)

    So now I’m stranded. I’ll probably have to look for some more advanced technique, like looking for the LINK tag with src equal to feedburner.com.
    That sounds like a good idea, but I’m pretty sure not every site use it.

    So this week I have to start again another huge crawl. I wonder if it’s time to store html pages on my server.
    360.000 pages X 10 kb html = 3,6 Gb. Ouch.

    Topics: Content Creation, Programming, Ramblings | No Comments »

    Read other related posts:

  • Feedburner Top List
  • Feedburner Top 100
  • Top 100 Feedburner sites on my Blog List
  • Comments

    Subscribe without commenting