Sporkmonger

purveyor of fabulously ambiguous eating utensils

FeedTools Schema And Other Short Stories

Posted by sporkmonger
Written September 27th, 2005

Now that FeedTools no longer automatically creates the database schema for you, I thought it might be best to put the schema files into rdoc. Of course, rdoc runs those schema files through its text formatter, and the schema files come out more typographically correct on the other side. Except that that’s not really what we want. After a little bit of experimentation, I discovered that if I prefixed the SQL with a SQL comment, and then 2-space indented the SQL that followed the comment, that it would get parsed by rdoc in such a way that you could still copy-paste from the docs straight to whatever SQL frontend you happen to be using.

E.g.:
1
2
3
4
5
6
7
8
9
10
11

-- Example PostgreSQL schema
  CREATE TABLE feeds (
    id                SERIAL PRIMARY KEY NOT NULL,
    url               varchar(255) default NULL,
    title             varchar(255) default NULL,
    link              varchar(255) default NULL,
    xml_data          text default NULL,
    http_headers      text default NULL,
    last_retrieved    timestamp default NULL
  );

By the way, does anyone know of a good SQL frontend for PostgreSQL for OS X? pgAdmin3 crashes on me every 5 seconds or so, and that’s more than a little irritating. At this point, I don’t even care if it’s free/open-source (though that’s a huge bonus). I just want something that works well and doesn’t look hideous.

FeedTools also got a significant speed-up for instances in which http redirection occurs, and the url doesn’t get updated (usually because it’s a permanent redirection instead of a temporary one). In other words, the cache gets updated with the new url, but the open method continues to get called with the old url. FeedTools used to be unaware of the updated feed in the cache and would go out and pull the feed again. This has been changed so that now FeedTools will check the cache before following a redirection to see if the feed is in the cache already and to see whether it’s expired or not. While this definately does increase the number of cache misses during redirection, misses are pretty painless, and the potential speed-up for a hit far outweighs the potential slow-down from the extra misses. I’ll take one or two extra SQL queries over an unnecessary HTTP request any day of the year.

HTTP error messages should now include a list of locations that FeedTools was redirected through before hitting the error. This was inserted primarily for the purposes of debugging.

I removed the global FeedTools.cache_only option in favor of a more granular approach. You can now say:

1
2
3
4

feed = FeedTools::Feed.open(
  'http://rss.slashdot.org/Slashdot/slashdot',
  :cache_only => true)

You may notice that I removed the attribute dictionary functionality. If you were using it, sorry about that, but I decided it was too ugly and hackish, not to mention slow. It had to go.

I split the feed_tools.rb file into a couple pieces as well. No more 5000 line files that are a huge pain to navigate.

FeedTools should now also automatically detect User-Agent blocking and deliver a warn if it runs into that.

Update:

The :cache_only configuration option has been renamed to :disable_update_from_remote. So the code should now be:

1
2
3
4

feed = FeedTools::Feed.open(
  'http://rss.slashdot.org/Slashdot/slashdot',
  :disable_update_from_remote => true)
  1. Written November 29th, 2005 at 04:12 PM

    MacSQL3 was ok, going to try out Navicat. I’ve given up on the free frontends for PostgreSQL on OSX.

  2. Written January 7th, 2006 at 03:51 PM

    To the people searching for serialz for Navicat and MacSQL 3 and finding this post:

    Please just buy the software. They gave you a free trial to evaluate it. Be good and pony up the cash if you found it useful.

    FYI, my logging software tells me your IP address, approximately where you live (Tokyo), and what you were looking for. You aren’t really anonymous. Play by the rules, it’s much nicer that way.

  3. Todd Boland Todd Boland :
    Written January 26th, 2006 at 12:06 PM

    Squirrel SQL Client is awesome for any SQL based database. It’s written in Java so you need the java drivers. Give it a try. I usually HATE Java based desktop apps and IDEs but Squirrel SQL is awesome. It doesn’t have that nasty ass faux-native os x java skin.

    http://squirrel-sql.sourceforge.net/

  4. Written January 26th, 2006 at 05:41 PM

    Ehh, I don’t think I like the look of the interface from the screen shots. And besides, Navicat is pretty much the perfect solution for my needs.

Leave a Response

NOTE: I'm afraid Javascript needs to be on in order to comment.

Comments should be formatted using Textile.

Ruby code should be enclosed within a <macro:code lang="ruby"> element. Other languages are supported. For output you can simply omit the lang attribute.