Feeding Nightmare

Don't worry my friends: despite of the title of this post, I've been eating ok. The problem was not about food this time, it was about RSS Feeds and some libraries that deal with those, which I had to use for a project recently.

I needed a blog-planet-like solution for a website we are mounting with Joomla. Yet it wasn't the simple type of read-them-all planet (like Planet Planet), because it needed admin approval and should integrate to Joomla. Because of the perpetual-polling mode needed, there was no Joomla module that could help us there (Joomla is not a server), and because of the level of specialization, I had to roll our own solution to it.

I was warned, through this book I once read (which I strongly recommend for anyone that has anything to do with web programming) that dealing with RSS Feeds was not exactly straightforward. The obvious thing was that I needed a library, so I started searching for what RSS Joomla modules were using. I found Simple Pie, which seemed to be a pretty straightforward standard solution (and had a beautiful website :) not that it would help me code better, but anyways). I coded a little sample and tried it out... and nothing happened. I moved around some stuff in the code, tried again and nothing. I was pretty sure I was following instructions correctly so I couldn't imagine what was wrong... until I read the documentation with more detail and grepped the code for the $_SERVER variable and there it was. I then uploaded the code to my personal web server and tried it out from the web browser and it magically worked, so it was clear to me that Simple Pie requires to be inside a server to run. Given the fact that we'd probably want to run the polling in a separate server (not the web server), I had to discard that library and it's beautiful website.

My next try was Mark Pilgrim's Universal Feed Parser which had the great disadvantage of being in Python, a language I can code in but I avoid, specially because PyDev hasn't been a great solution for me as an Eclipse plugin (don't waste your time suggesting me another IDE... it's Eclipse's way or no way; instead support the Dynamic Languages Toolkit project for Eclipse :)). Yet it's the one Planet Planet uses, so it had to be good enough for my purposes.

And it kinda was, I tried out some code samples based on intuition and the small documentation and got a little program working. I learned some things while doing so, for example the fact that Blogspot has no summary information for posts like WordPress does (i.e., WordPress rulz) so I had to do a workaround for that (i.e., Blogspot sucks). I then started to build all the rest of the stuff: DB integration with MySQL module, MIME emails with email module and smtplib module (hard to find helpful documentation on this) and everything was working quite fine (despite the spaghetti code I came up with) until I checked that the sample e-mails the script was sending had my author name instead of the title of my post. I was running tests with my blog and di3go's one and all of di3go's posts were looking ok. I was really puzzled, and I started checking everything from my code to the XML output of both pages... nothing seemed wrong. I finally had the idea of checking the Universal Feed Parser's buglist, and there it was, a bug that only happened with WordPress (I'm a lucky guy, huh?) so I downloaded a nightly build from the code and everything was working fine then.

I tgz'd the code up and sent it away thanking God it was all over. All of this took me like 4 hours though, so I'm pretty sure I paid my RSS Feeds nightmare hours of the year and don't want to be dealing with those anymore at least in the short term. Now some final disclaimers: Simple Pie looks like a very cool solution; Universal Feed Parser is a great tool, and simplifies the job a lot (that's as simple as you'll get because of the difference between the RSS, Atom and mixed standards out there, and each site's implementation of those); so my props to these two projects.