| udim ( @ 2006-11-06 12:28:00 |
Tiny Mix Tapes to ATOM
The ATOM feed: http://stuff.pulkes.org/tmt2atom.ph p
php source
So I wasted a Saturday creating a website-to-rss php script for sites that don't have rss. Anyway I went back and forth between trying to use the XML parser, writing my own HTML parser, and trying to find an already written HTML parser:
I saw the light when I gave up on making a general script that could fit any website and resorted to old-fashioned regular expressions. Much faster, and it still ended up being a general script.
Now I just have to wait for TMT to update to see if it really works.
Update: 1. It works. 2. JWZ already did this a long time ago.
keywords: atom, rss, feed, tinymixtapes, tiny mix tapes, tmt2atom
The ATOM feed: http://stuff.pulkes.org/tmt2atom.ph
php source
So I wasted a Saturday creating a website-to-rss php script for sites that don't have rss. Anyway I went back and forth between trying to use the XML parser, writing my own HTML parser, and trying to find an already written HTML parser:
- Trying to parse HTML as XML doesn't work. Even if you strip most of the tags and add a dummy enclosing tag. XML is just too anal (at least PHP's) and most HTML is buggy (unescaped &'s for instance).
- html tidy wasn't compiled in with dreamhost's php, and when I tried rolling my own I found they didn't have libtidy installed and I decided to give up on it.
- Writing my own parser, I couldn't shake the nagging feeling that I was reinventing the wheel. Also, it didn't take long (only a couple of hours, but I have MANY hours to spare) to reach the first hurdle: PHP is SLOW! And then I remembered that PHP's XML parser uses libexpat, which is written in C, and it all went downhill from here.
I saw the light when I gave up on making a general script that could fit any website and resorted to old-fashioned regular expressions. Much faster, and it still ended up being a general script.
Now I just have to wait for TMT to update to see if it really works.
Update: 1. It works. 2. JWZ already did this a long time ago.
keywords: atom, rss, feed, tinymixtapes, tiny mix tapes, tmt2atom