Making a RSS feed by hand

RSSs are cool. They allow people to follow webpages. They don’t get stuck in spam filters. They give people power over the web. I wanted to create a RSS feed for the Alife newsletter. The Alife newsletter is built from hand-made python scripts that parse multiple markdown files, so I had to learn how to make RSS feed by hands from those markdown scripts.

Part 1: The RSS format.

A well formed RSS file looks like this:


<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Title of the feed</title>
<link>Link to your webpage</link>
<description>Description of your feed</description>
<language>en-US</language>
<item>
<title>Title of one news item</title>
<link>Link to the item</link>
<guid>Unique id</guid>
<pubDate>27 Nov 2013 15:17:32 GMT</pubDate>
<description>The information to be displayed by the reader. Could be everything.</description>
</item>
</channel>
</rss>

Here is the detailed RSS format specification.

Part 2: Generating the items

An item requires a timestamp, but we didn’t have anywhere to store the timestamp, so I added a release_date.dat file to each directory with the release data in yyyy-mm-dd format.

It also requires a time in GMT. Usually it is a little bit of a nightmare to check the local date-time of a server you don’t control (in this case, whatever server github uses for its continuous integration). Luckily, the newsletter is released once a month, so time is immaterial. I just put 00:00:00 GMT there.

Note that the description is empty, we are going to fix that later.


import os
from datetime import date

def getDate(edition):
dateFile = os.path.join(edition,"release_date.dat")
rdate = ""
if os.path.exists(dateFile):
with open(dateFile,"r") as f:
dateText = f.readline()
rdate = date(*[int(s) for s in dateText.split("-")])
return rdate

def makeTitle(edition):
nmb = str(int(edition[8:]))
if nmb[-1]=="1":
nmb += "st"
elif nmb[-1]=="2":
nmb += "nd"
elif nmb[-1]=="3":
nmb += "rd"
else:
nmb += "th"
return f"The {nmb} edition of the Alife Newsletter"

def makeItem(edition):
nDate = getDate(edition)
titleDate = nDate.strftime("%B %Y")
pubDate = nDate.strftime("%d %b %Y 00:00:00 GMT")
link = "https://alife-newsletter.github.io/Newsletter/"+edition+".html"

item = []
item.append("<item>")
item.append(" <title>"+makeTitle(edition)+", "+titleDate+"</title>")
item.append(" <link>"+link+"</link>")
item.append(" <guid>"+edition+"</guid>")
item.append(" <pubDate>"+pubDate+"</pubDate>")
item.append(" <description>")
item.append(" </description>")
item.append("</item>")
return "\n".join(item)

Part 03 Putting the items together

Now that we can generate the items, a simple script put all the items together, and adds the header and the footer of the RSS.


rss_header = """<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Artificial Life Newsletter</title>
<link>https://alife-newsletter.github.io/Newsletter/</link>
<description>The Artificial Life Newsletter Brings you the latest alife news!</description>
"""
rss_footer = """
</channel>
</rss>
"""
with open(os.path.join("docs","RSS.xml"),"w") as f:
f.write(rss_header)
for e in edition_names:
f.write(makeItem(e))
f.write(rss_footer)

Be careful! The xml file format is very particular about empty lines. In my first version of this script, there was a blank line at the beginning of the file, and my RSS reader refused to recognize the file as valid…

Part 04 Adding the content from the newsletter

The above RSS is already workable. An RSS reader will be able to see when new articles are posted, and take you to the webpage of the article.

However, one of the great benefits of an RSS reader is the ability to read the article right there, without going to the original webpage! So we will add the content of the newsletter to the feed.

This is a simple matter of stripping the HTML headers and footers from each newsletter edition. The big problem is that to include the HTML in the RSS, you need to put it inside a <![CDATA[]]> tag, like this:


def getContents(edition):
htmlFile = os.path.join("docs",edition+".html")
appender = False
contents = []
with open(htmlFile,"r") as f:
for l in f.readlines():
if "<h1 " in l:
appender = True
if "</div>" in l:
appender = False
if appender:
contents.append(l)
return "\n".join(contents)

item = []
item.append(" <item>")
item.append(" <title>"+makeTitle(edition)+", "+titleDate+"</title>")
item.append(" <link>"+link+"</link>")
item.append(" <guid>"+edition+"</guid>")
item.append(" <pubDate>"+pubDate+"</pubDate>")
item.append("<description><![CDATA[") # Open the CDATA tag
item.append(getContents(edition)) # Add the contents
item.append("]]></description>") # Close the CDATA tag
item.append(" </item>")
return "\n".join(item)

Part 05 Making the RSS discoverable

After making the script that generates the RSS feed, I need to allow people to find it in the webpage. This means creating links inside the webpage itself, as well as adding the following tag to the head area of the page:


<link rel="alternate" type="application/rss+xml" title="Alife Newsletter RSS Feed" href="https://alife-newsletter.github.io/Newsletter/RSS.xml" />

This will make a little RSS icon appear on top of the browser, indicating to visitors that the website has a RSS feed.

06 — The End

And that’s it! Soon (as soon as I get the PR approved) you can follow the Alife Newsletter without an e-mail reader, or a twitter account, from the confort of your home!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.