Archive for Aggregators

Azeem Ahmad – Final Project Proposal (Detailed)

Project Outline –

Many internet users travel to various news websites to digest daily occurrences around the world. Most often have e-mail subscriptions (no matter how dated the service may be), and it is often tedious having to trawl through several websites to find the news the user is looking for.

A lot of people are now taking advantage of RSS, or Really Simple Syndication – a method of linking through a ‘feed’ that can provide up to 15 or more links (in this case, news stories) at once. This is a good new media technology, but it has room to be developed more. Subscribing to an RSS feed is anonymous, hence eliminating the need for e-mail news subscriptions, but placing several feed boxes on a browsers toolbar can seriously overload the screen with unnecessary information.

For my project I aim to manipulate four popular news websites, BBC News, Sky News, The Guardian, and The Mirror, and converging them into one feed so that online news users only need to subscribe to one feed.

I plan to do this by creating a one big feed, essentially a mash-up of the four feeds mentioned above. I will also attempt to categorise the links inside of the feed by genre so that browsing through the single feed is easier. Also, by categorising, I am also able to promote this product as an RSS 2.0 feed, rather than simply an RSS feed, thus enhancing potential attention from interested parties.

Market Research –

One very successful website that does this is http://imooty.eu, a news aggregator for the whole of Europe. Imooty allows users to pick and choose which particular publications they want to see an RSS breakdown (of top headlines) on the users ‘my imooty’ page – a clever individualisation for the website. Such personalisation of the site will keep viewers and readers coming back to the site for more, and more regularly, as all of the news they wish to digest is on one page. Imooty also has options to view the news for the rest of Europe, and is a very successful website.

Upon viewing the page source code for imooty, it is clear to see that the technical side of the website is heavily reliant on javascript coding, and embedding CSS into RSS coding, both of which are advanced coding procedures. For my project I simply intend to create a single feed which is a mash-up of four feeds, rather than creating a whole website.

I intend to create my product by obtaining and re-writing the RSS feed codes from the four sites mentioned, including date, category and GUID (Globally Unique Identifier) tags – so that any copyright issues are avoided. A simple breakdown of how one link in one section of the whole feed will look something like this:

<xml type="text/xsl">
 
<rss xml: version="2.0">
  <channel>
        <item>
      <title>Saudi king chides UK on terrorism</title>  
      <description>Saudi Arabia's King Abdullah accuses Britain of doing too little to fight international terrorism.</description>  
      <link>http://news.bbc.co.uk/go/rss/-/1/hi/uk/7066867.stm</link>  
      <guid isPermaLink="false">http://news.bbc.co.uk/1/hi/uk/7066867.stm</guid>  
      <pubDate>Mon, 29 Oct 2007 11:21:57 GMT</pubDate>  
      <category>UK</category>  
      
    </item>  

As seen from the box above, the link is the first story from the BBC News front page, taken at 1237 on Monday 29th October 2007. The date is relevant as the feed constantly is self-updating.

Included in the sample feed is the formatting of the code, in this case, XSL (Extensible Stylesheet Language) – this method eliminates the need to include CSS into the code. As with every RSS feed, the ‘version’ and ‘channel’ tags remain present, so too do the ‘item’ and ‘description’ tags.

However, the features that make this an RSS 2.0 feed, are the other tags. In this case, because the BBC is the original creator of the page (containing the story), there is no need for a GUID, hence the ‘false’ answer to the tag – the link simply points to the page that it is describing.

 

Also included is the publication date, and the category. These two categories are often overlooked when people create RSS feeds, but are now becoming increasingly relevant as the latest versions of Microsoft Internet Explorer and Mozilla Firefox now include standard options into their browsers that tell users when RSS feeds are present on a page, and also allow users to browse directly through a feed by category.

 

Conclusion –

 

To sum up, my product will be a single RSS feed that is an amalgamation of four mini RSS feeds from BBC News, The Guardian, The Mirror, and Sky News. I will also be including features that will make this a Web 2.0 technology, such as:

·         The publication dates of the feeds, so that they are self updating

·         A GUID – so that the feed that I have created is unique

·         Category tags, so browsing through the feed is easy

·         XSL encoding, so that the need for CSS is eliminated.

Advertisements

Azeem Ahmad – Week 2 – What is RSS?

This week I have spent researching RSS for my final project. I decided, with some help from the lecturer, that doing something with RSS will make me more employable as a potential journalist. So without further ado, here is my understanding of what RSS is, how to use and understand it, and other topics which interested me whilst I researched it.

What is RSS?

  • Really Simple Syndication
  • Rich Site Summary

RSS solves a problem for people who regularly use the web. It allows people to easily stay informed by retrieving the latest content from the sites they are interested in, rather than users having to trawl through various sites and signing up to newsletters.

RSS also solves a multitude of problems that webmasters commonly face, such as increasing traffic, and gathering and distributing news. It can also be the basis for additional content distribution services.

The number of sites offering RSS feeds is growing rapidly and includes big names like Yahoo News, BBC, Sky Sports, Mirror.

Syndic8 offers a directory of the most popular RSS feeds of the internet.

How to use and understand RSSRSS defines an XML grammar (a set of HTML-like tags) for sharing news. Each RSS text file contains both static information about the site, plus dynamic information about the new stories, all surrounded by matching start and end tags.Each story is defined by an <item> tag, which contains a headline TITLE, URL, and DESCRIPTION. Here’s an example:

...
<item>
  <title>RSS Resources</title>
  <link>http://www.webreference.com/authoring/languages/xml/rss/</link>
  <description>Defined in XML, the Rich Site Summary (RSS) format has
  quietly become a dominant format for distributing headlines on the Web.
  Our list of links gives you the tools, tips and tutorials you need to get
  started using RSS. 0323</description>
</item>
...

Each RSS channel can contain up to 15 items and is easily parsed using Perl or other open source software. Perl is an open source scripting program that is faster than C. Coincidentally, Perl is written in C and both programs occupy a large amount of CPU time.

SyndicationPublishing an RSS feed is just the beginning. RSS, really a mini database containing headlines and descriptions of what’s new on your site, is a natural for layering on additional services. In addition to displaying the news on other sites and headline viewers, RSS data can flow into other products and services like PDA’s, mobile phones, and even voice updates. Email newsletters can easily be automated with RSS. In this Web-like way RSS encourages multiple points of entry to one primary article, rather than multiple copies of the same article (which introduces its own maintenance problems). As Google shows, the sites with the most back-links win, and those with the freshest content also win. RSS, therefore, creates a win-win situation. Once you have data in a standardized format, new forms of content distribution channels are only limited by your imagination, and scripting ability.

RSS Aggregators (from webreference)There are a number of RSS news aggregators out there that automatically suck up RSS files from content providers and present the news in a variety of ways (my.netscape.com, my.userland.com, xmltree.com, moreover.com). Many make it easy to drop an RSS feed into your site. In fact, O’Reilly’s new Meerkat Open News Wire service, is an example of what can be done with RSS and some clever code. Meerkat aggregates the currently available technical RSS feeds, and filters new stories by time, topic, keywords, and even regular expression. Narrowing the new stories down to your interests is a breeze, all entirely automated.O’Reilly Network’s President and CEO Dale Dougherty: “What interests me about RSS is the ability to begin to monitor the flow of new information on the net. We all know what sites exist; what we really want to know is how often sites generate new information. As a writer and editor, I thought Meerkat would be valuable to watch what was happening in different technical communities. What I especially like about RSS and looking at feeds from hundreds of sites is that you can see the Web work at a grassroots level. I thought that Meerkat is the kind of tool I’d want to keep track of what is going on. We realized that this wasn’t just useful to editors but to anyone who wants to be able to respond to new information. I’m not sure where Meerkat will take us, but it feels like it’s opening up a remarkable new view of the Web. We’d really like to see more and more sites become RSS-enabled. RSS can do for them what Yahoo did for them in 1994, which is drive traffic by letting others know what you are doing. The difference is now we can notify others not just of a new site, but of new stories — new activity on our site.”The Future of RSSThanks to the efforts of the likes of Jonathan Eisenzopf (webreference.com), Dave Winer and Netscape, future versions of RSS will incorporate popular additional fields like news category, time stamps, and more. With thousands of sites now RSS-enabled and more on the way, RSS has become perhaps the most visible XML success story to date. RSS improves news distribution by making everyone a potential news provider. It utilises the Web’s most valuable asset – content – and makes displaying high-quality relevant news on a website easy.

What this means for my project

After all of that background knowledge of RSS and the possibilities of it, I think for my project i’m going to ‘mash-up’ around four popular RSS feeds and combine them into one. There is a CMS online that allows people to do this in a very basic way – www.rssmix.com, but it is very basic, and i would prefer to learn how its done rather than letting a machine or software do it for me. It will make me more employable as a journalist and add to my skills base so potential employers will rub their hands with glee when they see me applying for a job!

Sources

My apologies for not listing the sources earlier – my main source of information was from here. I did also browse and read a few other websites about RSS, but not in enough detail to include it into this post, or to reference it.