Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Turn Twitter Into Your Personal Assistant

The combination of social networking with public messaging, link posting, and subscriptions leads to impressive synergy effects, but there is also the drawback of information overload, information loss, distraction, and content redundancy.


advertisement
icroblogging service Twitter has become a disruptive everyday tool. It is increasingly replacing not only instant messaging clients, but also social bookmarking sites, interest tracking applications, support forums, email, and (to a certain extent) classical blogs.

A few simple conventions, together with RDF and SPARQL, can turn your Twitter feeds into rich information streams, which you can then use for a more productive microblogging experience.

The following sections explain how to:

  • Enhance microposts with machine-extractable data
  • Query the extracted data with SPARQL
  • Generate custom streams and reports to support your personal workflow
You can reproduce the examples in this article with the supplied source code that contains early components of a semantic microblogging system.



System Setup

Download the source code archive (smesher.zip) and copy its contents to your web server. Follow the setup instructions in the readme.txt file, you will need Apache, PHP, and a MySQL database. The project consists of five directories, with application-specific sub-directories in code/ and themes/:
  • cache: should be write-enabled, used for CSS and JavaScript documents
  • code:
    • arc: the core RDF toolkit
    • trice: reusable framework components
    • smr: the project controller, custom scripts, and templates
  • config: database configuration and path dispatching rules
  • logs: should be write-enabled, used for system messages
  • themes: CSS and images

Step 1: Subscribe to Your Twitter Feeds

First, you need some input data to work with, such as the most recent posts mentioning your username or interesting keywords (see Figure 1). Luckily, Twitter provides Atom feeds for all pages, and the demo system includes an Atom-to-RDF converter, so you don't have to learn how the Twitter API works. You can directly import user timelines and search results instead. Click Settings in the upper right navigation to open a simple Feeds form. For the sake of simplicity, you only have to enter your username and a set of tags that are then used internally to generate corresponding feed URLs.

When you are done, return to the main screen by clicking on the logo in the upper left corner. Instead of cronjobs or background processes, the demo simply checks and periodically refreshes your subscriptions when you access the start page. After a few seconds (you might have to reload the page to see the changes), the first items should appear, as shown in Figure 2.


Figure 1. Import settings: Based on the provided information, the demo application imports a selection of microfeeds.
 
Figure 2. Initial timeline: So far, the microposts can (only) be filtered by author.

Step 2: Explore the Data

The individual items carry a number of structured elements, which you can use for formatting (for example, displaying an image instead of a raw avatar URL) or basic filtering (for example, by author). Together with SPARQL's
 
Figure 3. SPARQL API Example: The COUNT feature is not part of the current SPARQL specification yet, but a new W3C Working Group just launched to explore aggregate functions and similar extensions.
REGEX command, you can already run some interesting queries against the API at /sparql. For example, this SPARQL query returns the names and Twitter accounts of people who mentioned "Berners-Lee" in their posts:

SELECT ?author ?account WHERE { ?post a sioct:MicroBlogPost ; dc:creator ?author ; sioc:has_creator ?account ; content:encoded ?content . FILTER(REGEX(?content, "Berners-Lee", "i")) }

Not too spectacular data-wise, but the exciting thing here is the fact that a semantic API lets you retrieve exactly the elements that you need (see Figure 3). Twitter's search feature can only return a list of posts, SPARQL allows you to generate a list of persons, or dates, or any other available attribute. This greatly simplifies data integration and repurposing.

Step 3: Increase the Granularity

While the default structures are a handy starting point, the really interesting data is still hidden in the post's body. People are addressed (leading @name) or mentioned (@name somewhere in the text), hashtags (#tag) and links (http://...) are embedded, and quoted Tweets are marked up with a leading RT.

The demo system contains a PHP class (located at code/smr/SMR_RDFExtractor.php) that auto-extracts these elements from the otherwise opaque content and turns them into RDF triples. The converter is based on simple regular expressions and you can extend them with custom patterns (more on this later).

After the granular information is added to the RDF store, you may add respective filters to the main view. The facets are defined in code/smr/options/SMR_Options_DefaultBox.php. You can add entries to the getTabs method, and then write a matching method where the SPARQL pattern with its RDF relation is specified:

 
Figure 4. Filtered Stream: The advanced facets helps you find out who re-tweeted any of your posts, or posts that contain a certain link, or popular links in general.

function getTabs() { return array( [...] 'tags' => array('label' => 'Tags'), 'users' => array('label' => 'Mentioned Users'), 'links' => array('label' => 'Links'), ); } function getUsersTabHTML() { $pattern = '?res smr:mentionedUser ?val . '; return $this->getFilterList($pattern, 'smr:mentionedUser'); } function getLinksTabHTML() { $pattern = '?res smr:link ?val . '; return $this->getFilterList($pattern, 'smr:link'); }

The application can now generate clickable filter lists (see Figure 4). You can browse your subscriptions in a more fine-grained way, for example, to discover popular links or socially active users.


Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap