To use the public web service, post the URL-encoded license, content, and parameters to
http://api.opencalais.com/enlighten/rest/.
If successful, the response is an RDF/XML file. You can parse the file directly or import it into an RDF store. Sesame,
a leading RDF framework, provides parsers and storage for RDF content. The following Java code, which you can find in the
Crawler.java in the downloadable code, imports the results.
private Reader post(CharSequence text) throws IOException {
StringBuilder sb = new StringBuilder(text.length() + 1024);
sb.append("licenseID=").append(encode(licenseID));
sb.append("&content=").append(encode(text));
sb.append("¶msXML=").append(encode(getParamsXML()));
URLConnection connection = new URL(API_URL).openConnection();
connection.addRequestProperty("Content-Type",
"application/x-www-form-urlencoded");
connection.addRequestProperty("Content-Length", valueOf(sb.length()));
connection.setDoOutput(true);
OutputStream out = connection.getOutputStream();
OutputStreamWriter writer = new OutputStreamWriter(out);
writer.write(sb.toString());
writer.flush();
return new InputStreamReader(connection.getInputStream());
}
private Repository createRepository() throws RepositoryException {
File dataDir = new File("data");
Sail store = new NativeStore(dataDir);
Repository repository = new SailRepository(store);
repository.initialize();
return repository;
}
private void add(Reader reader)
throws RepositoryException, IOException, RDFParseException {
RepositoryConnection con = repository.getConnection();
try {
con.add(reader, "", RDFFormat.RDFXML);
} finally {
con.close();
}
}
Visualizing Relationships
After you import a collection of document metadata into an RDF store, you can synthesize it to derive new assets of
information based on extracted data. Aduna's Cluster Map technolog can visualize the relationships between documents
(through named entities) and between named entities (through facts and events).
Figure 2, a Document Cluster Map, shows the highlighted document from un.org, which contains
references to the industry terms "greenhouse gas emissions," "food crisis," and "food security."
Figure 3, a Named Entity Cluster Map, shows the named entity "George W. Bush" holds the position of President of the "United States." It also shows 107
countries and people have or hold the position of President. Using the Named Entity Cluster Map, the foreign minister
of France is seen as Bernard Kouchner and the President as Nicolas Sarkozy. Although this information did not originate from the same document, by extracting the meaning and relationships of the named entities, you can create new information assets that combine the entity information.

Figure 2. Document Cluster Map: Shows the references to the document. |
|

Figure 3. Named Entity Cluster Map: Shows the relationships of different entities. |
The download archive includes a simplistic web crawler and two interactive visualization tools that you can use to explore these relationships. Executing the Main class with a list of URLs that you can import into the local RDF store opens two windows: Document, and Named Entity Cluster Map. The relationships appear in the side pane, while the selected relationships are shown graphically using Aduna's Cluster Map technology, which displays whether and how sets overlap (similar to Venn diagrams and Euler diagrams). In the command line, you can prefix each URL by '1' to indicate that embedded links should be followed once, or '0' to include only the explicit URL.
Conclusion
OpenCalais now makes it easy to extract meaningful structured information that would otherwise be out of reach from
automated processes and aggregation tools. By embedding OpenCalais within other applications, such as a Cluster Map,
new information assets can be created to expose information and link it back to relevant documents. With such tools
available, distinct silos of previously unprocessed data can be combined in new ways to create derived data, niche
content sets, related links, and headline summaries. For more information visit
OpenCalais and
Aduna's Cluster Map.