Organizing Existing Data
Creating new content in RDF is great, but what about existing content and the content that
will be created using the Web 2.0 methodology? The answer lies in Natural Language Processing. NLP is a science that deals with natural languages (the languages people speak) and computer languages.
There are two general types of NLP systems:
- Ones that convert information from software and data storages into human-readable form
- Ones that convert natural language data into machine-readable form
In order to categorize and finally organize the nearly 20 years' worth of data that aimlessly floats around on the web today, NLP systems can go through that data, make sense of it, and categorize it. These systems ultimately will help to convert all the old data into RDF, enabling it to be infinitely shared by computers on the web.
Not Only Categorization, but Reasoning
The unified RDF format's machine-readability allows machines to "make sense" of the data. While people can may look at birds and the sky and associate the two together, computers have to be instructed that birds and the sky belong together. Once they are made aware of that association, however, computers can incorporate it into their existing knowledge.
This has very interesting implications. If computers "understand" things by linking their logical associations, they can also "figure out" things that are locally associated. For example, if:
- All humans are mortal, AND
- Socrates is a human, THEN the computer can draw the conclusion that
- Socrates is mortal.
This process is called inference and it is widely used in RDF. More strictly, inference is a mathematical process of taking a set of axioms and asserting new logical consequences from them. In short, it is a way to get additional data from existing data. Many organizations use this concept and purposely structure their data in order to get new interesting data that can benefit their businesses.
Barriers to Full Adoption
All these solutions to existing problems, new inference technologies, uses of ontologies–why hasn't all this goodness been fully adopted yet?
Well, Semantic Web is quite an advanced computer science topic. This very introductory article alone touched on many new technologies, and each of these technologies has a learning curve. Additionally, the Semantic Web is only an extension of the web so Web 2.0 systems can still function without it. Before Web 3.0 reaches critical mass, it remains a luxury that only very well funded projects can afford to implement.