An Introduction to the Yahoo! Query Language Platform

An Introduction to the Yahoo! Query Language Platform

ne of Yahoo!’s key goals is to become more open through the Yahoo! Open Strategy (Y!OS). At the heart of Y!OS is an open and extensible platform that allows developers to rapidly access Yahoo! network data and develop applications with access control using an open authentication standard.

Yahoo! makes structured data available to developers through its web services, such as Flickr and Yahoo! Local, and through other sources like RSS feeds or CSV documents. There are also numerous external web services and APIs outside of Yahoo! that provide valuable data. For example, Programmableweb.com shows more than a thousand APIs available on topics ranging from project management to the Bible.

These disparate services require developers to locate the correct URLs for accessing the APIs and the documentation for querying them. Data remains isolated and separated, requiring developers to combine and work on the data after it’s returned to them.

What Is YQL?

The Yahoo! Query Language (YQL) platform provides a mediator service that enables developers to query, filter, and combine data across the web. YQL exposes a SQL-like SELECT syntax that is both familiar to developers and expressive enough for getting the right data. Through YQL’s SHOW and DESC commands, Yahoo! attempts to make YQL self-documenting, enabling developers to discover the available data sources and structure without opening another web browser or reading a manual.

The SELECT statement is the primary verb for YQL and borrows much from a SQL-like syntax:

SELECT what FROM table WHERE filter

“Tables” in YQL refer to external data sources that often contain very large collections of structured data. You can think of each item in a table as a “row” in a more traditional relational database. All data in YQL is treated as XML, and if the underlying table source does not provide XML, it converts into an XML-like representation.

You can join any table or data source with another data source through sub-selects—if each table has a matching value—similar to a foreign key in SQL.

Consider the following YQL statement:

SELECT filename,thumbnail_url FROM search.images(50) WHERE query="hat" AND mimetype LIKE "%jpeg%" 
AND height<400 AND width<400 LIMIT 2

This returns the thumbnail URL and filename of up to two results FROM the search.image table for images of "hats" that are jpeg encoded and under a certain size:

 yahoo:lang="en-US" yahoo:updated="2008-12-02T12:50:59Z" yahoo:uri="http:// query.yahooapis.com/v1/
yql?q=select+filename%2C+thumbnail_url+from+search.images%2850%29+where+query%3D%22hat%22+and+mimetype+like+%22%25jpeg
%25%22+and+height%3C400+and+width%3C400+LIMIT+2"> http://boss.yahooapis.com/ysearch/images/v1/hat?count=50&start=0&format=xml 320 258 2008.12.01.15:20 RedHat4151.jpg http://ac4.yt-thm-a01.yimg.com/image/45b413c52b2d4a88 red-hat-tshirt-p-lg.jpg http://ac4.yt-thm-a01.yimg.com/image/239a055f9d7c0fb8

You can run this query by simply copying the following URL into your browser:

http://query.yahooapis.com/v1/public/yql? q=select%20filename%2C%20thumbnail_url%20from%20search.images(50)
%20where%20query%3D%22hat%22%20and%20mimetype%20like%20%22%25jpeg%25%22%20and%20height%3C400%20and%20width%3C400%20
LIMIT%202&format=xml

Functional Basics

Dot Syntax

The SELECT statement's "what" and local filtering clauses use a "dot" style syntax to address parts in the table item data structure. Each dot-separated part in the field path refers to the name of the element (for JSON and XML documents), an attribute, or CDATA/text portion of the element (for XML documents). The more "dots" in a path, the deeper into the data structure the final element is found. This dot syntax provides a simple and expressive mechanism to address parts of an item's structure.Consider the following XML item.

   some item title           http://foo          37.123   

The following are valid dot-paths that refer to different parts of this structure:

Item.titleItem.linkItem.link.contentItem.image.widthItem.lat

The namespace of any attributes or elements in XML is ignored.

Remote vs. Local Processing

YQL accesses a diverse range of services and data. Many queries access data sources with thousands of items in them. To manage these large remote set sizes and to work with existing web services without modification, YQL provides two distinct ways?remote and local?to filter the data being queried in the table.

Remote filtering simply means pushing as much of the filter (or query) to the remote data provider or service. The most common way that services enable their collections to be queried and filtered is through REST-style syntax and query/path parameters. Because these often do not map closely to the result structure that is returned by that query, YQL exposes them explicitly as remote "keys" in the WHERE cause, some of which must be provided. For example, the local.search table requires both "query" and "location" keys (what to search for and where):

SELECT * FROM local.search WHERE query="sushi" AND location="san francisco, ca"

Keys support only "equality" comparisons. Describing a table (DESC local.search) shows what keys or combinations of keys are possible for a given table.

Local Filtering

After data arrives at the YQL service, you can filter it through any combination of ANDs, Ors, and comparators. You can refer to particular parts of each item in the result set through the dot-style syntax (the root/item level is not needed). For example, consider the local example above. If you are interested only in restaurants with a high average review, you can locally filter the results using the Rating element like this:

SELECT * FROM local.search WHERE query="sushi" AND location="san francisco, ca" AND Rating.AverageRating>4

Paging

By default, YQL only provides 10 items from a table. If you want to get more data locally processed, or return more results, you need to instruct YQL to retrieve more items by inserting the number into parenthesis after the table name:

SELECT * FROM flickr.photos.interestingness(20)

If you want to start at an offset in the table, you can use a second parameter in the parentheses:

SELECT * FROM flickr.photos.interestingness(10,20)

This returns 20 items starting at offset 10. If you'd like YQL to process as much data as it can in 30 seconds (the default timeout for any YQL statement), you can use (0) as the table sizing parameter.

Projection

Often, you don't want all of the data in each item that's coming back from a table. For example, perhaps you only want the title of an RSS feed item. The "what" part of the SELECT statement enables you to select (or project) which parts of the table data that satisfy the WHERE clause are actually returned, including any parent parts of the structure that are necessary. Each dot-style path included in the comma separated list selects a sub-tree to return for each item in the table.

Consider a web search example:

select * FROM search.web where query="pizza"

Each item returned from this result looks like this:

            pizza chain, with online 
ordering, Pizza Hut store finder, coupons, and menu. Also in Spanish.]]=
http://lrd.yahooapis.com/_ylc= =10t5eod6j/**http%3A//www.pizzahut.com/ 2008/12/01 pizzahut.com]]= 12880 <![CDATA[<b>Pizza</b> Hut]]= http://www.pizzahut.com/

If you need only the title and URL, for example, you can project only those pieces of the result:

SELECT title, url FROM search.web where query="pizza"

This query returns:

            <![CDATA[<b>Pizza</b> Hut]]=            http://www.pizzahut.com/

Joins

One of the most powerful aspects of YQL is its ability to join data across different tables (and web services) through sub-SELECTs. For example, to get your own profile information, use this statement:

SELECT * FROM social.profile where guid=me

The social.profile table returns the profile for a given GUID, the identifier for a person within Yahoo! (and me is a convenience constant that expands the current user's profile key or identifier).

Do this to get your friends:

SELECT * FROM social.connections where owner_guid=me

The result of this table query is a list of friends that don't have much information; it's mostly a list of GUID's.

Now, you can combine both of these tables using a join to obtain all the profiles for each of your friends:

SELECT * FROM social.profile WHERE guid IN(SELECT guid FROM social.connections WHERE owner_guid=me)

In essence, the sub-SELECT needs to return the foreign-keys that are used for the join with the outer-SELECT. The project part of the sub-SELECT needs to point to a leaf node of the sub-SELECTs results, like the GUID for each social connection returned.

You can do joins whenever you get a table that provides input to another table. Consider getting an international weather forecast for any location on the planet. The weather.forecast table provides forecasts but requires a location that is either a U.S. ZIP code or an international location identifier. The weather.com site provides a great service that returns this identifier from a string. So now you can join these two services together to get the weather using a string:

SELECT * FROM weather.forecast WHERE location IN   (SELECT id FROM xml WHERE    url="http://xoap.weather.com/search/search?WHERE=prague"    AND itemPath="search.loc")		

This query also uses YQL's ability to bring in data from any place on the Internet through the XML table.

Piping Output Through Post-query Functions

After a SELECT has finished filtering and projecting the items, YQL allows you to add additional "post-query" functions to the results before they are returned. Appending the "pipe" symbol to the end of the statement performs these functions:

SELECT … |sort(field=item.date)SELECT … |unique(field=item.title)| …

Using the Service

Getting Started: The Console

 
Figure 1. YQL Console Explore what YQL can do.

The easiest way to explore what YQL can do, and to start using the service, is to use the console (see Figure 1). The right-hand side of the page shows the set of "built-in" tables that YQL exposes, primarily most of Yahoo!'s web services. You can enter the YQL commands into the main text area at the top of the page and the results display underneath. Click any of the table names or sample queries to cause examples to "run" in the console.

Using the Web Service

After using the console you've probably come up with a few queries that you'd like to execute from your application. Accomplish this by sending a GET request to one of the two YQL web service endpoints, either:

http://query.yahooapis.com/v1/yql?q=[query]

or

http://query.yahooapis.com/v1/public/yql?q=[query]

The default output is XML for both endpoints, regardless of the table source that you access. If you like, JSON can append an output parameter to the call:

http://query.yahooapis.com/v1/yql?q=[query]&output=json

In addition, the JSON response also supports JSONP to allow script includes in the browser:

http://query.yahooapis.com/v1/yql?q=[query]&output=json&callback=jsfunction

Authentication and Access Control

Each table in YQL has an authentication scope required for access. Typically, this corresponds to the type of data the table exposes. Private data, like user profile information, is limited to applications that can verify they are acting on behalf of a user. Public data, like web search results, is generally accessible to any application.

Public tables are accessible through the /v1/public/yql endpoint. You access private tables (and public tables) through the main /v1/yql endpoint, which requires that you sign the call using OAuth. The public URL has stricter rate limiting, so if you plan to use YQL heavily, we recommend you access the OAuth-protected URL.

OAuth

OAuth is an open standard that enables providers of user data to share some or all of that data with an external application with the user's explicit permission. YQL (and Yahoo!) uses this standard to enable users to share their information, like their name or social network details, with third-party developers (you!).

The first step is to get a "consumer key" and "consumer secret." This information tells Yahoo! what the application is when calls are made.

The second step, if you need it, is to get permission from the user using your application to access their private data in Yahoo!. After you've done that, you'll have an "access token" and an "access token secret."

You can find more information here. Getting a request signed using OAuth may seem a little daunting at first. Fortunately, there are libraries for most languages that take care of the details for you. One of these, the Yahoo! social SDK provides a PHP library that makes accessing YQL and other Yahoo web services easy:

See Listing 1 for an example of how to get the most of your social information from Yahoo!

Learning More

YQL enables developers to easily access, combine, and filter remote data to fit their application requirements. It provides consistent and unified semantics for accessing user-specific private data and public data in Yahoo!, as well as the rest of the web. Data is returned in a normalized form.More detailed documentation and information can be found here.To learn more about Y!OS and to see how it provides new avenues for you to reach and connect with Yahoo! users like never before, visit Yahoo!'s Open Strategy home page.

devx-admin

devx-admin

Share the Post:
Razer Discount

Unbelievable Razer Blade 17 Discount

On September 24, 2023, it was reported that Razer, a popular brand in the premium gaming laptop industry, is offering an exceptional deal on their

Innovation Ignition

New Fintech Innovation Ignites Change

The fintech sector continues to attract substantial interest, as demonstrated by a dedicated fintech stage at a recent event featuring panel discussions and informal conversations

Import Easing

Easing Import Rules for Big Tech

India has chosen to ease its proposed restrictions on imports of laptops, tablets, and other IT hardware, allowing manufacturers like Apple Inc., HP Inc., and

Anthropic Investment

Amazon’s Bold Anthropic Investment

On Monday, Amazon announced its plan to invest up to $4 billion in the AI firm Anthropic, acquiring a minority stake in the process. This

Razer Discount

Unbelievable Razer Blade 17 Discount

On September 24, 2023, it was reported that Razer, a popular brand in the premium gaming laptop industry, is offering an exceptional deal on their Razer Blade 17 model. Typically

Innovation Ignition

New Fintech Innovation Ignites Change

The fintech sector continues to attract substantial interest, as demonstrated by a dedicated fintech stage at a recent event featuring panel discussions and informal conversations with industry professionals. The gathering,

Import Easing

Easing Import Rules for Big Tech

India has chosen to ease its proposed restrictions on imports of laptops, tablets, and other IT hardware, allowing manufacturers like Apple Inc., HP Inc., and Dell Technologies Inc. more time

Semiconductor Stock Plummet

Dramatic Downturn in Semiconductor Stocks Looms

Recent events show that the S&P Semiconductors Select Industry Index seems to be experiencing a downturn, which could result in a decline in semiconductor stocks. Known as a key indicator

Anthropic Investment

Amazon’s Bold Anthropic Investment

On Monday, Amazon announced its plan to invest up to $4 billion in the AI firm Anthropic, acquiring a minority stake in the process. This decision demonstrates Amazon’s commitment to

AI Experts Get Hired

Tech Industry Rehiring Wave: AI Experts Wanted

A few months ago, Big Tech companies were downsizing their workforce, but currently, many are considering rehiring some of these employees, especially in popular fields such as artificial intelligence. The

Lagos Migration

Middle-Class Migration: Undermining Democracy?

As the middle class in Lagos, Nigeria, increasingly migrates to private communities, a PhD scholar from a leading technology institute has been investigating the impact of this development on democratic

AI Software Development

ChatGPT is Now Making Video Games

Pietro Schirano’s foray into using ChatGPT, an AI tool for programming, has opened up new vistas in game and software development. As design lead at business finance firm Brex, Schirano

Llama Codebot

Developers! Here’s Your Chatbot

Meta Platforms has recently unveiled Code Llama, a free chatbot designed to aid developers in crafting coding scripts. This large language model (LLM), developed using Meta’s Llama 2 model, serves

Tech Layoffs

Unraveling the Tech Sector’s Historic Job Losses

Throughout 2023, the tech sector has experienced a record-breaking number of job losses, impacting tens of thousands of workers across various companies, including well-established corporations and emerging startups in areas

Chinese 5G Limitation

Germany Considers Limiting Chinese 5G Tech

A recent report has put forth the possibility that Germany’s Federal Ministry of the Interior and Community may consider limiting the use of Chinese 5G technology by local network providers

Modern Warfare

The Barak Tank is Transforming Modern Warfare

The Barak tank is a groundbreaking addition to the Israeli Defense Forces’ arsenal, significantly enhancing their combat capabilities. This AI-powered military vehicle is expected to transform the way modern warfare

AI Cheating Growth

AI Plagiarism Challenges Shake Academic Integrity

As generative AI technologies like ChatGPT become increasingly prevalent among students and raise concerns about widespread cheating, prominent universities have halted their use of AI detection software, such as Turnitin’s

US Commitment

US Approves Sustainable Battery Research

The US Department of Energy has revealed a $325 million commitment in the research of innovative battery types, designed to enable solar and wind power as continuous, 24-hour energy sources.

Netanyahu Musk AI

Netanyahu and Musk Discuss AI Future

On September 22, 2023, Israeli Prime Minister Benjamin Netanyahu met with entrepreneur Elon Musk in San Francisco prior to attending the United Nations. In a live-streamed discussion, Netanyahu lauded Musk

Urban Gardening

Creating Thriving Cities Through Urban Gardening

The rising popularity of urban gardening is receiving increased recognition for its numerous advantages, as demonstrated in a recent study featured in the Environmental Research Letters journal. Carried out by

What You Need to Know About Cloud Security Strategies

What You Need to Know About Cloud Security Strategies

Today, many businesses are adopting cloud computing services. As a result, it’s important to recognize that security measures for data in the cloud are different from those in traditional on-premises

Romanian Energy Security

Eastern Europe is Achieving Energy Security

Canada and Romania have solidified their commitment to energy security and independence from Russian energy exports by signing a $3-billion export development agreement. The deal is centered on constructing two

Seamless Integration

Unlocking Seamless Smart Home Integration

The vision of an intelligently organized and interconnected smart home that conserves time, energy, and resources has long been desired by many homeowners. However, this aspiration has often been hindered

New Algorithm

MicroAlgo’s Groundbreaking Algorithm

MicroAlgo Inc. has revealed the creation of a knowledge-augmented backtracking search algorithm, developed through extensive research in evolutionary computational techniques. The algorithm is designed to boost problem-solving effectiveness, precision, and

Poland Energy Future

Westinghouse Builds Polish Power Plant

Westinghouse Electric Company and Bechtel have come together to establish a formal partnership in order to design and construct Poland’s inaugural nuclear power plant at the Lubiatowo-Kopalino site in Pomerania.

EV Labor Market

EV Industry Hurting For Skilled Labor

The United Auto Workers strike has highlighted the anticipated change towards a future dominated by electric vehicles (EVs), a shift which numerous people think will result in job losses. However,