SharePoint Applied: CAML, Your New Pet

harePoint is a powerful platform that gives you a very easy-to-setup place in which to put data. But you know what happens when you have a tool like SharePoint? People use it! And then, after people have put their data in, they want to retrieve it?in all sorts of weird ways. Putting in data is only half the story (and I’d argue, the easier part). It’s fetching the data in a meaningful and targeted manner that separates the wheat from the chaff.

Business users can be amazing. They like to hit us with scenarios we could not have imagined in our wildest dreams, particularly after we thought we had all the requirements figured out. Much to my chagrin, I have heard the following, “Oh yeah, that is a new requirement!” So it’s important that developers’ tools allow them the agility and flexibility to satisfy such needs and changing requirements.

At its heart, every system is basically data in and data out. Sure, standard adages apply, good architecture, garbage in, garbage out, etc. But SharePoint builds upon many years’ experience, and thus includes a large number of features that developers otherwise would have to worry about or code from scratch. For example, SharePoint tracks every piece of content, keeping information such as when the information was submitted, who edited it last, and when they edited it. And just a few clicks away you can find even more information, such as versioning, content approval, and so on. But in addition to such tracking information, sometimes you need to fetch SharePoint data that may not be so straightforwardly available.

Retrieving SharePoint Data
There are several different options for pulling data out of SharePoint. For example, you can use the object model to get the SPList object, and run a for/each over its SPListItems collection. Of course, that isn’t the smartest way to filter for data. Filtering via iteration can be extremely resource expensive.

You could search programmatically by using the FullTextSqlQuery object, as demonstrated in this article. That will provide you with quick results, but it may or may not provide you with accurate results, and this approach has an external dependency on crawl schedules and algorithms. Given the nature of search, you also have limited control over the sorting mechanisms. The sorting is controlled generally by the rank, which is dependent on an algorithm that you can influence, but not fully control.

And then you have CAML, the Collaborative Application Markup Language. SharePoint uses CAML for many purposes, one of which is extracting the data you need. CAML strikes an excellent balance between speed, dependability, and accuracy.

Within SharePoint you’ll find a number of objects that use CAML and that can help you make data queries. This article looks at these objects individually.

The Lists Web Service
SharePoint ships with a number of pre-defined web services. Web services have the innate advantage of isolating atomic pieces of functionality, giving you better reliability and flexibility, a concept otherwise known as SOA (service oriented architecture).

Listing 1 shows an easy way to filter out all rows modified by a given user ID, using the lists.asmx web service.

Web services have their advantages, but performance and XmlSerialization isn’t one of them, so it’s reasonable to expect very rich support for CAML in the object model as well. At the heart of that you’ll find the SPQuery object.

Editor’s Note: This article was first published in the May/June 2008 issue of CoDe Magazine, and is reprinted here by permission.

The SPQuery Object
The SPQuery object lets you specify a query in CAML syntax, and find the matching items using the SPList.GetItems method. For instance, you could easily use SPQuery to filter out all documents for a given user. You can see this demonstrated by the code shown in Listing 2.

This lets you retrieve a subset of items, matching a given criterion in your .NET code. If you wanted to do further lightweight filtering or sorting etc., you could simply covert the SPListItemCollection to a DataTable, and offer client-side filtering capabilities. Even better, you could use LINQ to sift through your data as shown below:

   DataTable returnedItems =    items.GetDataTable();   IEnumerable selectedRows =       from r in returnedItems.rows      where r["Title"] =    "My Announcement"      select r;

An interesting thing to note about SPList.GetItems and SPQuery is that by default, SPList.GetItems will return results in only one folder of a given list. If you instead wanted to recurse the folders in a given list and identify all items matching the given criterion, you need to add the following line to your code:

   query.ViewAttributes =       "Scope="Recursive"";

The SPSiteDataQuery Object
Well of course, now that you’ve figured out how to scan subfolders, the business user coyly hits you with another requirement. Could you somehow retrieve all documents, modified or created by the given user, across the entire website? Well, you know that the CAML query for such a requirement would look exactly like you have seen in Listing 1 and Listing 2.

But how can you possibly filter out all documents or list items across the entire site? Use the SPSiteDataQuery object as demonstrated in the code below.

   using (      SPSite site =       new SPSite("http://moss2007"))   {       SPWeb web = site.OpenWeb();       SPSiteDataQuery query = new           SPSiteDataQuery();       query.Query = "...";       DataTable results =           web.GetSiteData(query);   }

The code snippet above intentionally omits the CAML query and replaced it with an ellipsis (). SPSiteDataQuery expects to see the CAML query without the element, and without any whitespace or line breaks. The query then looks a bit like this (without the line breaks).

     MOSS2007\Administrator    MOSS2007\Administrator

The PortalSiteMapProvider Object
SPQuery and SPSiteDataQuery perform quite well. But sometimes, “quite well” is just not good enough. As illustration, the out-of-the-box views in SharePoint recommend that you place fewer than 2000 items in each container (such as a view or folder). Beyond the 2000 mark, you will see significant performance degradation. For example, a list with 100K items may require up to 152,000 milliseconds to fetch data using SPListItems and For Each. This shortens significantly?to about 2700 milliseconds?using a simple Page_Load in a browser for the out-of-the-box views. It takes even less time (450 milliseconds) when using CAML and the lists.asmx web service. In comparison, even the built-in-search takes about 350 milliseconds to return the results?possibly inaccurately. Amazingly, SPQuery and SPList.GetItems can reduce this time to just over 200 milliseconds, which is impressive for sifting through 100K records.

But what if you had a million items?

At some point, in certain very rare situations, you need extreme performance. Performance that can be met by advanced techniques such as caching. The PortalSiteMapProvider provides exactly that. The PortalSiteMapProvider reduces the time required for the preceding scenario to a mere 5-10 milliseconds for searching 100K records. Of course, due to caching, the downside of using this approach is much higher memory usage on the web server. In other words, at some point, you will actually hurt overall performance by using PortalSiteMapProvider.

The following code demonstrates how to use the PortalSiteMapProvider object.

   PortalSiteMapProvider ps =        PortalSiteMapProvider.       WebSiteMapProvider ;   SiteMapNodeCollection nodes =        ps.GetCachedListItemsByQuery(       (ps.FindSiteMapNode(       web.ServerRelativeUrl) as        PortalWebSiteMapNode       ),   "Announcements", query, web);

Figuring Out the CAML Search Syntax
It would be incredibly remiss of me to end this article without mentioning an indispensible tool that makes writing CAML queries a whole lot easier. Check out the U2U CAMLBuilder, available as a free download.

Using U2U CAMLBuilder is incredibly simple and through a point-and-click interface, it lets you craft up a parameterized CAML query, which you can then copy and paste in your .NET code. It really doesn’t get much easier than that.

Happy CAML riding!

Share the Post:
Share on facebook
Share on twitter
Share on linkedin

Overview

Recent Articles: