Browse DevX
Sign up for e-mail newsletters from DevX


Ink And The Database : Page 3

Unless your battery is really, really good, you'll eventually want to store your Ink. Simple file storage or XML serialization is sometimes sufficient, but usually, you'll want to move Ink into and out of a relational database. Here's how.




Building the Right Environment to Support AI, Machine Learning and Deep Learning

Improving Performance While Adding Full-Text Search
I've come full circle to the initial discussion of trade-offs between normal form, query scenarios, and performance. Neither of the two extremes (Fully normalize your point data and store nothing but BLOBs) are likely to be correct. Instead, the right approach varies from application to application.

For the final example, take one piece of data derivable from Ink (the recognized text) and associate it with an Ink store. Full-text search requires a considerable departure from the SQL standard, so this example is highly specific to SQL Server 2005.

First, add an ID field of SQL type uniqueidentifier to the Ink table and set it to be the primary key. Create another table, RecoResults, which also has a uniqueidentifierId, a foreign key to Ink table (InkId) an ntext field called Alternate, and another called RecoConfidence. (See Figure 2.)

Figure 2. Storing Ink: The figure shows the process of storing Ink-derived data for full-text search of recognized handwriting.
After creating the RecoResults table, open SQL Server Management Studio, right-click on the table, and choose "Define Full-Text Index…" to create a full-text index for the Alternate column via a simple wizard.

Back in Visual Studio in the Example1 event-handler for Save, modify the SqlCommand so that it includes a new GUID to ID the Ink. After the Ink is inserted, add a call to a new method:

AddRecognizedText(conn, (Guid)cmd.Parameters["@Id"].Value, io.Ink.Strokes);

To call AddRecognizedText(), you need an open SqlConnection, the GUID of the row in which you've stored the Ink's data, and the Strokes collection that represents the pen-down to pen-up strokes in the Ink. In the Tablet PC SDK, recognition is done by a RecognitionContext object. The first block of code in Listing 3 is the boilerplate code to get a default context, here called ctxt.

You could recognize the Strokes as a single whole, but in many applications, an Ink object contains a mix of text and drawing areas.
You could recognize the Strokes as a single whole, but in many applications an Ink object contains a mix of text and drawing areas. (For instance, in an image annotation application, you might draw dent #1 and draw an arrow to the relevant portion of the photo.) Divvying up a Strokes collection into text and non-text is done by the Divider class.

After instantiating the divider object and calling Divide() on it, you end up with a DivisionResult object. Typically, this holds some mix of Drawing, Paragraph, Line, and Segment results, which you can iterate over by calling ResultsByType() and passing the appropriate type.

Keeping in mind that you've only requested Paragraphs, iterate each DivisionResult and use the ctxt to perform recognition on the Strokes collection of the DivisionResult. If the RecognitionStatus is NoError, you know that you have some recognized text. The handwriting recognition typically returns more than one RecognitionAlternate, each of which has a RecognitionConfidence value of Strong, Intermediate, or Poor. The recognized text is available from the ToString() method of the RecognitionAlternate. In Listing 3, the innermost loop inserts each possible Recognition object, along with its Confidence value, into the RecoResults table of the database. To query this data, use the SQL Server-specific Contains() function. Here's an example.

SELECT Id, Ink, Length FROM Ink, RecoResults WHERE RecoResults.InkId = Ink.ID AND Contains(RecoResults.Alternate, 'foo') AND RecoResults.Confidence = 'Strong'

An interesting thing about this query is that you cannot use the DISTINCT qualifier, as the presence of an image column disqualifies that argument. So you can either accept duplicates in the return (but remember that a duplicate here means duplicating the byte[] blob) or use a two-stage query in a stored procedure (which is certainly what I'd choose in a production application), selecting the DISTINCT InkIds from the RecoResults table and then using those to query the Ink table.

Fill Up With Ink
Although it is not possible to add Ink directly into a database, the techniques shown in this article show how easy it is to convert Ink to and from a BLOB and work on that basis. Databinding is also easy, although as discussed in Example 2, a client-side DataSet containing the entire database's worth of Ink data is unlikely to be a good idea. Instead, look at your usage and query scenarios, and directly store the Ink-derived metadata in appropriate tables (or, if you prefer, XML columns) as you did in Example3. At that point, recovering the desired Ink is simply a matter of working back to the ink data via standard database joins. It couldn't be easier!

Larry O�Brien is a software development industry analyst and programmer. He writes the �Microsoft and .NET Watch� column for SD Times and was the founding editor of Software Development and Game Developer magazines. He is a vocal proponent of the TabletPC, and outlined this story using MindManager in pen mode on a Motion M1400 slate. Final text and coding was done on a Toshiba Portege M200 in laptop mode. See Larry's blog for more information.
Thanks for your registration, follow us on our social networks to keep up-to-date