advertisement
Premier Club Log In/Registration
  Include Code  Search Tips
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   SKILLBUILDING  |   TIP BANK  |   SOURCEBANK  |   FORUMS  |   NEWSLETTERS
Browse DevX
Download the code for this article
Were you aware of WBXML before reading this article? Are you already using it in your applications, or are you using a competing technology? Are articles such as this that explain a file format in detail useful, or would you rather that DevX publish such information as documentation? Let us know in the xml.general discussion group.
Partners & Affiliates
advertisement
advertisement
advertisement
Average Rating: 5/5 | Rate this item | 1 user has rated this item.
Email this articleEmail this article
Compressing XML—Part I, Writing WBXML (cont'd)
Generating WBXML Format from XML
Listing 1 contains a Wireless Markup Language (WML) file. WML is an XML based markup language similar to HTML, but optimized to serve as a presentation layer in the small, monochrome screens typically associated with WAP devices. I'll show you how to convert WML to WBXML. Listing 2 shows the WBXML format resulting from the conversion process and Table 1 (see the last page of this article) gives descriptive notes for each byte in Listing 2.
advertisement


Most of the WBXML format is not humanly readable; it contains bytes or octets in raw (non-textual or non-encoded) hexadecimal (hex) form. Two hex numbers represent one byte. For example, 0100 1000 in hex form is 0x48 (0x is the standard form of writing hex numbers; the number 4 represents 0100 and the number 8 represents 1000 in hex). As another example, the binary number 1100 1110 0111 1101 in hex is 0xCE 0x7D.

The first byte of a WBXML file represents the WBXML specification version number used in the file. The version uses a zero-based syntax where the number 0x03 means version 1.3 and 0x13 represents version 2.3. In this example, using WBXML version 1.3, the first byte of the WBXML file is 0x03. Version 1.3 is the latest version, called WAP 2.0 (see the June 2001 release on WAP Forum's web site).

The sequence of bytes following the version number represents the Document Type Definition (DTD) of the XML file that you want to transform. There are two ways of doing this. First, you can include a well-known public ID for the DTD. If no public ID is available, you can include the DTD in its string form. Listing 2 uses the first method (0x04 for WML, byte 2 of Table 1).

Byte 3 contains the character-encoding declaration. WBXML requires an Internet Assigned Numbers Authority (IANA) MIBEnum value instead of a character encoding declaration. For further information about IANA MIBEnum values, refer to IANA web site (see the resources column). Table 1 shows the MIBEnum value for the UTF encoding used in the example, which is 106 (decimal) or 0x6A.

A String Table follows the character encoding. A string table is a reusable sequence of characters (yes, characters and not bytes) that you include once in a WBXML file and can refer to from anywhere else in the WBXML file. String tables reduce WBXML file size by using references to avoid inserting any sequence of characters more than once. The fourth byte specifies the total length (number of characters) in the string table. This example doesn't use a string table; therefore byte 4 Listing 2 contains 0x00, which means length of the string table is zero.

Byte 5 represents the root element (<wml>). The example uses a standard WML DTD for which the WAP Forum has defined a WBXML encoding table. So in this case you can simply use the WML byte codes that the WAP Forum has defined in their WML specification and you don't need to include any element names in the WBXML document. The WBXML specification allows you to use byte codes ranging from 0x05 to 0x 3f to specify tags, a range called the tag code space.

The WBXML specification also defines a code space for Global Tokens, which have special meaning for WBXML parsers, so you can't use them as element codes. For example, 0x00 to 0x04 are Global Tokens.

You may need to add a numeric value to the byte code of each element, depending upon one of the following three scenarios:

  • When an element contains content (text nodes or child elements) but no attributes, you add a numeric value of 0x40 to the byte code
  • When the element contains one or more attributes but no content, add 0x80 to the element byte code
  • When the element contains both an attribute and content, add 0xC0 to the element bye code.
In the example in Listing 1, the root element <wml> has a byte code of 0x3F. Therefore byte 5 in Table 1 is 0x7f (0x3F + 0x40).

The Card element has byte code 0x27, so byte 6 is 0xE7 (0x27 + 0xC0).

When you include an element with attributes (such as byte 6 in Listing 1), the attribute list follows immediately after the element code. WBXML declares an attribute code space that overlaps with the tag code space (as this overlap does not produce any ambiguity). The global token 0x01 marks the end of the attribute code space —byte number 30 in Table 1. Attribute and element code spaces overlap with each other, but not with the code space for global tokens. Therefore all the content from bytes 6 to 30 in Table 1 and Listing 2.
Previous Page: Introduction Next Page: Another Conversion Example


Page 1: IntroductionPage 3: Another Conversion Example
Page 2: Generating WBXML Format from XMLPage 4: Table 1: WBXML Byte Code for the WML of Listing 1
advertisement
Advertising Info  |   Member Services  |   Permissions  |   Contact Us  |   Help  |   Feedback  |   Site Map  |   Network Map  |   About


JupiterOnlineMedia

internet.comearthweb.comDevx.commediabistro.comGraphics.com

Search:

Jupitermedia Corporation has two divisions: Jupiterimages and JupiterOnlineMedia

Jupitermedia Corporate Info


Legal Notices, Licensing, Reprints, & Permissions, Privacy Policy.

Advertise | Newsletters | Tech Jobs | Shopping | E-mail Offers

Solutions
Whitepapers and eBooks
IBM Whitepaper: Innovative Collaboration to Advance Your Business
Internet.com eBook: Real Life Rails
Avaya Article: Call Control XML - Powerful, Standards-Based Call Control
Internet.com eBook: The Pros and Cons of Outsourcing
Go Parallel Article: Scalable Parallelism with Intel(R) Threading Building Blocks
Internet.com eBook: Best Practices for Developing a Web Site
IBM CXO Whitepaper: The 2008 Global CEO Study "The Enterprise of the Future"
Avaya Article: Call Control XML in Action - A CCXML Auto Attendant
Go Parallel Article: James Reinders on the Intel Parallel Studio Beta Program
IBM CXO Whitepaper: Unlocking the DNA of the Adaptable Workforce--The Global Human Capital Study 2008
Adobe Acrobat Connect Pro: Web Conferencing and eLearning Whitepapers
Go Parallel Article: Getting Started with TBB on Windows
HP eBook: Storage Networking , Part 1
MORE WHITEPAPERS, EBOOKS, AND ARTICLES
Webcasts
Go Parallel Video: Intel(R) Threading Building Blocks: A New Method for Threading in C++
HP Video: Is Your Data Center Ready for a Real World Disaster?
Microsoft Partner Portal Video: Microsoft Gold Certified Partners Build Successful Practices
HP On Demand Webcast: Virtualization in Action
Go Parallel Video: Performance and Threading Tools for Game Developers
Rackspace Hosting Center: Customer Videos
Intel vPro Developer Virtual Bootcamp
HP Disaster-Proof Solutions eSeminar
HP On Demand Webcast: Discover the Benefits of Virtualization
MORE WEBCASTS, PODCASTS, AND VIDEOS
Downloads and eKits
Microsoft Download: Silverlight 2 Software Development Kit Beta 2
30-Day Trial: SPAMfighter Exchange Module
Red Gate Download: SQL Toolbelt
Iron Speed Designer Application Generator
Microsoft Download: Silverlight 2 Beta 2 Runtime
MORE DOWNLOADS, EKITS, AND FREE TRIALS
Tutorials and Demos
IBM IT Innovation Article: Green Servers Provide a Competitive Advantage
Microsoft Article: Expression Web 2 for PHP Developers--Simplify Your PHP Applications
Featured Algorithm: Intel Threading Building Blocks - parallel_reduce
MORE TUTORIALS, DEMOS AND STEP-BY-STEP GUIDES