Efficient XML Interchange Encodings

 
 

The Efficient XML Interchange (EXI) Working Group has been chartered by the W3C to create a more compact and faster to parse encoding of the XML infoset. See the public page at the EXI working group for details about the group. The working group has written a number of documents related to the EXI specification.


The EXI Primer is here

The EXI Format Specification is here

The EXI Best Practices document is here

The EXI Measurements Note is here


The EXI specification is currently at Last Call status. In order to promote interoperability between implementations and validate the clarity of the draft specification, we’ve encoded a number of XML documents in the EXI format with several different encoding options. Those interested in ensuring compatibility can compare their own encodings to those on this web site.


The XML documents were encoding using Agile Delta’s ( http://www.agiledelta.com) Efficient XML product (http://www.agiledelta.com/product_efxsdk.html). Efficient XML is an implementation of the Draft 3 specification. It has not been tested by the EXI working group for conformance or interoperability.


 

Introduction

Encoded Files

The working group used a test corpus that consisted of a variety of XML documents. The original XML documents and encoded files are here:


Encoded Files Directory


You can download a tar file of the entire directory Here


File Encoding Notes

EXI has several different encoding options to match different situations. For example, if an XML file has a schema and that schema describes some attributes as numeric values, that information can be used by the EXI encoder and some XML data can be stored in a binary encoded format rather than string format. The existence of the schema can also be used to represent event codes more compactly.

The XML files that were encoded are extracts of the EXI test corpus, plus some small example files of minimal complexity, including the "notebook" example from the EXI Primer document. Where schemas were available, they were used to assist encoding. In all cases the XML files were also encoded without a schema.

The XML files were encoded with several different options, including:

• Byte-aligned

• Schema-informed (allows for deviations from the schema, the default mode)

• Strict Schema (no deviations from the schema allowed)

• With EXI header information

• Compressed

• Lexical encoding


The schema-informed encodings were created by a bash script, an excerpt of which is below:


     efx.sh -strict -schema "$2" "$1" -o "$1"_STRICT.exi

  efx.sh -header -strict  -schema  "$2" "$1" -o "$1"_STRICT_HEADER.exi

  efx.sh -schema "$2"  "$1" -o "$1"_DEFAULT.exi

  efx.sh -header -schema "$2" "$1" -o "$1"_DEFAULT_HEADER.exi

  efx.sh -bytealign -schema "$2" "$1" -o "$1"_DEFAULT_BYTEALIGNED.exi

  efx.sh -header -bytealign -schema "$2" "$1" -o "$1"_DEFAULT_BYTEALIGNED_HEADER.exi

  efx.sh -zip -strict -schema "$2" "$1" -o "$1"_STRICT_COMPRESSED.exi

  efx.sh -strict -zip -header -schema "$2" "$1" -o "$1"_STRICT_COMPRESSED_HEADER.exi

  efx.sh -strict -prezip -schema "$2" "$1" -o "$1"_PRECOMPRESSION.exi

  efx.sh -header -prezip -strict -schema "$2" "$1" -o "$1"_STRICT_PRECOMPRESSION_HEADER.exi

  efx.sh -strict -bytealign -schema "$2" "$1" -o "$1"_STRICT_BYTEALIGNED.exi

  efx.sh -strict -bytealign -header -schema "$2" "$1" -o "$1"_STRICT_BYTEALIGNED_HEADER.exi

  efx.sh -stringvalues -schema "$2" "$1" -o "$1"_SCHEMA_LEXICAL.exi

  efx.sh -stringvalues -header -schema "$2" "$1" -o "$1"_SCHEMA_LEXICAL_HEADER.exi



EXI can also encode XML documents without a schema. The bash script for these encodings included the following:


  efx.sh -noschema "$1" -o "$1"_NOSCHEMA.exi

 efx.sh -noschema -bytealign "$1" -o "$1"_NOSCHEMA_BYTEALIGN.exi


In the above, the file "notebook.xml" with the schema "notebook.xsd" would have the following command line

executed:


  efx.sh -header -strict  -schema  notebook.xsd notebook.xml -o notebook.xml_STRICT_HEADER.exi


This would encode the notebook.xml file using the "strict schema" option and the EXI headers option, and place the output in notebook.xml_STRICT_HEADER.exi.


The script used to encode the files is here.


License: W3C License for use of the example XML files.

GUI For Encoding XML Documents

Efficient XML from Agile Delta is a commercial product. Other implementations are emerging, including the open source Exificient from Siemens. LT Sheldon Snyder, a student at the Naval Postgraduate School, has put together a GUI front end for the Exificient implementation that allows users to easily encode XML documents with or without schemas. Since the Exificient implementation is still a pre-release product some features may be missing and not all aspects of the implementation have been extensively tested.

The java application is available for download here. To run the application,  extract the files, use JDK 1.6 and type

java -jar OpenerExi.jar

In the directory. The application looks like this: