Friday, January 28, 2005

XML-binary Optimized Packaging

At last, XML has a formal, standard means of dealing cleanly and efficiently with arbitrary binary data. To date the approach has been to base-64 encode which:
  • Enlarges the data.
  • Spreads each group of three bytes across four bytes, marginally worsening the performance of efficient update algorithms like rsync when synchronising changes.
  • Renders in-place use (e.g. mmap()ing a video stream) impossible.
The adopted approch utilises a MIME multipart/related container to move the binary streams out of the XML document itself and the Content-ID: header plus cid: scheme to tie the pieces back together, substantially the same approach as a scheme that I've contemplated in the past to solve much the same problem.

What they've done that I had not considered is to define a mapping between this efficient form and a simple XML document with the binary stream base-64-encoded in place which means that efficient representation and canonical form (e.g. for signing) need no longer be in conflict.