Need to Know: XML

Question marks

Microsoft has gotten into hot legal water over its use of XML in Word, with a Texas court giving it two months to sort out a patent issue before it'll be forced to stop selling one of its top products.

The legal angle is not as confusing to some as XML what is it, and why is it so important to Word?

What is XML?

XML is Extensible Markup Language. That possibly didn't help much.

It marks out specific bits of a document you can specify what each and every bit is, by content type. It's code for organising content, not for formatting.

In a Word document, you could use it to specify who wrote the words, for example, or what the title is. That may seem like a silly thing to do, but it means the document can be read in an automated way all the authors could be pulled out of a group of documents, for example, and formatted automatically.

What is it used for?

XML is the language which makes feeds work. Think of RSS, Atom and XHTML all are based on XML. That's what lets feed aggregators pull out the right bits of a news story for example, the headline, byline and first few lines of a story - in order to reformat it into something your feed reader can understand.

For document editing suites, XML is used in Apple's iWork,, as well as, for now anyway, Microsoft Word.

In Word, Microsoft uses it for many things, but especially to connect to other business programs like SharePoint. The Open Office XML file format became the default in Word 2007, and is used as a standard for document files in other programs, too.

So what's the problem now?

XML itself isn't at issue, but the way it's manipulated is. The Canadian firm i4i which is suing Microsoft has a patent for changing the architecture and content of a document separately from each other, while Microsoft itself has a patent for pulling data out of an XML document using a computer.

It's a pretty specific thing, then. Because the i4i bits aren't necessary to Word functioning, many pundits are suggesting Microsoft may just issue a patch to remove it from Word, rather than risk the software being pulled from shelves - or pay the $240 million fine.

Want to read more background on the latest IT topics? Click here for all the tech cheat sheets in our Need to Know series.