I recently had the need to calculate the number of words in an XML-based book I was writing. Because of all the markup, this isn't as simple as it might seem. I eventually found this article:
Tip: Computing word count in XML documents.
The technique works well, although the word count isn't entirely precise. It's close enough for my purposes, though, and here's the command line I use to calculate a single word count across the entire book (which is broken into many separate DITA source files):
$ xsltproc --novalid http://www.example.com/stripXML.xsl *.dita |wc -w
The XSL file referenced is a copy of the one from the article, I just posted it to a web server so that I have easy access to it from wherever I'm working. XSLTPROC is pre-installed on Mac OS X 10.6 Snow Leopard.