All the courses I am responsible for, rely heavily on content made available through the web. I know, big deal! Each course has its own web page containing absolutely all the documents related to the course: notes, exams, labs, messages, schedule, syllabus, etc. This was something I didn't even think about back when I had to organize these courses, but as time passed, it proved to be a more challenging task than initially thought.
Since I have free access to a web site, the initial approach was simply push some files into a directory and let students access them. Then, a bit of HTML, nothing fancy, simply to link the content into sections, each of them with its own HTML page. So far so good. Then it came the need make quick modifications to the documents (up in the vicinity of 100 files when you include figures, additional material, etc). So what I needed was an agile way of taking a change from my own computer and push it to the web site. That was easily done with a dead simple ftp client. But then it came the need to produce large numbers of HTML documents.
That was the turning point in my otherwise happy life of content producer. An HTML file in one of my courses now has a unified prelude (University Logo, a few links to other relevant web sites) and a postlude (including the CC license and some other stuff). The page structure was a template that needed to be replicated for all the documents. Then, rather than resorting to sophisticated include patterns, I decided to write only the essential part of the HTML document and let style processing take care of the rest.
And it was here when I found DocBook. Docbook is basically a way to annotate all the parts of an arbitrary text with its structure. In other words, if a sentence is a title, you simply surround it with <title>, a section within your document would be surrounded by <section> and </section> elements. This procedure is extended to a large number of possible elements. If you write a document with this structure, which is XML, you basically focussed your energy on the structure of your text, and then leave the rest for some additional tool that will take these instructions and create the pretty HTML.
The magic of this step is done by the XSL Transformations. Basically, every structural element in the Docbook file is transformed to the appropriate formalism, which in this case is HTML and is applied identically to all the elements of the same time. This type of processing facilitates changing the aspect of all titles in just one simple step. Of course, there are other tools such as Word that do something similar, that is, you write your document using the styles that appear on the right hand side of the editor (if you choose to) and the editor remembers the format of each portion of text. If you then change the format, all elements labeled with such format change their appearance.
Back to my course web site. So, by writing XML documents containing the course material, I would then process and transfer all of them to the remote web site, and my pages would look all of them the same with a fairly decent appearance. But as the course material grew larger (labs, auxiliary files, more images, etc), so it came the process of producing the entire web site every time a new change was needed. Then, as a techie, some serious workflow tool was needed and deployed ant to capture all the required steps to produce only the required steps to obtain the entire web site and then transfer it (or should I say, synchronize it) with the remote host.
Although it took me quite a while to achieve this, it soon became apparent the big advantages of such approach. I then needed to add a few improvements into the web site, and the structure of docbook + xsl transformations showed a lot of potential.
Common appearance
One very important gain from this approach is to control the appearance of all your pages from a centralized location. Docbook provides a set of default transformations to translate files into HTML. If you are not happy with the outcome, you just simply have to add whatever is new. Doing so, it is a breeze to include a common prelude/postlude to all pages. Each page now has its head and foot with all the proper links.
RSS: Really simple sindication
Things got more interesting when I decided to create an RSS feed for the entire course web site. Since all the required information was contained in the docbook files, it was just a matter of creating a new transformation scheme that rather than producing HTML, it would create the RSS Feed. I have now all my courses with its RSS feed in parallel (another issue is the percentage of students that use it).
The Gadget
The next feature I wanted to add was a Google Gadget. Google allows you to provide a Gadget suitable to be embedded in your personalized home page. Basically, you provide an XML file with an HTML snippet inside, and this is rendered in the middle of a box at a location in a page chosen by the user. Of course the API is much more complicated and allows you to make reactive Gadgets, but I went for the easy catch and basically have a miniature course web page with the relevant links suitable for your Google Personalized Home page.
And now Core Duo
But the real challenge was when it came the time to take the web site (initially only in Spanish) and localized to English. This is when Docbook was a real boost. The two choices I was confronted with were: duplicating every single file that needed a different version in English (that is, 99% of them), or keep both versions next to each other in the same document and pull only the relevant part for each web site.
The first option appeared the most convenient. Merging content in two languages within the same file seemed a nightmare to understand. But on the other hand, when a change needed to be done, having a paragraph in both languages next to each other seemed perfect. It turned out that docbook comes fully loaded to adopt this second option with almost no effort. The trick is to label each element that needs two versions with an attribute stating the language in which it is written. So, the new material grew almost twice the original size because for each paragraph written in Spanish, another with its English translation would follow. This greatly simplified changing files, because both versions were right next to each other. As for the problem of creating two versions of the site, the style transformations allow for a language parameter to be provided and select out of all the elements in the text, only those that are labeled with the language you provided or with no label at all.
This is exactly what I needed to keep both versions in the same document and yet maintain the capability of quickly generating the two sets of disjoint HTML pages and now publish them in two different course sites.
And this is the scheme I'm using these days. When it comes to writing a new document, a plain old ascii file will do with a bunch of XML elements in there that you come to like. I then include the new document in the chain and voilá, the right HTML page in the right language with the proper links appears in the web site.