Just a quick update: I will be giving a presentation about all the new features in Officeshots at the 2010 Document Freedom Day in Baarn in The Netherlands. I will be updating the audience about the progress made since my presentation last year at the DFD. I am not sure exactly what time I will speak, but it will be taped for those who cannot be there.
One of the great advantages of the OpenDocument format is that it is simply a zip file. You can unzip it with any archiver and take a look at the contents, which is a set of XML documents and associated data. Many people are using this feature do create some nifty toolchains. Unzip, make some changes, zip it again and you have a new ODF document. Well… almost.
The OpenDocument Format specification has one little extra restriction when it comes to zip containers: The file called “mimetype” must be at the beginning of the zip file, it must be uncompressed and it must be stored without any additional file attributes. Unfortunately many developers seem to forget this. It is the number one cause of failed documents at Officeshots.org. If the mimetype file is not correctly zipped then it is not possible to programmatically detect the mimetype of the ODF file. And if the mimetype check fails, Officeshots (and possibly other applications) will refuse the document. This problem is compounded because virtually no ODF validator checks the zip container. They only check the contents. In this article I will show you how you can properly create ODF files using zip.
I have just released a new feature for Officeshots: The ODF anonymiser. The ODF Anonymiser tries to make your document completely anonymous while maintaining it's overall structure. All metadata is removed or cleaned. All text in the document is replaces with gibberish text that has approximately the same word length and word distribution. All images are replaced with placeholder images. All unknown content is removed.
The result of the anonymiser is a document that has the same general structure but with made-up contents. If your original document does not work in a certain application, the anonymised version of the document should fail in the same manner. By using the anonymiser you can test your private documents without exposing the contents to our rendering clients.
I am happy to announce an exciting new feature for Officeshots: Integrated ODF validators.
Every ODF document that is uploaded is run through several different ODF validators. If the converted documents are also ODF documents (when you are testing ODF round trips) then those results are also passed through these ODF validators.
The results of the validators are made available on the request overview, the individual result pages and inside the galleries. Galleries now not only show all attached documents but also all results and a summary of the validator results. This way it becomes really easy to see which documents failed.
I have finished setting up the internationalisation and localisation frameworks for Officeshots. If you want, you can now help to translate Officeshots to your own language. Translating Officeshots can be done through our Pootle installation.
At the moment there are almost no languages configured yet in Pootle. The reason is that the CakePHP framework on which Officeshots runs has a different locale structure than what Pootle expects. This means I need to add every language by hand. If you want to start working on a new language, please post to the Officeshots mailinglist and I will add the language to Pootle and to Officeshots.
When working on the beta of Officeshots.org I ran into an interesting problem with file type and MIME type detection of OpenDocument files. When a user uploads an ODF file to Officeshots I want to determine the MIME type myself using the PHP Fileinfo extension. Windows user who do not have any ODF supporting applications installed will report ODF files as application/zip which is of no use to me. In addition, a malicious user could attempt to upload an executable file and report the MIME type as ODF file.
On Linux, the PHP Fileinfo extension relies on the magic file that is provided by the file package. The magic file contains a series of tests that can determine the file type and MIME type of a file by its contents. I found out that the magic file is incomplete for OpenDocument files. Below I will show you what is wrong with the magic file and how you can fix it.
Update 2009-06-29: I have now also created a patch against the original upstream file-5.0.3.
Officeshots.org has finally gone into Beta this week. It took a lot more work (and time) than expected but we made it nonetheless. At the moment the beta is a closed beta, available to current contributers and members of the OpenDoc society. But we hope to start with public, free availability within a month. Joining the OpenDoc society is free for FOSS projects, so if you are interested in the beta, please join them.
Read more for the full press release.
Yesterday the OpenDoc Society, the NoiV (Netherlands in Open Connection) and the NLNet Foundation announced Officeshots.org, a new webservice where you can upload ODF documents and compare their rendering and output in different office suite applications. We here at Lone Wolves are happy to announce that we are the lead architects of this new webservice.
Over the coming days I will announce a couple of things regarding Officeshots.org on this website, like how it works, where to get the code and how to contribute. The plan is to start a closed beta by the end of February and go public by the end of March, but if we want to make this deadline then we need contributers. In the upcoming days I will explain exactly what we need, but if you want to help then you can already join the officeshots.org mailinglist.
I just spotted this wonderful poem about freedom on a photograph of the ODF Olympiad 2008 Malaysia. I thought it very apt for the issues facing us these days.
Where the mind is without fear and the head is held high.
Where knowledge is free.
Where the world has not been broken up into fragments by narrow domestic walls.
Where words come out from the depth of truth.
Where tireless striving stretches its arms towards perfection.
Where the clear stream of reason has not lost its way into the dreary desert sand of dead habit.
Where the mind is led forward by thee into ever-widening thought and action-into that heaven of freedom, my Father, let my country awake.
Via the ODF Discuss mailinglist.
A curious FAQ put up by an unnamed ISO staffer on MS-OOXML. Question #1 expresses concerns about Fast Tracking a 6,000 page specification, a concern which a large number of NB's also expressed during the DIS process. Rather than deal honestly with this question, the ISO FAQ says:
The number of pages of a document is not a criterion cited in the JTC 1 Directives for refusal. It should be noted that it is not unusual for IT standards to run to several hundred, or even several thousand pages.
For ISO, in a public relations pitch, to blithely suggest that several thousand page Fast Tracks are "not unusual" shows an audacious disregard for the truth and a lack of respect for a public that is looking for ISO to correct its errors.
From: An Antic Disposition by Rob Weir.
Updated on 2008-03-26@17:34 I emailed a copy of this article to Patrick and he has responded. I have posted his response at the bottom of the article.
This is a response to Patrick Durusau's recent letter Who loses if OpenXML loses? (PDF). Before I discuss the various points that you make in your letter there is one thing that I would like to say; I find it shameful that you, Patrick, makes these kind of statements without a proper disclaimer that this is your personal opinion and not the position of the ODF committee (for whom you edit the ODF specifications), the V1 or any other technical body that you represent. In fact you seem quite happy that the media is running with headlines like “The ODF editor says…” else you would have done something about it after your previous publications. To lead by example:
The opinions expressed in this letter are my own. They do not necessarily represent the viewpoint of LXer Linux News, nor the viewpoint of my employer Tribal Internet Marketing. They do represent the viewpoint of The Lone Wolves Foundation though.
Now, back to your letter.
The only one who loses if DIS 29500 fails is Microsoft, whose Office 2007 cashcow will run into trouble. Everyone else, including the OpenDocument Format, do not need an ISO stamp of approval on DIS 29500. The current Ecma 376 standard, flawed as it is, is more than enough to work with.
This letter is also posted on LXer Linux News.
Lone Wolves is happy to announce the ODF-XSLT project. The ODF-XSLT Document Generator is a library written in PHP 5 that brings the full power of XSLT to your OpenDocument files. It enables you to use ODF files as if they were plain XSLT templates. It also includes a few extra parsing options that allow you to edit the XSLT parts of these ODF from within your favourite office suite. ODF-XSLT is developed by Tribal Internet Marketing and is released by Stichting Lone Wolves as Free Software under the GNU General Public License, version 3.
The first release of ODF-XSLT is odf-xslt-0.4 and can be downloaded from our download section, together with a nightly snapshot of the subversion trunk. You can also check out the latest version directly from our subversion repository. The manual and API documentation are available from the project website.
A long time ago in a land far away there once was a prosperous town called Hamelin. Everything was perfect in Hamelin until the year the rats came. The rats ate up the grain, bit the townsfolk in the toes and scared the young children. Something had to be done!
And so begins Rob Weir's allegory The Legend of the Rat Farmer. An allegory in which the Bürgermeister and Council of Hamelin try to find a solution to their rat problem, discover the importance of appropriate metrics and learn a thing or two about standardization in the process.
Rob gives a very good explanation of exactly what is wrong with Microsoft's latest claims that “choice [of standards] is good for the consumer”. Read it at An Antic Disposition.
Two organisations, OpenForum Europe (OFE), a leading organisation set up to advance the use of open standards, and ODF Alliance, a campaigning group promoting open document format, representing over 210 organisations in 30 countries, highlight that the new standard, Microsoft licensed Office Open XML, is being fast tracked to become a new European ISO/IEC standard. This new standard has been submitted by ECMA, the European Computer Manufacturers Association with a completely unrealistic deadline for stakeholders to engage.
One of the OFE’s and ODF Alliance’s main criticisms targeted at ECMA’s standard is its complexity. It is over 6,000 pages long, excluding supporting material, making it time consuming and ultimately more expensive for the future development of software. It also duplicates an existing comprehensive and recently ratified) standard Open Document Format (ODF) which causes a major issue of system complexity, development, maintenance, archiving and licensing. Furthermore, elements of ECMA’s standard contradict the recently ratified ODF standard, which if implemented, would lead to confusion for software developers, increase cost and leading to problems sharing and archiving documents. There are also serious doubts that the standard could be implemented outside the Microsoft environment, due to license requirements that are not made explicit.
ACTION: Write to your local standards organisation setting out your concerns, recommending that an issue of this importance should be reasonable given time for proper consideration and due diligence. A 30 day Fast Track Procedure is not appropriate for a 6000 page document. Contact list on the ODF Alliance European Website.