Open Source News from FOSDEM 2009 - Day 2 - Lone Wolves - Web, game, and open source development

In the weekend of 7 and 8 February, the 9th Free & Open Source Developers' Europe Meeting (FOSDEM) took place at the Université Libre Bruxelles (ULB) in Brussels. Your editors Sander Marechal and Hans Kwint attended this meeting to find out for you what's hot, new in the area of the Linux environment and might be coming to you in the near future. This is our report of the second day covering the talks about Thunderbird 3, Debian release management, Ext4, Syslinux, CalDAV and more. Coverage of the first day can be found in our previous article.

FOSDEM, the Free and Open Source Software Developers' European Meeting

Lightning talk: How the social networking site Hyves benefits from Pupput - Marlon de Boer
Thunderbird 3 - Ludovic Hirlimann and David Ascher
Upstart - Scott James Remnant
Release management in Debian: Can we do better? - Frans Pop
Lightning talk: Introducing FreeDroidRPG - Arthur Huillet
Syslinux and the dynamic x86 boot process - H. Peter Anvin
Lightning talk: Games done good - Steven Goodwin
Ext4 - Theodore Ts'o
Automated web translation workflow for multilingual Drupal sites - Stany van Gelder
CalDAV, the open groupware protocol - Helge Heß
Wrap up

Lightning talk: How the social networking site Hyves benefits from Pupput - Marlon de Boer

By Hans Kwint [back to top]

Hyves.net is one of the Netherlands biggest social Network sites. However, not everybody might be familiar with Hyves. To give you an idea what this means for the requirements for the servers, Marlon gives some figures: "200 million pageviews daily and at busy moments reaching eighteen million pageviews per hour". All these requests are served by 2500 servers decentrally located in the area of the largest internet exchange in the world in Amsterdam. From time to time servers have to be replaced or new servers have to be added. The Hyves administrators team is able to unpack and connect the wires of a server and the configuration of thenew server in seven minutes. Interestingly these servers run Gentoo Linux. Great "Get the facts" titles came to my mind such as "Read how Hyves manages 2500 servers and puts new servers up online in seven minutes using Gentoo Linux and Puppet".

Back to Puppet: Puppet is a front-end for remote system administration tasks written in Ruby. Also, it uses Augeas, the topic of the last talk of the previous day. Puppet uses a client-server architecture, and it uses SSL for authentication. It makes use of templates to enable quick configuration of new servers and it also supports types and functions. Hyves now uses the PXE boot protocol in combination with puppet to quickly configure new servers. Puppet amongst others handles DNS, NTP, Firewall and package management. Of course, as a Gentoo user I was curious why Hyves chose Gentoo; which may not seem the most logical choice. One of the developers told someone else asked exactly that question, they use it because of the ease in writing custom packages. "I write an ebuild within five minutes" he said, "something that cannot be done that quickly with Debian packages for example".

Thunderbird 3 - Ludovic Hirlimann and David Ascher

By Sander Marechal [back to top]

There was already a Q&A session going on when I entered the Mozilla dev room for the Thunderbird 3 talk. The room was packed to the brim and some people could not get in anymore. Many points were discussed before the talk even began, the biggest one probably being the biggest setback for people awaiting Thunderbird 3 and Mozilla dropping integration of the Lightning Calendar for the 3.0 release.

The talk itself began with a short overview of the history of Thunderbird and Mozilla Messaging. Mozilla itself has been very focused on Thunderbird (See also the Mozilla opening keynote in the previous article) but Thunderbird had been pushed to the background a bit. The spin-off of Mozilla Messaging is meant to correct that because like the web, messaging is under serious threat lately.

Messaging as a whole (not just e-mail) is under threat from a variety of big, centralized and closed systems, in particular Facebook. Ludovic estimated that at the current growth rate of closed messaging networks like Facebook, soon more messages will travel over these closed networks than over e-mail (spam excluded). Just because these messages travel over HTTP does not mean that the messaging networks are open. Data ownership matters, as does decentralized innovation and user-level hackability. Currently most innovation is happening in the closed centralized networks. Mozilla Messaging is therefor not just about Thunderbird and e-mail but about messaging as a whole.

For the immediate future the focus is on Thunderbird though, and Mozilla Messaging has quite a big job ahead of it. There are thousands of open bug reports and feature requests going back years that need to be addressed. Mozilla also wants to add new features in order to grow it's market the way that Firefox has but that is more difficult than it seems. Thunderbird is pretty feature-packed as it is so in order to add new features, old ones need to be removed or pushed into extensions in order to keep the Thunderbird base from becoming too bloated. Also, the internals of Thunderbird are quite complex, changes that appear to be easy to do often turn out to be very complex or even impossible and (too) many of the APIs are still in C++. It has a code base of well over 500,000 lines, only a handful of developers and 2-3 QA people to maintain it.

The plan for Thunderbird 3 is to upgrade Gecko to version 1.9.1 (the same as Firefox 3.1), upgrade the platform and make a lot of improvements to the overall user experience. The core of Thunderbird is being reworked to take modern computing into account (cheap disk space and high bandwidth) and do most of it's work asynchronously. For example, when you delete a message it immediately disappears but the actual delete from your inbox files will be run at leisure in the background. The UI is also getting an overhaul in order to provide faster access to the most used functions and simplify account setup from a 7 page wizard to a single form with three fields. When all this is done Thunderbird 3 will be released. Beta 2 is expected around the end of this week.

After that the team plans to make contributing easier with higher level (JavaScript) API, a new indexing database called Gloda, HTML/CSS-based views which are easier to customize and a set of extensions to experiment with even more drastic UI layout changes.

Upstart - Scott James Remnant

By Hans Kwint [back to top]

Before attending this presentation, I only new Upstart as an alternative way of booting used by Ubuntu to speed up the boot process. However, Scott Remnant, the writer of Upstart who also works at Canonical says that Upstart is much more: "It's an API for processes to communicate." It has also been used in Fedora as of late and it promises to improve the boot process by way of trying to eliminate the time the computer spends doing nothing but sleep() and avoiding race conditions.

The unique characteristic of Upstart is that it makes use of simple grammar: While Q running, until P, while R start / stop S, else T, V also W, X and Y, Z or N. The capitals here may represent boot processes which have or have not been started. As you may be guessed, these requirements may be combined as well. As a result, it's the intention that runlevels are history. "Just tell what you want to start, or when it needs to be run", Scott explains.

For example, the grammar can be used to tell a laptop what it should do or what to stop doing while it draws power from the battery or the net.

Upstart also intends to make cron obsolete, by means of supporting timed events. These are the ones like 'in 2 hours', '24 seconds after startup', 'every 10 minutes while eth0 up' and so on. These timed events allow for a greater flexibility of running conditional commands; because these commands can be coupled to services that are running or not.

Scott also explained more about the road map: Upstart 0.5.2 will be available this month, and 0.10.0 is planned for June 2009 and will feature new job syntax. After that, 1.00 will probably come sometime in September of 2009; or if it's not progressed far enough it will be called 0.50.

Release management in Debian: Can we do better? - Frans Pop

By Hans Kwint [back to top]

Frans Pop is not a Debian release engineer but works as the release engineer of the Debian installer, even though he temporarily resigned for a few months. In that function he is an important part of the Debian release cycle because if anything in Debian is broken, users will logically blame the installer. In the past, there have been serious issues with Debian releases leading to broken features in the distribution. For example, when Etch had been released the Sarge installer was broken for some time. This is a problem because certain users have policies, for example that they might have decided to continue using old-stable for a year to year and a half. Frans was not out to blame people personally, however he did make some suggestions on how release engineers can improve their behaviour to prevent problems such as has happened in the past.

It basically can be summarized as follows: There has been too little public communication. The communication between those involved with the release process was mainly on IRC, resulting in those not being in IRC at that time not knowing what had happened. Also, the team of release engineers lack a thorough summary of what tasks are part of a release and how long these tasks take. For example, some of the release engineers pulled out new release notices without being aware these notices have to be translated before the release. Some not widely known parts of the release are sometimes a bit forgotten: Key distribution, documentation on an upgrade path and the documentation on the Debian website. Frans pleaded for a more open communication and better understanding of how long certain tasks take and a better planning of all these tasks.

Lightning talk: Introducing FreeDroidRPG - Arthur Huillet

By Sander Marechal [back to top]

FreeDroidRPG is an isometric action/adventure game in the style of Diablo starring Tux as the main protagonist against the evil robots that are running the MegaSys operating system, a poorly secured operating system that nevertheless runs on the majority of robots. Arthur took the audience around a quick tour of the features of the game, such as fighting, magic inventory and the ability to hack the OS of the droids so you can reprogram them and turn them to your side.

The game is purely single player and no multiplayer options are planned. Arthur and his team want to focus to be on the storyline and character development. The base code is pretty much done but they are looking for people who can add characters, quests and levels in order to increase the games length because at the moment it only takes about six hours to finish the game.

Syslinux and the dynamic x86 boot process - H. Peter Anvin

By Hans Kwint [back to top]

SysLinux is a lightweight dynamic bootloader which can be used to boot various operating systems in different ways. Peter developed it because when working late into the night, he became incensed at not being able to boot something. As of recent SysLinux has included gPXE support and it can also boot over HTTP using Apache and cgi. Peter showed this by means of booting his virtual computer in Brussels from an Apache server in California.

SysLinux has a modular design. It consists of the user interface, diagnostic tools, policies and filesystem modules. The filesystem modules can boot binary formats. For example, recently someone asked Peter to support the Microsoft SDI format. Peter tells: "I didn't know that format, but it turned out to be some ramdisk using a Windows kernel". SysLinux uses a system called "shuffle": It's fed with parameters defining which parts of a boot file are at what place in a binary boot file. Because of this system adding new formats is quite easy. Supporting the Microsoft SDI format only took 139 lines of code, most of which was error checking. SysLinux also comes with policy modules. For example, these can say "Boot kernel X when on a 32 bit system, boot kernel Y when on a 64 bit system, else boot kernel Z". These modules also enable some quite sophisticated uses. One of them is probing the PCI bus when booting, mapping the devices found to kernel modules, and building an initramfs with all the needed modules. The astonishing thing is all of this can be done on the fly!

Syslinux is a work in progress. As a result of a historical error, much of the FS-code has been written in assembly. Work is being done to rewrite these parts amongst others to make sure it's easy to write support for brtfs in the near future. Also, Peter intends SysLinux to have an LUA-interpreter in the future to allow usage of simple to write LUA-scripts instead of the current modules. LUA is chosen because it is small and clean. Another point of attention is EFI support. For all of these things, Syslinux needs feedback from users and distributions. Currently it also lacks newbie-friendly documentation.

Lightning talk: Games done good - Steven Goodwin

By Sander Marechal [back to top]

Steven Goodwin is the main developer of the SGX Engine, a 3D graphics engine for games and the author of "Cross-Platform Game Programming" and "The Game Developers' Open Source Handbook". He started off describing that there is a fundamental disconnect in the way people write libraries for use in games and the game developers that actually write the games that causes game developers to re-use far less libraries that most non-game projects. This is usually because the developers of the library are making far too many assumptions and game developers are not going to try too hard to adapt. They simply try out a library and if it doesn't work they throw it out and try another one. If none of them work then they write it from scratch, again.

Library developers need to find common ground with the game developers, but what common ground is there? Steven gave some nice examples to show that there is far less common ground than expected. For example, the size of an int is different across platforms. The standard library is different across platforms. Even GCC is different across platforms with many options not being available everywhere and to make matters worse, people don't even agree on the definition of an "engine" or "object".

Libraries for games need to be built in a different way; everything needs to be abstracted and no assumptions must be made. For example, it is nearly impossible to write an input toolkit for games. What' i a "click" when you're running on a Wii with four controllers connected? What is a cursor on a touch screen? In that same vein, a graphics engine should not expect to find a graphics card on a machine. Just think of a World of Warcraft server.

SGX has been built in a way that is truly cross platform. Everything is abstracted and organized into loosely coupled modules ranging from memory to CRC, math, geometry, graphics, physics, sound and more. It is used by various games (although Steven was not allowed to say which ones) and should run everywhere, from your PC to your Wii, XBox, server and hand-held device.

Ext4 - Theodore Ts'o

By Sander Marechal [back to top]

Theodore started his presentation with an apology. He had prepared a very nice demonstration of ext4 on his laptop but it had been stolen at the train station. Luckily he did have a backup of his presentation, just not the demo.

He gave a quick overview about the ext3 filesystem and what is so good about it; ext3 is widely used and is pretty much the de facto Linux filesystem. It also has a very diverse developer community with contributors from all the major distributions. That is a bigger point then you would at first assume because until recently Red Had did not officially support the XFS filesystem and it did not have any of its own developers working on it. JFS is a great filesystem but the fact that pretty much all contributors are IBM employees has likely contributed to JFS' lack of success. Big distributions want someone who knows the ins and outs on their own team before they can support something as important as a filesystem and ext3 developers are everywhere.

The ext3 filesystem has its fair share of problems which ext4 should fix. Currently ext3 filesystems can only be up to 16 TB in size and there is a limit of 32,000 directories. The resolutions of the timestamps on files is only one second and there are performance problems, ext4 fixes all these issues.

Ext4 is not a new filesystem, just like ext3 was not a new filesystem. Ext3 is just ext2 with a number of new features added such as journaling and ext4 is simply ext3 with even more features, such as extends. Google has even contributed a patch that you can use to mount ext2 filesystems as ext4. The reason: Google is still using ext2 because it doesn't believe in journaling. When something goes wrong on one of the machines at Google it is simply easier to wipe the system and re-flash it from another node than it is to recover it. But they did want to make use of some of the new features that ext4 adds.

Ext4 isn't all good news though, the new allocator that it uses is likely to expose old bugs in software running on top of it. With ext3, applications were pretty much guaranteed that data was actually written to disk about 5 seconds after a write call. This was never official but simply resulted from the way ext3 was implemented. The new allocator used for ext4 means that this can take between 30 seconds and 5 minutes or more if you are running in laptop mode. It exposes a lot of applications that forget to call fsync() to force data to the disk but nevertheless assume that it has been written. Two of the major culprits appear to be Gnome and KDE who each write hundreds of dotfiles to a users home directory. A sudden crash of the machine means that all these files will appear to have disappeared. Users think that the filesystem is to blame but in reality it is the applications.

The situation appears to be a bit tricky to solve because the last thing you want to do is call fsync() too often because that would force your hard drive out of power saving mode. One of the possible solutions under investigation is a sort of callback system whereby an application can be notified when data has actually reached the platters of your hard drive.

Automated web translation workflow for multilingual Drupal sites - Stany van Gelder

By Hans Kwint [back to top]

As of Drupal 6, Connexion offers an automated web translation workflow module for Drupal. Of course there are machine translators, but normally these are not good enough. A human translator will still be necessary. However, much of the work such as a human translator does can be automated. The AWTW-module meant to do just that; this process is called Computer Aided Translation, or just simply CAT. The module is mainly aimed to automate repetitive tasks, however it should consider local differences like websites for different countries that may have different contact persons. Stany presented a demo of how these things can be filled in using the XML editor Déjavu. He also explained how ATWT can save several days of time in the translation process. In the future this module will also be able to map internal links, making sure links in one translation link to that topic in the same language.

CalDAV, the open groupware protocol - Helge Heß

By Sander Marechal [back to top]

CalDAV is a relatively new standard that allows users to store and retrieve calendar events from a central server. Helge noted that CalDAV is just a transport; the actual data formats used, iCal and vCard are much older. CalDAV is built on top of several well known technologies such as HTTP (REST-style), WebDAV and WebDAV-ACL. CalDAV itself is relatively simple, but the underpinnings of WebDAV and especially WebDAV ACL can make it quite complex. That is why a new protocol has emerged in the open source world called GroupDAV [really, this talk should have been titled GroupDAV and not WebDAV -- Sander].

GroupDAV is a subset of CalDAV, CardDAV and WebDAV. Helge recommended that anyone who tries to implement CalDAV first implement proper GroupDAV support because any GroupDAV client is able of talking to a CalDAV server because they are completely compatible. Full CalDAV clients simply have a couple of extra functions like REPORT which make some types of queries easier to do. One of the more interesting design goals behind GroupDAV is that it is completely compatible with Apache + mod_webdav. That means you do not need a special server to store your groupware data.

Helge then went into a more technical explanation of the protocols and how you can implement them, demonstrating various things with a simple command line client that showed the HTTP requests and responses between the server and client. He finished with an overview of existing server and client implementations. DaviCal (PHP) and CalendarServer (Python) are pretty complete CalDAV server implementations. On the client side the choice is wide. Evolution, Mozilla, Funambol, Mulberry, Chandler and even MS-Outlook (using the OpenConnector) are all able to speak CalDAV.

Wrap up

By Sander Marechal [back to top]

All in all I think FOSDEM 2009 has been a great success. The talks were great, the people friendly and the atmosphere was buzzing. The LXer editors would like to congratulate the FOSDEM 2009 staff on a job well done. We will certainly be there next year for FOSDEM 2010.

This article was originally posted on LXer Linux News.

Lone Wolves

Web, game, and open source development