Yoan Blanc’s weblog

Another lost swiss guy

May 2008

Comet web chat (with MochiWeb)

Yoan BlancThu 15 May 2008, , , , , ,

I already played here with Comet, but mostly from the front end perspective. This time, I’m diving on the back end. My previous examples were made using web.py which is built to run as a CGI and don’t fit that purpose. It doesn’t because it has to do the polling on the server side instead of the client side, which is wrong moving the problem around and not solving it: polling is bad. The server side code has to work with events that moves around, and the Comet call has to wait for it (an event) and should not trying to find it (polling).

I previously mentioned Twisted that is the most famous Python library that uses an event based mechanism (called Reactor, which also exists for Ruby: EventMachine) but they are many more for Python:

I didn’t most try most of them because I didn’t find what I was looking for: a many-to-many, simple and scalable event system.

When Facebook released their online chat, I checked how they did it. They don’t use Bayeux (or Orbited), neither Python but MochiWeb.

Screenshot of Firebug showing a Facebook chat call.

Mochiweb is the Erlang HTTP toolkit that powers MochiAds (advertisments for casual Flash games, Desktop Tower Defense you know). I don’t know if it’s as good as Yaws is (or was) when they compared it with Apache, but it has the ability to handle many (many) connections, hot swapping code, distribute over the network/web, etc. CouchDB is now using MochiWeb as well. Looking at others famous web chatting system: Meebo seems to use lighty and eBuddy Apache Coyote (Tomcat) but they don’t do Comet?

So, let’s start by building a MochiWeb environment (starts there):

$ escript script/new_mochiweb.erl chat ~/src

The only files you’ll have to edit are:

src/chat_web.erl handles the incoming message in the loop. So a GET will wait for a message and a POST will send one. Generally Comet works with two connections, one that gets the messages from the server and the other that sends them. Let’s dive!

'POST' ->
  case Path of ->
    "chat" ->
      % extract message informations from POST
      QueryString = Req:parse_post(),
      Message = proplists:get_value("message", QueryString),
      % send the message to the room
      Room ! {self(), Message},
      % reponse
      Req:ok({"text/plain", "ok"});

When we receive a request POST /chat the message is extracted from the body (message=…) and sent to the Room (using the bang !). Then an ack message is sent back. Now on the GET (Comet) side:

'GET' ->
  case Path of
    "chat" ->
      % 1) subscribing
      Room ! {self(), subscribe},
      % 2) waiting
      receive
        Message ->
          % 3) everything went right
          Req:ok({"text/plain", Message})
      after 10000
        % 4) oOops, too long buddy.
        room ! {self(), unsubscribe},
        Req:not_found()
    end;

That means, when I receive the HTTP call GET /chat I’ll :

  1. subscribe to the room (so it can send you back any messages);
  2. wait for a message (receive) for 10 secs (10000 ms) from the Room;
  3. a message arrived so we display it back (Req:ok);
  4. no messages, we can unsubscribe from the room and return an error message.

The room is a dispatcher, that will collect the subscriptions, a dispatch a message once it appears to each subscribers.

room(Subscribers) ->
  receive
    {From, subscribe} ->
      room([From | Subscribers]);

    {From, unsubscribe} ->
      room(Subscribers -- [From]);

    {From, Message} ->
      lists:foreach(fun(Subscriber) ->
          Subscriber ! Message
        end, Subscribers),
      room([])
  end.

This is how a process can work in Erlang, a tail recursive loop that runs every time it receives a message. There are three kind of messages here:

  • subscribe that prepends the subscriber to the subscribers;
  • unsubscribe that removes it;
  • an a message that broadcast the message to each subscribers (lists:foreach and !) and empty the list of subscribers (so they don’t have to unsubscribe).

Erlang messaging between processes can remind you Ada’s one but with lovely duck typing inside.

Et voilà, the remaining bits are to spawn the room under a “global” name (using register and whereis) so you are always talking to the same room. And of course the JSON encoding using mochijson2 take makes Ajax calls responses yummy.

$ erl
1> c(mochijson2).
{ok, mochijson2}
2> list_to_binary(
2>   mochijson2:encode({
2>     struct, [
2>       {hello, <<"World !">>}
2>     ]
2>   })
2> ).
<<"{\"hello\":\"World !\"}">>

Erlang treats string as lists, list_to_binary or <<...>> ensure you have the string (as a binary) and not as a list.

I’m sad that I cannot put a test here, so there is an animation using both Firefox and Opera because you’ll get a deadlock with two tabs in the same browser. And you can of course download the whole sources at once: chat.tgz (261 kB).

Next steps: how the room is managed is pretty minimalist. Improvement can be: history, users informations (joined, leaved) but what really interest me the most is how to distribute that on more that one server instance. I think I really have to look at OTP to manage the processes more globally and CouchDB or ErlyDB for a storage solution. I would enjoy getting any feedbacks you may have.

Retour sur mes expérimentations avec Comet (et web.py) où si ça fonctionnait, je savais que la solution choisie n’était pas adéquate car je déplaçais le polling côté client (dit Ajax) au sein du serveur alors que le serveur doit fonctionner de manière évènementielle et attendre qu’il y ait une information à donner au client pour agir. Un polling qu’il soit fait côté client ou serveur me semble mauvais, car étant une bidouille. « À quel intervale dois-je vérifier pour une nouvelle information ? » est la question cornélienne à laquelle je n’ai pas de réponse, si ce n’est de ne pas le faire.

Il faut donc lorgner du côté des systèmes évènementiels. Twisted a déjà été mentionné car il repose sur un principe d’évènements (Reactor, également porté pour Ruby); Orbited utilise lui libevent, Second Life a ses Eventlets (utilisant les Greenlets de Stacklessd), et les autres Cogen, Kamaelia, etc. Il ne m’a semblé trouvé la solution simple en dehors des exemples fournis pour obtenir un système évènementiel multiple (many-to-many), simple et scalable.

Et récemment, Facebook a ouvert au monde son système de chat, qui ne repose pas (encore) sur un protocole existant (type XMPP plus connu sous le nom de Jabber) mais est simplement du Comet. Et Facebook utilise un serveur nommé MochiWeb, développé par MochiMedia qui sert des millions de publicités (MochiAds) au format Flash habituellement affiché lors du chargement des jeux, type Desktop Tower Defense (ce jeu est génial). Si ce dernier affiche les mêmes performances face à Apache que Yaws, c’est prometteur.

Le chat construit fonctionne selon le principe de base d’un système Comet : deux canaux. Le premier de contrôle qui envoit l’information (ici des messages uniquement), et le second d’écoute (qui dans ce cas, reçoit les dits messages). Traduit en HTTP ça donne POST et GET (comme le monde est bien fait, même si on peut discuter sur l’usage de PUT à la place de POST, pour le moment Keep It Stupid Simple).

  • L’envoi d’un message a pour action d’envoyer ce message a toutes les personnes qui écoutent.
  • L’écoute d’un message a pour action d’attendre qu’un message nous soit transmis.

Dans l’exemple, l’envoi est assuré par une pièce (room) a laquelle on s’inscrit préalablement. Mon conseil si c’en est un, est : plongez dans le code, il est petit et simple :

Puisque que je ne vais pas vous demander d'installer erlang pour si peu, une petite animation (gif) est également disponible.

Aller plus loin, l’éternelle question. Tel quel, ce système n’offre aucun backend, aucune mémoire, un peu comme IRC (aussi bien soit-il). Ça peut donc être un point de départ en utilisant Mnesia, CouchDB voire ErlyDB (et MySQL). Distribution, comment un server de ce type va exister au sein d’un réseau (comme le web) et partager des évènements, états au travers de celui-ci ? Peut-être qu’OTP a des réponses a cette question là.

Sinon, il est bien évident que vos avis m’intéressent.

About

meYoan Blanc is a web developer that lives in Lausanne (rue Couchirard 15, 1004 Switzerland) worked for Yahoo! and comes from La Chaux-de-Fonds. This web site is for this weblog, a playground for random stuffs and can help keepingmeconnected with some friends out there.

Get my vCard or contact me by phone (+41 21 625 82 09) or email ().

Misc

RSS, list.blogug.ch

This site reflects only my opinion and is not affiliated with anyone else.

copyright 2006-2008 — doSimple.ch