I already played here with Comet, but mostly from the front end perspective. This time, I’m diving on the back end. My previous examples were made using web.py which is built to run as a CGI and don’t fit that purpose. It doesn’t because it has to do the polling on the server side instead of the client side, which is wrong moving the problem around and not solving it: polling is bad. The server side code has to work with events that moves around, and the Comet call has to wait for it (an event) and should not trying to find it (polling).
I previously mentioned Twisted that is the most famous Python library that uses an event based mechanism (called Reactor, which also exists for Ruby: EventMachine) but they are many more for Python:
- libevent (Orbited),
- Eventlet (Second Life that uses Greenlet inspired by Stackless’ tasklet),
- Cogen,
- Kamaelia, …
I didn’t most try most of them because I didn’t find what I was looking for: a many-to-many, simple and scalable event system.
When Facebook released their online chat, I checked how they did it. They don’t use Bayeux (or Orbited), neither Python but MochiWeb.
Screenshot of Firebug showing a Facebook chat call.
Mochiweb is the Erlang HTTP toolkit that powers MochiAds (advertisments for casual Flash games, Desktop Tower Defense you know). I don’t know if it’s as good as Yaws is (or was) when they compared it with Apache, but it has the ability to handle many (many) connections, hot swapping code, distribute over the network/web, etc. CouchDB is now using MochiWeb as well. Looking at others famous web chatting system: Meebo seems to use lighty and eBuddy Apache Coyote (Tomcat) but they don’t do Comet?
So, let’s start by building a MochiWeb environment (starts there):
$ escript script/new_mochiweb.erl chat ~/src
The only files you’ll have to edit are:
src/chat_web.erl
handles the incoming message in the loop
. So a GET will wait for a message and a POST will send one. Generally Comet works with two connections, one that gets the messages from the server and the other that sends them. Let’s dive!
'POST' -> case Path of -> "chat" -> % extract message informations from POST QueryString = Req:parse_post(), Message = proplists:get_value("message", QueryString), % send the message to the room Room ! {self(), Message}, % reponse Req:ok({"text/plain", "ok"});
When we receive a request POST /chat
the message is extracted from the body (message=…
) and sent to the Room
(using the bang !
). Then an ack message is sent back. Now on the GET (Comet) side:
'GET' -> case Path of "chat" -> % 1) subscribing Room ! {self(), subscribe}, % 2) waiting receive Message -> % 3) everything went right Req:ok({"text/plain", Message}) after 10000 % 4) oOops, too long buddy. room ! {self(), unsubscribe}, Req:not_found() end;
That means, when I receive the HTTP call GET /chat
I’ll :
- subscribe to the room (so it can send you back any messages);
- wait for a message (
receive
) for 10 secs (10000 ms) from theRoom
; - a message arrived so we display it back (
Req:ok
); - no messages, we can unsubscribe from the room and return an error message.
The room is a dispatcher, that will collect the subscriptions, a dispatch a message once it appears to each subscribers.
room(Subscribers) -> receive {From, subscribe} -> room([From | Subscribers]); {From, unsubscribe} -> room(Subscribers -- [From]); {From, Message} -> lists:foreach(fun(Subscriber) -> Subscriber ! Message end, Subscribers), room([]) end.
This is how a process can work in Erlang, a tail recursive loop that runs every time it receives a message. There are three kind of messages here:
subscribe
that prepends the subscriber to the subscribers;unsubscribe
that removes it;- an a message that broadcast the message to each subscribers (
lists:foreach
and!
) and empty the list of subscribers (so they don’t have to unsubscribe).
Erlang messaging between processes can remind you Ada’s one but with lovely duck typing inside.
Et voilà, the remaining bits are to spawn
the room under a “global” name (using register
and whereis
) so you are always talking to the same room. And of course the JSON encoding using mochijson2 take makes Ajax calls responses yummy.
$ erl 1> c(mochijson2). {ok, mochijson2} 2> list_to_binary( 2> mochijson2:encode({ 2> struct, [ 2> {hello, <<"World !">>} 2> ] 2> }) 2> ). <<"{\"hello\":\"World !\"}">>
Erlang treats string as lists, list_to_binary
or <<...>>
ensure you have the string (as a binary) and not as a list.
I’m sad that I cannot put a test here, so there is an animation using both Firefox and Opera because you’ll get a deadlock with two tabs in the same browser. And you can of course download the whole sources at once: chat.tgz (261 kB).
Next steps: how the room is managed is pretty minimalist. Improvement can be: history, users informations (joined, leaved) but what really interest me the most is how to distribute that on more that one server instance. I think I really have to look at OTP to manage the processes more globally and CouchDB or ErlyDB for a storage solution. I would enjoy getting any feedbacks you may have.