July 2008

Denormalizing is also hype these days

Yoan Blanc — Wed 16 July 2008 — SQLAlchemy, Twitter, MySQL, Python

It seems that we have to think about scalability, that what have be told us isn’t that true. Fast is the new small. People are spending a lot of time making thing faster and creating best practices in that field (such as the well-known Yahoo! exceptional performance team). A product has to have a lot of shinny features, to which you add a lot more every two months, and should keep the velocity of its early days. One path is to apply the simple rule: divide and conquer. Split every single piece of feature into a smaller ones, so this way you’ll be able to make it grow with Amdahl’s law in mind.

A simple example: you just built a micro blogging tool with your favorite tool. And created a simple REST API using a specify page in the same application, let’s say: /post/<user>/?message=<message>

It’s working and you can share piece of code between the front-end and the updater, like the model, some validators and so on. Now imagine that your service is becoming very popular, so you add servers and but them on a load balancer and use a memcached (or sharedance) to handle the users’ session. But the thing is that your API is becoming more popular than the front-end you’re using. 10 times more, like Twitter (lucky you). You cannot grow only the API part of your application because it’s tied to the web one, you’re screwed. And does a REST API need something that big and powerful that Apache is?

A single application must be split into smaller ones early. You may also run them on the same server at the beginning. I think that it makes the maintenance way simpler. You can know more clearly what are the dependencies.

The you can reduce the work load by postponing some tasks to later, or at least not in the time of the same request. Simple things like resizing images (i.e. local.ch’s binary pool, video presentation), sending notifications (email, Jabber/XMPP or even writing into the database). Any informations that the user don’t want to see directly can be done later and sometimes should. A way of doing this is to send a message over the wire and a worker will pick it up as soon as he has some free time (producer/consumer). There are many ways to do so, I recently (but it’s not something new) discovered lwqueue (written in Perl, with different wrappers), but you can go with Amazon’s SQS, Erlang copy ESQS, …

Your application is in different pieces (i.e. front end and API), it uses messaging to perform big (or not-that-urgent tasks) what remains? The data, because it takes time to write and read data. Especially when things become huge. So, I made a little test with SQLAlchemy on MySQL for a micro shouting tool where data are denormalized and tries to be a start for what’s called: database sharding.

User centric informations are stored into a common database but user related informations, in this case messages and relations (between users) are stored into a different database. From a user point of view, it only needs his database where everything is. There are a lot of data copied around, so it’s a trade-off between speed and space. It’s using a lot more space but you can grow very big and still have reasonable time while reading data because you’re in a specific space. With this architecture, if someone has 30’000 followers (like Robert Scoble), the message will be copied once 30’000 times but at read time you don’t have a sql query with 30’000 users. I don’t know how Twitter works, but identi.ca does sub select but a traditional (normalized) approach won’t work once it gets (too) big and your database will become the bottleneck.

bigger (svg file)

You can see in this little piece of code (output), apart the fact that I am still a n00b with SQLAlchemy, that the writing operations are complex because it has to be done in several places. I.e. with two users John(1) and Alice(1), that are in different databases db1 and db2, a subscription goes that way.

# John subscribes to Alice
>>> subscribe(john, alice)

# will run
INSERT INTO db1.relations
 (from_id, to_id)
 VALUES
 (1, 2);

INSERT INTO db2.relations
 (from_id, to_id)
 VALUES
 (1, 2);

The second statement (db2) can be done later, because only John(1) cares about it directly. Aline shouldn’t notice a delay if there is any.

# Alice sends a message
>>> post(alice, "Hello World!")

# the SQL statements
INSERT INTO db2.messages
 (user_id, from_id, from_name, message)
 VALUES
 (2, 2, "Alice", "Hello World!");

# (can be done later)
# select the followers
SELECT from_id FROM db2.relations
 WHERE (to_id=2);

# only one follower, John(1)
INSERT INTO db1.messages
 (user_id, from_id, from_name, message)
 VALUES
 (1, 2, "Alice", "Hello World!");

Every single follower will get a carbon copy of the message, posting a message seems very complex, but now reading John’s messages are trivial:

SELECT from_name, message FROM db2.messages
 WHERE user_id=2;

No joins, no sub-selects, no hassle at read-time. It’s optimized for reading purposes. Writing takes a little more time, even if most of the actions can be postponed, and not when the user submits the data. The problem with this is that you spend a lot of time ensuring that data is coherent, but it’s not something urgent (for most of it).

It’s only a test, I’m not a DBA or a back-end guy, so it’s a very simple essay with bugs, glitches and design issues take this as is. This small experiments helped me understanding how you can, for example, design an application for Google App where JOINs don’t exist. And as you will see, SQLAlchemy great functionalities aren’t a great help here because I don’t know how to join model across schema or to have multiple databases into the same schema.

I invite you to watch, it you didn’t already video about how YouTube scaled so well and so fast.

Montée en charge, performances, cloud computing, dénormalisation sont dans le vent si on suit un peu ce qui se passe côté technique de l’univers web qui atteignent un nombre important de visiteurs, utilisateurs. J’ai exploré, du bout du doigt, ce qu’on appelle la dénormalisation, qui consiste à oublier ce qui nous a été enseigné pour prendre une approche orientée performance quand il s’agit d’architecturer sa base de donnée.

Une fois Boyce-Codd rangé au placard, on peut essayer de casser un peu le système. J’ai écrit un petit script (résultat) faisant usage de SQLAlchemy qui crée un embryon d’outil de micro-blogging (c’est tendance parait-il), mais je vous rassure que la partie base de donnée, on mettra ça dans une timeline à la prochaine St-Glinglin.

L’idée est de pouvoir distribuer, diviser, les comptes utilisateurs dans plusieurs bases de donnée différentes. Enfin, leurs données du moins. Genre tous les messages, toutes les données de l’utilisateur A sont dans la base de donnée X. Toutes les données dont il a besoin, dans l’optique qu’il puisse y accéder à moindre coût. Quand l’utilisateur A envoie un message à l’utilisateur B, se trouvant lui dans la base de donnée Y ce message va être écrit deux fois. Une fois chez X et une fois chez Y. Ainsi A et B y accède de manière simple. Le surcoût à l’écriture a pour but de faire de grosses économies lors de la lecture qui est, normalement bien, plus fréquente.

D’autre part, en tant qu’utilisateur A, envoyant un message à B, il m’intéresse seulement de savoir que ce message est bien parti, donc de l’avoir dans mon espace directement. L’opération de copie chez B ne m’intéresse pas et peut donc se faire plus tard, un peu plus tard, de manière asynchrone.

Quand A envoie son message donc, il va être inscrit directement chez lui, mais l’opération d’inscription ailleurs est déférée. Un système de messages, producteur/consommateur, utilisant par exemple lwqueue, starling ou SQS saura faire l’affaire. Si c’est une opération coûteuse, elle sera en tout cas épargné à ce cher A qui n’a pas de temps à perdre, lui.

Pour revenir au partitionnement, à la copie en divers endroit d’une même information est sensé permettre de grandir plus facilement. De pouvoir ajouter autant de nouvelles bases de donnée que nécessaire pour répartir la charge. Ça s’appelle “shared-nothing architecture” mais tel que j’ai fait ça une base de donnée centrale m’est nécessaire pour stocker où se trouve les utilisateurs. C’est du RAID3 et pas du RAID5, pour faire une analogie hardware.

Le principe très intéressant d’une telle approche est à mon sens de se dire : « que me faut-il pour me rendre la vie plus simple, de quelles données j’ai besoin. » Ensuite, la complexité est masquée par une couche d’abstraction au système qui devient du coup obligatoire si un minimum de cohérence doit être conservée. Cette couche d’abstraction va décider de ce qui doit être réalisé dans l’immédiat et de ce qui peut être remis à plus tard, délégué à une autre personne.

La force de pouvoir déléguer signifie que l’architecture générale est modulaire. Qu’il doit être possible d’ajouter indépendamment des serveurs frontaux, ou des travailleurs de l’ombre si l’un ou l’autre patauge dans la semoule. Mais un site 2.0 de ce nom n’est rien sans son API, qui doit à mon sens être un service à part pour permettre sa croissance de manière indépendante. Twitter qui est un cas un peu particulier avait quelques mois auparavant un facteur 10:1 entre l’API et les serveurs web traditionnels, à mon avis ça doit être un écart plus important encore aujourd’hui. Diviser pour régner.

Bien du plaisir avec ce petit bout de code, et je suis preneur d’un meilleur usage de SQLAlchemy au passage.

Browsing data on a timeline

Yoan Blanc — Mon 14 July 2008 — Javascript, User Experience, Timeline

It seems that the timeline is the new map. The famous Google/Yahoo!/… maps that enable so many so-called mash-ups (we all put something on a map and said: “wow, that is cool”). The timelines are hype these days apparently, with Wikiwix, Plurk, Dipity or the swiss one: Mixin.

I recently bumped on some prototypes for Firefox history browsing on Planet Mozilla, made by Wei Zhou: very impressing and interesting. It makes me realizing that the common single issue for all the calendar or timeline I saw aren’t going with the internet way. If you recall “Dao of web design” back in 2000, the internet is a new media with new rules. We have books with pages, agenda with grids but it doesn’t make any sense — most of the time — to strictly reuse those paradigms onto the web. Anyone tried flickriver (or twitterriver) sees directly how enjoyable this is to only scroll down and not having to deal with pages.

From my point of view, timeline for the web have to be flexible. More data: can I have a bigger space then? It’s empty: so collapse it to the minimum. A timeline is after all, the vertical scrolling made horizontal. Plurk even catches the scroll event, which makes me crazy. A timeline, have to use the maximum space it can, because you’re putting the visitor into something new, or different at least. A user, I, will continue to take advantages of what he knows. Dipity does the work pretty well because it puts all the content inside it, everything that is external belongs to meta data or actions you can trigger. Proof is you can browse it fullscreen.

A timeline is a browsing facet you can offer, but don’t rely on it too much as you cannot rely on a map to create an application. It cannot define it. On the other hand it is a great help. A timeline is more visual than a list, more direct. You can remember it from time to time and figure out what changed since last time I saw it. You also should be able to see where are the hot spots, like you do with a graph.

“Content precedes design. Design in the absence of content is not design, it’s decoration.” — Jeffrey Zeldman

Timelines or graphs aren’t data, are nothing without them. All visualization metaphors have to serve the content, to make it more understandable, actionable or relevant. If it’s not the case, maybe remove them (and read “Administrative debris”) It’s almost certain that I can be amused to discover a new way to browse data, to engage with a web application. Everyone wants to get things done, in a pleasant and simple way and it’s not a map, a timeline or anything else that you may imagine that will solve your problems. It’s like regular expressions, now you have two problems, don’t you?

Timeline for Bikini Test agenda.

Anyway, I made a little timeline experiment. It’s for the well-known Bikini Test and in French... have fun. It’s quite interesting to see how Firefox, Opera and Webkit (via Epiphany) behave with it. Webkit wins.

Avec Dipity, Plurk, Wikiwix et Mixin, la timeline est-elle devenue la nouvelle carte. Ce gadget qu’on va placer partout, entre deux accordéons Javascript pour attirer le badeau assoiffé de web 2.0. C’est certe un élément intéressant pour de multiples raisons, prenez juste garde au facteur hype qui ne passera pas l’hiver.

Sur planète Mozilla, des prototypes réalisés par Wei Zhou imaginent une navigation assez intéressante de ces bookmarks, intéressante car spaciale, offrant ainsi au lecteur plus d’élément lui permettant de remettre le lien sauvegardé dans le contexte où ça c’est passé. C’est cognitif et franchement titillant. Ce que ces mockups m’ont inspiré est l’élasticité du temps, des informations, rien n’y est linéaire, constant ou répétitif. C’est être créatif, inventif que de réussir à s’y adapter, à sortir de la grille de l’agenda papier pour en faire une caricature à l’image du temps présent. Comme le web doit sortir du média papier, des livres pour définir ses règles, réflexion amorcée en 2000.

J’ai rapidement réalisé une timeline pour les informations du Bikini Test, des soirées, le plus souvent le week-end où la vue mensuelle traditionnelle est bien inintéressante et ennuyeuse à consulter. L’idée est d’ouvrir les jours à contenu et refermer au minimum ceux sans. Au minimum dans le sens que garder en trame de fond les espaces vides donne de l’emphase. C’est assez rudimental et, de mon point de vue, sympathique.

Un outil comme celui-ci sert le contenu, n’est rien sans lui, doit s’effacer pour lui laisser la première place, le devant de la scène. Dipity le fait assez bien à mon sens, il immerge bien dans cet élément nouveau car rien de ce qui l’entoure ne vient brouiller son utilisateur et un débutant va retrouver une scollbar, certe horizontale mais utilisable. Plurk au contraire, et de mon point de vue, plonge dans un monde un peu bâtard où ce qui est en dessous de la ligne de temps a son importance. Scroller avec sa souris va avoir un comportement différent suivant où on se trouve m’amenant le plus souvent à un comportement non désiré, donc un peu frustrant.

Pour conclure, un design comme une interface riche sert son contenu et doit s’effacer pour lui laisser le devant de la scène, l’accès premier. Ce qui reste comme bruit doit être supprimé si ça n’a pas plus de raison d’être là. La force d’une timeline sur une simple liste est l’espace, les choses vivent dans un autre espace, le temps qui est représenté et permet de facilement se faire une image mentale des choses, et de les mémoriser d’un clignement d’œil. Pour faire un essai, enlevez les avatars de Twitter et regardez combien de temps il vous vous pour réaliser qu’il y a du contenu frais ou pas. Autrement plus selon moi.

Yoan Blanc’s weblog

July 2008

Denormalizing is also hype these days

Browsing data on a timeline

About

Misc