Sep
04
Posted on 04-09-2007
Filed Under (Programming, Nimboo) by Federico Feroldi on 04-09-2007

One of the most important characteristics of modern web application is scalability. With the advent of the web 2.0 era your web application can grow up very fast to millions of users and almost all the now famous websites that experienced this rapid growt shared the same problems: how to design a web application to serve from hundredths to hundredths millions of user requests.

Some people would argue that some language is faster than others or some framework is better scalable than other but the real answer is that there’s no silver bullet for scalability.
What you have to do instead is to follow some simple rules while you write the application from the ground (even if someone says that performance is not an issue before performance is an issue).

Monolitic applications cannot scale

This is far the most common issue in scaling. If you build a web application that is a big block of code, a huge monolitic tightly coupled mixture of code and HTML, you will loose the ability to optimize and separate what can run fast and what can be run more slowly.

Like a modern manufacturing system, your application must be built as separate and loosely coupled modules. By keeping each module small and independent you can optimize the usage of limited resources like CPU and memory and get the best performances.

By using standard RPC mechanisms (like SOAP or REST) or enterprise message busses (like ActiveMQ or XMPP servers) you can easily and transparently interconnect these modules while maintain the ability to optimize and scale each single module.

Some common features that can be separated from the main application are:

Administration panel
Many web applications need an administration panel needed to configure the application and manage user and data. Usually all this data is stored in a database that can be accessed and modified by a separate administration application.
Mail delivery
Many web applications handle the mail delivery in a synchronous way. This means that the HTTPD process has to wait for the SMTP server or the mail delivery program to complete. By keeping a shared mail queue (like a mysql db) you can minimize the queing time and manage or delay the actual delivery.
Batch data processing and statistics generation
This is a common source of load on backend databases in websites that show download/view counters of media items such as images or videos. A common approach is to update the “views” and “downloads” fields in the database on each user request. This puts the databased under a huge load that can bring it down to his knees during traffic peaks.
A better approach is to move the views/downloads counters updates to an asynchronous process that updates the database once in a while. This process can analyze the web server log to find how many times an item was viewed or downloaded. Another option involves the use of asyncronous messages queues or shared memory caches such as memcached.
Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • DZone
  • Reddit
  • Technorati
  • YahooMyWeb
    Read More   

Comments

Carlettos on 5 September, 2007 at 4:31 pm #

The last two items are very interesting from a Rails development. A simple cronjob pointing to “appropriate” task written for the application can save a lot of time (script/console?…). And IMHO Rails offers a nice environment to accomplish that..


Federico Feroldi on 5 September, 2007 at 4:39 pm #

Carlettos, yeah Rails is very good on running parts of an application from the command line. But doing this still goes against the rule of application modularity since you’re putting batch processing code into the main application thus making it heavier.
Instead, by just leveraing on ActiveRecord you can build very lite scripts that just do what you want without bringing up the whole application. :)


Post a Comment
Name:
Email:
Website:
Comments: