Brief Comparison of Servers and Frameworks

Thursday February 12th, 2009

Pages: 1 2 3 4

This article has been deprecated as the fundamental principles behind the data acquisition are flawed and incorrect (at the time, I didn’t have multiple machines to test on; as a consequence, benchmarks were performed on the same server by the same server). You should not cite this article as authoritative, but it can give you some additional foundations to base further research on.

I’ve been working with a couple of Python-based frameworks for web applications recently and have elected to try my hand at writing my own application server in Stackless Python due to its simplified concurrency model (and threading in Python appears to have performance-related issues compared to languages that support it natively, like Java). My efforts haven’t gotten much further than creating a basic skeleton so far, and it certainly seems as though I’ve learned much more about Python’s internals than I ever wanted to know! Performance has been a fairly significant concern up front, and I believe it’s necessary to examine how your own software performs early in the development process rather than waiting until the design has been solidified. Once you’ve been bitten by a fundamental flaw or serious bottleneck around which your entire design relies heavily on, it’s very difficult to alter the application’s behavior without significant refactoring. While I’ve heard it said that it’s sometimes better to write the application first and worry about performance later, I have never had such luck. It seems easier to write the application correctly the first time, in terms of developer time spent now and in the future on maintenance.

However, there is one benefit I didn’t predict from this exercise: This experience has also afforded me an opportunity to compare different frameworks and servers for performance and behavior under load.

The benchmarks in this post are unscientific and were not performed under controlled conditions. The data were obtained from my development machine which operates as a general purpose file server, test server, and software development host, so it can potentially see variable loads that interfere with network performance and CPU availability. It’s also an older machine and is certainly not representative of modern multi-core servers. Furthermore, these benchmarks are based on static files served up from the file system by the framework or the server (this isn’t always this case–more on that later). No processing is performed beyond what is strictly needed to serve a file, and it’s important to keep in mind that the parts of certain application servers, like Tomcat, that make them so attractive have been intentionally ignored. Again: This test only examines each software’s capability to serve static, not dynamic, content.

Hardware Specs

My development machine (Sagittarius) is an older Pentium 4, clocked at 2.4GHz with hyperthreading enabled, 512MiB 400MHz DDR on a D865-based motherboard with a couple of mixed hard drives (Seagate and Maxtor), running ext3 and reiserfs under Gentoo. Software tested includes Apache 2.2.8 (2.2.10 is current as of today), Apache Tomcat 6.0.16 (6.0.18-r2 is current), mod_jk 1.2.25 (1.2.26-r1 is current), PHP 5.2.6-rc4 (5.2.8-r2 is current), Python 2.5.2-r7 for the base Python tests (Gentoo lags behind a bit from Python development since updates to the language tend to break Portage; 2.6.1 and 3.0 are current as of this writing), Pylons 0.9.6.2 (0.9.7-rc6 is current), and Stackless Python 2.5.2 – 3.1b3 for the Stackless tests (though Stackless builds are currently available for Python 2.6.1).

The Benchmarks

I conducted these benchmarks with ab, the Apache benchmark utility. There were two rounds of benchmarks: The first conducted 2000 connections with a concurrency of 20 and the second was conducted with 500 connections sequentially with no concurrency. The options for each were as follows:

ab -c 20 -n 2000 $url
ab -c 1 -n 500 $url

Ten services and frameworks in all were tested:

Stackless with a tasklet accept() server behind a WSGI proxy (my test framework)
Stackless + Python’s BaseHTTP server
Base Python using a select() server behind a WSGI proxy (reference implementation for my test framework)
Base Python + Python’s BaseHTTP server
Base Python + Pylons
PHP Passthrough (using fopen(); reading a file and printing it to the server)
Base Apache
Base Apache (.htaccess disabled)
Tomcat 6 + mod_jk using AJP 1.3
Base Tomcat 6

Each test was performed six times to reduce the impact of fstat or other caching and to ensure each service would be represented fairly. On my particular set up, I was unable to unload all Apache modules not directly related to the file system (such as mod_jk, mod_suexec, and so forth) due to the shear volume of test domains on my set up, and I certainly wasn’t about to disable each for a single test! Thus, it is important to consider that this benchmark is not necessarily representative of a bare Apache install but rather of one that contains a few supporting modules for PHP, Python integration, and so forth. However, I did include a separate test with .htaccess support disabled as it is fairly well known that enabling support for .htaccess incurs a significant performance penalty. Interestingly, Apache Tomcat was the predominant winner even though it is intended as a servlet container, NOT a simple webserver. These results aren’t particularly surprising.

Pages: 1 2 3 4

***

2 Responses to “Brief Comparison of Servers and Frameworks”

Grimblast writes:

O_O

Wow. Glad you did that rather than me. Would have took me forever to screw around with this.

# Permanent link to this comment
Benjamin writes:

You’d be surprised! I’ve been working on collecting data for this benchmark off and on for a while. I do need to change the graphs, though. With as many lines as there are, I’m afraid it’s getting cluttered. Displaying a comparison of mean values as bars might be easier on the eyes!

The results are still very interesting. It’s just a shame that some of the standard library stuff works so poorly. Though, it isn’t surprising; much of that is intended as example code.

# Permanent link to this comment