About two months ago, Rasmus Lerdorf gave a talk at FrOSCon entitled Simple is Hard. If you don’t know about it yet I recommend reading the slides before continuing with this article.
Rasmus proved how hard it is to stay simple when writing an application or a library, and explained that the main victims of complexity are scalability, performance and security. He used a good number of frameworks to prove his point, showing the overhead in performance for an “Hello world!” application along with the graph of all the included files for generating such a simple request. Of course your application will be more than just an “Hello world!” but this shows just how much is loaded by the various frameworks for doing such a simple task.
I found the results to be frightening. So I looked at how simple our framework is compared to the others.
After firing siege, I got a very bad result: only around 120 trans/sec when PHP alone did more than 4000 trans/sec. It wasn’t surprising however, considering no performance optimization was ever done in the project. With the project reaching beta soon, it’s time to change this and multiply the speed.
You can take a look at the inclued file from that day:
Firing up the XDebug profiler, we could observe this call graph:
It’s easy to see what takes time… The weeAutoload::addPath method takes 76% of the execution time, and the configuration file loading 12%. Both can be cached and are only subject to change in a development environment or when upgrading, so we can automatically cache them when DEBUG mode is off and ignore the cache otherwise.
The cache method used is pretty simple. The cache files have no expiration time, they’re just deleted when upgrading an installation to a new version. If something goes wrong, the global constant NO_CACHE will happily ignore the cache files. And in development, cache is ignored as long as you have DEBUG enabled, which you should have all the time when developing.
The first time the script runs the data (an array in both case) is written to the cache file as valid PHP using var_export. Subsequent runs will simply require this file to load the cache. To prevent security problems, the file mode is set to 0600, disabling its access to anyone other than the webserver’s user.
Using siege today I get these results:
essen@karen (0) % siege -c 5 "http://localhost/wee/trunk/" -b -t30s > /tmp/wee-today ** SIEGE 2.66 ** Preparing 5 concurrent users for battle. The server is now under siege... Lifting the server siege... done. Transactions: 40995 hits Availability: 100.00 % Elapsed time: 30.16 secs Data transferred: 3.44 MB Response time: 0.00 secs Transaction rate: 1359.25 trans/sec Throughput: 0.11 MB/sec Concurrency: 4.97 Successful transactions: 40995 Failed transactions: 0 Longest transaction: 0.04 Shortest transaction: 0.00
Not bad at all, considering all we did really was to cache the autoload and configuration data automatically.
The inclued graph looks like this on the first load:
And like this when the cache files have been generated:
As you can probably guess, we won’t be able to improve much more on this part of Web:Extend.
I computed a table with results from a few other frameworks. I used the same method as Rasmus used in his presentation. There’s probably a few frameworks missing, because they’re too complex, because they require me to set up a database, or because I missed them. If you would like one added to this table, feel free to give me its name and instructions on how to write an hello world page with it.
For those who really think an “Hello world!” proves nothing, I also added an example of Web:Extend with a few queries for comparison. The queries include an INSERT statement recording the hits in a table, along with an UPDATE statement and a SELECT statement both working on a 100K rows table. The UPDATE sets one column of one row to NOW() while the SELECT randomly selects 25 rows and display them in the template. This is not a prepared statement so it can be faster.
| Framework | without APC | with APC, apc.stat=1 | with APC, apc.stat=0 | with APC, apc.stat=0, % efficiency |
|---|---|---|---|---|
| PHP | 3965.53 | 4019.18 | 4018.94 | 100% |
| Akelos | 20.28 | 52.40 | 52.47 | 1.3% |
| CakePHP | 49.26 | 183.26 | 181.70 | 4.5% |
| CodeIgniter | 224.36 | 903.99 | 940.07 | 23.4% |
| Kohana | 198.60 | 596.21 | 609.00 | 15.2% |
| Web:Extend | 454.13 | 1359.25 | 1344.81 | 33.5% |
| Web:Extend w/ queries | 223.97 | 426.77 | 418.61 | N/A |
| Zend Framework | 53.03 | 182.11 | 234.53 | 5.8% |
Tests were done with Ubuntu’s default PHP configuration file, on Ubuntu.
A few notes about ZF: since it’s the only framework requiring the setting of include_path, it was tested separately with its library path configured specifically for these tests. The include_path started with the path to the library. Other tests didn’t have any modification of the configuration.
I was kinda surprised to see Kohana being so slow compared to CodeIgniter. As you might not know Kohana is based off CodeIgniter. Maybe there’s a reason? If you know something that could make Kohana slower than CodeIgniter, tell me and I can run the test one more time.
As you can see, there’s a huge difference in performance today between Web:Extend and the other frameworks. Performance-wise, if you’re running PHP5, Web:Extend is your best bet. If you’re still running PHP4, CodeIgniter will get the job done.
There’s still a few things we want to improve in our framework in the next few weeks.
There’s currently 3 application modules: output, database and sessions. We already do a lazy start of the output module, but not of the 2 others. This means that the application will always connect to the database, even when no query are performed. Or that the application will always start the session, even when you don’t need it. Unless of course you didn’t activate these modules in the configuration file at all.
With lazy-loading of the modules, we’ll be able to connect to the database only when weeApp()->db is called. Or start the session only when weeApp()->session is called. The modules won’t be initialized at all if they’re not required for the current page.
Currently the framework generates the templates after the request handling ended. Sometimes you might have some heavy operations to do, like logging, saving a search result, doing some maintenance on the database that the user doesn’t care about. We’ll make it possible to send the output to the browser as soon as possible and perform these other tasks without making the user wait.
Jayson Minard wrote about this recently at devzone: Do Not Make Users Wait for Things They Do Not Care About.
It might also be interesting to send parts of the page early, like the header of the HTML page.
It might be possible to send cached pages directly from the index.php file, which would be really fast but would not be a good choice for all types of applications. We’re still not decided on this. The pros are that the file would be sent very fast, without loading the framework: it would be almost as fast as fetching an HTML file directly. The cons are that it won’t be useable for anything other than simple pages, common to every users. This would be a good choice for vitrine websites or any page that doesn’t require user’s input. We probably will implement it, since our current cache is already “restricted” to common pages. And for other pages the best solution is to use APC or memcache anyway.
Allow the developer to cache the results of a query without having to worry about the caching mechanism used (be it APC, memcache or anything else). The framework would choose the best available on the system and provide a standard interface directly in the model classes. For example, instead of $this->query you could use $this->queryCache to cache the query automatically, with an optional timeout value passed as parameter. You could also use $this->deleteCache to delete the cache corresponding to the parameters given. It could be all the cache for the current model, or only part of it. We would of course provide an option to allow you to choose the caching mechanism used in case the framework doesn’t choose the one you want.
We saw a few days ago how we could analyze all our SELECT queries in one script. This is a functionality offered by our framework today. But this is not enough. We’re still missing a data generator that can handle relational data generation (if you know a good one we could use, please leave a comment!); a builtin profiler to test queries and functions; and a tool to create reports based on the data generated by query analyze and overall benchmark. As you already saw, the analyze of the SELECT queries returns a PHP array. This wasn’t because of laziness, this was because we intend to give you a nice report of these analysis, complete with emphasis on the problematic values (at least for MySQL and PgSQL, anyway).
To quote wikipedia, Simplicity is the property, condition, or quality of being simple or un-combined. It often denotes beauty, purity or clarity. Simple things are usually easier to explain and understand than complicated ones. Simplicity can mean freedom from hardship, effort or confusion.
Code simplicity can be achieved by following a few rules:
And if you ever need guidance in the path of simplicity, ask this good man.
Thanks for reading.
Disclaimer: If you feel like exploring the framework then have fun, but remember that the documentation is still lacking and that some parts of the API may change before we reach beta. We advise you to wait for the beta release before trying to use it, unless you feel adventurous or just plain curious.
November 6th, 2008 by Loïc Hoguin · Tags: Optimization, Simplicity, Web:Extend
November 9th, 2008 at 2:11 pm
Thats because frameworks dont scale.
DUH!
November 9th, 2008 at 9:33 pm
Because most don’t scale doesn’t mean all frameworks can’t scale. The problem with most frameworks is that they load too much by default, or that they have too much dependencies within them. If you get rid of these problems, you’ll find out that frameworks perform almost as well as having a simple custom MVC library and a database abstraction layer. Except frameworks will have other functionalities, like an easy-to-use model caching, which will actually make them faster in a real-world application. More on that at a later time. :-)
December 19th, 2008 at 8:56 am
Good framework scales and is easily extensible from its tiny core. Software that does not use any framework cannot scale to provide more functionality and can offer good performance only at a simple task.