Speeding up Web Applications: Part I
It is often interesting hearing the different responses and solutions to the problem of a sluggish website. Responses range from adding new hardware (ram, io, cpu) and application servers, jumping on the latest lightweight web server (Nginx, lighthttpd), caching solutions (memcached, boost) all the way to “outsourcing” the serving of images to other services like Amazon Cloud or fulltext search to Apache Solr. While it is entirely possible these can contribute to speeding things up, more often than not they result in insignificant gains and a bigger system administration headache. Inexperienced system administrators and consultants stare at the results of `nload`, they `tail -f` their log files and boast of the enormous amount of hits they get from bots, users and “attackers” when quite often these observations have little to do with any significant discovery phase.
Where do people get these ideas? We live in an age where hardware is abundant and cheap.The hardware age has trained us to stop being scientific in our discovery methods and it has spoiled developers who can now get away with writing inefficient code. Big numbers razzle dazzle the inexperienced who boast about the size of server farms, how many hits they get and their use of enterprise tools to keep things in check. Management sets hard deadlines and has little respect for the art of optimization as much as they do with getting things done quickly. After all, why be scientific if I can throw another server at it or install a prepackaged solution that will move things along a bit? At the same time the internet is filled with experts, trends, and advice that is often misapplied, misinterpreted and adopted without good reason. People are always anxious to jump on the next bandwagon and use the latest buzz words in hopes that what solved someone else’s problem will solve theirs.
Unfortunately, there is a lot to system optimization and none of these things can be described in full. In my experience these are the common culprits:
- Poor development practices.
- Database queries are inefficient.
- Poor choice of algorithms & data structures.
- Bloated frameworks.
- Hit or miss caching: caching the wrong things but not caching the right things.
- Bad Configuration
- Apache / MySQL / PHP – left to defaults.
- Wrong hardware choices/configuration for particular applications.
- Using swap unnecessarily.
- Using the same interface for public and local network traffic.
- Network services doing stupid things (dns lookups for local hosts, etc.)
- The dreaded black box – commercial server appliances.
- Overly complicated network topology.
- Lack of common sense
- Website is full of large images/media.
- Storing images in a database.
In future posts we’ll look into the methods of discovering these bottlenecks and how to go about addressing them.