I like to think of software systems as having three axes: performance, capacity, scalability. While related, each axis is independent of each other. This is evident by systems that are high performing but unable to scale.
First, lets define the terms involved using a fictitious system that accepts requests to process images (resize, correct exposure, etc.).
Performance is something we are all acutely aware of if we've ever bitched at Microsoft Outlook's search functionality. Performance is how fast we can do something, typically measured in units of time. In our image processing system a baseline image of some fixed size and some fixed color depth may take 500 milliseconds (ms) to process under some fixed conditions (e.g., system has gone through warmup, VM has optimized the code, lazy caches have been loaded, etc.). That's our performance.
Capacity is how much your system can handle subject to some service level agreement (SLA), which may be an internal SLA we provide to other teams in the company, not just an external facing SLA to our customers. Say our system has an SLA that mandates 95 percent of all images be processed in under 1500 ms (tp95). Given this SLA our system can process 100 images per second (100 requests per second). More requests than this and our SLA will not be met. The result could be fatal errors (maybe we run out of memory and crash) or gradual degradation, typically the more desirable behavior. At 150 requests/second we can only meet an SLA of 2000 ms tp95; at 200 requests/second we can only meet and SLA of 2300 ms tp95. So forth.
Performance and capacity treat our system as fixed. Our system could consist of a single server or several servers--it's whatever we choose. We can vary our inputs or change our system configuration/state to get new performance and capacity numbers, but fundamentally the system is fixed.
Scalability, on the other hand, treats our system as malleable. Scalability is how our system responds when we add resources to the system. Scalability is the affect on our system of additional resources--servers, storage, etc. Some literature discuss scalability in terms of how the system is affected under various loads; but I prefer to view that as performance and capacity under different configurations/inputs to maintain a clear separation that scalability involves the addition of resources to the system.
Scalability is a broader term than performance and capacity. Here are three examples of scalability.
Operational scalability. Say our system, currently consisting of 10 nodes, requires 2 IT guys to manage. Tomorrow, if we add 10 more nodes will it require 2 more IT guys to manage? If it does then we have horrible operational scalability. A system that can operationally scale should require the same number of people to manage 10 servers or 20 servers.
Capacity scalability. Given the same SLA as above, if we add more storage or more servers to the system will our capacity increase and how does it increase? The design of some systems make it impossible to increase capacity with additional resources. Our image processing system can scale in capacity fairly easily if image processing is self-contained. Add a new server and we can now guarantee an SLA of 120 req/sec @ 1500 ms tp95 compared to the previous 100 req/sec. On the other hand, if the first thing our system did was store the image in a single, monolithic, already maxed out database, we can add all the servers to the system we want and capacity will remain the same until we re-architect/re-design our system to get rid of that single database.
Performance scalability. Will additional resources improve my system performance? If our image processing algorithm is serially, no. An image will still require 500ms to process. If our algorithm split the images into parts and farmed things out to multiple servers to be done in parallel, addiing servers will reduce our processing time.
Performance, capacity and scalability are related but not in any direct way. Think of all these different ways a system can be characterized,
- great performance but unable to scale
- decent performance and can scale
- bad performance and can't scalability
- great performance but low capacity
- etc.
Often these dimensions are at odds with one another. Our image processor might be very fast, able to process images in 100 ms, but require excessive memory, reducing our capacity. Great performing systems might not be very scalable without a lot of work.
Comments