This is the First of a Three Part Article by Alex Podelko on Performance Testing. For the last seventeen years Alex Podelko has worked as a performance engineer and architect for several companies. Currently he is Consulting Member of Technical Staff at Oracle, responsible for performance testing and optimization of Enterprise Performance Management and Business Intelligence (a.k.a. Hyperion) products.
Alex periodically talks and writes about performance-related topics, advocating tearing down silo walls between different groups of performance professionals. His collection of performance-related links and documents (including his recent papers and presentations) can be found at www.alexanderpodelko.com. He blogs at http://alexanderpodelko.com/blog and can be found on Twitter as @apodelko. Alex currently serves as a director for the Computer Measurement Group (CMG, http://cmg.org), an organization of performance and capacity planning professionals
Load testing is an important part of the performance engineering process. It remains the main way to ensure appropriate performance and reliability in production. Still it is important to see a bigger picture beyond stereotypical last-moment load testing. There are different ways to create load; a single approach may not work in all situations. Many tools allow you to use different ways of recording/playback and programming. This paper discusses pros and cons of each approach, when it can be used and what tool’s features we need to support it.
Load Testing As Part Of The Performance Engineering Process : Recent trends of agile development, continuous deployment, DevOps, web operations somewhat question importance of load testing. Some (not many) openly saying that they don’t need load testing, some still paying lip service to it – but just never get to it. In more traditional corporate world we still see performance testing groups and important systems usually get load tested before deployment.
Let’s first define load testing as far as terminology is rather vague here. The term is used here for everything that requires applying multi-user synthetic load. Many different terms are used for such kind of multi-user testing, such as performance, concurrency, stress, scalability, endurance, longevity, soak, stability, reliability, etc. There are different (and sometimes conflicting) definitions of these terms. Mostly these terms describe testing from somewhat different points of view, so they are not mutually exclusive.
While each kind of performance testing may have different goals and test designs, in most cases they use the same approach: applying multi-user synthetic workload to the system. The term ‘load testing’ is used further in this paper because, by author’s opinion, it better contrasts multi-user testing with other performance engineering methods including single-user performance testing. Everything mentioned here applies to performance, stress, concurrency, scalability, and other kinds of testing as far as the system is tested by applying multi-user load.
If we define load testing in this way, it becomes evident that it is much wider the stereotypical waterfall-like last-moment record-and-replay load testing that we often see in large corporations. Unfortunately load testing often became associated with that stereotype that makes difficult to see a larger picture of load testing as an important and integral part of the performance engineering process.
Load testing is a way to mitigate load- and performance-related risks. There are other approaches and techniques that also alleviate some performance risks :
- Single-user performance engineering practices (profiling, tracking and optimization single-user performance, Web Performance Optimization).
- Software Performance Engineering (SPE – including modeling and performance patterns)
- Instrumentation/Application Performance Management providing insights in what is going on inside the production system.
- [Auto] scalable architecture.
- Continuous integration/deployment allowing quickly deploy and remove changes.
Still they don’t replace load testing, but rather complement it. They definitely decrease performance risks comparing with situation when nothing was done about performance at all until the last moment before rolling out the system in production, but it still leaves risks of crashing and performance degradation under multi-user load. There is always a risk of crashing or performance issues under heavy load – and the best way to mitigate it is actually to test it. Even stellar performance in production and highly scalable architecture don’t guarantee that it won’t crash with a slightly higher load. So if the cost of it is high, you should do load testing (what exactly, when and how are other large topics; only one side of them – load generation – is discussed in this paper). Although, to be exact, even proper load testing isn’t a full guarantee (real-life workload always would be somewhat different from what you have tested), but it drastically decreases the risks.
Another important value of load testing is making sure that changes don’t degrade multi-user performance. Unfortunately, better single-user performance doesn’t guarantee better multi-user performance. In many cases it improves multi-user performance too, but definitely not always. And the more complex system, the more chances of exotic multi-user performance issues no one even thought of. And load testing is the way to ensure that you don’t have such issues.
In particular, when you do performance optimization, you need a reproducible way to evaluate the impact of changes on multi-user performance. The impact on multi-user performance won’t probably be proportional to what you see with single-user performance (even if it still would be somewhat correlated). Without multi-user testing the actual effect is difficult to quantify. It is also important for issues happening only in specific cases that are difficult to troubleshoot and verify in production – using load testing can significantly simplify the process.
Load Testing Process Overview : Load testing is emerging as an engineering discipline of its own, based on “classic” testing from one side, and system performance analysis from another side. A typical load testing process is shown in the figure below:
The Load Testing Process
We explicitly define two different steps here: ‘define load’ and ‘create test assets’. The ‘define load’ step (sometimes referred to as workload characterization or workload modeling) is a logical description of the load we want to apply (like “that group of users login, navigate to a random item in the catalog, add it to the shopping cart, pay, and logout with average 10 seconds think time between actions”). The ‘create test assets’ step is the implementation of this workload, and conversion of the logical description into something that will physically create that load during the ‘run tests’ step. While for manual testing that can be just the description given to each tester, usually it is something else in load testing – a program or a script.
Quite often load testing goes hand-in-hand with tuning, diagnostics, and capacity planning. They are actually represented by the back loop on the figure above: if we don’t meet our goal, we need optimize the system to improve performance. Usually the load testing process implies tuning and modification of the system to achieve the goals.
Load testing is not a one-time procedure. It spans through the whole system development life cycle. It may start from technology or prototype scalability evaluation, continue through component / unit performance testing into system performance testing and follow up in production (to troubleshoot performance issues and test upgrades / load increases).
Before we can move forward from ‘define load’ to ‘create test assets’, we need to decide how we are going to generate that load. Load generation can be a simple technical step when you know how to do it for your system (compared with other non-trivial steps like collecting requirements, defining load, or analyzing results). Unfortunately, quite often it is a very challenging task for a new system, up to being impossible in the given time frame. It is important to understand all possible options; a single approach may not work in all situations. The main choices are to generate workload manually (really an option only if you test few users), use a load testing tool (software or hardware), or create a program to do it. Many tools allow using different ways of recording/playing back and programming. The paper considers different approaches to assist in choosing the best approach and tool for specific contexts. It is based on experience with business applications, so handling load in other kinds of environments may differ.
For the last seventeen years Alex Podelko has worked as a performance engineer and architect for several companies. Currently he is Consulting Member of Technical Staff at Oracle, responsible for performance testing and optimization of Enterprise Performance Management and Business Intelligence (a.k.a. Hyperion) products.
Alex periodically talks and writes about performance-related topics, advocating tearing down silo walls between different groups of performance professionals. His collection of performance-related links and documents (including his recent papers and presentations) can be found at www.alexanderpodelko.com. He blogs at http://alexanderpodelko.com/blog and can be found on Twitter as @apodelko. Alex currently serves as a director for the Computer Measurement Group (CMG, http://cmg.org), an organization of performance and capacity planning professionals.