Performance Testing with k6

Performance testing with k6

There’s an old adage in the software industry – premature optimization is the root of all evil. A corollary to this should be – no optimization is just as bad as premature optimization.

It is often true that effort spent optimizing a system before features are properly implemented, correct and stable is effort wasted, but even a fully-implemented system can offer a terrible user experience if performance is never considered. Unfortunately when delivery timelines get squeezed, performance will more often than not take a back seat behind yet more system features – but even the best feature in the world is useless if users struggle to interact with it due to poor performance. Finding a balance between features and performance is key.

What is load testing?

While there are many performance testing tools available that cater to a variety of needs, in this post we’re going to focus on performance testing of HTTP services using one tool in particular: k6.

Why k6 over other load testing tools?

If you work in software engineering in 2020, chances are you or someone you know is writing a modern web application using TypeScript – or at the very least, ES6. That web application is also likely communicating with back-end services over HTTP, which implement the bulk of the application’s business logic and data management.

k6 allows you to transfer those JavaScript skills over to writing performance test scripts for your application’s HTTP back-end, cutting down the time and investment required to start seeing meaningful performance insights in your application.

While k6 supports JavaScript, it does not run on Node.js – it uses its own runtime engine written in Go. This is done for two main reasons: to reduce the surface area of its standard library to only focus on APIs relevant to performance testing, and to help get script execution performance closer to bare metal than what performance testing tools based on more generic runtime engines can achieve (warning: k6 bias in that article – although it is not the best performer out there!).

Overall, k6 strikes a good balance between scripting flexibility and execution performance. k6 also offers a comprehensive ecosystem beyond the test runner. This includes extensive documentation on its features, as well as tools that can convert user journeys recorded from web browsers (HAR files) or convert other pre-existing performance test scripts your team may already have if they use JMeter (a popular performance testing tool for Java-based backends). k6 also has a commercial offering of a cloud-based performance script execution environment, including pre-configured analysis dashboards, allowing you to focus on writing your test scripts and implementing performance improvements within your application.

How to get started with k6

The quickest way to get up and running with meaningful performance testing and analysis is by using a docker-compose setup provided by the k6 team. This gives you the ability to:

Assuming you already have docker installed, your first example performance test can be run with the following commands:

git clone --depth 1 'https://github.com/loadimpact/k6'
cd k6
docker-compose up -d
docker-compose run -v $PWD/samples:/scripts k6 run --no-usage-report -w /scripts/es6sample.js

Analyzing k6 output

The output from the k6 run command shows a series of ticked or crossed verification steps, similar to traditional functional testing tools. These checks allow k6 scripts to assert observed performance metrics against expected thresholds, as well as to validate response values similar to regular functional testing. When testing more complex workflows in stateful applications, checking responses often includes extracting out values that need to get provided as parameters to later endpoints.

The last section of k6’s command line output is a summary table of performance gauges, mostly related to various stages of HTTP data negotiation and transmission. Of particular interest is the http_req_duration metric, which represents the time spent sending a request, waiting for a response from the back-end, and eventually receiving the response data. This metric excludes client-side time spent looking up DNS entries and performing TLS handshakes, etc, so more closely represents the actual time taken for the application’s HTTP backend to perform its work.

The remaining HTTP timings are still relevant for overall performance analysis, as they could represent what a user of the application would experience for a given request, but these values are more specifically related to the environment in which k6 is getting run, so may not be entirely representative of a real-world user experience.

Structure of a k6 performance script

So what is k6 actually testing here? For the above example, open up the ./samples/es6sample.js file. You’ll be most interested in the default function export – this is what k6 runs as a performance test iteration for a virtual user (VU). There are also a few other top-level sections of a k6 script used for more granular customization.

Configuring execution

The options export at the top of the script allows you to pre-configure certain k6 runtime parameters. You can omit this variable when writing your first performance test script, as you’ll most likely be executing your script from the command line and can customize from there. Embedding options within the script becomes useful when looking to automate performance testing, such as when integrating into your build pipeline alongside functional testing.

By default, when the options variable is not present, k6 run [perfscript.js] will use a single virtual user to execute a single iteration of your test script. This type of execution is useful when initially writing and debugging a performance test script. However it won’t give you much in the way of performance insight under any meaningful user load (although it is useful for performance smoke testing new releases of your application).

You can ramp up the number of virtual users and total iterations using the -u and -i command-line arguments, respectively. If you want to limit the total script duration by time rather than the number of iterations, you can also use the -d argument.

Testing performance

As mentioned, the focus of the script is the default function. This function should include one or more HTTP requests used to interact with relevant aspects of your application’s service layer. Each HTTP request will have its own performance metrics recorded that indicate how long the request took to complete – part of this timing will indicate how long the back-end services took to do their work.

Applications will typically require some form of authentication before other endpoints can get accessed, such as a POST request to a /login endpoint. k6 implicitly handles cookie management across multiple HTTP requests for a single virtual user, meaning there’s not much more to do beyond invoking your authentication endpoint, assuming your application relies on typical authorization or session cookies.

When testing multiple endpoints as part of a cohesive user journey through parts of your application, you will likely want to start grouping areas of your script to provide a structure closer to the user features, rather than a series of isolated HTTP requests. This grouping by user features is where the group() function comes into play (and it may also get nested) – although k6 also supports tagging for an added dimension to how your test can be structured.

There are a few different approaches that can get taken when it comes to writing the bulk of your script. In essence, you can start from a blank slate and manually add requests for endpoints you are interested in testing, or you can start from an auto-generated script based on a HAR recording of a typical user journey within your application’s user interface. Starting from a HAR can be useful for more complex journeys, as it will more accurately reflect everything a user needs to do when using your application – including any human processing delays between steps.

When ramping up the load generated by your performance script, there may be aspects of your application that you do not want to include in every test iteration, such as invariant static data that your web application may already be caching. You can use the additional setup and teardown phases of k6’s execution lifecycle to handle this.

Using dynamic data

Starting a k6 script from a HAR will require modification, as the initial journey will be hard-coded to only those parameter values used as part of the recording. This hard coding won’t provide broad coverage of application performance against a variety of different data sets.

Using dynamic data in your tests will likely require parameterized HTTP requests, including data extracted from previous responses. To help generate dynamic data, you can use k6’s module support to import a library such as faker.js.

If this dynamic data appears in a POST request body to a static endpoint URL, there’s not too much of a problem when it comes to analyzing the performance of that single endpoint, but this requires more effort if the endpoint name itself needs to get parameterized. By default, k6 differentiates between every unique endpoint name when tagging the metrics it records. This differentiation helps if you aim to look at performance differences between GET /product/a versus GET /product/b, but it is not that helpful if you’re looking for overall performance analysis of GET /product/{productId} with a variety of parameters. To get around this, you can override the name tag for a given request – all requests using the same name alias will then have their metrics grouped.

Bringing things together: example k6 script

The following script highlights many of the dynamic data concepts described above:

import { group, sleep, check } from 'k6';
import http from 'k6/http';
import faker from 'https://cdnjs.cloudflare.com/ajax/libs/Faker/3.1.0/faker.min.js';
 
const BASE_URL = `https://myapp.example.com`;
 
const COMMON_REQUEST_HEADERS = {
   dnt: '1',
   'user-agent': 'Mozilla/5.0',
   'content-type': 'application/json',
   accept: '*/*',
   origin: BASE_URL,
   referer: BASE_URL
};
 
function simulateUserInteractionDelay() {
   sleep(1 + Math.random(3));
}
 
export default function() {
   group('myapp performance test', function() {
       group('authenticate', function() {
           let response = http.post(
               `${BASE_URL}/login`,
               '{username:"whoami",password:"verysecure"}',
               {
                   tags: { name: '/login' },
                   headers: COMMON_REQUEST_HEADERS
               }
           );
           check(response, {
               'can login': (res) => res.status === 201
           });
       });
 
       simulateUserInteractionDelay();
 
       let productId;
       group('add product', function() {
           let productName = faker.commerce.productName();
 
           let response = http.post(`${BASE_URL}/product`, `{productName: "${productName}"}`, {
               tags: { name: '/product' },
               headers: COMMON_REQUEST_HEADERS
           });
           check(response, {
               'can add product': (res) => res.status === 201,
               'can obtain product ID': (res) => {
                   let productResponse = JSON.parse(res.body);
                   productId = productResponse && productResponse.id;
                   return productId !== undefined;
               }
           });
       });
 
       simulateUserInteractionDelay();
 
       group('fetch product', function() {
           let response = http.get(`${BASE_URL}/product/${productId}`, {
               tags: { name: '/product/{productId}' },
               headers: COMMON_REQUEST_HEADERS
           });
           check(response, {
               'can get product': (res) => res.status === 200
           });
       });
   });
}


Get more detailed performance analysis using k6

k6’s command line output is very much a summary of the total script’s execution. It doesn’t help much when your scripts are testing more complex user journeys involving several unrelated backend endpoints. More granular analysis of the complete set of metrics, tagged per endpoint name, is better performed in Grafana.

The docker-compose setup makes Grafana available on http://localhost:3000/. However, this is a stock install with no dashboards preconfigured. You can create your dashboard customized to whatever analysis needs you have, but when just getting started, it’s often quicker to use a pre-canned dashboard and further customize from there. To do so, you’ll need to import a dashboard via http://localhost:3000/dashboard/import – a good one to start with is dashboard ID# 2587 – or 4411 is also worth exploring.

In terms of doing the analysis and making accurate sense of your application’s performance – that is, unfortunately, a much larger subject than covered here! Brendan Gregg provides a wealth of information to help guide you in using sound methodologies and avoiding the most common metric interpretation pitfalls, and also author’s books for a more thorough review of the subject.

Conclusion

k6 offers a comprehensive performance testing ecosystem that can add significant value to your application’s responsiveness, whether you’re looking for ad-hoc manual performance investigation for particularly slow areas, or whether you would like to integrate performance testing as part of your overall automated testing suite. And while it is not the only tool for the job, it is well suited to modern full-stack application development.

Performance testing is not as binary as functional testing, where a feature is either working or not working. Compromises often need to be reached after weighing up how many users will interact with a poorly-performing feature vs. the investment required to improve its performance. It is impossible to achieve ‘100% perfect performance’, as such a concept does not exist; you could spend an infinite amount of time optimizing your application, to the detriment of regular feature development. Therefore, useful performance measurements must get gathered, allowing you to focus on the top few bottlenecks that, if fixed, would result in the most significant perceived improvement to end-users. There will also be a cut-off point in the list of bottlenecks where, if any others were to get improved, they would only provide diminishing, imperceptible returns.

Whether you choose to use k6 or not, you must properly analyze and consider the performance of your application. Performance is an all-too-often forgotten piece of the puzzle that is delivering the best possible experience for your users. Make sure you keep the fully-assembled puzzle picture in view!

If you need help analyzing or improving the performance of your web application, contact us to learn more about how we can help!

Learn more about how SitePen can be your partner.

SitePen is a strategic consultancy committed to achieving technical objectives, solving critical business problems and helping our customers build web applications the right way, the first time.
Share This