Rui Lopes Rodrigues | Test Automation Specialist Leader | everis Brazil

Test strategy in a microservices architecture

The purpose of this article is to explore, from the perspective of quality strategy, the options and best practices when we have architectural and software development approaches focused on dividing the applications into small independent components, which we call a microservices strategy. This strategy poses a series of new challenges but also new opportunities. It is a new universe of needs and possibilities that cannot be treated with good results without a new quality view.

This new view is not only necessary due to the characteristics of the technical approach; in general, it is associated with a business context that is less tolerant with flaws and more eager for changes, which is the mainspring that drives most of the new quality needs.

First, we have a constant need for higher performance in applications, and customers are less tolerant of waiting. This is a great sign of the success of new technologies and approaches; however, it sets a higher level and increases the responsibility of high-quality testing.

Since it is common for a request-response to go through a set of services that, in a parallel or serial manner consume processing times that will be added to the response time of our request and resources distributed, the following is crucial:

1) Understanding the functionality needs. We have to know the response time to meet the customer’s needs successfully and the work to reach this number (or these numbers, when we talk about more than one functionality) can and should start at the time of inception so that we can begin to advance towards it from the first concrete actions. This also goes for the expected number of users and their distribution over time and the safety challenges and other non-functional characteristics that will determine the new application’s success from the business’s perspective.

2) Understanding the composition of the response in the multiple services that are accessed and the consumption of time and resources in each of them is essential to evaluate bottlenecks accurately.

3) Automating the performance (and load, and safety, and stress, etc.) tests into the necessary granularity so we can evaluate each service, execute them automatically with the proper frequency, and managing and monitoring the results is crucial for us to achieve the expected results. We cannot forget that services are independent and updatable independently; these updates may involve changes that impact the overall result, so monitoring separately is essential.

Another crucial characteristic of this new approach is its distributed nature, usually jointly with a set of tiny modules that provide interfaces under REST and HTTP. This usually increases the network’s relevance that allows the communication between these services (and their connectivity settings) and the orchestration of these calls. To monitor these points efficiently, we must:

1) Develop a smoke test for the services so any fluctuation in the network and availability of services may be quickly and effectively located. These tests must ensure access to the services is working and each one of them is active, but without spending time on more detailed verifications. The purpose of these tests is to know if we can move on to the other tests and have a quick alarm when basic flaws happen.

2) Develop contract tests for each service, exploring the input and output limits for each of them and the validation of the service’s processing. These tests must be developed as soon as we have the contract. In the prevalent situations in which it would be great to use a service that is not ready (whether to validate tests, due to dependencies from other services, or other developments), using a services virtualization tool is the answer. The most common tools in this field are Broadcom Service Virtualization and Micro Focus Service Virtualization. Still, there are also open-source options that deliver good results, such as WireMock and HoverFly.

3) Develop integration tests of services to ensure the communication and orchestration of contributions are produced in a distributed way, and collaborate correctly to the final results. In these tests, it is essential to consider that we may have variations in each service’s response time. Whenever possible, we must add random variations within the plausible range for each service. These timing variations may cause synchrony problems, leading to incorrect answers.

Unit testing continues to be crucial when we talk about ensuring the quality of services and microservices. Those tests are our main quality view from the technical and code point of view, and this view is crucial to building resilient and safe applications. In this approach of small independent blocks, however, if each microservice’s responsibilities are granular enough, the service tests and the unit tests may overlap. Due to this overlap, we started hearing about a test structure that is more focused on the services and their integration than the unit test. This new form of distributing tests is called “honeycomb testing,” contrasting with the test pyramid. In the pyramid, the distribution of tests (in quantities) happens following volumes of the many layers of a pyramid. In the honeycomb approach (hexagonal), the test classification is more generic (only considering implementation, integration, or integrated tests). The focus is on the larger number of tests and in the validation of the integration. The diagrams below exemplify both approaches:

Test Pyramid

Honeycomb testing

Integrated Tests (user interface, end-to-end, systems, exploratory tests, etc.)
Integration Tests (services, components, integration)
Implementation Details Tests (units)

Both approaches are valid but in different scenarios. The honeycomb testing presents more benefits in the situation mentioned above, where services have quite granular responsibilities; therefore, the responsibilities of the unit tests and services are quite close. Note that even in this context, the unit tests are not ruled out. Its responsibilities regarding the implementation validation are reinforced.

In my opinion, the pyramid model’s effective use in this small services scenario leads to a test distribution that resembles the one suggested by the honeycomb model. Therefore the models are not conflicting. For me, the honeycomb approach is more dev-based, whereas the pyramid approach has a more test-based language. These are different ways of talking about the same things. The use of the appropriate language to address the right audience will help understand and effectively apply the best practices.

In both approaches, the user interface tests are not ruled out either. They are indeed the most difficult to develop, maintain, and execute out of the universe of automatized tests. Still, they are the only ones that can evaluate the software from the user’s point of view. For this reason, they are irreplaceable in their context of the operation. Only a few are necessary (in contrast with other tests further down on the pyramid or hexagon), but we should have them. The statical analysis of the code to validate best practices, the analysis of safety, and the evolution of unit tests with the support of mutation tests are also crucial and should be carried out if not in all builds. At least in a repetition strategy that makes sense for the nature of the application in production and the development moment.

And the manual tests, never again? Not exactly. They are still valuable in situations where they cannot be automatized or for exploratory testing. Maybe one day, the AI support will help us in this exploratory testing front, but this is not yet the case. By the way, this is a subject for another article.

Exponential intelligence for exponential companies