The “Test ” is a metaphor that tells us to group software
tests into buckets of different granularity. It also gives an idea
of how many tests we should have in each of these groups. Although
the concept of the Test Pyramid has been around for a while, teams
still struggle to put it into practice properly. This article
revisits the original concept of the Test Pyramid and shows how
you can put this into practice. It shows which kinds of tests you
should be looking for in the different levels of the pyramid and
gives practical examples on how these can be implemented.

- teaser - The Practical Test Pyramid

Production-ready software requires testing before it goes into production. As
the discipline of software development matured, software testing approaches have
matured too. Instead of having myriads of manual software testers, development
teams have moved towards automating the biggest portion of their testing
efforts. Automating their tests allows teams to know whether their
software is broken in a matter of seconds and minutes instead of days and

The drastically shortened feedback loop fueled by automated tests goes hand
in hand with agile development practices, continuous delivery and DevOps
culture. Having an effective software testing approach allows teams to move
fast and with confidence.

This article explores what a well-rounded test portfolio should look
like to be responsive, reliable and maintainable – regardless of whether
you’re a microservices architecture, mobile apps or IoT ecosystems.
We’ll also get into the details of building effective and readable
automated tests.

The Importance of (Test) Automation

Software has become an essential part of the world we live in. It has
outgrown its early sole purpose of making businesses more efficient. Today
companies try to find ways to become first-class digital companies. As users
everyone of us interacts with an ever-increasing amount of software every
day. The wheels of innovation are turning faster.

If you want to keep pace you’ll have to look into ways to deliver your
software faster without sacrificing its quality. Continuous delivery, a
practice where you automatically ensure that your software can be released
into production any time, can help you with that. With continuous delivery
you use a build pipeline to automatically test your software and deploy
it to your testing and production environments.

Building, testing and deploying an ever-increasing amount of software
manually soon becomes impossible — unless you want to spend all your time
with manual, repetitive work instead of delivering working software.
Automating everything — from build to tests, deployment and infrastructure —
is your only way forward.

- buildPipeline - The Practical Test Pyramid

Figure 1: Use build pipelines to automatically and
reliably get your software into production

Traditionally software testing was overly manual work done by deploying your
application to a test environment and then performing some black- style
testing e.g. by clicking through your user interface to see if anything’s
Often these tests would be specified by test scripts to ensure the
testers would do consistent checking.

It’s obvious that testing all changes manually is time-consuming, repetitive
and tedious. Repetitive is boring, boring leads to mistakes and makes you look
for a different job by the end of the week.

Luckily there’s a remedy for repetitive tasks: automation.

Automating your repetitive tests can be a big game changer in your life as a software
developer. Automate these tests and you no longer have to mindlessly follow click
protocols in order to check if your software still works correctly. Automate
your tests and you can change your codebase without batting an eye. If you’ve
ever tried doing a large-scale refactoring without a proper test suite I bet you
know what a terrifying experience this can be. How would you know if you
accidentally broke stuff along the way? Well, you click through all your manual
test cases, that’s how. But let’s be honest: do you really enjoy that? How about
making even large-scale changes and knowing whether you broke stuff within
seconds while taking a nice sip of coffee? Sounds more enjoyable if you ask

The Test Pyramid

If you want to get serious about automated tests for your software there
is one key concept you should know about: the test pyramid. Mike
Cohn came up with this concept in his book Succeeding with Agile.
It’s a great visual metaphor telling you to think about different layers
of testing. It also tells you how much testing to do on each layer.

- testPyramid - The Practical Test Pyramid

Figure 2: The Test Pyramid

Mike Cohn’s original test pyramid consists of three layers that your
test suite should consist of (bottom to top):

  1. Unit Tests
  2. Service Tests
  3. User Interface Tests

Unfortunately the concept of the test pyramid falls a little short if
you take a closer look. Some argue that either the naming or some
conceptual aspects of Mike Cohn’s test pyramid are not ideal, and I have to
agree. From a modern point of view the test pyramid seems overly simplistic
and can therefore be misleading.

Still, due to it’s simplicity the essence of the test pyramid serves as
a good rule of thumb when it comes to establishing your own test suite.
Your best bet is to remember two things from Cohn’s original test pyramid:

  1. Write tests with different granularity
  2. The more high-level you get the fewer tests you should have

Stick to the pyramid shape to come up with a healthy, fast and
maintainable test suite: Write lots of small and fast unit
. Write some more coarse-grained tests and very few
high-level tests that test your application from end to end. Watch out that
you don’t end up with a
test ice-cream cone
that will be a nightmare to maintain and takes
way too long to run.

Don’t become too attached to the names of the individual layers in Cohn’s
test pyramid. In fact they can be quite misleading: service test is a
term that is hard to grasp (Cohn himself talks about the observation that

a lot of developers completely ignore this layer
). In the days of
single page application frameworks like react, angular, ember.js and others
it becomes apparent that UI tests don’t have to be on the highest
level of your pyramid – you’re perfectly able to unit test your UI in all
of these frameworks.

Given the shortcomings of the original names it’s totally okay to come
up with other names for your test layers, as long as you keep it consistent
within your codebase and your team’s discussions.

The Sample Application

I’ve written a simple
including a test
suite with tests for the different layers of the test pyramid.

The sample application shows traits of a typical microservice. It
provides a REST interface, talks to a database and fetches information from
a third-party REST service. It’s implemented in Spring Boot
and should be understandable even
if you’ve never worked with Spring Boot before.

Make sure to check
out the code on Github. The
readme contains instructions you need to run the application and its
automated tests on your machine.


The application’s functionality is simple. It
provides a REST interface with three endpoints:

GET /hello Returns “Hello World”. Always.
GET /hello/{lastname} Looks up the person with the provided last name. If the person
is known, returns “Hello {Firstname} {Lastname}”.
GET /weather Returns the current weather conditions for Hamburg,

High-level Structure

On a high-level the system has the
following structure:

- testService - The Practical Test Pyramid

Figure 3: the high level structure of our microservice system

Our microservice provides a REST interface that can be called via HTTP.
For some endpoints the service will fetch information from a database. In
other cases the service will call an external weather
via HTTP to fetch and display current weather

Internal Architecture

Internally, the Spring Service has a Spring-typical architecture:

- testArchitecture - The Practical Test Pyramid

Figure 4: the internal structure of our microservice

  • Controller classes provide REST endpoints and deal with HTTP
    requests and responses
  • Repository classes interface with the database and take care of
    writing and reading to/from persistent storage
  • Client classes talk to other APIs, in our case it fetches JSON
    via HTTPS from the weather API
  • Domain classes capture our domain model including
    the domain logic (which, to be fair, is quite trivial in our case).

Experienced Spring developers might notice that a frequently used layer
is missing here: Inspired by Domain-Driven
a lot of developers build a service layer consisting of
service classes. I decided not to include a service layer in this
application. One reason is that our application is simple enough, a
service layer would have been an unnecessary level of indirection. The
other one is that I think people overdo it with service layers. I often
encounter codebases where the entire business logic is captured within
service classes. The domain model becomes merely a layer for data, not for
behaviour (an
Anemic Domain Model
). For every non-trivial application this wastes a lot of
potential to keep your code well-structured and testable and does not
fully utilize the power of object orientation.

Our repositories are straightforward and provide simple
functionality. To keep the
code simple I used Spring Data.
Spring Data gives us a simple and generic CRUD repository implementation
that we can use instead of rolling our own. It also takes care of spinning
up an in-memory database for our tests instead of using a PostgreSQL
database as it would in production.

Take a look at the codebase and make yourself familiar with the
internal structure. It will be useful for our next step: Testing the

Unit tests

The foundation of your test suite will be made up of unit tests. Your unit
tests make sure that a certain unit (your subject under test) of your
codebase works as intended. Unit tests have the narrowest scope of all the
tests in your test suite. The number of unit tests in your test suite will
largely outnumber any other type of test.

- unitTest - The Practical Test Pyramid

Figure 5: A unit test typically replaces external
collaborators with test doubles

What’s a Unit?

If you ask three different people what “unit” means in the context of
unit tests, you’ll probably receive four different, slightly nuanced
answers. To a certain extent it’s a matter of your own definition and it’s
okay to have no canonical answer.

If you’re working in a functional language a unit will most likely be a
single function. Your unit tests will call a function with different
parameters and ensure that it returns the expected values. In an
object-oriented language a unit can range from a single method to an entire

Sociable and Solitary

Some argue that all collaborators (e.g. other classes that are called by
your class under test) of your subject under test should be substituted with
mocks or stubs to come up with perfect isolation and to avoid
side-effects and a complicated test setup. Others argue that only
collaborators that are slow or have bigger side effects (e.g. classes that
access databases or make network calls) should be stubbed or mocked.

Occasionally people
label these two sorts of tests as solitary unit tests for tests that
stub all collaborators and sociable unit tests for tests that allow
talking to real collaborators (Jay Fields’ Working Effectively with Unit Tests coined
these terms). If you have some spare time you can go down the rabbit hole
and read more about
the pros and cons
of the different schools of thought.

At the end of the day it’s not important to decide if you go for solitary
or sociable unit tests. Writing automated tests is what’s important.
Personally, I find myself using both approaches all the time. If it becomes
awkward to use real collaborators I will use mocks and stubs generously. If
I feel like involving the real collaborator gives me more confidence in a
test I’ll only stub the outermost parts of my service.

Mocking and Stubbing

Mocks and Stubs are two different kinds of
Test Doubles (there are more than these
two). A lot of people use the terms Mock and Stub interchangeably. I
think it’s good to be precise and keep their specific properties in mind.
You can use test doubles to replace objects you’d use in production with
an implementation that helps you with testing.

In plain words it means that you replace a real thing (e.g. a class,
module or function) with a fake version of that thing. The fake version
looks and acts like the real thing (answers to the same method calls) but
answers with canned responses that you define yourself at the beginning of
your unit test.

Using test doubles is not specific to unit testing. More elaborate
test doubles can be used to simulate entire parts of your system in a
controlled way. However, in unit testing you’re most likely to encounter
a lot of mocks and stubs (depending of whether you’re the sociable or
solitary kind of developer), simply because lots of modern languages and
libraries make it easy and comfortable to set up mocks and stubs.

Regardless of your technology choice, there’s a good chance that either
your language’s standard library or some popular third-party library will
provide you with elegant ways to set up mocks. And even writing your own
mocks from scratch is only a matter of writing a fake class/module/function
with the same signature as the real one and setting up the fake in your

Your unit tests will run very fast. On a decent machine you can expect to
run thousands of unit tests within a few minutes. Test small pieces of your
codebase in isolation and avoid hitting databases, the filesystem or firing
HTTP queries (by using mocks and stubs for these parts) to keep your tests

Once you got a hang of writing unit tests you will become more and more
fluent in writing them. Stub out external collaborators, set up some input
data, call your subject under test and check that the returned value is
what you expected. Look into Test-Driven
and let your unit tests guide your development; if applied
correctly it can help you get into a great flow and come up with a good
and maintainable design while automatically producing a comprehensive and
fully automated test suite. Still, it’s no silver bullet. Go ahead, give
it a real chance and see if it feels right for you.

What to Test?

The good thing about unit tests is that you can write them for all your
production code classes, regardless of their functionality or which layer in
your internal structure they belong to. You can unit tests controllers just
like you can unit test repositories, domain classes or file readers. Simply
stick to the one test class per production class rule of thumb and
you’re off to a good start.

A unit test class should at least test the public interface of the
. Private methods can’t be tested anyways since you simply can’t call
them from a different test class. Protected or package-private are
accessible from a test class (given the package structure of your test class
is the same as with the production class) but testing these methods could
already go too far.

There’s a fine line when it comes to writing unit tests: They should
ensure that all your non-trivial code paths are tested (including happy path
and edge cases). At the same time they shouldn’t be tied to your
implementation too closely.

Why’s that?

Tests that are too close to the production code quickly become annoying.
As soon as you refactor your production code (quick recap: refactoring means
changing the internal structure of your code without changing the externally
visible behavior) your unit tests will break.

This way you lose one big benefit of unit tests: acting as a safety net
for code changes. You rather become fed up with those stupid tests failing
every time you refactor, causing more work than being helpful; and whose idea
was this stupid testing stuff anyways?

What do you do instead? Don’t reflect your internal code structure within
your unit tests. Test for observable behavior instead. Think about

if I enter values x and y,
will the result be z?

instead of

if I enter x and y, will the
method call class A first, then call class B and then return the result of
class A plus the result of class B?

Private methods should generally be considered an implementation detail.
That’s why you shouldn’t even have the urge to test them.

I often hear opponents of unit testing (or
) arguing that writing unit tests becomes pointless
work where you have to test all your methods in order to come up with a high
test coverage. They often cite scenarios where an overly eager team lead
forced them to write unit tests for getters and setters and all other sorts
of trivial code in order to come up with 0% test coverage.

There’s so much wrong with that.

Yes, you should test the public interface. More importantly, however,
you don’t test trivial code. Don’t worry,

Kent Beck said it’s ok
. You won’t gain anything from testing
simple getters or setters or other trivial implementations (e.g.
without any conditional logic). Save the time, that’s one more meeting you
can attend, hooray!

Test Structure

A good structure for all your tests (this is not limited to unit tests)
is this one:

  1. Set up the test data
  2. Call your method under test
  3. Assert that the expected results are returned

There’s a nice mnemonic to remember this structure:
“Arrange, Act, Assert”.
Another one that you can use takes inspiration from
It’s the “given”, “when”, “then”
triad, where given reflects the setup, when the method call
and then the assertion part.

This pattern can be applied to other, more high-level tests as well. In
every case they ensure that your tests remain easy and consistent to read.
On top of that tests written with this structure in mind tend to be shorter
and more expressive.

Implementing a Unit Test

Now that we know what to test and how to structure our unit tests we can
finally see a real example.

Let’s take a simplified version of the ExampleController class:

public class ExampleController {

    private final PersonRepository personRepo;

    public ExampleController(final PersonRepository personRepo) {
        this.personRepo = personRepo;

    public String hello(@PathVariable final String lastName) {
        Optional<Person> foundPerson = personRepo.findByLastName(lastName);

        return foundPerson
                .map(person -> String.format("Hello %s %s!",
                .orElse(String.format("Who is this '%s' you're talking about?",

A unit test for the hello(lastname) method could look like

public class ExampleControllerTest {

    private ExampleController subject;

    private PersonRepository personRepo;

    public void setUp() throws Exception {
        subject = new ExampleController(personRepo);

    public void shouldReturnFullNameOfAPerson() throws Exception {
        Person peter = new Person("Peter", "Pan");

        String greeting = subject.hello("Pan");

        assertThat(greeting, is("Hello Peter Pan!"));

    public void shouldTellIfPersonIsUnknown() throws Exception {

        String greeting = subject.hello("Pan");

        assertThat(greeting, is("Who is this 'Pan' you're talking about?"));

We’re writing the unit tests using JUnit, the de-facto standard testing framework for
Java. We use Mockito to replace the
real PersonRepository class with a stub for our test. This stub
allows us to define canned responses the stubbed method should return in
this test. Stubbing makes our test more simple, predictable and allows us to
easily setup test data.

Following the arrange, act, assert structure, we write two unit tests
– a positive case and a case where the searched person cannot be found. The
first, positive test case creates a new person object and tells the mocked
repository to return this object when it’s called with “Pan” as the value
for the lastName parameter. The test then goes on to call the method that
should be tested. Finally it asserts that the response is equal to the
expected response.

The second test works similarly but tests the scenario where the tested
method does not find a person for the given parameter.

Integration Tests

All non-trivial applications will integrate with some other parts
(databases, filesystems, network calls to other applications). When writing
unit tests these are usually the parts you leave out in order to come up
with better isolation and faster tests. Still, your application will interact
with other parts and this needs to be tested.
Integration Tests are there
to help. They test the integration of your application with all the parts
that live outside of your application.

For your automated tests this means you don’t just need to run your own
application but also the component you’re integrating with. If you’re
testing the integration with a database you need to run a database when
running your tests. For testing that you can read files from a disk you need
to save a file to your disk and load it in your integration test.

I mentioned before that “unit tests” is a vague term, this is even more
true for “integration tests”. For some people integration testing means
to test through the entire stack of your application to other
applications within your system. I like to treat integration
testing more narrowly and test one integration point at a time by
replacing separate services and databases with test doubles. Together with
contract testing and running contract tests against test doubles as well
as the real implementations you can come up with integration tests that
are faster, more independent and usually easier to reason about.

Narrow integration tests live at the boundary of your service. Conceptually
they’re always about triggering an action that leads to integrating with the
outside part (filesystem, database, separate service). A database integration
test would look like this:

- dbIntegrationTest - The Practical Test Pyramid

Figure 6:
A database integration test integrates your code with a real database

  1. start a database
  2. connect your application to the database
  3. trigger a function within your code that writes data to the database
  4. check that the expected data has been written to the database by reading
    the data from the database

Another example, testing that your service integrates with a
separate service via a REST API could look like this:

- httpIntegrationTest - The Practical Test Pyramid

Figure 7:
This kind of integration test checks that your application can
communicate with a separate service correctly

  1. start your application
  2. start an instance of the separate service (or a test double with
    the same interface)
  3. trigger a function within your code that reads from the separate
    service’s API
  4. check that the your application can parse the response correctly

Your integration tests – like unit tests – can be fairly whitebox. Some
frameworks allow you to start your application while still being able to mock
some other parts of your application so that you can check that the correct
interactions have happened.

Write integration tests for all pieces of code where you either serialize
or deserialize data. This happens more often than you might think. Think

  • Calls to your services’ REST API
  • Reading from and writing to databases
  • Calling other application’s APIs
  • Reading from and writing to queues
  • Writing to the filesystem

Writing integration tests around these boundaries ensures that writing data
to and reading data from these external collaborators works fine.

When writing narrow integration tests you should aim to run your
external dependencies locally: spin up a local MySQL database, test against
a local ext4 filesystem. If you’re integrating with a separate service
either run an instance of that service locally or build and run a fake
version that mimics the behaviour of the real service.

If there’s no way to run a third-party service locally you should opt for
running a dedicated test instance and point at this test instance when
running your integration tests. Avoid integrating with the real production
system in your automated tests. Blasting thousands of test requests
against a production system is a surefire way to get people angry because
you’re cluttering their logs (in the best case) or even

‘ing their service (in the worst
case). Integrating with a service over the network is a typical characteristic
of a broad integration test and makes your tests slower and usually
harder to write.

With regards to the test pyramid, integration tests are on a higher level
than your unit tests. Integrating slow parts like filesystems and databases
tends to be much slower than running unit tests with these parts stubbed out.
They can also be harder to write than small and isolated unit tests, after all
you have to take care of spinning up an external part as part of your tests.
Still, they have the advantage of giving you the confidence that your
application can correctly work with all the external parts it needs to talk to.
Unit tests can’t help you with that.

Database Integration

The PersonRepository is the only repository class in the codebase. It
relies on Spring Data and has no actual implementation. It just extends
the CrudRepository interface and provides a single method header. The rest
is Spring magic.

public interface PersonRepository extends CrudRepository<Person, String> {
    Optional<Person> findByLastName(String lastName);

With the CrudRepository interface Spring Boot offers a fully functional
CRUD repository with findOne, findAll, save, update and delete
methods. Our custom method definition (findByLastName()) extends this
basic functionality and gives us a way to fetch Persons by their last
name. Spring Data analyses the return type of the method and its method name
and checks the method name against a naming convention to figure out what it
should do.

Although Spring Data does the heavy lifting of implementing database
repositories I still wrote a database integration test. You might argue that
this is testing the framework and something that I should avoid as it’s
not our code that we’re testing. Still, I believe having at least one
integration test here is crucial. First it tests that our custom
findByLastName method actually behaves as expected. Secondly it proves
that our repository used Spring’s wiring correctly and can connect to the

To make it easier for you to run the tests on your machine (without
having to install a PostgreSQL database) our test connects to an in-memory
H2 database.

I’ve defined H2 as a test dependency in the build.gradle file. The in the test directory doesn’t define any
spring.datasource properties. This tells Spring Data to use an in-memory
database. As it finds H2 on the classpath it simply uses H2 when running
our tests.

When running the real application with the int profile (e.g. by setting
SPRING_PROFILES_ACTIVE=int as environment variable) it connects to a
PostgreSQL database as defined in the

I know, that’s an awful lot of Spring specifics to know and understand.
To get there, you’ll have to sift through a lot of
The resulting code is easy on the eye but hard to understand if you don’t
know the fine details of Spring.

On top of that going with an in-memory database is risky business. After
all, our integration tests run against a different type of database than
they would in production. Go ahead and decide for yourself if you prefer
Spring magic and simple code over an explicit yet more verbose

Enough explanation already, here’s a simple integration test that saves a
Person to the database and finds it by its last name:

public class PersonRepositoryIntegrationTest {
    private PersonRepository subject;

    public void tearDown() throws Exception {

    public void shouldSaveAndFetchPerson() throws Exception {
        Person peter = new Person("Peter", "Pan");;

        Optional<Person> maybePeter = subject.findByLastName("Pan");

        assertThat(maybePeter, is(Optional.of(peter)));

You can see that our integration test follows the same arrange, act,
structure as the unit tests. Told you that this was a universal

Integration With Separate Services

Our microservice talks to,
a weather REST API. Of course we want to ensure that our service sends
requests and parses the responses correctly.

We want to avoid hitting the real darksky servers when running
automated tests. Quota limits of our free plan are only part of the reason.
The real reason is decoupling. Our tests should run independently of
whatever the lovely people at are doing. Even when your machine
can’t access the darksky servers or the darksky servers are down
for maintenance.

We can avoid hitting the real darksky servers by running our own,
fake darksky server while running our integration tests. This might
sound like a huge task. Thanks to tools like
Wiremock it’s easy peasy. Watch this:

public class WeatherClientIntegrationTest {

    private WeatherClient subject;

    public WireMockRule wireMockRule = new WireMockRule(8089);

    public void shouldCallWeatherService() throws Exception {
                        .withHeader(CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)

        Optional<WeatherResponse> weatherResponse = subject.fetchWeather();

        Optional<WeatherResponse> expectedResponse = Optional.of(new WeatherResponse("Rain"));
        assertThat(weatherResponse, is(expectedResponse));

To use Wiremock we instantiate a WireMockRule on a fixed
port (8089). Using the DSL we can set up the Wiremock server,
define the endpoints it should listen on and set canned responses it should
respond with.

Next we call the method we want to test, the one that calls the
third-party service and check if the result is parsed correctly.

It’s important to understand how the test knows that it should call the
fake Wiremock server instead of the real darksky API. The secret is
in our file contained in
src/test/resources. This is the properties file Spring loads
when running tests. In this file we override configuration like API keys and
URLs with values that are suitable for our testing purposes, e.g. calling
the fake Wiremock server instead of the real one:

weather.url = http://localhost:8089

Note that the port defined here has to be the same we define when
instantiating the WireMockRule in our test. Replacing the real weather
API’s URL with a fake one in our tests is made possible by injecting the URL
in our WeatherClient class’ constructor:

public WeatherClient(final RestTemplate restTemplate,
                     @Value("${weather.url}") final String weatherServiceUrl,
                     @Value("${weather.api_key}") final String weatherServiceApiKey) {
    this.restTemplate = restTemplate;
    this.weatherServiceUrl = weatherServiceUrl;
    this.weatherServiceApiKey = weatherServiceApiKey;

This way we tell our WeatherClient to read the
weatherUrl parameter’s value from the weather.url
property we define in our application properties.

Writing narrow integration tests for a separate service is quite easy
with tools like Wiremock. Unfortunately there’s a downside to this
approach: How can we ensure that the fake server we set up behaves
like the real server? With the current implementation, the separate service
could change its API and our tests would still pass. Right now we’re merely
testing that our WeatherClient can parse the responses that
the fake server sends. That’s a start but it’s very brittle. Using
end-to-end tests and running the tests
against a test instance of the real service instead of using a fake
service would solve this problem but would make us reliant on the
availability of the test service. Fortunately, there’s a better solution to
this dilemma: Running contract tests against the fake and the real server
ensures that the fake we use in our integration tests is a faithful test
double. Let’s see how this works next.


Thanks to Clare Sudbery, Chris Ford, Martha Rohte, Andrew Jones-Weiss
David Swallow, Aiko Klostermann, Bastian Stein, Sebastian Roidl and
Birgitta Böckeler for providing feedback and suggestions to early drafts
of this article. Thanks to Martin Fowler for his advice, insights and

For articles on similar topics…

…take a look at the tag: testing

Source link


Please enter your comment!
Please enter your name here