Testing

Everything having to do with testing: Unit testing, Integration Testing, Test Coverage

The article The Testing Introduction I Wish I Had is a great introduction to testing It is written by Max Antonucci and according to dev.to is a 14 minute read.

A good second article for those working on front-end development is this one: Static vs Unit vs Integration vs E2E Testing for Frontend Apps.

This article takes the perspective that it is better to learn the testing pyramid from the top, instead of the bottom: https://github.com/NoriSte/ui-testing-best-practices/blob/master/sections/beginners/top-to-bottom-approach.md

Three core test types

Antonucci’s article divides the topic of testing into six areas. The first three of these are core areas of testing that most treatments of the topic mention:

Test type Explanation
Unit Tests “the simplest test[s] for the smallest possible pieces of your program.”
Integration Tests “check how well separate units [of the code] integrate together”
Acceptance Tests “shift away from what pieces of code should do to what users should do. These tests are based around common user tasks like logging in, submitting a form, navigating content, and having their privacy invaded by tracking scripts. This usually makes acceptance tests the highest-level tests for any application, and often the most important.”

Here’s a bit more on each of these:

Unit tests

Writing good unit tests has some different conventions from writing production code.

The gist is that unit tests should be simple and modular. Each test should focus on some very specific aspect of the function or object you are testing and should not be dependent on any other parts of the application working.

There is also the idea of self-documenting unit tests. This is where best test-writing practices differ from best development practices. Each test should have a name that thoroughly identifies its purpose, even if it is a really long name.

This way, if the test fails for any reason, it will be easy for anyone to identify the exact details of what is going wrong just from seeing the error message.

Test code is also a lot more repetitive than production code- you generally don’t have helper functions, since you want a reader to see everything that is being set up and used in the body of the test itself. Sometimes you can pull out some setup that is required for every single test to a single set up function that is run before every test, but this should be written in a way that is very clear to any reader. In general, choose readability over efficiency in test code, even if that means more repetition or copy-pasting.

Unit tests in JavaScript:

Integration Tests

Integration tests examine how multiple components work with each other. Once you have a thorough set of unit tests, you can be confident that each individual part of your project works well on its own. However, you rarely just have those components working alone, so you need to check that they behave correctly when interacting with one another.

This article discusses the main approaches to integration testing: http://softwaretestingfundamentals.com/integration-testing/

In general for integration testing, you are gradually combining components together to verify that they work properly together, so it is easier to catch errors in specific interactions.

For integration tests, you may need more than a simple test framework.

For front-end testing of Web Apps, for example, or any kind of app built with React:

Acceptance Tests (aka end-to-end tests)

The last of these is also sometimes called “end-to-end” testing when it is automated. In Antonucci’s article, acceptance testing refers to automated tests of acceptance criteria: for example, for web apps, these tests might be automated using a “headless brower”.

These are related to the acceptance tests that might be carried out by a human that is looking at acceptance criteria on a user story or issue. In both cases, these might be specified using the “GIVEN/WHEN/THEN” style of writing acceptance tests. Ideally, these tests should be both:

Three other test types

Max Antonucci’s article The Testing Introduction I Wish I Had also mentions three additional important areas of testing:

Test type Explanation
Visual Regression Testing “for unexpected (or expected) visual changes in the app”. Compares before and after screenshots of the app as it runs, pixel-by-pixel
Accessibility Testing Tests for accessibility of apps for users with different abilities (e.g. low-vision, color-vision-issues, blind users that interact with screen reading software).
Code Quality Tests Using linters to look through a code base for issues such as code duplication, security risks, style conventions (e.g. indenting), overly complex control structures, etc.

Code Quality Tests

The last of these three, “code quality tests” deserves a special mention, especially pretty-printers and linters.

There are a variety of programs to do this kind of analysis for the languages typically used in CS48.

For Python:

For JavaScript:

Other Test Types

The term smoke test or sanity test is sometimes used to refer to a test that is run during staging (i.e. putting a new version of an application into production). It is a “fast test that is done by a script or human that ensures that the application under test works to minimal expectation. For example, a human run smoke test involves logging into the app and doing some usual activities such as conduct a search or exercise a standard feature.”

The idea is that if the team has done a reasonable job of testing the code base, then it is sufficient to do a quick test to make sure that all of the pieces are up and talking to each other (the various processes, servers, databases, APIs, etc.) If a major component is failing, then it will show up quickly during a properly designed smoke test.

Test Driven Development (TDD)

See also: /topics/TDD/

Test Coverage

It’s helpful to be able to measure how much of our code is covered by tests. This metric is known as “test coverage”

Line Versus Branch Coverage

Two common metrics are

It might be immediately obvious why those are not the same.

The answer is that not every if statement has an else.

  color="blue";
  if (x<10) {
      color="red";
  }
  foo(color);

Suppose we have a test that covers the case where x<10 evaluates to true. Then for those code, we have 100% line coverage, but we do not necessarily have 100% branch coverage unless we ALSO have a test that covers the case when x<10 evaluates to false. That means that there is a branch into calling foo(color) when color still has the value “blue”, and that branch is untested.

Tools for measuring test coverage

Python

In Python, there is a module called coverage that can be installed with pip.

JavaScript

In JavaScript, when using the node npm ecosystem, there is a module called istanbul that can be use used to measure code coverage.

Java

In Java, Jacoco (http://www.jacoco.org/jacoco/index.html) is one tool for measuring test coverage.

The documentation for Jacoco can be difficult to follow.

Here is some help:

Related topics: