Around IT in 256 seconds

Property-based testing with Spock

September 18, 2014 | 4 Minute Read

Property based testing is an alternative approach to testing, complementing example based testing. The latter is what we've been doing all our lives: exercising production code against "examples" - inputs we think are representative. Picking these examples is an art on its own: "ordinary" inputs, edge cases, malformed inputs, etc. But why are we limiting ourselves to just few examples? Why not test hundreds, millions... ALL inputs? There are at least two difficulties with that approach:

  1. Scale. A pure function taking just one int input would require 4 billion tests. This means few hundred gigabytes of test source code and several months of execution time. Square it if a function takes two ints. For String it practically goes to infinity.

  2. Assume we have these tests, executed on a quantum computer or something. How do you know the expected result for each particular input? You either enter it by hand (good luck) or generate expected output. By generate I mean write a program that produces expected value for every input. But aren't we testing such program already in the first place? Are we suppose to write better, error-free version of code under test just to test it? Also known as ugly mirror antipattern.

So you understand testing every single input, although ideal, is just a mental experiment, impossible to implement. That being said property based testing tries to get as close as possible to this testing nirvana. Issue #1 is solved by slamming code under test with hundreds or thousands of random inputs. Not all of them, not even a fraction. But a good, random representation.

Issue #2 is surprisingly harder. Property based testing can generate random arguments, but it can't figure out what should be the expected outcome for that random input. Thus we need a different mechanism, giving name to whole philosophy. We have to come up with properties (invariants, behaviours) that code under test exhibits no matter what the input is. This sounds very theoretically, but there are many such properties in various scenarios:

  1. Absolute value of any number should never be negative
  2. Encoding and decoding any string should yield the same String back for every symmetric encoding
  3. Optimized version of some old algorithm should produce the same result as the old one for any input
  4. Total money in a bank should remain the same after arbitrary number of intra-bank transactions in any order
As you can see there are many properties we can think of that do not mention specific example inputs. This is not exhaustive and strict testing. It's more like sampling and making sure samples are "sane". There are many, many libraries supporting property based testing for virtually every language. In this article we will explore Spock and ScalaCheck later.

Spock + custom data generators

Spock does not support property based testing out-of-the-box. However with help from data driven testing and 3rd-party data generators we can go quite far. Data tables in Spock can be generalized into so-called data pipes:

def 'absolute value of #value should not be negative'() {
expect:
value.abs() >= 0

where:
value << randomInts(100)
}

private static def List<Integer> randomInts(int count) {
final Random random = new Random()
(1..count).collect { random.nextInt() }
}
Code above will generate 100 random integers and make sure for all of them .abs() is non-negative. You might think this test is quite dumb, but to a great surprise it actually discovers one bug! But first let's kill some boilerplate code. Generating random inputs, especially more complex, is cumbersome and boring. I found two libraries that can help us. spock-genesis:

import spock.genesis.Gen

def 'absolute value of #value should not be negative'() {
expect:
value.abs() >= 0

where:
value << Gen.int.take(100)
}
Looks great, but if you want to generate e.g. lists of random integers, net.java.quickcheck has nicer API and is not Groovy-specific:

import static net.java.quickcheck.generator.CombinedGeneratorsIterables.someLists
import static net.java.quickcheck.generator.PrimitiveGenerators.integers

def 'sum of non-negative numbers from #list should not be negative'() {
expect:
list.findAll{it >= 0}.sum() >= 0

where:
list << someLists(integers(), 100)
}
This test is interesting. It makes sure sum of non-negative numbers is never negative - by generating 100 lists of randoms ints. Sounds reasonable. However multiple tests are failing. First of all due to integer overflow sometimes two positive ints add up to a negative one. Duh! Another type of failure that was discovered is actually frightening. While [1,2,3].sum() is 6, obviously, [].sum() is... null (WAT?)

As you can see even silliest and most basic property based tests can be useful in finding unusual corner cases in your data. But wait, I said testing absolute of int discovered one bug. Actually it didn't, because of poor (too "random") data generators, not returning known edge values in the first place. We will fix that in the next article.

Tags: Spock, groovy, testing

Be the first to listen to new episodes!

To get exclusive content: