Property-based testing with Spock
- Scale. A pure function taking just one
int
input would require 4 billion tests. This means few hundred gigabytes of test source code and several months of execution time. Square it if a function takes twoint
s. ForString
it practically goes to infinity. - Assume we have these tests, executed on a quantum computer or something. How do you know the expected result for each particular input? You either enter it by hand (good luck) or generate expected output. By generate I mean write a program that produces expected value for every input. But aren't we testing such program already in the first place? Are we suppose to write better, error-free version of code under test just to test it? Also known as ugly mirror antipattern.
Issue #2 is surprisingly harder. Property based testing can generate random arguments, but it can't figure out what should be the expected outcome for that random input. Thus we need a different mechanism, giving name to whole philosophy. We have to come up with properties (invariants, behaviours) that code under test exhibits no matter what the input is. This sounds very theoretically, but there are many such properties in various scenarios:
- Absolute value of any number should never be negative
- Encoding and decoding any string should yield the same
String
back for every symmetric encoding - Optimized version of some old algorithm should produce the same result as the old one for any input
- Total money in a bank should remain the same after arbitrary number of intra-bank transactions in any order
Spock + custom data generators
Spock does not support property based testing out-of-the-box. However with help from data driven testing and 3rd-party data generators we can go quite far. Data tables in Spock can be generalized into so-called data pipes:def 'absolute value of #value should not be negative'() {Code above will generate 100 random integers and make sure for all of them
expect:
value.abs() >= 0
where:
value << randomInts(100)
}
private static def List<Integer> randomInts(int count) {
final Random random = new Random()
(1..count).collect { random.nextInt() }
}
.abs()
is non-negative. You might think this test is quite dumb, but to a great surprise it actually discovers one bug! But first let's kill some boilerplate code. Generating random inputs, especially more complex, is cumbersome and boring. I found two libraries that can help us. spock-genesis:import spock.genesis.GenLooks great, but if you want to generate e.g. lists of random integers,
def 'absolute value of #value should not be negative'() {
expect:
value.abs() >= 0
where:
value << Gen.int.take(100)
}
net.java.quickcheck
has nicer API and is not Groovy-specific: import static net.java.quickcheck.generator.CombinedGeneratorsIterables.someListsThis test is interesting. It makes sure sum of non-negative numbers is never negative - by generating 100 lists of randoms
import static net.java.quickcheck.generator.PrimitiveGenerators.integers
def 'sum of non-negative numbers from #list should not be negative'() {
expect:
list.findAll{it >= 0}.sum() >= 0
where:
list << someLists(integers(), 100)
}
int
s. Sounds reasonable. However multiple tests are failing. First of all due to integer overflow sometimes two positive int
s add up to a negative one. Duh! Another type of failure that was discovered is actually frightening. While [1,2,3].sum()
is 6, obviously, [].sum()
is... null
(WAT?)As you can see even silliest and most basic property based tests can be useful in finding unusual corner cases in your data. But wait, I said testing absolute of
int
discovered one bug. Actually it didn't, because of poor (too "random") data generators, not returning known edge values in the first place. We will fix that in the next article.Tags: Spock, groovy, testing