property based testing - generative data & executable domain rules
DESCRIPTION
As presented in FunctionalConf 2014TRANSCRIPT
Property based Testinggenerative data & executable domain rules
Debasish Ghosh (@debasishg)
Friday, 10 October 14
Agenda
Why xUnit based testing is not enough
What is a property?
Properties for free?
What to verify in property based testing
ScalaCheck
Domain Model Testing
Friday, 10 October 14
xUnit based testing
• Convenient
• Widely used
• Rich tool support
Friday, 10 October 14
xUnit based testing
• Often grows out of bounds (verbosity), being at a lower level of abstraction
• Difficult to manage data/logic isolation
• Always scared that I may have missed some edge cases & boundary conditions
Friday, 10 October 14
// polymorphic list appenddef append[A](xs: List[A], ys: List[A]): List[A] = { //..}
How do you show the correctness of the above implementation ?
Friday, 10 October 14
(1) Theorem Proving
length [] = 0 (length.1)length (z : zs) = 1 + length zs (length.2)
[] ++ zs = zs (++.1)(w : ws) ++ zs = w : (ws ++ zs) (++.2)
length ([] ++ ys) = length [] + length ys (base)
length (xs ++ ys) = length xs + length ys (hypothesis)
Length & Append
Induction
length ((x : xs) ++ ys) = length (x : xs) + length ys
To Prove
Friday, 10 October 14
Base Case
length ([] ++ ys) = length [] + length ys (base)
length ([] ++ ys) = length ys (by ++.1)
length [] + length ys = 0 + length ys = length ys (by length.1)
length [] = 0 (length.1)length (z : zs) = 1 + length zs (length.2)
[] ++ zs = zs (++.1)(w : ws) ++ zs = w : (ws ++ zs) (++.2)
Friday, 10 October 14
Induction
length ((x : xs) ++ ys) = length (x : xs) + length ys
length ((x : xs) ++ ys) = length (x : (xs ++ ys)) (by ++.2) = 1 + length (xs ++ ys) (by length.2) = 1 + length xs + length ys
length (x : xs) + length ys = 1 + length xs + length ys (by length.2)
length [] = 0 (length.1)length (z : zs) = 1 + length zs (length.2)
[] ++ zs = zs (++.1)(w : ws) ++ zs = w : (ws ++ zs) (++.2)
Friday, 10 October 14
(2) Use your favorite unit testing library
// ScalaTest based assertions
List(1,2,3).length + List(4,5).length should equal (append(List(1,2,3), List(4,5)).length)
List().length + List(4,5).length should equal (append(List(), List(4,5)).length)
List(1,2,3).length + List().length should equal (append(List(1,2,3), List()).length)
Friday, 10 October 14
• Hard coded data sets
• May not be exhaustive
• Coverage depends upon the knowledge of the test creator
But ..
Friday, 10 October 14
(3) Use dependent typing - an example of append in Idris, a dependently typed language
-- Vectors lists that have size as part of typedata Vect : Nat -> Type -> Type where Nil : Vect Z a (::) : a -> Vect k a -> Vect (S k) a
-- the app function is correct by constructionapp : Vect n a -> Vect m a -> Vect (n + m) aapp Nil ys = ysapp (x :: xs) ys = x :: app xs ys
Future ?
Friday, 10 October 14
• Types depend on values
• Powerful constraints encoded within the type signature
• Correct by construction - correct before the program runs!
Future ?
Friday, 10 October 14
• Miles Sabin has been working on shapeless (https://github.com/milessabin/shapeless)
• Flavors of dependent typing in Scala
• Sized containers, polymorphic function values, heterogeneous lists & a host of other goodness built on top of Scala typesystem
today ..
Friday, 10 October 14
• Till such time the ecosystem matures, we all start programming in dependently typed languages ..
• We have better options than using only xUnit based testing ..
Mature?
Friday, 10 October 14
What exactly are we trying to verify ?
property("List append adds up the 2 sizes") = forAll((l1: List[Int], l2: List[Int]) => l1.length + l2.length == append(l1, l2).length )
Friday, 10 October 14
What exactly are we trying to verify ?
property("List append adds up the 2 sizes") = forAll((l1: List[Int], l2: List[Int]) => l1.length + l2.length == append(l1, l2).length )
invariant of our function encoded as a generic property
Friday, 10 October 14
What is a property?
• Constraints and invariants that must be honored within the bounded context of the model
• Sometimes called “laws” or the “algebra”
• Ensures well-formed-ness of abstractions
Friday, 10 October 14
A Monoid
An algebraic structure having
• an identity element
• a binary associative operation
trait Monoid[A] { def zero: A def op(l: A, r: => A): A}
object MonoidLaws { def associative[A: Equal: Monoid](a1: A, a2: A, a3: A): Boolean = //..
def rightIdentity[A: Equal: Monoid](a: A) = //..
def leftIdentity[A: Equal: Monoid](a: A) = //..}
Friday, 10 October 14
Monoid Laws
An algebraic structure havingsa
• an identity element
• a binary associative operation
trait Monoid[A] { def zero: A def op(l: A, r: => A): A}
object MonoidLaws { def associative[A: Equal: Monoid](a1: A, a2: A, a3: A): Boolean = //..
def rightIdentity[A: Equal: Monoid](a: A) = //..
def leftIdentity[A: Equal: Monoid](a: A) = //..}
satisfies op(x, zero) == x and op(zero, x) == x
satisfies op(op(x, y), z) == op(x, op(y, z))
Friday, 10 October 14
• Every monoid that you define must honor all the laws of the abstraction
• The question is how do we verify that all laws are satisfied
it’s the LAW
Friday, 10 October 14
But before that another important important question is what properties do we need to verify ..
Friday, 10 October 14
def f[A](a: A): A
• The function f takes as input a value and returns a value of the same type
• But we don’t know the exact type of A
• Hence we cannot do any type specific operation within f
• All we know is that f is a polymorphic function parameterized on type A, that returns the same type as the input
• A little thought makes us realize that the only possible implementation of f is that of an identity function
Friday, 10 October 14
def f[A](a: A): A = a
• This is the only possible implementation of f (unless we decide to launch a missile and do all evil stuff like throwing exceptions or do typecasing)
• If the type-checker checks it ok, we are done. The compiler has proved the theorem and we don’t need to verify the property ourselves
• So we have proved a theorem out of the types - this technique is called parametricity and Phil Wadler calls these free theorems
Fast and Loose Reasoning is Morally Correct (2006) by by Nils Anders Danielsson, John Hughes, Jeremy Gibbons, Patrik Jansson (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.59.8232)
Friday, 10 October 14
Free theorems are ensured by the type-checker in a statically typed language that supports parametric polymorphism and we don’t need to write tests for verifying any of those properties
Theorems for Free! by Phil Wadler (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.38.9875)
Friday, 10 October 14
Parametricity tests a lot of properties in your model.
“Parametricity constantly tests more conditions than your unit test suite ever will”
- Edward Kmett on #scala
Friday, 10 October 14
What properties do we need to verify ?
Friday, 10 October 14
Summarizing ..
• if your programming language has a decent static type system and
• support for parametric polymorphism and
• you play to the rules of parametricity
• ....
Friday, 10 October 14
Summarizing ..
• if your programming language has a decent static type system and
• support for parametric polymorphism and
• you play to the rules of parametricity
• ....
You get a lot of propertie
s
verified for FREE
Friday, 10 October 14
• Property based testing library for Scala (and also for Java)
• Inspired by QuickCheck (Erlang & Haskell)
• Works on property specifications
• Does automatic data generation
ScalaCheck ..
Friday, 10 October 14
ScalaCheck
• You specify the property to be tested
• ScalaCheck verifies that the property holds by generating random data
• No burden on programmer to maintain data, no fear of missed edge cases
Friday, 10 October 14
“Property-based testing encourages a high level approach to testing in the form of abstract invariants functions should satisfy universally, with the actual test data generated for the programmer by the testing library. In this way code can be hammered with thousands of tests that would be infeasible to write by hand, often uncovering subtle corner cases that wouldn't be found otherwise.”
Real World Haskell by Bryan O’Sullivan, Don Stewart & John Goerzen
Friday, 10 October 14
scala> import org.scalacheck._import org.scalacheck._
scala> import Prop.forAllimport Prop.forAll
scala> forAll((l1: List[Int], l2: List[Int]) => | l1.length + l2.length == append(l1, l2).length | )res5: org.scalacheck.Prop = Prop
scala> res5.check+ OK, passed 100 tests.
generates 100 test cases randomly and verifies the property
specification of the property to be verified
Verifying properties
universal quantifier
Friday, 10 October 14
scala> val propSqrt = forAll { (n: Int) => scala.math.sqrt(n*n) == n }propSqrt: org.scalacheck.Prop = Prop
scala> propSqrt.check! Falsified after 0 passed tests.> ARG_0: -2147483648
scala> propSqrt.check! Falsified after 0 passed tests.> ARG_0: -1
not only says that the property fails, but also points to failure data set
Verifying properties
A word about minimization of test cases. Whenever a property fails, scalacheck starts shrinking the test cases until it finds the minimal failing test case. This is a huge feature that helps debugging failures
Friday, 10 October 14
scala> val propSqrt = forAll { (n: Int) => | scala.math.sqrt(n*n) == n | }propSqrt: org.scalacheck.Prop = Prop
scala> val smallInteger = Gen.choose(1, 100)smallInteger: org.scalacheck.Gen[Int] = ..
scala> val propSmallInteger = forAll(smallInteger)(n => | n >= 0 && n <= 100 | )propSmallInteger: org.scalacheck.Prop = Prop
uses default data generator
custom generator
forAll uses the custom generator
Custom generators
Friday, 10 October 14
// generate values in a rangeGen.choose(10, 20)
// conditional generatorGen.choose(0,200) suchThat (_ % 2 == 0)
// generate specific valuesGen.oneOf('A' | 'E' | 'I' | 'O' | 'U' | 'Y')
// default distribution is random, but you can changeval vowel = Gen.frequency( (3, 'A'), (4, 'E'), (2, 'I'), (3, 'O'), (1, 'U'), (1, 'Y'))
// generate containersGen.containerOf[List,Int](Gen.oneOf(1, 3, 5))
Custom generators
Friday, 10 October 14
Define your own generator
case class Account(no: String, holder: String, openingDate: Date, closeDate: Option[Date])
val genAccount = for { no <- Gen.oneOf("1", "2", "3") nm <- Gen.oneOf("john", "david", "mary") od <- arbitrary[Date] cd <- arbitrary[Option[Date]]} yield Account(no, nm, od, cd)
(model)
(random data generator)
Friday, 10 October 14
Define your own generator
sealed abstract class Treecase object Leaf extends Treecase class Node(left: Tree, right: Tree, v: Int) extends Tree
val genLeaf = value(Leaf)
val genNode = for { v <- arbitrary[Int] left <- genTree right <- genTree} yield Node(left, right, v)
def genTree: Gen[Tree] = oneOf(genLeaf, genNode)
(model)
(random data generator)
Friday, 10 October 14
implicit val arbAccount: Arbitrary[Account] = Arbitrary { for { no <- Gen.oneOf("1", "2", "3") nm <- Gen.oneOf("john", "david", "mary") od <- arbitrary[Date] } yield checkingAccount(no, nm, od)}
implicit val arbCcy: Arbitrary[Currency] = Arbitrary { Gen.oneOf(USD, SGD, AUD, INR)}
implicit val arbMoney = Arbitrary { for { a <- Gen.oneOf(1 to 10) c <- arbitrary[Currency] } yield Money(a, c)}
implicit val arbPosition: Arbitrary[Position] = Arbitrary { for { a <- arbitrary[Account] m <- arbitrary[Money] d <- arbitrary[Date] } yield Position(a, m, d)}
Arbitrary is a special generator that scalacheck uses to generate random data. Need to specify an
implicit Arbitrary instance of your specific data type which can be used to generate data
Domain Model Testing
Friday, 10 October 14
property("Equal debit & credit retains the same position") = forAll((a: Account, c: Currency, d: Date, i: BigDecimal) => { val Success((before, after)) = for { p <- position(a, c, d) r <- credit(p, Money(i, c), d) q <- debit(r, Money(i, c), d) } yield (q, p)
before == after })
property("Can close account with close date > opening date") = forAll((a: Account) => close(a, new Date(a.openingDate.getTime + 10000)).isSuccess == true)
property("Cannot close account with close date < opening date") = forAll((a: Account) => close(a, new Date(a.openingDate.getTime - 10000)).isSuccess == false)
Domain Model Testing
Verify your business rules and document them too
Friday, 10 October 14
Some other features
• Sized generators - restrict the set of generated data
• Conditional generators - use combinators to specify filters
• Classify generated test data to see statistical distribution
Friday, 10 October 14
Some other features
• Test case minimization for failed tests
• Stateful testing - not only queries, but commands as well in the CQRS sense of the term
Friday, 10 October 14
Essence of property based testing
• Identify constraints & invariants of your abstraction that must hold within your domain model
• Encode them as “properties” in your code (not as dumb documents)
• Execute properties to verify the correctness of your abstraction
Friday, 10 October 14
Essence of property based testing
• In testing domain models, properties help you think at a higher level of abstraction
• Basically you encode domain rules as properties and verify them using data generators that use the domain model itself
• Executable domain rules
Friday, 10 October 14
Property based testing & Functional programs
• Functional programs are easier to test & debug (no global state)
• Makes a good case for property based testing (immutable data, pure functions)
• Usually concise & modular - easier to identify properties
Friday, 10 October 14
Thank You!
Friday, 10 October 14