TDD: optimizing for inputs vs for outputs
Gautier DI FOLCO November 22, 2022 [Code practice] #haskell #code kata #coding dojoThere is a famous code kata used during Global Days of Code Retreat called Conway's Game Of Life.
It has four rules:
- Any live cell with fewer than two live neighbours dies, as if by underpopulation
- Any live cell with two or three live neighbours lives on to the next generation
- Any live cell with more than three live neighbours dies, as if by overpopulation
- Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction
I have a love - hate relationship with this code kata.
On the positive side:
- It was one of the first code kata I have seriously worked on (especially during my first GDCR in 2012)
- It is a simple and yet challenging kata
- It shaped my TDD style (inside-out)
On the negative side:
- I have overdone it (more than 120 times)
- When I do it, I tend to be impatient and prescriptive
I'll use it to illustrate my point, which is: there are two ways to optimize a piece of code in TDD, for outputs or for inputs.
Let's start with outputs optimizations:
I always start by implementing the rule 4, focusing on the cell (not the grid):
= do
describe "Game of life" $ do
it "Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction" $
nextGen Dead 3 `shouldBe` Alive
data Cell
= Alive
| Dead
deriving stock (Eq, Show)
newtype Neighbours
= Neighbours { getNeighbours :: Int }
deriving newtype (Eq, Ord, Show, Num)
nextGen _ _ = Alive
spec
Then I implement rule 3:
-- ...
it "Any live cell with more than three live neighbours dies, as if by overpopulation" $
nextGen Alive 4 `shouldBe` Dead
-- ...
nextGen _ =
\case
3 -> Alive
_ -> Dead
-- ...
it "Any live cell with two live neighbours lives on to the next generation" $
nextGen Alive 2 `shouldBe` Alive
it "Any live cell with three live neighbours lives on to the next generation" $
nextGen Alive 3 `shouldBe` Alive
-- ...
nextGen _ =
\case
2 -> Alive
3 -> Alive
_ -> Dead
At this point, my implementation is incorrect as a dead cell with two neighbours or less should stay dead.
Moreover, in a pure TDD-style, I won't be able to add tests not passing to cover rule 1, so I add few "business" tests:
-- ...
it "Any Dead cell with fewer than three (2) live neighbours stays dead on to the next generation" $
nextGen Dead 2 `shouldBe` Dead
it "Any live cell with fewer than two (1) live neighbours dies, as if by underpopulation" $
nextGen Alive 1 `shouldBe` Dead
it "Any live cell with fewer than two (0) live neighbours dies, as if by underpopulation" $
nextGen Alive 0 `shouldBe` Dead
-- ...
nextGen x =
\case
2 -> x
3 -> Alive
_ -> Dead
That's what I call optimizing for outputs: implementation does not matter, as long as you have the correct values, the production code can omit the business side (which are in the tests).
On another hand, there are what I call optimizing for inputs:
Let's start with rule 4:
-- ...
describe "Reproduction (three live neighbours)" $ do
it "Any dead cell with exactly three live neighbours becomes a live cell" $
reproduction.next Dead `shouldBe` Alive
it "Any live cell with three live neighbours lives on to the next generation" $
reproduction.next Alive `shouldBe` Alive
-- ...
newtype Neighbourhood
= Neighbourhood { next :: Cell -> Cell }
reproduction = Neighbourhood $ const Alive
Then rule 3:
-- ...
describe "Overpopulation (more than three live neighbours)" $ do
it "Any live cell with more than three live neighbours dies" $
overpopulation.next Alive `shouldBe` Dead
-- ...
overpopulation = Neighbourhood $ const Dead
Then rule 2:
-- ...
describe "Survive (two live neighbours)" $ do
it "Any Dead cell with fewer than three live neighbours stays dead on to the next generation" $
survive.next Dead `shouldBe` Dead
it "Any dead cell with exactly three live neighbours becomes a live cell" $
survive.next Alive `shouldBe` Alive
-- ...
survive = Neighbourhood id
And finally rule 1:
-- ...
describe "Underpopulation (zero or one live neighbours)" $ do
it "Any Dead cell with fewer than three live neighbours stays dead on to the next generation" $
underpopulation.next Dead `shouldBe` Dead
it "Any live cell with fewer than two live neighbours dies" $
underpopulation.next Alive `shouldBe` Dead
-- ...
underpopulation = Neighbourhood $ const Dead
Of course, it takes an effort to talk with the business experts to structure the problem's understanding, but at least each rule is clearly defined in the production code, and not only in tests.
Moreover, each rule is independent.
Bonus, that's the only approach I know to have an implementation without condition/patter-matching, but that's the subject of another log.