Interactive Development with Clojure.spec

clojure.spec provides seamless integration with clojure.test.check's generators. Write a spec, get a functioning generator, and you can use that generator in a REPL as you're developing, or in a generative test.

To explore this, we'll use clojure.spec to specify a scoring function for Codebreaker, a game based on an old game called Bulls and Cows, a predecessor to the board game, Mastermind. You might recognize this exercise if you've read The RSpec Book, however this will be a bit different.

Although I'll explain some things as I go along, I'm going to assume that you're already familiar with the clojure.spec Rationale and Overview and the spec Guide.

If you like, you can follow along by evaluating the forms (in order of appearance - some redef vars that are def'd earlier in the namespace) in codebreaker.clj.

Problem

We want a function that accepts a secret code and a guess, and returns a score for that guess. Codes are made of 4 to 6 colored pegs, selected from six colors: [r]ed, [y]ellow, [g]reen, [c]yan, [b]lack, and [w]hite. The score is based on the number of pegs in the guess that match the secret code. A peg in the guess that matches the color of the peg in the same position in the secret code is considered an exact match, and a peg that matches a peg in a different position in the secret code is considered a loose match.

For example, if the secret code is [:r :y :g :c] and the guess is [:c :y :g :b], the score would be {:codebreaker/exact-matches 2 :codebreaker/loose-matches 1} because :y and :g appear in the same positions and :c appears in a different position.

We want to invoke this fn with two codes and get back a map like the one above, e.g.

(score [:r :y :g :c] [:c :y :g :r])
;; {:codebreaker/exact-matches 2
;;  :codebreaker/loose-matches 2}

Properties and property-based testing

In property-based testing, we make assertions about properties of a function and provide test data generators. The testing tool then generates test data, applies the function to it, and invokes the assertions. Properties are more general than examples in example based tests. For example, rather than writing a test that expresses the example above and asserts that the resulting map looks exactly like the one above, we'd write expressions that express more general properties like:

The return value should be a map with the two keys :codebreaker/exact-matches and :codebreaker/loose-matches
the values should be natural (i.e. non-negative) integers
the sum of the values should be >= 0
the sum of the values should be <= the number of pegs in the secret code

We'll express all of these properties using clojure.spec, and we're also going to describe the arguments to the function using the same tooling. This is one way in clojure.spec departs from other property-based testing tools.

First, in English:

there are two arguments
the arguments should both be codes
- a code is a sequence of 4 to 6 colored pegs
- the available colors are red, yellow, green, cyan, black, and white, represented by :r, :y, :g, :c, :b, and :w
- a code may contain duplicates
the two codes should be of equal length

:args

We have a few more questions to answer, but that's enough to get started, which we'll do with a function spec. We'll start with just the spec for the arguments, and a couple of supporting definitions.

(ns codebreaker
  (:require [clojure.spec :as s]

            [clojure.spec.test :as stest]))

(def peg? #{:y :g :r :c :w :b})
(s/def ::code (s/coll-of peg? :min-count 4 :max-count 6))
(s/fdef score
        :args (s/cat :secret ::code :guess ::code))

Now we can exercise the :args spec:

(s/exercise (:args (s/get-spec `score)))
;; ([([:y :w :g :y :c] [:c :g :y :y :y :c]) {:secret [:y :w :g :y :c], :guess [:c :g :y :y :y :c]}]
;;  [([:c :w :g :r :g] [:r :b :c :r :w :g]) {:secret [:c :w :g :r :g], :guess [:r :b :c :r :w :g]}]
;;  ...
;;  [([:r :c :w :w :y :r] [:y :r :y :y :c]) {:secret [:r :c :w :w :y :r], :guess [:y :r :y :y :c]}]
;;  [([:c :g :b :g :w :b] [:r :y :w :r :b]) {:secret [:c :g :b :g :w :b], :guess [:r :y :w :r :b]}])

s/exercise returns a collection of tuples of a value generated by the generator associated with the spec, and the same value conformed by the spec. This can give us a lot of confidence that the spec expresses what we think it does and that the generator produces the values we expect. The generator we get for free from the :args spec is sufficient, so we don't need to explicitly define one.

The output reveals that we didn't yet specify one of the properties described earlier: the two codes should be of equal length. So let's specify that:

(s/fdef score
        :args (s/and (s/cat :secret ::code :guess ::code)
                     (fn [{:keys [secret guess]}]
                       (= (count secret) (count guess)))))

(s/exercise (:args (s/get-spec `score)))
;; ([([:w :w :y :b :c] [:c :b :y :w :c]) {:secret [:w :w :y :b :c], :guess [:c :b :y :w :c]}]
;;  [([:y :g :b :r :b] [:b :y :r :b :r]) {:secret [:y :g :b :r :b], :guess [:b :y :r :b :r]}]
;;  ...
;;  [([:y :w :g :w :g] [:y :c :c :y :y]) {:secret [:y :w :g :w :g], :guess [:y :c :c :y :y]}]
;;  [([:b :w :c :r :c :w] [:b :g :r :y :y :g]) {:secret [:b :w :c :r :c :w], :guess [:b :g :r :y :y :g]}])

:ret

Now the :args spec represents the properties we laid out above, so let's move on to spec the :ret spec.

(s/def ::exact-matches nat-int?)
(s/def ::loose-matches nat-int?)

(s/fdef score
        :args (s/and (s/cat :secret ::code :guess ::code)
                     (fn [{:keys [secret guess]}]
                       (= (count secret) (count guess))))
        :ret (s/keys :req [::exact-matches ::loose-matches]))

(s/exercise (:ret (s/get-spec `score)))
;; ([#:codebreaker{:exact-matches 0, :loose-matches 1}
;;   #:codebreaker{:exact-matches 0, :loose-matches 1}]
;;  [#:codebreaker{:exact-matches 1, :loose-matches 1}
;;   #:codebreaker{:exact-matches 1, :loose-matches 1}]
;;  [#:codebreaker{:exact-matches 0, :loose-matches 1}
;;   #:codebreaker{:exact-matches 0, :loose-matches 1}]
;;  ...
;;  [#:codebreaker{:exact-matches 2, :loose-matches 6}
;;   #:codebreaker{:exact-matches 2, :loose-matches 6}]
;;  [#:codebreaker{:exact-matches 8, :loose-matches 0}
;;   #:codebreaker{:exact-matches 8, :loose-matches 0}])

Again, we see tuples of a generated value and the same value conformed by the spec. And here we see that the map keys are correct but the map values may exceed the number of pegs in the code, violating one of the properties we laid out earlier: the sum of the values in the returned map should be between 0 and the count of either of the codes. The values generated by the :ret spec are always >= 0 because they are spec'd with nat-int?, and their sum is therefore always >= 0, but we can't specify that the sum is >= the number of pegs without knowing the number of pegs, and that information is in the :args spec, which is not exposed to the :ret spec.

:fn

For relationships between :args and :ret values, we use a :fn spec:

(s/fdef score
        :args (s/and (s/cat :secret ::code :guess ::code)
                     (fn [{:keys [secret guess]}]
                       (= (count secret) (count guess))))
        :ret (s/keys :req [::exact-matches ::loose-matches])
        :fn (fn [{{secret :secret}} :args ret :ret}]
              (<= 0 (apply + (vals ret)) (count secret))))

Here we're choosing to explicitly specify that the sum of the values is <= 0 even though it's already specified implicitly by the nat-int? predicates we used to specify the values in the returned map. This is not necessary, but it clearly does a better job of expressing the properties we described earlier.

So now we have specs for the :args, the :ret value, and the relationship between them (in the :fn spec). So we're done, right? Well, not quite. We still don't have a function!

Wire it up

We need a function to tie it all together. Here's a skeletal implementation:

(defn score [secret guess]
  {::exact-matches 0
   ::loose-matches 0})

(s/exercise-fn `score)
;; ([([:g :w :w :c :y :g] [:b :c :g :w :c :w]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 0}]
;;  [([:y :r :w :c :b] [:b :b :c :r :c]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 0}]
;;  ...
;;  [([:r :w :r :y :r :r] [:g :w :g :y :r :r]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 0}]
;;  [([:y :c :r :c] [:y :r :g :c]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 0}])

This is incomplete, but we can see that the generated args and the function's return value match the specs, including the :fn spec. So now let's use clojure.spec's test/check wrapper to actually test the function:

(stest/check `score)
;; ...
;; :clojure.spec.test.check/ret {:result true, :num-tests 1000, :seed 1471029622166},

This happens to pass because the 0 values conform to the ::exact-matches and ::loose-matches specs, and their sum conforms to the :ret spec. We can validate that the test is actually testing what we think it is by providing hard coded values that would not conform to the spec, e.g.

(defn score [secret guess]
  {::exact-matches 4
   ::loose-matches 3})
(s/exercise-fn `score)
;; ([([:y :w :b :g :g] [:c :c :r :r :y]) {:codebreaker/exact-matches 4, :codebreaker/loose-matches 3}]
;;  [([:y :y :b :g :c] [:r :b :w :y :g]) {:codebreaker/exact-matches 4, :codebreaker/loose-matches 3}]
;;  ...
;;  [([:r :w :y :g] [:y :r :y :g]) {:codebreaker/exact-matches 4, :codebreaker/loose-matches 3}])
;;     (stest/check `score)
;;       :clojure.spec.test.check/ret {:result #error {
;;  :cause "Specification-based check failed"
;;  :data {:clojure.spec/problems
;;         [{:path [:fn]
;;           :pred (fn [{{secret :secret}} :args, ret :ret}]
;;                   (<= 0 (apply + (vals ret)) (count secret)))
;;           :val {:args {:secret [:w :b :w :w :r :g]
;;                        :guess  [:c :c :b :c :r :c]}
;;           :ret #:codebreaker{:exact-matches 4, :loose-matches 3}}
;;           :via []
;;           :in []}]
;;  :clojure.spec.test/args ([:w :b :w :w :r :g] [:c :c :b :c :r :c])
;;  :clojure.spec.test/val {:args {:secret [:w :b :w :w :r :g]
;;                                 :guess  [:c :c :b :c :r :c]}
;;                          :ret #:codebreaker{:exact-matches 4, :loose-matches 3}}
;;  :clojure.spec/failure :check-failed}

Now we know that everything's wired up correctly, and we can start to flesh out the solution.

The approach we'll take is to calculate all of the matches, ignoring position, and then the exact matches, and then subtract the exact matches from all matches to calculate the number of loose matches. For example, given the secret [:r :g :y :c] and the guess [:g :w :b :c], there are 2 matches altogether, :g and :c, so the count of all matches would be 2. One of those, :c, is an exact match, so exact matches would be 1, leaving 1 loose match.

exact-matches

The exact match calculation is easy to imagine: we need to compare each peg in the guess to the peg in the same position in the secret:

(defn score [secret guess]
  {::exact-matches (count (filter true? (map = secret guess)))
   ::loose-matches 0})

Let's exercise that and see what we get:

(s/exercise-fn `score)
;; ([([:r :c :r :r :r :y] [:r :y :r :c :b :y]) {:codebreaker/exact-matches 3, :codebreaker/loose-matches 0}]
;;  [([:w :c :c :b] [:c :b :b :y]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 0}]
;;  ...
;;  [([:w :b :b :w :c :c] [:g :g :b :r :c :g]) {:codebreaker/exact-matches 2, :codebreaker/loose-matches 0}]
;;  [([:r :y :w :r] [:c :b :g :r]) {:codebreaker/exact-matches 1, :codebreaker/loose-matches 0}])

And here we can see the incredible value we get from generators! In this one sample set we got at least one case each of 0, 1, 2, and 3 exact matches. This is not guaranteed, of course. We got lucky! And the next time we run it we'll get different combinations that may or may not be as lucky. But this is far easier than imagining different scenarios, and the result is an arguably more effective evaluation of the function we've written.

You can easily scan these examples visually and validate that they're all producing the correct result for :codebreaker/exact-matches. We can run stest/check again and see that we're still passing:

(stest/check `score)
;;  ...
;;  :clojure.spec.test.check/ret {:result true, :num-tests 1000, :seed 1471031057455}

One thing that we don't have, however, is a specification for the exact-matches function. In fact, we don't even have an exact matches function, so let's extract it:

(defn exact-matches [secret guess]
  (count (filter true? (map = secret guess))))

(defn score [secret guess]
  {::exact-matches (exact-matches secret guess)
   ::loose-matches 0})

Now we can add a spec for it. It has the same args as score, so let's extract the :args spec to something we can share:

(s/def ::secret-and-guess (s/and (s/cat :secret ::code :guess ::code)
                          (fn [{:keys [secret guess]}]
                            (= (count secret) (count guess)))))
(s/fdef score
        :args ::secret-and-guess
        :ret (s/keys :req [::exact-matches ::loose-matches])
        :fn (fn [{{secret :secret}} :args ret :ret}]
              (<= 0 (apply + (vals ret)) (count secret))))

And now we can use that ::secret-and-guess spec for our exact-matches spec:

(s/fdef exact-matches
        :args ::secret-and-guess
        :ret nat-int?
        :fn (fn [{{secret :secret}} :args ret :ret}]
              (<= 0 ret (count secret))))

This is quite similar to the score spec, but the :ret is a single nat-int? between 0 and the count of pegs in the secret. So let's exercise this:

(s/exercise-fn `exact-matches)
;; ([([:c :w :g :r :c] [:w :b :r :y :b]) 0]
;;  [([:r :w :w :b :c] [:y :r :w :b :r]) 2]
;;  ...
;;  [([:c :c :g :y :y :y] [:g :c :g :y :b :c]) 3]
;;  [([:w :r :c :c :g] [:b :w :c :w :y]) 1])

Again, we can scan the results to validate them visually, and then run stest/check to validate the results against the :fn spec.

(stest/check `exact-matches)
;; :clojure.spec.test.check/ret {:result true, :num-tests 1000, :seed 1471041377249}

And now we can start tying things together by instrumenting exact-matches and exercising and testing score. s/instrument wraps a fn in a fn that checks args for conformance to the :args spec before delegating to the original fn:

(stest/instrument `exact-matches)
(s/exercise-fn `score)
;; ([([:r :c :w :b :y :g] [:g :r :w :c :r :g]) {:codebreaker/exact-matches 2, :codebreaker/loose-matches 0}]
;;  ...
;;  [([:y :r :w :b :w :r] [:y :r :y :g :c :w]) {:codebreaker/exact-matches 2, :codebreaker/loose-matches 0}])
(stest/check `score)
;; :clojure.spec.test.check/ret {:result true, :num-tests 1000, :seed 1471041098588}

Everything still passes because score is invoking exact-matches correctly, but it would report incorrect calls to exact-matches if we had a bug in score:

(defn score [secret guess]
  {::exact-matches (exact-matches secret (take 3 guess))
   ::loose-matches 0})
(s/exercise-fn `score)
;; 1. Unhandled clojure.lang.ExceptionInfo
;;    Call to #'codebreaker/exact-matches did not conform to spec:
;; ...
;;    {:clojure.spec/problems
;;     [{:path [:args :guess],
;;       :pred (clojure.core/<= 4 (clojure.core/count %) 6),
;;       :val (:w :g :y),
;;       :via [:codebreaker/code :codebreaker/code],
;;       :in [1]}],
;;     :clojure.spec/args ([:r :g :y :c :w] (:w :g :y)),
;;     :clojure.spec/failure :instrument,
;;     :clojure.spec.test/caller
;;     {:file "form-init3693047879084126402.clj",
;;      :line 122,
;;      :var-scope codebreaker/score}}

This is a really great way to uncover problems below the surface. It is similar to a mock object in that incorrect calls to exact-matches throw errors, but different in that nothing happens when there are no calls to exact-matches. Still, we're in an interactive session here, and we can see that score is calling exact-matches. Gray box testing FTW!

Also note that we have property based tests for both functions, with no specific examples codified. More on this later!

If you're following along in a REPL, don't forget to fix the bug by restoring the score fn:

(defn score [secret guess]
  {::exact-matches (exact-matches secret guess)
   ::loose-matches 0})

all-matches

Thinking of a function to calculate all of the matches, it would have the same properties as the exact-matches function: it takes a pair of codes and returns a nat-int? between 0 and the count of pegs in either of the codes. We already have a spec for that, so let's generalize its name, and then we can use it for exact-matches and all-matches:

(s/fdef match-count
        :args ::secret-and-guess
        :ret nat-int?
        :fn (fn [{{secret :secret}} :args ret :ret}]
              (<= 0 ret (count secret))))

(s/exercise-fn `exact-matches 10 (s/get-spec `match-count))
(stest/check-fn exact-matches (s/get-spec `match-count))

These calls to s/exercise-fn and stest/check-fn produce similar results to those above. Now we can instrument exact-matches with the match-count spec and exercise and test score as we did earlier as well:

(stest/instrument `exact-matches {:spec {`exact-matches (s/get-spec `match-count)}})

(s/exercise-fn `score)
(stest/check `score)

So now comes the hard part: the all-matches calculation. We need to allow for duplicates, so we can count the number of appearances of e.g. :r in the secret and the guess and take the lower of the two numbers. Then we can do the same for all the colors and add up the resulting counts. For example, with a secret [:r :r :y :b] and a guess [:r :g :c :y], we can see that :r appears twice in the secret and once in the guess, so the score for :r would be 1. Yellow appears once in each, so its score is 1. Neither :g nor :c ever appear in the secret, so we don't count those and the total is 2. Make sense? Here's one way to express that:

(defn all-matches [secret guess]
  (apply + (vals (merge-with min
                             (select-keys (frequencies secret) guess)
                             (select-keys (frequencies guess) secret)))))

(s/exercise-fn `all-matches 10 (s/get-spec `match-count))
;; ([([:b :y :g :b :c] [:y :b :y :c :c]) 3]
;;  [([:r :r :c :r] [:w :w :c :b]) 1]
;;  [([:g :y :g :w :c] [:g :b :w :y :g]) 4]
;;  [([:r :g :c :r :r] [:w :g :c :r :g]) 3]
;;  [([:g :w :b :w :b :c] [:b :r :c :c :c :b]) 3]
;;  [([:y :w :b :b :c :y] [:w :c :r :y :w :c]) 3]
;;  [([:r :w :w :w :y :c] [:y :w :w :w :g :y]) 4]
;;  [([:b :w :c :r :w :b] [:c :r :r :c :r :g]) 2]
;;  [([:c :y :b :g] [:y :g :g :b]) 3]
;;  [([:b :r :w :g :g :c] [:g :r :w :c :g :w]) 5])

All together now

Scanning the output of s/exercise, we can see we got it right. So now let's put that to work in score:

(defn score [secret guess]
  (let [exact (exact-matches secret guess)
        all   (all-matches secret guess)]
    {::exact-matches exact
     ::loose-matches (- all exact)}))

(stest/instrument [`exact-matches `all-matches]
                  {:spec {`exact-matches (s/get-spec `exact-matches)
                          `all-matches   (s/get-spec `exact-matches)}})

(s/exercise-fn `score)
;; ([([:g :b :w :b :w] [:b :b :w :y :w]) {:codebreaker/exact-matches 3, :codebreaker/loose-matches 1}]
;;  [([:c :w :r :r :r] [:g :w :b :g :b]) {:codebreaker/exact-matches 1, :codebreaker/loose-matches 0}]
;;  [([:g :b :y :r :b] [:c :w :w :g :y]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 2}]
;;  [([:c :w :b :b :y :y] [:w :g :c :r :w :g]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 2}]
;;  [([:g :b :c :w :b :w] [:g :y :c :y :y :r]) {:codebreaker/exact-matches 2, :codebreaker/loose-matches 0}]
;;  [([:g :b :g :w] [:y :c :c :g]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 1}]
;;  [([:y :r :w :r :g] [:y :r :g :y :w]) {:codebreaker/exact-matches 2, :codebreaker/loose-matches 2}]
;;  [([:r :y :w :g :r :r] [:y :b :y :y :y :r]) {:codebreaker/exact-matches 1, :codebreaker/loose-matches 1}]
;;  [([:r :y :y :r] [:g :y :c :r]) {:codebreaker/exact-matches 2, :codebreaker/loose-matches 0}]
;;  [([:c :c :b :y :b] [:w :g :r :b :c]) {:codebreaker/exact-matches 0, :codebreaker/loose-matches 2}])
(stest/check `score)
;; :clojure.spec.test.check/ret {:result true, :num-tests 1000, :seed 1471043045446}

We can see that we're calculating the exact and loose matches correctly, and the tests all pass. So now we can memorialize these generative tests in a repeatable automated test.

Testing, testing, 1, 2, 3 ...

We can get a summary of the test results:

(stest/summarize-results (stest/check 'codebreaker/score))
;; {:total 1, :check-passed 1}

If there were failures the result would look like this instead:

;; {:total 1, :check-failed 1}

Then we can build predicates around that like:

#(= (:total %) (:check-passed %))
;; or
#(not (contains? % :check-failed))

And then we can hook those into assertions in clojure.test or whatever tool you prefer for repeatable tests.

Note that there are no example based tests here: everything is being generated by generators built for us by clojure.spec. If this makes you uncomfortable, then add a couple of example based tests! But you certainly don't need an exhaustive set of them. The handful of functions and specs are all small, expressive, and easy for any experienced clojure developer to read and understand, and the generative, property-based tests that we get for score, exact-matches, and all-matches provide a lot of confidence that they are all working correctly.

Experience report

I've been using TDD in some form or another for many years, and when clojure.spec appeared I was curious to see how it would fit into or change my approach. I won't go as far as to say that this post represents "my approach" as that is still evolving in light of the presence of spec, but there are clear similarities to and differences from TDD in this example.

Like TDD, there is a tight feedback loop: write a spec and exercise it right away, all before any implementation code. The spec itself is not a test, but it is a reusable source for generated sample data that we can use in an interactive REPL session and in a repeatable test.

Like TDD, I did some refactoring as I discovered opportunities to improve the code. Sometimes I used visual inspection of the result of exercising specs and sometimes I used test.check to cast a wider net.

Unlike TDD, I didn't go through a consistent cycle of watching a test fail and then making it pass, and then refactoring (red/green/refactor). You can use these tools for that cycle, but generative tests are, by design, more coarse than example based tests, so it might be more of a challenge to keep that very granular cycle consistent. Perhaps a subject for another post (perhaps written by you!).

Unlike example based tests in TDD, generation allowed me to quickly spot-check dozens of correct invocations without having to hand write them or wire them to assertions.

Unlike TDD, generation can sometimes find categories of inputs that we've failed to consider. That didn't happen in this example, but if you've ever forgotten to account for nil or empty string in tests for a string processing function, then you know what I'm talking about.

In summary, I think that clojure.spec provides a powerful set of tools for interactive development that should appeal to anybody developing at the REPL, regardless of your particular process.