Rifle-Oriented Programming with Clojure

Any comparison of hot JVM languages is likely to note that “Clojure is not object-oriented.” This is true, but it may lead you to the wrong conclusions. It’s a little like saying that a rifle is not arrow-oriented. In this article, you will see some of the ways that Clojure addresses the key concerns of OO: encapsulation, polymorphism, and inheritance.

This is a whirlwind tour, and we won't have time to cover the full details of all the Clojure code you will see. When we are done, I hope you will decide to explore for yourself. You can download and start using Clojure by following the instructions on the getting started page.

Just Enough Clojure Syntax

Clojure has vectors, which are accessed by integer indexes:

  [1 2 3 4]
  -> [1 2 3 4]

  (get [:a :b :c :d :e] 2)
  -> :c

In the preceding example, the initial `[1 2 3 4]` is input that you enter at the Read-Eval-Print Loop (REPL). The `->` indicates the response from the REPL.

Clojure has maps, which are key/value collections:

  {:fname "Stu", :lname "Halloway"}
  -> {:fname "Stu", :lname "Halloway"}

Sets contain a set of values, and their literal form is preceded with a hash. Here is the set of English vowels, using backslash to introduce a character literal:

  #{\a \e \i \o \u}
  -> #{\a \e \i \o \u}

Lists are singly-linked lists, and are enclosed with parentheses. Lists are special: Not only are they data, they also act as the syntax for invoking functions. The list below invokes the plus (`+`) function:

  (+ 1 2 3 4 5)
  -> 15

Collections themselves act as functions. They take an argument which is the key/index to look up:

  ([:a :b :c :d :e] 2)
  -> :c

  ({:name "Stu" :ext 101} :name)
  -> "Stu"

Enough syntax, let's get started.

Encapsulation

Encapsulation is the hiding of implementation details so that clients of your code do not accidentally become dependent on them. In object-oriented languages, this is usually done at the class level. A class has public methods, private implementation details, and various other scopes in between.

Clojure accomplishes the purposes of encapsulation in three ways: closures, namespaces, and immutability.

Closures

A closure closes over (remembers) the environment at the time it was created. For example, the function `make-counter` below closes over the initial value passed via `init-val`:

  (defn make-counter [init-val] 
    (let [c (atom init-val)] #(swap! c inc)))

Let’s break this down:

`defn` defines a new function, named `make-counter`, that takes a single argument `init-val`.
The `let` binds the name `c` to a new `atom`.
The `atom` creates a threadsafe, deadlock-proof mutable reference to a value.
The octothorpe (`#`) prefix introduces an anonymous function
The call to `swap!` updates the value referenced by `c` by calling `inc` on it.
The value of the let is the value of its last expression. This `let` returns a function that increments a counter, which is then the return value of `make-counter`.

The atom `c` is private to the function returned by `make-counter`. The only public thing you can do is increment it by one:

  (def c (make-counter))
  -> #'user/c

  (c)
  -> 1

  (c)
  -> 2

  (c)
  -> 3

The counter example returned a single function, but nothing stops you from returning multiple functions. These multiple functions can then share private state. The new version of `make-counter` below returns two functions: one to increment the counter, and one to reset it.

  (defn make-counter [init-val] 
    (let [c (atom init-val)] 
      {:next #(swap! c inc)
       :reset #(reset! c init-val)}))

This new `make-counter` returns a map whose `:next` value increments the counter, and whose `:reset` value resets it:

  (def c (make-counter 10))
  -> #'user/c

  ((c :next))
  -> 11

  ((c :next))
  -> 12

  ((c :reset))
  -> 10

Why the double parentheses above? Two functions calls: The inner function call looks up the appropriate function, and the outer one calls it.

Closing over data is far more general than the simplistic model offered by private, protected, public, friend, et al. in OO languages. By combining multiple lets and multiple return values from a function, you can create arbitrary encapsulation strategies.

Similar encapsulation possibilities are available in any language that supports closures. Douglas Crockford describes a similar idiom in JavaScript.

Namespaces

A Clojure namespace groups a set of related data and functions. Inside a namespace, a Clojure var can refer to a function or to data, and can be public or private.

For example, Chris Houser’s error-kit library implements a condition/restart system for Clojure.

  (with-handler
    (vec (map int-half [2 4 5 8]))
      (handle *number-error* [n]
        (continue-with 0)))

In the code above, `with-handler`, `handle`, and `continue-with` are public vars of the `clojure.contrib.error-kit` namespace. The `int-half` is a demo function that blows up on odd inputs. When a `*number-error*` occurs, the handler causes execution to continue with the value 0. (Note how this is more flexible than try/catch exception handling, which cannot recover back into the middle of some operation.)

Internally, error-kit keeps track of available handlers and continues using these private vars:

  (defvar- *handler-stack* () 
    "Stack of bound handler symbols")
  (defvar- *continues* {} 
    "Map of currently available continue forms")

The trailing minus sign on the end of `defvar-` marks the vars as private. These vars are implementation details, and are invisible to code outside the `clojure.contrib.error-kit` namespace.

Immutability

In OO languages, another purpose of encapsulation is to prevent object A from modifying or corrupting the private data used by object B.

In Clojure, this problem does not exist. Data structures are immutable. They cannot possibly be corrupted, or changed in any way, period. You can write query functions that return “private” state, without any fear of data corruption.

Polymorphism

For our purposes here, polymorphism is the ability to choose a different method implementation based on the type of the caller. So for example:

  Flyer a = new Airplane();
  Flyer b = new Bird();
  a.fly();
  b.fly();

`a.fly()` and `b.fly()` do different things because they are called on different concrete types.

Clojure provides a generalization of polymorphism called multimethods. A multimethod definition begins with `defmulti`, and then has a name, plus a dispatch function that is used to select the actual implementation: To mimic polymorphism, simply dispatch on the `class` of the argument:

  (defmulti fly class)

Individual methods of a multimethod begin with `defmethod`, then the multimethod name, then the object that must match the dispatch function. Finally, you get the argument list in a vector, followed by the implementation of the method. For example:

  (defmethod fly Bird [b] (flap-wings b))
  (defmethod fly Airplane [a] (turn-propeller a))

Unlike polymorphism, multimethods do not limit you to dispatching on class. You can dispatch based on any arbitrary function of the method arguments. So for example, a bank account might have a `:type` entry that is used to determine the interest rate:

  (defmulti interest :type)
  (defmethod interest :checking [a] 0)
  (defmethod interest :savings [a] 0.05M)

The `:type` attribute is a convention, but nothing prevents you from dispatching on a different attribute, or even dispatching on more than one at the same time! For example, the `service-charge` multimethod below dispatches on two different facets of the same object: the object’s `account-level` (`::Basic` or `::Premium`) and its `:tag:` (`::Checking` or `::Savings`)

  (defmulti service-charge 
    (fn [acct] [(account-level acct) (:tag acct)]))
  (defmethod service-charge [::Basic ::Checking]   [_] 25)
  (defmethod service-charge [::Basic ::Savings]    [_] 10)
  (defmethod service-charge [::Premium ::Checking] [_] 0)
  (defmethod service-charge [::Premium ::Savings]  [_] 0)

The `_` is a legal name, and is used idiomatically to indicate that an argument will be ignored. (There is no need to even look at the argument, since all the work has been done in choosing which method to dispatch to!) This example also demonstrates two other concepts:

The double-colon prefix resolves a keyword in a namespace. This prevents name collisions among keywords, just as object-oriented langauges use namespaces to prevent name collisions between type names.
`account-level` is a function (not shown here), not a simple key lookup. It returns `::Premium` or `::Basic` based on the the account type and the current balance. Thus an account can dynamically change its account level as its balance changes.

As you can see, multimethods are far more general than polymorphism. Instead of being limited to type-based dispatch, multimethods can dispatch on any arbitrary function of an argument list. This allows programming models that more closely resemble reality: after all, what real-world entities are limited to a single type hierarchy, and forbidden to change types over time?

Inheritance

In OO languages, inheritance allows you to create a derived type that reuses the behavior of a base type. For example:

  class Person {
    String fullName() { /* impl details */ }
  }
  class Employee extends Person {
    AddressBookItem companyDirectoryEntry() { /* impl details */ }
  }

This kind of reuse is so natural in Clojure that it doesn’t even have a name. For example, here is a function that returns the full name of a person, based on first and last names:

  (defn full-name [p]
    (str (:first-name p) " " (:last-name p)))

Employees are like people, but have other properties and behaviors, such as a telephone extension. The `company-directory-entry` returns a vector of an employee's full name and telephone extension, like this:

  (defn company-directory-entry [p]
    [(full-name p) (:extension p)])

Notice that `company-directory-entry` “reuses” the person-ness of its argument `p` by calling `full-name` on it. There is no special inheritance ceremony required to set this up, you just call functions when you need them.

You can pass either a person or an employee to `full-name`. For `company-directory-entry`, though, you must have an employee. Or, more accurately, you must have something that resembles an employee, to the extent of having a `:first-name`, `:last-name`, and `:extension`. This is an example of duck typing: if it walks like a duck and quacks like a duck, we assume it is a duck, without asking it to present its `IDuck` papers.

Many Functions, Few Types

The example above demonstrates another negative consequence of idiomatic OO style: the over-specification of data types. The return value of `companyDirectoryEntry` is given its own unique type, `AddressBookItem`. Each new data type like `AddressBookItem` requires its own life-support system: constructors, accessors, `equals`, `hashCode`, and so on.

In Clojure, an address book item would simply be a vector or a map. No new types, and no life support system required. Moreover, an address book item can be manipulated with any of the large arsenal of functions in Clojure's sequence library.

To see the problem with overspecifying types, consider this method from the Apache Commons:

  // From Apache Commons Lang, http://commons.apache.org/lang/
  public static int indexOfAny(String str, char[] searchChars) {
      if (isEmpty(str) || ArrayUtils.isEmpty(searchChars)) {
  	return -1;
      }
      for (int i = 0; i < str.length(); i++) {
  	char ch = str.charAt(i);
  	for (int j = 0; j < searchChars.length; j++) {
  	    if (searchChars[j] == ch) {
  		return i;
  	    }
  	}
      }
      return -1;
  }

The purpose of `indexOfAny` is to find the index of the first occurrence of one of the `searchChars` that appears in `str`. Note the unnecessary specificity of types: it works only with strings and character arrays.

Here's the Clojure version, using the sequence library's `map`, `iterate`, and `for` forms:

  (defn indexed [coll] (map vector (iterate inc 0) coll))
  (defn index-filter [pred coll]
    (when pred 
      (for [[idx elt] (indexed coll) :when (pred elt)] idx)))

Here is an example calling `index-filter`:

  (index-filter #{\a \e \i \o \u} "Lts f cnsnts nd n vwel")
  -> (20)

The expression above finds the index of the first vowel in the string "Lts f cnsnts nd n vwel", that is, 20. But `index-filter` is more general than the Commons version in several ways:

1. `index-filter` returns all the matches, not just one.

  (index-filter #{\a \e \i \o \o} "The quick brown fox")
  -> (2 6 12 17)

2. `index-filter` works with any sequence, not just a string of characters. For example, the call below works against a range of integers:

  (index-filter #{2 3 5 7} (range 6))
  -> (2 3 5)

3. `index-filter` works with any predicate, not just a test against a character array. In the example below, the predicate is an anonymous function that tests for strings longer than three characters:

  (index-filter #(> (.length %) 3) ["The" "quick" "brown" "fox"])
  -> (1 2)

That is a lot of extra power, especially given that the function is shorter, easier to write, and easier to read (given some Clojure experience, of course) than the Commons version.

Conclusion

Clojure solves the same problems that OO solves, but it solves them in different ways. Instead of encapsulation, polymorphism, and inheritance, you have closures, namespaces, pure functions, immutable data, and multimethods. Idiomatic OO gives you a bloated type system with duplicated code hidden away behind encapsulation boundaries and little hope for thread safety. Clojure offers a radical alternative: a lean type system, a rich function library, and language-level concurrency support that is usable by mere mortals.

There is a lot more to Clojure than we have covered here: lazy and infinite sequences, destructuring, macros, software transactional memory, agents, seamless Java interop, and more. But those are topics for another day.

[This article was originally published in the May 2009 issue of NFJS, the Magazine. I will be speaking about Clojure at several upcoming NFJS events, come join the fun.]