This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.
This article covers Chapter 11, Collections.
PCL describes a group of basic collection functions: count
, find
, position
, remove
, and substitute
. Clojure supports count
for a variety of list-like types:
These types, and any others than implement a basic first/rest protocol, are called sequences in Clojure. A sequence is logically a list, but may be implemented using other data structures.
In addition to generic sequence functions, some sequences have specific functions unique to their underlying data structure. Clojure defines find
for maps to return the matching key/value pair:
Or, you could just place the map itself in function position, and get back the matching value for a key:
The Clojure core does not define find
for other collection types. But the implementation is a one-liner using some
. For example, to ask if a collection contains the number 2:
Clojure-contrib wraps the some
idiom into a function named includes?
.
The rest of the "basic" functions have similar stories: The Clojure core tends to support them directly where they are efficient (constant time) operations. Where they would take longer (e.g. linear time), the operations can be written as one-liners atop higher-order functions.
CL includes higher order versions of the basic functions described above. These higher-order versions take an additional parameter, which is a function that acts as a filter. Here are some examples.
First, a collection of days
for the examples to work against:
Now I can find the weekdays that start with "S":
Or simply count the days that start with "S":
In an immutable world, remove
is the opposite of find
. I can get a collection with all "S" days removed by reversing the previous filter with complement
:
To replace all "S" days with "Weekend!" I can use map
:
Sorting is easy:
Sorting by criteria is also easy:
The concat
function concatenates sequences.
Note that the resulting sequence is lazy. So, concat
can return without walking each input sequence. In other words, the (take 5 ...)
below does not have to wait (forever!) for all the powers of 2 to be generated:
What if one of the sequences passed to concat
blows up instead of returning a sequence?
Here concat
fails because its second argument is not a sequence. As it happens, I have an even lazier option than concat
. The lazy-cat
function does not even look at each argument until it is forced to do so:
Lazy sequences have many uses, but take some getting used to. One mistake to avoid is trying to inspect a lazy infinite sequence from the REPL. The REPL tries to print the entire sequence, which will take forever (literally). Hence the (take 2 ...)
wrappers above.
It is often interesting to take subsequences from the beginning, middle, or end of a collection. Clojure supports this in a general way with take
and drop
. You have already seen take
, which returns the first part of a collection:
For the end of a collection, I can use drop
:
For the middle of a collection, I can use take
and drop
together:
The take-nth
function takes only every nth item of a collection. To demonstrate take-nth
, I will begin by defining a lazy collection of the natural-numbers:
The call to iterate
produces a collection that starts with 1
and generates subsequent members by calling inc
. You can verify that these are the natural numbers by taking a few of them.
Now I can write an intuitive definition for the even and odd numbers in terms of the natural numbers:
Clojure provides a number of functions that test boolean predicates, including every?
, not-any?
, and not-every?
, and some
. Here are a few examples, using the days
collection defined above.
Does every day start with "S"?
Is there some day that starts with "M"?
map
take a function and one or more sequences. It returns a new sequence which is the result of applying the function to the item(s) in each sequence. So, to take the product of numbers from two sequences:
If I want to control the type of collection returned, I can use into
:
reduce
walks down a collection, applying function f
of two arguments to the first two arguments, then applying f
to the result of the first call and the next element. This is very useful for operations that process a sequence and return a single value. For example, I can sum a sequence:
Or find the max value of a sequence:
Maps (hash tables in CL) can be iterated just like any other sequence type, bearing in mind that the function you pass in should expect a key/value pair. Given the following map of names to scores:
I can find all the people who scored above 15:
Notice how the destructuring bind ([[k,v]]
) makes it easy to bind k
and v
separately, without introducing a temporary variable pair
that I don't really need.
Lisp excels at processing lists. Clojure offers similar capabilities, but generalized to sequences, which can be lists, vectors, maps, sets, or other list-like collections.
Clojure's support for lazy collections allows a different style for collection processing that I will continue to explore in later articles in this series.