Lists

Originally posted on dev.to/gillchristian.

Lists

The explicit teaching of thinking is no trivial task, but who said that the teaching of programming is? In our terminology, the more explicitly thinking is taught, the more of a scientist the programmer will become. Edsger Dijkstra

More catch up ™️, this time with a previous week's session, Delft Haskell Study Group #6. As always, I quote the book as a way of recap 📖

About thinking 🤔

I'm always adding the quotes the book adds at the cover of each chapter but never mention anything about them. I really liked this one in particular. Programming is all about thinking. The more we learn to think the better programmers we'll be.

Lists

The list datatype in Haskell is defined like this:

data [] a = [] | a : [a]

There are a few things to point out there. [] Is both the type constructor as well as the data constructor for the empty list. The other constructor (:), the infix operator called cons, takes a value of type a and a list of type [a] and evaluates to [a]. The data type as a whole is a sum (|) but the second constructor (:) is a product. What a : [a] does is adding a to the front of the list.

Pattern matching 🎉

As I already mentioned pattern matching is one of my favorite features of Haskell. And we can do quite some interesting things by pattern matching on lists.

head' :: [a] -> Maybe a
head' []       = Nothing
head' (x : _ ) = Just x

tail' :: [a] -> Maybe [a]
tail' []       = Nothing
tail' (_ : xs) = Just xs

I wrote these versions that return the value wrapped in a maybe because [a] doesn't guarantee that the list will have values. If we try the Prelude version of these functions on empty lists we get and error.

λ> head []
*** Exception: Prelude.head: empty list
λ> tail []
*** Exception: Prelude.tail: empty list

One thing to mention about our tail'. In case of calling it with a list that has one value, it will return Just []. I think it's fine, since it reflects what the tail of a list with one element is (a : []). But if we'd want to only return lists with elements we could extend it like so.

tail' :: [a] -> Maybe [a]
tail' []       = Nothing
tail' (_ : []) = Nothing
tail' (_ : xs) = Just xs

Sweet sweet syntactic sugar 🍭

To write a list of four elements using the constructor we'd have to do use the cons operator 4 times.

(1 : 2 : 3 : 4 : [])

Luckily Haskell provides some syntactic sugar to make it easier, which resembles the one in many other languages.

[1, 2, 3, 4]

Ranges

Another nice thing to work with lists that Haskell provides is the ranges syntax.

λ> [1..10]
[1,2,3,4,5,6,7,8,9,10]

λ> [2, 4..10]
[2,4,6,8,10]

λ> [1..]
[1,2,3,4,5,6,7,8,9,10,11,12 ...] -- continues infinitely


λ> [1, 5..]
[1,5,9,13,17,21,25,29,33,37,41,45 ...] -- continues infinitely

Which are equivalent to.

λ> enumFromTo 1 10
[1,2,3,4,5,6,7,8,9,10]

λ> enumFromThenTo 2 4 10
[2,4,6,8,10]

λ> enumFrom 1
[1,2,3,4,5,6,7,8,9,10,11,12 ...] -- continues infinitely


λ> enumFromThen 1 5
[1,5,9,13,17,21,25,29,33,37,41,45 ...] -- continues infinitely

All of these functions require the type being ranged, in other words an Enum type class.

enumFromTo :: Enum a => a -> a -> [a]
enumFromThenTo :: Enum a => a -> a -> a -> [a]

enumFrom :: Enum a => a -> [a]
enumFromThen :: Enum a => a -> a -> [a]

List comprehensions

List comprehensions are a means of generating a new list from a list or lists. They come directly from the concept of set comprehensions in mathematics, including similar syntax. They must have at least one list, called the generator, that gives the input for the comprehension, that is, provides the set of items from which the new list will be constructed. They may have conditions to determine which elements are drawn from the list and/or functions applied to those elements.

Coming from a non-mathematical background (JS, PHP, C++, Go, .etc) this was one of the really new things for me.

[ x^2 | x <- [1..10]]

Give the definition we can guess that we are building a list of all the square numbers from the list of [1..10]. Or at least I'd guess that because I already read the chapter 😂

λ> [ x^2 | x <- [1..10]]
[1,4,9,16,25,36,49,64,81,100]

We can add predicates to ignore certain items. The ones the predicate returns False for, naturally. For example, we can decide to keep only the numbers divisible by 3.

λ> [x^2 | x <- [1..10], rem x 3 == 0]
[9,36,81]

List comprehensions can have multiple generators.

λ> [x^y | x <- [1..5], y <- [2, 3]]
[1,1,4,8,9,27,16,64,25,125]

λ> [(x, y) | x <- [1..5], y <- [2, 3]]
[(1,2),(1,3),(2,2),(2,3),(3,2),(3,3),(4,2),(4,3),(5,2),(5,3)]

Notice that the rightmost generator is exhausted first, then the second one and so on.

Spines and nonstrict evaluation

Lists are a recursive series of cons cells a : [a] terminated by the empty list []. Now, it'd be great to find a way to visualize that.

In Haskell we say that some data structures (lists, sequences and trees) have a spine, the connective structure that ties the collection of values together. In lists it's the recursive cons (:) operators.

1 : 2 : 3 : []

or

1 : (2 : (3 : []))

There's a problem with this 1 : 2 : 3 : [] representation, it makes it look like 1 exists before the cons cell but actually it contains the two values. Looking more like the following.

  :
 / \
1   :
   / \
  2   :
     / \
    3  []

Because of this and the way nonstrict evaluation works, you can evaluate cons cells independently of what they contain. It is possible to evaluate only the spine of the list without evaluating individual values. It is also possible to evaluate only part of the spine of a list and not the rest of it.

Evaluation of a list goes down the spine, while construction goes up. But nothing is evaluated until it is needed to be.

As we said, the spine is the recursive series of cons operators (:). And in here the _ would be the non evaluated values.

 : <------|
/ \       |
_  : <----| This is the "spine"
  / \     |
 _   : <--|
    / \
   _  []

Using :sprint in GHCi we can see what is evaluated and what isn't.

λ> ls = [1..5] :: [Int]
λ> :sprint ls
ls = _

So, ls hasn't been evaluated yet (as _ indicates).

λ> take 1 ls
[1]
λ> :sprint ls
ls = 1 : _

Now we evaluated the first cons cell and it's first value. Let's take 2 values.

λ> take 2 ls
[1, 2]
λ> :sprint ls
ls = 1 : 2 : _

We can continue like that until the end. We can also evaluate the whole spine with length.

λ> length ls
5
λ> :sprint ls
ls = [1,2,3,4,5]

Wait but I said the spine only 🤔 This is some quirk of GHCi. But we can prove it's only the spine by checking a list of bottom ⊥ (i.e. undefined).

λ> undefined
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
  error, called at libraries/base/GHC/Err.hs:78:14 in base:GHC.Err
  undefined, called at <interactive>:1:1 in interactive:Ghci1

As we can see it throws an exception when evaluated. But ...

λ> ls = [undefined, undefined]
λ> :sprint ls
ls = _
λ> length ls
2
λ> :sprint ls
ls = [_,_]

As we can see Haskell only evaluates what it needs and nothing else. The book extends on this a bit more and explains the concepts of normal form and weak head normal form. But I'll get back to those in more depth in the chapter of nonstrictness.

Transforming lists of values

We can use map or fmap to transform a list of values, by applying a function to each element and then returning the new list.

map  ::              (a -> b) -> [a] -> [b]
fmap :: Functor f => (a -> b) -> f a -> f b

fmap is the generalized version of map that works for Functors. List is one of them. More on Functors in a following chapter.

λ> fmap (+1) [1, 2, 3, 4]
[2,3,4,5]

This is how it's defined in base.

map :: (a -> b) -> [a] -> [b]
map _ []     = []
map f (a:as) = f a : map f as

By adding some parens to the type definition we can make it more explicit what map really allows us to do. Basically lifting our function a -> b to work in lists.

(a -> b) -> ([a] -> [b])

( a  ->  b )
([a] -> [b])

Filtering lists of values

map is not the only function we can use on lists. We can also filter certain values out of a list when they don't fulfill a predicate.

λ> even 10
True
λ> even 9
False
λ> filter even [1..10]
[2,4,6,8,10]

Here's how filter is defined.

filter :: (a -> Bool) -> [a] -> [a]
filter _ [] = []
filter pred (x:xs)
  | pred x    = x : filter pred xs
  | otherwise = filter pred xs

Zipping lists

Zipping lists together is a means of combining values from multiple lists into a single list.

λ> :t zip
zip :: [a] -> [b] -> [(a, b)]
λ> zip "abc" "cde"
[('a','c'),('b','d'),('c','e')]
λ> zip "abc" "cdefghi"
[('a','c'),('b','d'),('c','e')]
λ> zip "abcdefghi" "jkl"
[('a','j'),('b','k'),('c','l')]

Note that zip returns the length of the shortests list.

We can also unzip lists.

λ> :t unzip
unzip :: [(a, b)] -> ([a], [b])
λ> unzip [(1,'a'), (2, 'b'), (3, 'c')]
([1,2,3],"abc")

There's also zipWith to combine two lists.

λ> :t zipWith
zipWith :: (a -> b -> c) -> [a] -> [b] -> [c]
λ> zipWith (+) [1,2,3] [1,2,3]
[2,4,6]
λ> zipWith (+) [1,2,3] [1,2,3]
[2,4,6]
λ> zipWith (,) [1,2,3] [1,2,3]
[(1,1),(2,2),(3,3)]

Conclusion

With lists we start to see the declarative patterns of functional programming. And how the same functions we can use for plain values can be used for lists as well with higher order functions like map, filter and zip.

Thanks to Haskell's laziness (nonstrict evaluation) it will only evaluate what it needs and no more.

λ> ls = [1,2,3,4,5,6,7,undefined]
λ> filter even ls
[2,4,6*** Exception: Prelude.undefined
λ> take 2 $ zip "abcdefg" $ map (*2) $ filter even ls
[('a',4),('b',8)]

Although it's clear in this example, it has many more implications that I'm yet to understand or even discover and I look forward to learn about that.