When I work with Scala collections, I always keep in mind that there are two types of operations which I can perform: transformation operations and actions or like someone call it aggregation operations. The first type transforms a collection into some another collection. The second one type returns some value.

After this short introduction I want to focus on particular Scala collection functions, which I count the most useful and amazing in everyday work. Some of these functions aimed to transform collections, rest of them return a particular values after application. And in the end of the article I want to show you, how these functions can be combined in order to solve a concrete problem.

#1 Minimum and Maximum values

I want to start from the action function.

It’s so common task to find a minimum or maximum value in a sequence. Of course you may say that this kind of operations are helpful only for interview questions and algorithms. But let’s be honest, who don’t remember this lines of code in Java?

int[] arr = {11, 2, 5, 1, 6, 3, 9};

int to = arr.length - 1;
int max = arr[0];

for (int i = 0; i < to; i++) {
    if (max < arr[i+1])
        max = arr[i+1];
}

System.out.println(max);

Question: How to find maximum / minimum in list?
Scala suggests a pretty elegant solution:

val numbers = Seq(11, 2, 5, 1, 6, 3, 9)

numbers.max //11
numbers.min //1

But always we work with more complex data. Let’s introduce more advanced example, where we have a sequence of books, represented by case classes.

case class Book(title: String, pages: Int)

val books = Seq(
  Book("Future of Scala developers", 85),
  Book("Parallel algorithms", 240),
  Book("Object Oriented Programming", 130),
  Book("Mobile Development", 495)
)

//Book(Mobile Development,495)
books.maxBy(book => book.pages)

//Book(Future of Scala developers,85)
books.minBy(book => book.pages)

So as you see, minBy & maxBy functions solve problem with non-trivial data. The only thing you need to do is to choose a data property by which you want to determine minimum or maximum value.

#2 Filtering

Have you ever performed filtering of collections? For example you want to get items with price more than $10 or you need to select the youngest employees with age under 24 years. All this operations imply usage of filtering.

Let’s start with the popular example: filter a list of numbers and get only even elements.

val numbers = Seq(1,2,3,4,5,6,7,8,9,10)

numbers.filter(n => n % 2 == 0)

What about more complex scenario? I want to choose books where number of pages more than 120.

val books = Seq(
  Book("Future of Scala developers", 85),
  Book("Parallel algorithms", 240),
  Book("Object Oriented Programming", 130),
  Book("Mobile Development", 495)
)

books.filter(book => book.pages >= 120)

Filtering is not much harder to apply than the min & max functions, despite that filter is a function of the transformation type.

Also there is a syntax sugar analogue of filter function. Its name is filterNot. I guess, you know what it does by its name. If no, try to substitute filter function for filterNot in the first example.

#3 Flatten O_o

I bet, there is a huge chance that you haven’t heard about this function before! It’s easy to explain. Because its application is extremely specific. For me, it’s hard to describe this function without an example.

val abcd = Seq('a', 'b', 'c', 'd')
val efgj = Seq('e', 'f', 'g', 'h')
val ijkl = Seq('i', 'j', 'k', 'l')
val mnop = Seq('m', 'n', 'o', 'p')
val qrst = Seq('q', 'r', 's', 't')
val uvwx = Seq('u', 'v', 'w', 'x')
val yz   = Seq('y', 'z')

val alphabet = Seq(abcd, efgj, ijkl, mnop, qrst, uvwx, yz)

// List(a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z)
alphabet.flatten

When flatten is helpful? Well, if you have a collection of collections, and you want to operate with all of the elements from the collections, don’t hesitate and use flatten.

#4 Euler diagram functions

Don’t panic! I talk about a well known operations: difference, intersection, union. I hope, you agree with me, that these functions are good to be explained on Euler diagrams 🙂

val num1 = Seq(1, 2, 3, 4, 5, 6)
val num2 = Seq(4, 5, 6, 7, 8, 9)

//List(1, 2, 3)
num1.diff(num2)

//List(4, 5, 6)
num1.intersect(num2)

//List(1, 2, 3, 4, 5, 6, 4, 5, 6, 7, 8, 9)
num1.union(num2)

The examples are self explained. But what about the union function? It keeps duplicates. What if we want to get rid of them? For this purpose use distinct function:

//List(1, 2, 3, 4, 5, 6, 7, 8, 9)
num1.union(num2).distinct

Here is an illustration of the functions above:

Scala collection Euler functions

#5 map list elements

Probably map is the most widely used function in Scala collections. Its power is better to show than talk about it:

val numbers = Seq(1,2,3,4,5,6)

//List(2, 4, 6, 8, 10, 12)
numbers.map(n => n * 2)

val chars = Seq('a', 'b', 'c', 'd')

//List(A, B, C, D)
chars.map(ch => ch.toUpper)

A logic of map function is following: you iterate through each element of collection and apply a function to the elements. You even can leave elements as they are, without applying any function, but in this case map function is absolutely useless, because you will get the same collection after mapping.

#6 flatMap

I still remember how was difficult for me to understand where and how to apply flatMap function. In general this is caused by wide variety of situations where flatMap can be helpful. The first thing I recommend for every beginner to look at the name of the function more attentively. You should notice that flatMap consists of 2 functions which we’ve already considered above: map & flatten.

Let’s assume that we want to see how uppercase and lowercase characters look in the alphabet.

val abcd = Seq('a', 'b', 'c', 'd')

//List(A, a, B, b, C, c, D, d)
abcd.flatMap(ch => List(ch.toUpper, ch))

Since in this article I talk only about collection functions, I’ll omit examples with Futures and Options.

#7 Check entire collection for a condition

There is a well known scenario, when you need to ensure that all elements in a collection met some requirement. If at least one of the elements doesn’t correspond to the condition, you need to do something.

val numbers = Seq(3, 7, 2, 9, 6, 5, 1, 4, 2)

//ture
numbers.forall(n => n < 10)

//false
numbers.forall(n => n > 5)

Function forall is created for this sort of tasks.

#8 Partitioning of collection

What if you have a plan to separate a collection into two new collections by some rule? This can be done with help of partition function. So let’s put all even numbers in one collection and all odd numbers in another:

val numbers = Seq(3, 7, 2, 9, 6, 5, 1, 4, 2)

//(List(2, 6, 4, 2), List(3, 7, 9, 5, 1))
numbers.partition(n => n % 2 == 0)

#9 Fold?

Another one popular operation is fold. In context of Scala you can usually think about foldLeft and foldRight. In general they do the same job but from the different sides 😀

val numbers = Seq(1, 2, 3, 4, 5)

//15
numbers.foldLeft(0)((res, n) => res + n)

The code sample above needs some explanations. In the first pair of parentheses we put a start value. In the second pair of parentheses we define the operation which need to be performed for each element of the numbers sequence. On the first step n = 0, than it evolves according to the sequence elements.

Just to clarify, I want to provide another example of foldLeft. Let’s count number ow characters in sequence of words:

val words = Seq("apple", "dog", "table")

//13
words.foldLeft(0)((resultLength, word) => resultLength + word.length)

#10 Your favorite function

After a so long enumeration, it would be cool to see your favorite function from Scala collections. Write about it in comments and provide an example of its usage.

Functions cooperation for a problem solution

As I promised in the beginning of this post, I provide an example of a function composition. Recently I was passing a Codility test and there was a task:

Given a string S, you have to find the longest substring which contains uppercase & lowercase characters, but not numbers.

Example: dP4knqw1QAp
Answer: QAp

So how to solve this task with help of Scala collection functions?

def theLongest(s: String): String = {
  s.split ("[0-9]")
    .filter (_.exists (ch => ch.isUpper))
    .filter (_.exists (ch => ch.isLower))
    .maxBy (_.length)
}

This function solves the problem. If the input string doesn’t contain any suitable substring UnsupportedOperationException will be thrown.

Summary

Scala has incredibly powerful collection API. You can do a lot of stuff with help of it. Furthermore, the same things can be done in different ways, e.g. look at section with Euler functions. The API is rich and its studying requires time and practicing.

About The Author

Mathematician, programmer, wrestler, last action hero... Java / Scala architect, trainer, entrepreneur, author of this blog

  • chaotic3quilibrium

    You missed including “exists” in the section with “forAll”. In my experience, exists is more commonly used than forAll.

    • You know, “exists” checks only presence of a particular single element in a collection. While “forAll” checks that all elements in a collection correspond to a particular rule 🙂

      • chaotic3quilibrium

        Technically, “exists” is the inversion of “forAll”. IOW, whatever Boolean condition must hold for all elements in forAll, the same Boolean condition inverted must only hold for one element in exists.

  • Eugeny Kostarev

    I found useful collect operation. It transform collection to another collection applying partial function
    Let’s create collection of Int from collection of different types
    val l = List(0, 1, “2”, “3”, Some(4), Some(“5”), 6.5)
    //List(0, 1, 2, 3, 5, 6)
    l.collect {
    case i: Int => i
    case i: Double => i.toInt
    case s: String => s.toInt
    case Some(s: String) => s.toInt
    }

    • acjohnson55

      `collect` is extremely useful, although I would recommend avoiding creating a `List[Any]` in the first place.

      `collectFirst` is also a great one, acting as a compact `find` + `map`.

      It’s too bad they couldn’t call it `match`, because that would have been much more obvious name. I assume this is because of the `match` keyword.

    • I note you that `Some(4)` is filtered out from the collection. Really, `collect` can be saw as a compact `filter` + `map`. Any item not defined into the partial function will be filter out. You can also use a simple `map` to transform the collection adding a default case:

      //List(0, 1, 2, 3, 0, 5, 6)
      l.map {
      case i: Int => i
      case i: Double => i.toInt
      case s: String => s.toInt
      case Some(s: String) => s.toInt
      case _ => 0
      }

  • Stefan Wagner

    Your task of function compositions enforces upper- and lowercase characters, but only checks for uppercase. This doesn’t fail, because the data does not include an all uppercase part, which is longer than the one you find, which includes both types. This should be corrected the one or other way.

    scala> val s= "dP4knqw1QAp9AAAAAAA"
    s: String = dP4knqw1QAp9AAAAAAA

    scala> theLongest (s)
    res2: String = AAAAAAA

    scala> def theLongest (s: String): String = {
    | s.split ("[0-9]")
    | .filter (_.exists (ch => ch.isUpper))
    | .filter (_.exists (ch => ch.isLower))
    | .maxBy (_.length)
    | }
    theLongest: (s: String)String

    scala> theLongest (s)
    res3: String = QAp

    • Wow! Stefan, thanks that you corrected me.
      Will fix the code 🙂

  • Zafar

    Would be helpful also to see the algorithms these nice functions use – time complexity.

  • You can saw `map` as a transformation “one to one”. Original and result has the same number of items. On the other hand, `flatMap` can be saw as a transformation “one to zero o more” items.

    Also, `Option` can be saw a collections with zero or one items.

Close