Scala algorithm: Merge Sort: stack-safe, tail-recursive, in pure immutable Scala, N-way

Published November 12^th 2020

Algorithm goal

Merge Sort is a standard merging algorithm. It works by grouping items into pairs, and then merging those pairs by selecting the smallest items in ascending order. Then, it repeats this process until 1 whole array is computed.

Our goal is to achieve a sorting like this:

3	2	1	4

1	2	3	4

Merge sort algorithm illustration

The data transformation in Merge Sort looks like this:

Picking of items from two halves
Resulting List				Left Half		Right Half
Original list (split in half)
				3	2	4	1
Applying a merge+sort function to each of the halves

				2	3	1	4
Then, in the merge function, we begin to extract the smallest elements (as the two halves are sorted)
				2	3	1	4

1				2	3		4

1	2				3		4

1	2	3					4

1	2	3	4
And now we have solved one level of merging.

In the non-stack-safe version, we achieve this via recursion, where we really say 'our sorted version is the merge of sorting of the two halves of our original input'.

This version is stack-safe (and thus a bit more complicated); to find the standard recursive version, see here: MergeSort.

Test cases in Scala

assert(mergeSort(Vector.empty) == Vector.empty)
assert(mergeSort(Vector(1)) == Vector(1))
assert(mergeSort(Vector(1, 2)) == Vector(1, 2))
assert(mergeSort(Vector(2, 1)) == Vector(1, 2))
assert(mergeSort(Vector(2, 1, 3)) == Vector(1, 2, 3))
assert(mergeSort(Vector(2, 1, 4, 3)) == Vector(1, 2, 3, 4))
assert(mergeSort(Vector(2, 4, 5, 1, 3)) == Vector(1, 2, 3, 4, 5))
assert(
  {
    val randomArray = scala.util.Random
      .nextBytes(10 + Math.abs(scala.util.Random.nextInt(1000)))
      .map(_.toInt)
      .toVector
    mergeSort(randomArray) == randomArray.sorted
  },
  "Random array of any length is sorted"
)

Algorithm in Scala

29 lines of Scala (compatible versions 2.13 & 3.0), showing how concise Scala can be!

Get the full algorithm !

'Unlimited Scala Algorithms' gives you access to all the 100 published Scala Algorithms!

Upon purchase, you will be able to Register an account to access all the algorithms on multiple devices.

Stripe logo

Explanation

This solution takes a bottom-up approach to avoid having to use non-tail recursion (which is not stack-safe).

We group all input items into pairs of 2, and then repeat as per the problem definition. (this is © from www.scala-algorithms.com)

The difference to many other solutions out there is that we do not split the input, but rather read from it sequentially, meaning that it is quite intuitive. Another beneficial aspect is that the complexity of the computation is very easy to establish, as the number of iterations required is defined, $O(n * \log{n})$.

We use a utility method 'iterate' to iterate a function on an initial value n times. LazyList provides us with the 'last' method which allows us to get the iteration really quickly, although we could just as easily use tail recursion to implement it well -- which is a little more verbose.

The merge function

Full explanation is available to subscribers

Scala concepts & Hints

Def Inside Def
A great aspect of Scala is being able to declare functions inside functions, making it possible to reduce repetition.
```
def exampleDef(input: String): String = {
  def surroundInputWith(char: Char): String = s"$char$input$char"
  surroundInputWith('-')
}

assert(exampleDef("test") == "-test-")
```
It is also frequently used in combination with Tail Recursion.

Drop, Take, dropRight, takeRight

Scala's `drop` and `take` methods typically remove or select `n` items from a collection.

assert(List(1, 2, 3).drop(2) == List(3))

assert(List(1, 2, 3).take(2) == List(1, 2))

assert(List(1, 2, 3).dropRight(2) == List(1))

assert(List(1, 2, 3).takeRight(2) == List(2, 3))

assert((1 to 5).take(2) == (1 to 2))

Lazy List
The 'LazyList' type (previously known as 'Stream' in Scala) is used to describe a potentially infinite list that evaluates only when necessary ('lazily').

Ordering

In Scala, the 'Ordering' type is a 'type class' that contains methods to determine an ordering of specific types.

assert(List(3, 2, 1).sorted == List(1, 2, 3))

assert(List(3, 2, 1).sorted(Ordering[Int].reverse) == List(3, 2, 1))

assert(Ordering[Int].lt(1, 2))

assert(!Ordering[Int].lt(2, 1))

Pattern Matching
Pattern matching in Scala lets you quickly identify what you are looking for in a data, and also extract it.
```
assert("Hello World".collect {
  case character if Character.isUpperCase(character) => character.toLower
} == "hw")
```
Stack Safety
Stack safety is present where a function cannot crash due to overflowing the limit of number of recursive calls.
This function will work for n = 5, but will not work for n = 2000 (crash with java.lang.StackOverflowError) - however there is a way to fix it :-)
In Scala Algorithms, we try to write the algorithms in a stack-safe way, where possible, so that when you use the algorithms, they will not crash on large inputs. However, stack-safe implementations are often more complex, and in some cases, overly complex, for the task at hand.
```
def sum(from: Int, until: Int): Int =
  if (from == until) until else from + sum(from + 1, until)

def thisWillSucceed: Int = sum(1, 5)

def thisWillFail: Int = sum(1, 300)
```

Tail Recursion

In Scala, tail recursion enables you to rewrite a mutable structure such as a while-loop, into an immutable algorithm.

def fibonacci(n: Int): Int = {
  @scala.annotation.tailrec
  def go(i: Int, previous: Int, beforePrevious: Int): Int =
    if (i >= n) previous else go(i + 1, previous + beforePrevious, previous)

  go(i = 1, previous = 1, beforePrevious = 0)
}

assert(fibonacci(8) == 21)

Scala Algorithms: The most comprehensive library of algorithms in standard pure-functional Scala

How our 100 algorithms look

A description/goal of the algorithm.
An explanation with both Scala and logical parts.
A proof or a derivation, where appropriate.
Links to Scala concepts used in this specific algorithm, also unit-tested.
An implementation in pure-functional immutable Scala, with efficiency in mind (for most algorithms, this is for paid subscribers only).
Unit tests, with a button to run them immediately in our in-browser IDE.

Screenshot of an example algorithm demonstrating the listed features

Study our 100 Scala Algorithms: 6 fully free, 100 published & 0 upcoming

Fully unit-tested, with explanations and relevant concepts; new algorithms published about once a week.

Explore the 22 most useful Scala concepts

To save you going through various tutorials, we cherry-picked the most useful Scala concepts in a consistent form.

Subscribe to Scala Algorithms

Maximize your Scala with disciplined and consistently unit-tested solutions to 100+ algorithms.

Use it from improving your day-to-day data structures and Scala; all the way to interviewing.