Scala algorithm: Merge Sort: in pure immutable Scala

Published , last updated

Algorithm goal

Merge Sort is a standard merging algorithm. It works by grouping items into pairs, and then merging those pairs by selecting the smallest items in ascending order. Then, it repeats this process until 1 whole array is computed.

Our goal is to achieve a sorting like this:

3214
1234

Merge sort algorithm illustration

The data transformation in Merge Sort looks like this:

Picking of items from two halves
Resulting ListLeft HalfRight Half
Original list (split in half)
3241
Applying a merge+sort function to each of the halves
2314
Then, in the merge function, we begin to extract the smallest elements (as the two halves are sorted)
2314
1234
1234
1234
1234
And now we have solved one level of merging.

In the non-stack-safe version, we achieve this via recursion, where we really say 'our sorted version is the merge of sorting of the two halves of our original input'.

This version is not stack-safe; for stack-safe, see: MergeSortStackSafe

Test cases in Scala

assert(mergeSort(List.empty) == List.empty)
assert(mergeSort(List(1)) == List(1))
assert(mergeSort(List(1, 2)) == List(1, 2))
assert(mergeSort(List(2, 1)) == List(1, 2))
assert(mergeSort(List(2, 1, 3)) == List(1, 2, 3))
assert(mergeSort(List(2, 1, 4, 3)) == List(1, 2, 3, 4))
assert(mergeSort(List(2, 4, 5, 1, 3)) == List(1, 2, 3, 4, 5))
assert(
  {
    val randomArray = scala.util.Random
      .nextBytes(10 + Math.abs(scala.util.Random.nextInt(1000)))
      .map(_.toInt)
      .toList
    mergeSort(randomArray) == randomArray.sorted
  },
  "Random array of any length is sorted"
)

Algorithm in Scala

22 lines of Scala (version 2.13), showing how concise Scala can be!

Get the full algorithm Scala algorithms logo, maze part, which looks quirky!

or

'Unlimited Scala Algorithms' gives you access to all the Scala Algorithms!

Upon purchase, you will be able to Register an account to access all the algorithms on multiple devices.

Stripe logo

Explanation

We follow the algorithm: split the items by half, and then perform a merging operation. In this divide-and-conquer algorithm, the more complex part is the merging function:

The merge function

The merge function is more unusual in that its implementation is simpler when the number of inputs is dynamic, rather than fixed. (this is © from www.scala-algorithms.com)

Full explanation is available for subscribers Scala algorithms logo, maze part, which looks quirky

Scala concepts & Hints

  1. Def Inside Def

    A great aspect of Scala is being able to declare functions inside functions, making it possible to reduce repetition.

    def exampleDef(input: String): String = {
      def surroundInputWith(char: Char): String = s"$char$input$char"
      surroundInputWith('-')
    }
    
    assert(exampleDef("test") == "-test-")
    

    It is also frequently used in combination with Tail Recursion.

  2. Drop, Take, dropRight, takeRight

    Scala's `drop` and `take` methods typically remove or select `n` items from a collection.

    assert(List(1, 2, 3).drop(2) == List(3))
    
    assert(List(1, 2, 3).take(2) == List(1, 2))
    
    assert(List(1, 2, 3).dropRight(2) == List(1))
    
    assert(List(1, 2, 3).takeRight(2) == List(2, 3))
    
    assert((1 to 5).take(2) == (1 to 2))
    
  3. Lazy List

    The 'LazyList' type (previously known as 'Stream' in Scala) is used to describe a potentially infinite list that evaluates only when necessary ('lazily').

  4. Ordering

    In Scala, the 'Ordering' type is a 'type class' that contains methods to determine an ordering of specific types.

    assert(List(3, 2, 1).sorted == List(1, 2, 3))
    
    assert(List(3, 2, 1).sorted(Ordering[Int].reverse) == List(3, 2, 1))
    
    assert(Ordering[Int].lt(1, 2))
    
    assert(!Ordering[Int].lt(2, 1))
    
  5. Pattern Matching

    Pattern matching in Scala lets you quickly identify what you are looking for in a data, and also extract it.

    assert("Hello World".collect {
      case character if Character.isUpperCase(character) => character.toLower
    } == "hw")
    
  6. Stack Safety

    Stack safety is present where a function cannot crash due to overflowing the limit of number of recursive calls.

    This function will work for n = 5, but will not work for n = 2000 (crash with java.lang.StackOverflowError) - however there is a way to fix it :-)

    In Scala Algorithms, we try to write the algorithms in a stack-safe way, where possible, so that when you use the algorithms, they will not crash on large inputs. However, stack-safe implementations are often more complex, and in some cases, overly complex, for the task at hand.

    def sum(from: Int, until: Int): Int =
      if (from == until) until else from + sum(from + 1, until)
    
    def thisWillSucceed: Int = sum(1, 5)
    
    def thisWillFail: Int = sum(1, 300)
    
  7. Tail Recursion

    In Scala, tail recursion enables you to rewrite a mutable structure such as a while-loop, into an immutable algorithm.

    def fibonacci(n: Int): Int = {
      @scala.annotation.tailrec
      def go(i: Int, previous: Int, beforePrevious: Int): Int =
        if (i >= n) previous else go(i + 1, previous + beforePrevious, previous)
    
      go(i = 1, previous = 1, beforePrevious = 0)
    }
    
    assert(fibonacci(8) == 21)
    

View the rest of Scala algorithms