Compute maximum sum of subarray (Kadane's algorithm) in purely functional immutable Scala

Problem

Find the maximum sum of a sub-sequence of an array. This problem is known as 'MaxSliceSum' on Codility.

If we have an array like '[1, 2, -3, 4, -5]', the maximum sum of a subarray is 4, that of '[4]'.

Solution

This solution is available for purchase!

Upon purchase, you will be able to Register an account to access solutions on multiple devices.

We use Stripe for secure payment processing.


Alternatively, get unlimited solutions for $3.99 per month!

'Unlimited Scala Algorithms' gives you access to all solutions!

Test cases

assert(
  computeForArrayClearKadane(Array[Int](-2, 1, -3, 4, -1, 2, 1, -5, 4)) == 6
)
assert(computeForArrayClearKadane(Array(1, 2, -3, 4, -5)) == 4)

Scala Concepts

scanLeft and scanRight

Scala's `scan` functions enable you to do folds like foldLeft and foldRight, while collecting the intermediate results

This is incredibly useful when you need to collect a sequence of states, such as in MaximumSubArraySum.

Example

assert(List(1, 2, 3, 4, 5).scanLeft(0)(_ + _) == List(0, 1, 3, 6, 10, 15))

assert(List(1, 2, 3, 4, 5).scanRight(0)(_ + _) == List(15, 14, 12, 9, 5, 0))

assert(
  Iterator.from(1).scanLeft(0)(_ + _).take(5).toList == List(0, 1, 3, 6, 10)
)
View

The .view syntax creates a structure that mirrors another structure, until "forced" by an eager operation like .toList, .foreach, .forall, .count.

In the example below, we can see the view in action:

var counted = 0

val resultingList = List(1, 2, 3, 4).view
  .map { num =>
    counted = counted + 1
    num + 1
  }
  .take(2)
  .toList

assert(resultingList == List(2, 3))

assert(counted == 2)

If we add a side-effect inside a map (don't do this normally!), We note that items 3 and 4 are never touched/evaluated, meaning we perform a "lazy" computation.

This is very similar to an Iterator, except views can be Indexed, and also reversed, which is a tremendously useful fact when dealing with arrays, for example when you want to zip two arrays together, such as in CheckArrayIsAPalindrome

On views, you can perform almost any typical collection operation, such as `maxBy`, `count`, `flatMap` and so forth.

And you can get views from almost any data type. Benefits other than lazy computation include potentially fusing of operations by the Java compiler, because instead of creating a new list for every stage, you evaluate new items one-by-one, meaning that if there are any optimisations to be made per one-item basis, you may get a performance boost.

Explanation

The mathematics

We need to break the problem down into parts: if we have a sequence \([a,b,c]\), by brute-force, we would need to look through \([a]\), \([b]\), \([c]\), \([a,b]\), \([b,c]\), and \([a,b,c]\). However, if we had computed \([a,b]\), computing \([a,b,c]\) from scratch would be redundant (as we have to do many summations more than once) and here we see a possibility of optimisation.

How do we split it into parts? If we define the result \(M_e\) to be the maximum sum of subarray ending at position \(e\), then \(M_{e+1}\) is either that, plus the value \(V_{e+1}\) at position \(e+1\), or it is the new value \(V_{e+1}\).

This leads to a formula \(M_{e+1} = \max\{ M_e, M_e + V_{e+1}\} \), and now that we have the list of maximum subarrays ending at position \(e\), we can find the maximum value from those items.

The code

Once you understand the mathematics, the solution becomes quite straightforward.

We `scanLeft` over the Array (see below for explanation). We could then plug in the problem directly into `scanLeft`, and simply get the `max` value of that sequence. However, there are 2 things that we can do to make it more idiomatic Scala and also faster:

  1. Use a state machine: This allows to extract the core formula of the solution and explicitly define the relation between one state of the problem to the next. It also permits us to include this algorithm into Streaming solutions, because it is completely independent of the 'Array'. On top, this also gives us extra type-safety, so that we do not mix up the wrong numbers.
  2. Use a 'view':This gives us increased performance because after the `scanLeft` otherwise we would receive an Array, which means extra allocations. The view on the other hand, makes it lazy, so that computation only happens when we say `maxBy`.

More advanced flavour

There is a more advanced flavour of this problem, which also retrieves back the Array that resulted in this maximum sum. Please see it here:

SubArrayWithMaximumSum