Count binary gap size of a number in immutable/pure functional Scala using tail recursion

Problem

Algorithm to find the maximum length of a gap between a pair of 1s in a binary representation of a digit. This problem is also known as:

  • On Codility: Find longest sequence of zeros in binary representation of an integer.
  • On HackerRank: Balanced Brackets - Given strings of brackets, determine whether each sequence of brackets is balanced. If a string is balanced, return YES. Otherwise, return NO.

Solution

@scala.annotation.tailrec
def removeRightZeroes(number: Int): Int = {
  if (number == 0) number
  else if (isOdd(number)) number
  else removeRightZeroes(number >> 1)
}

def isEven(n: Int): Boolean = !isOdd(n)

def isOdd(n: Int): Boolean = (n & 1) == 1

def maximumBinaryGap(n: Int): Option[Int] = {
  if (n <= 0) None
  else {
    val numberStartingWith1 = removeRightZeroes(n)
    @scala.annotation.tailrec
    def iterate(
        currentNumber: Int,
        currentCount: Int,
        gapCounts: List[Int]
    ): Option[Int] =
      if (currentNumber == 1)
        Some((currentCount :: gapCounts).max).filter(_ > 0)
      else if (isEven(currentNumber))
        iterate(
          currentNumber >> 1,
          currentCount = currentCount + 1,
          gapCounts = gapCounts
        )
      else
        iterate(
          currentNumber = currentNumber >> 1,
          currentCount = 0,
          gapCounts = currentCount :: gapCounts
        )
    iterate(
      currentNumber = numberStartingWith1,
      currentCount = 0,
      gapCounts = Nil
    )
  }
}

Test cases

assert(
  removeRightZeroes(12) == 3,
  "12, with zeroes on the right removed, is 3, because it's 8 + 4 = 0b1100, becoming 0b11 which is 3"
)
assert(
  maximumBinaryGap(6217).contains(4),
  "76215 has a gap of 4, because it's represented as 0b1100001001001"
)
assert(
  maximumBinaryGap(16).isEmpty,
  "16 has no gaps at all because it's represented as 0b10000"
)
assert(
  maximumBinaryGap(1).isEmpty,
  "1 has no gap either because it's represented as 0b0000001"
)

Scala Concepts

Def Inside Def

A great aspect of Scala is being able to declare functions inside functions, making it possible to reduce repetition.

def exampleDef(input: String): String = {
  def surroundInputWith(char: Char): String = s"$char$input$char"
  surroundInputWith('-')
}
Option Type

The 'Option' type is used to describe a computation that either has a result or does not. In Scala, you can 'chain' Option processing, combine with lists and other data structures. For example, you can also turn a pattern-match into a function that return an Option, and vice-versa!

Some examples:

final case class Page(title: String, mainCategory: Option[String])

val pages = List(
  Page(title = "Tail Recursion", mainCategory = None /** No category **/ ),
  Page(title = "Option", mainCategory = Some("standard library")),
  Page(title = "zip", mainCategory = Some("standard library"))
)

val categories: Set[String] = pages.flatMap(_.mainCategory).toSet

assert(categories == Set("standard library"))

def pageCategory(title: String): Option[String] = {
  for {
    page <- pages.find(_.title == title)
    category <- page.mainCategory
  } yield category
}

def pageCategory2(title: String): Option[String] =
  pages.find(_.title == title).flatMap(_.mainCategory)

def pageCategory3(title: String): Option[String] =
  pages.collectFirst {
    case Page(`title`, mainCategory) => mainCategory
  }.flatten

assert(pageCategory("zip").contains("standard library"))

assert(pageCategory("zip") == Some("standard library"))

assert(pageCategory("zip") == Some("standard library"))

assert(pageCategory2("zip") == pageCategory("zip"))

assert(pageCategory3("zip") == pageCategory("zip"))

assert(List[String]("X").headOption == Some("X"))

assert(List[String]().headOption.isEmpty)

val startWithT: String = Option("Some test") match {
  case Some(value) if value.startsWith("T") => value
  case Some(value)                          => s"T${value}"
  case None                                 => "T"
}

assert(startWithT == "TSome test")
Tail Recursion

In Scala, tail recursion enables you to rewrite a mutable structure such as a while-loop, into an immutable algorithm.

Tail recursion always has a recursive call in a "final" position, ie you can only either return a result (exit the function), or return another call to self-function

In canonical form, the immutable form gets compiled down to the mutable form,

def evaluateGeneralImmutable[State, Result](initialParams: State)(
    iterate: State => State,
    terminate: State => Boolean,
    extractResult: State => Result
): Result = {
  @scala.annotation.tailrec
  def go(currentParams: State): Result =
    if (terminate(currentParams)) extractResult(currentParams)
    else go(currentParams = iterate(currentParams))

  go(initialParams)
}

becomes (after a stage of compilation):

def evaluateGeneralMutable[State, Result](initialParams: State)(
    iterate: State => State,
    terminate: State => Boolean,
    extractResult: State => Result
): Result = {
  var currentParams: State = initialParams
  while (!terminate(currentParams)) {
    currentParams = iterate(currentParams)
  }
  extractResult(currentParams)
}

This transformation can also be performed the other way round, as to give you a pure immutable solution

What are the benefits of tail recursion?

Tail recursion in Scala utilises a principle known as tail-call optimisation. It allows one to write iterative algorithms (that would otherwise would be complicated while-loops) in immutable form.

What are the benefits of immutability?

It becomes easier to reason about your code, and you always know that you can re-run a function as manytimes as you wish without causing unexpected side effects.

But really, can anything be written in this shape?

Anything that is iterative in nature can, so long as it can be represented in the canonical form.

Let's look at two versions of List#drop(n) - mutable and immutable (Drop, Take, dropRight, takeRight):

def dropMutable[T](list: List[T], n: Int): List[T] = {
  var remaining = n
  var returnList = list
  while (remaining > 0 && returnList.nonEmpty) {
    remaining = remaining - 1
    returnList = returnList.tail
  }
  returnList
}

def dropImmutable[T](list: List[T], n: Int): List[T] = {
  @tailrec
  def go(remaining: Int, returnList: List[T]): List[T] = {
    if (remaining == 0) returnList
    else
      returnList match {
        case _ :: rest => go(remaining - 1, rest)
        case Nil       => Nil
      }
  }
  go(remaining = n, list)
}

assert(dropMutable(List(1, 2, 3), 2) == List(3))

assert(dropImmutable(List(1, 2, 3), 2) == List(3))

The key thing to notice really is that you move all the `var`s to arguments of the `go` function.

Very detailed advanced example:

Let's try to implement List#foldLeft (foldLeft and foldRight):

Example how to turn a function from mutable to immutable. Warning: a lot of boilerplate code.
def foldLeftMutable[T, S](list: List[T])(initial: S)(f: (S, T) => S): S = {
  var currentResult: S = initial
  var remaining: List[T] = list
  while (remaining.nonEmpty) {
    currentResult = f(currentResult, remaining.head)
    remaining = remaining.tail
  }
  currentResult
}

final case class CurrentState[S, T](
    currentResult: S,
    remainingItems: List[T]
)

def foldLeftMutableSimplified[T, S](
    list: List[T]
)(initial: S)(f: (S, T) => S): S = {
  var currentResult: CurrentState[S, T] =
    CurrentState(currentResult = initial, remainingItems = list)
  while (currentResult.remainingItems.nonEmpty) {
    currentResult = CurrentState(
      currentResult =
        f(currentResult.currentResult, currentResult.remainingItems.head),
      remainingItems = currentResult.remainingItems.tail
    )
  }
  currentResult.currentResult
}

def foldLeftCanonicalMutable[T, S](
    list: List[T]
)(initial: S)(f: (S, T) => S): S = {
  evaluateGeneralMutable(
    CurrentState(currentResult = initial, remainingItems = list)
  )(
    currentResult =>
      CurrentState(
        currentResult =
          f(currentResult.currentResult, currentResult.remainingItems.head),
        remainingItems = currentResult.remainingItems.tail
      ),
    _.remainingItems.isEmpty,
    _.currentResult
  )
}

def foldLeft[T, S](list: List[T])(initial: S)(f: (S, T) => S): S =
  evaluateGeneralImmutable[CurrentState[S, T], S](
    CurrentState(initial, list)
  )(
    iterate = currentState =>
      CurrentState(
        currentResult =
          f(currentState.currentResult, currentState.remainingItems.head),
        remainingItems = currentState.remainingItems.tail
      ),
    terminate = _.remainingItems.isEmpty,
    extractResult = _.currentResult
  )

def foldLeftInlined[T, S](list: List[T])(initial: S)(f: (S, T) => S): S = {
  type State = CurrentState[S, T]
  type Result = S

  val initialParams: State = CurrentState(initial, list)
  val iterate: State => State = currentState =>
    CurrentState(
      currentResult =
        f(currentState.currentResult, currentState.remainingItems.head),
      remainingItems = currentState.remainingItems.tail
    )
  val terminate: State => Boolean = _.remainingItems.isEmpty
  val extractResult: State => Result = _.currentResult
  @scala.annotation.tailrec
  def go(currentParams: State): Result =
    if (terminate(currentParams)) extractResult(currentParams)
    else go(currentParams = iterate(currentParams))

  go(initialParams)
}

def foldLeftInlinedFurther[T, S](
    list: List[T]
)(initial: S)(f: (S, T) => S): S = {
  type State = CurrentState[S, T]
  type Result = S

  @scala.annotation.tailrec
  def go(currentParams: State): Result =
    if (currentParams.remainingItems.isEmpty) currentParams.currentResult
    else
      go(currentParams = {
        val currentState = currentParams
        CurrentState(
          currentResult =
            f(currentState.currentResult, currentState.remainingItems.head),
          remainingItems = currentState.remainingItems.tail
        )
      })

  go(CurrentState(initial, list))
}

def foldLeftInlinedState[T, S](
    list: List[T]
)(initial: S)(f: (S, T) => S): S = {

  @scala.annotation.tailrec
  def go(currentResult: S, remainingItems: List[T]): S =
    if (remainingItems.isEmpty) currentResult
    else
      go(
        currentResult = f(currentResult, remainingItems.head),
        remainingItems = remainingItems.tail
      )

  go(initial, list)
}

def foldLeftCompact[T, S](list: List[T])(initial: S)(f: (S, T) => S): S = {

  @scala.annotation.tailrec
  def go(currentResult: S, remainingItems: List[T]): S =
    remainingItems match {
      case head :: tail =>
        go(currentResult = f(currentResult, head), remainingItems = tail)
      case Nil => currentResult
    }

  go(initial, list)
}

Explanation

Tail recursion allows us to perform iteration without having to mutate variables. While Scala permits mutation, immutability allows for more possibilities, such as being able to adopt an algorithm to run in a streamed way.

We have to consider 5 cases of an input number in this algorithm:

  1. Pure zeroes: 0b000000, which have no gaps
  2. Numbers like 0b0000100, which also has no gaps
  3. Numbers like 0b0001100, which has no gaps
  4. Numbers like 0b001001 and 0b00100100, which have a gap
  5. Numbers like 0b01000101, which have 2 gaps, resuting in max gap of 3

Bit-wise operations will come to our help: for example 0b110 & 0b010 == 0b010, and bit shifting 0b110 >> 1 == 0b011. Upon familiarising with these, we can iterate through a number by bit-shifting it and comparing the last digit - to basically iterate through the binary digits of a number.

Initially, we ignore any first sequence of zeroes; when we reach a 1, we begin counting; if the next digit is a 1, we re-set the counter and include the length of the counter to a stack (or in Scala, a List); if the next digit is a 0, we increment the counter; and we repeat until we have shifted the number to be 0.