# Run-length encoding (RLE) Encoder

## Algorithm goal

Run-length encoding is one of the most basic compression methods, which is especially useful where there long runs of a particular character or a group of characters. It could be particularly useful for example in column-based databases which have potentially many repeating values.

Run-length turns a sequence of characters into effectively a sequence of count-character pairs. For example, WWWAAW becomes 3W2A1W.

This algorithm performs the Encoding part; see RunLengthEncodingDecoder for the Decoding part.

## Explanation

There is a Tail Recursion approach that can be taken; while more obvious to implement, it has two disadvantages:

• It it is not possible to make it work in a streaming fashion (because who wants to run out of memory?).

We present an immutable collection-based approach that only relies on plain and lazy transformations without worrying about index accessesand is also very efficient through the use of View, and because of that, is very easy to adapt to Streamed approaches. (this is © from www.scala-algorithms.com)

There are 2 stages to consider: computing the cumulative counts of each character (WWWAAW -> 1 -> W, 2 -> W, 3 -> W, 1 -> A, 2 -> A, 1 -> W), and then detecting transitions between the characters (we only emit a record once we have reached the maximum count, and we reach that when a different character is reached). Typically when dealing with transitions, consider the Sliding / Sliding Window concept.

## Scala Concepts & Hints

Collect

'collect' allows you to use Pattern Matching, to filter and map items.

assert("Hello World".collect {
case character if Character.isUpperCase(character) => character.toLower
} == "hw")


Option Type

The 'Option' type is used to describe a computation that either has a result or does not. In Scala, you can 'chain' Option processing, combine with lists and other data structures. For example, you can also turn a pattern-match into a function that return an Option, and vice-versa!

assert(Option(1).flatMap(x => Option(x + 2)) == Option(3))

assert(Option(1).flatMap(x => None) == None)


Pattern Matching

Pattern matching in Scala lets you quickly identify what you are looking for in a data, and also extract it.

assert("Hello World".collect {
case character if Character.isUpperCase(character) => character.toLower
} == "hw")


scanLeft and scanRight

Scala's scan functions enable you to do folds like foldLeft and foldRight, while collecting the intermediate results

assert(List(1, 2, 3, 4, 5).scanLeft(0)(_ + _) == List(0, 1, 3, 6, 10, 15))


Sliding / Sliding Window

Get fixed-length sliding sub-sequences (sliding windows) from another sequence

Stack Safety

Stack safety is present where a function cannot crash due to overflowing the limit of number of recursive calls.

This function will work for n = 5, but will not work for n = 2000 (crash with java.lang.StackOverflowError) - however there is a way to fix it :-)

In Scala Algorithms, we try to write the algorithms in a stack-safe way, where possible, so that when you use the algorithms, they will not crash on large inputs. However, stack-safe implementations are often more complex, and in some cases, overly complex, for the task at hand.

def sum(from: Int, until: Int): Int =
if (from == until) until else from + sum(from + 1, until)

def thisWillSucceed: Int = sum(1, 5)

def thisWillFail: Int = sum(1, 300)


View

The .view syntax creates a structure that mirrors another structure, until "forced" by an eager operation like .toList, .foreach, .forall, .count.

## Algorithm in Scala

29 lines of Scala (version 2.13), showing how concise Scala can be!

## Test cases in Scala

assert(encodeRunLength("") == "")
assert(encodeRunLength("W") == "1W")
assert(encodeRunLength("WWW") == "3W")
assert(encodeRunLength("WWWAA") == "3W2A")
assert(encodeRunLength("WWWAAW") == "3W2A1W")
assert(encodeRunLength("WWWABBWW") == "3W1A2B2W")

def encodeRunLength(inputString: String): String = ???