HackerRank has a foul-mouthed yet interesting brain f**k challenge. In a nutshell, the challenge requires a virtual machine interpreter that outputs “Hello World!” by running the following program.
+++++ +++++ initialize counter (cell #0) to 10
[ use loop to set the next four cells to 70/100/30/10
> +++++ ++ add 7 to cell #1
> +++++ +++++ add 10 to cell #2
> +++ add 3 to cell #3
> + add 1 to cell #4
<<<< - decrement counter (cell #0)
]
> ++ . print 'H'
> + . print 'e'
+++++ ++ . print 'l'
. print 'l'
+++ . print 'o'
> ++ . print ' '
<< +++++ +++++ +++++ . print 'W'
> . print 'o'
+++ . print 'r'
----- - . print 'l'
----- --- . print 'd'
> + . print '!'
The program’s characters act as operators:
<
and>
step a data pointer respectively backwards and forwards;[
and]
conditionally jump based on the zero or non-zero value of the current data pointer’s storage cell;+
and-
increment and decrement the byte at the current data address with 8-bit wrap-around behaviour. This implies a byte-oriented data store..
and,
output and input ASCII characters, respectively. They can only be ASCII because the data storage locations only hold 8-bit values; and finally,- anything else acts as a no-operation.
The virtual machine therefore comprises two sequentially-accessed stores, one for instructions and one for 8-bit data bytes. The data store is ‘infinite’ meaning that you can never not increment the data pointer and find a new mutable byte, initially zero. The program itself acts as a ‘tape’ albeit a read-only tape for instruction codes that you can fast-forward or rewind on a jump operation.
The concept reminisces the classic Turing machine (TURING 1996). In this case the model fits a two-tape machine since the instruction store acts as a tape as well as the data tape. Tape elements differ however. The instruction tape comprises ‘code points’ of typically ASCII characters. Though no restriction against Unicodes exists. Every non-interpreted character acts as a no-operation and hence useful for comments, including non-ASCII characters such as Chinese Unicode glyphs—for instance.
Functional Journey
For no particular reason other than personal preference, the implementation will target functional languages Elixir and Prolog simultaneously. Besides, this is the kind of thing you do on a damp Queen’s Platinum Jubilee Bank Holiday.
See the GitHub repository here for ‘TL; DR’ results. The Prolog predicates live on GitHub as well but in the form of a Gist. To cut a long story short, the novel approach presented utilises a two-element list tuple to emulate a two-tape Turing machine. Simulating a Turing tape using a list pair proves successful and useful. List pairs solve the problem, maintain good linear performance and present no disadvantages.
Tape Storage
Lists can model what Alan Turing thought of as a ‘tape.’ An abstract tape is a storage entity that has a current position at which you can read or write. The abstract Turing machine can advance or reverse the tape so that its current position goes to the next or previous place for subsequent read or write operations.
Let the tape store be a pair of lists with four aggregate operations: (1) fetch from head, (2) store to head, (3) advance head by one cell, (4) rewind head by one cell. One list contains the current head datum followed by all its subsequent cells possibly extending to infinity. The other list comprises the data already seen by the tape reader but in reverse order; the seen-list’s head is the tape head’s previous cell.
The Elixir BrainF__k
module, elided for the four basic ‘tape’ predicates:
@type tape :: {list(), list()}
@spec fetch(tape()) :: any
def fetch({_, [h | _]}), do: h
@spec store(tape(), any) :: tape()
def store({t0, [_ | t]}, h), do: {t0, [h | t]}
@spec forward(tape()) :: tape()
def forward({t0, [h | t]}), do: {[h | t0], t}
@spec reverse(tape()) :: tape()
def reverse({[h | t0], t}), do: {t0, [h | t]}
Important to realise that the first list in the tape pair stores its contents in reverse order. The entire store in proper order computes as the first list reversed prepended to the second list.
Jumping around requires that the tape can forward or reverse until some matching value appears at the head. The naive implementation searches for the very next list element matching some \(h\), see below.
@spec forward(tape(), any) :: tape()
def forward(t0, h) do
t = forward(t0)
case fetch(t) do
^h -> t
_ -> forward(t, h)
end
end
Notice the Elixir “pin” operator (^
) to exclude rebinding; the expression ^h -> t
wants to pattern match against \(h\)’s existing binding rather than rebind \(h\).
A bigger problem exists with this naive interpretation of the requirements. Angled brackets can nest. Finding a ]
means finding at the same level; or skip the next ]
if searching sees another [
first in-between. Skip the next two nested brackets if two complementary brackets intervene, and so on.
@spec forward(tape(), any, any, number) :: tape()
def forward(t0, h, nest, skip \\ 0) do
t = forward(t0)
case fetch(t) do
^h ->
case skip do
0 -> t
_ -> forward(t, h, nest, skip - 1)
end
^nest ->
forward(t, h, nest, skip + 1)
_ ->
forward(t, h, nest, skip)
end
end
An equivalent tail-recursive reverse(t0, h, nest, skip)
exists.
Virtual Machine
Now that the ‘tape’ abstraction exists along with its four primary operations, the virtual machine for brain f**k can apply the operators to an incoming program tape and a default zero-initialised data tape.
Another important unstated requirement: no-operations do not count as operations. Sounds obvious when you say it. HackerRank checks for this in their test cases. The maximum number of operations means the maximum allowed number of operations before aborting.
Operators <
and >
decrement and increment the data pointer, respectively, as follows.
defp f__k_(?<, code, data, ops), do: {forward(code), reverse(data), ops + 1}
defp f__k_(?>, code, data, ops), do: {forward(code), forward_inf(data, 0), ops + 1}
Operators [
and ]
follow; they perform condition branching.
defp f__k_(?[, code, data, ops) do
{case fetch(data) do
0 -> forward(code, ?], ?[)
_ -> forward(code)
end, data, ops + 1}
end
defp f__k_(?], code, data, ops) do
{case fetch(data) do
0 -> forward(code)
_ -> reverse(code, ?[, ?])
end, data, ops + 1}
end
Operators +
and -
increment or decrement the byte at the data store’s head.
defp f__k_(?+, code, data, ops) do
{forward(code),
store(
data,
case fetch(data) do
255 -> 0
xx -> xx + 1
end
), ops + 1}
end
defp f__k_(?-, code, data, ops) do
{forward(code),
store(
data,
case fetch(data) do
0 -> 255
xx -> xx - 1
end
), ops + 1}
end
Input and output occurs on ,
and .
, or comma and full-stop.
defp f__k_(?., code, data, ops) do
:ok =
[fetch(data)]
|> to_string()
|> IO.write()
{forward(code), data, ops + 1}
end
defp f__k_(?,, code, data, ops) do
[xx] =
case IO.read(1) do
:eof -> [0]
s -> to_charlist(s)
end
{forward(code), store(data, xx), ops + 1}
end
And finally the default no-operation.
defp f__k_(_, code, data, ops), do: {forward(code), data, ops}
The last piece of the puzzle, predicate f__k/1
accepts a list of program codes, or code-point characters representing the program, and runs up to \(10^5\) operations until complete.
@max_ops 100_000
def f__k(code), do: f__k({[], code}, {[], [0]}, 0)
def f__k({code, []}, data, ops), do: {:ok, {code, []}, data, ops}
def f__k(code, data, ops) when ops == @max_ops, do: {:error, code, data, ops}
def f__k(code, data, ops) do
{code, data} = f__k_(fetch(code), code, data)
f__k(code, data, ops + 1)
end
Lessons Learnt
The solution has two layers. Lower-level logic deals with the concept of a tape, and an infinite tape for simulating the data store. The instruction store is not infinite because it terminates when the tape ‘runs out.’ The upper layer handles the operation decoding and counting. Adding layers of abstraction help to decompose complexity.
Elixir and its sister Erlang have a disparity between arguments and return values. Prolog unifies with its arguments, bound or unbound. This results in semi-deterministic behaviour or non-deterministic with back-tracking. Prolog predicates have no “return values” as such. Either the predicate succeeds, fails or throws. The Erlang-ian cousins take a different approach where return values take the place of unbound arguments. Though they do not permit multiple return values, and hence require a tuple wrapper. Consequently you often see tuple returns in the form {:ok, ...}
to indicate success.
Hacker Postscript
Input and output differs for the actual HackerRank challenge. The program and its input both come from standard input.
Note that you can redirect standard input and output using ExUnit
’s CaptureIO
module; you specify a string for use as standard input and a function. The capture_io/2
function runs the function and returns the standard output captured as a string, or an empty string if no output. See the sample below. The nested IO.read(:line)
reads the “hello” string; the nested IO.puts()
writes to a string, and the outer IO.puts/1
writes the captured string to actual standard output. HackerRank’s solution runner makes this module available, conveniently.
ExUnit.CaptureIO.capture_io("hello", fn ->
IO.read(:line)
|> IO.puts()
end)
|> IO.puts()