Also available as PDF. Code at GitHub.
Roman Numerals
Friends, Romans, countrymen, lend me your ears—Marc Antony from William Shake-speare’s (likely Edward de Vere’s) Julius Caesar.
Apparently, men think about the Roman Empire1 regularly, a significant proportion of us at least once a week. I confess to be among them. The United Kingdom is full of Roman ruins; it is hard to escape their legacy, especially in the north.
Carpe Diem
With the Romans in mind, how easily can software logic convert between Roman numerals and Arabic numbers? Strictly speaking, what we traditionally call Arabic numbers should be called Hindu-Arabic numbers since the Arabic mathematicians took the decimal system from the Hindu mathematicians. Call it “decimal” for short.
Roman to decimal and back. Forwards and backwards matters. The solution must give Roman for decimal and decimal for Roman. Complete symmetry is required.
Veni, Vidi, Vici
Fundamentally, the Roman “number” represents a numerical sum using 9, 5, 4 and 1 as the principal factors. , and . Prefix and with to represent and , respectively. The same pattern recurs for , , and .
In definite-clause grammar terms (Triska 2024) this formula becomes the following logical phrase.
roman_numeral(1000) --> "M".
roman_numeral(900) --> "CM".
roman_numeral(500) --> "D".
roman_numeral(400) --> "CD".
roman_numeral(100) --> "C".
roman_numeral(90) --> "XC".
roman_numeral(50) --> "L".
roman_numeral(40) --> "XL".
roman_numeral(10) --> "X".
roman_numeral(9) --> "IX".
roman_numeral(5) --> "V".
roman_numeral(4) --> "IV".
roman_numeral(1) --> "I".
Here, I ignore the upper-lower-case issue. Roman numerals have upper case only, although frequently the numerals can have either case, albeit consistently all upper or all lower.
These are the factors. Composite numerals comprise a sequence of these sub-phrases. They need to add up to some numerical value. Enter constraint-logic programming for finite domains or (Triska 2025) for symbolically representing logical relations between numbers.
roman_numerals(Number) -->
{ Number #> 0,
Number #= Number0 + Number1
},
roman_numeral(Number0),
roman_numerals(Number1).
roman_numerals(0) --> [].
This is a recursive phrase. For every , there is a sum
where is a Roman numeral, and
is the sum of subsequent Roman numerals. Finally, no Roman
numeral []
corresponds to . The initial clause applies arithmetic
constraints. The terms do
not need to bind to integers initially. Hence, the constraints appear
first at the head of the predicate.
“Quod erat demonstrandum!”
How does it work?
Prolog searches the problem space using the given constraints. The sum of numbers must always sum to a positive result. The sum can never be zero or below. Romans did not grasp zero or negatives, apparently. Perhaps they did grasp the abstractions but found in them little practical value; Romans were pragmatic people after all.
The backtracking logic has an interesting side effect. It finds all the possible Roman representations of a number. Using the simple Roman combinatorial logic, one number has multiple alternative representation in Roman numerals.
Take for example.
?- phrase(roman_numerals(5), A), string_codes(B, A).
A = [86],
B = "V" ;
A = `IVI`,
B = "IVI" ;
A = `IIV`,
B = "IIV" ;
A = `IIIII`,
B = "IIIII".
There are four alternative representations:
Clause order matters for the roman_numeral//1
predicate; big numbers
must come first so that the solution finds larger factors before smaller
ones. Romans being Roman, the “correct” representation is the shortest
possible representation, or put another way: the form with the largest
possible factors. Hence big factors take priority over smaller ones.
It helps, therefore, to wrap the grammar using a cut (!
). The
following terminates the search for a solution when it finds the first
one; the first solution being the sum with the largest factors.
roman_number(Roman, Number) :-
phrase(roman_numerals(Number), Roman),
!.
What’s Roman for ?
?- roman_number(A, 9999).
A = `MMMMMMMMMCMXCIX`.
?- roman_number(`MMMMMMMMMCMXCIX`, B).
B = 9999.
The Roman term, in the previous query, unifies with a list of character codes hence the backticks.
Strengths and Weaknesses
The logical implementation has a somewhat surprising outcome: Roman numerals have alternative renderings.
The representation for 0 is logically blank. That makes total sense, come to think about it. Subtly, the unification does not fail. It succeeds with nothing instead. This implies that the Romans could represent zero by writing nothing.
?- phrase(roman_numerals(0), A).
A = [].
Prolog makes the implementation pretty simple, but there are some subtleties: clause order has semantic significance; green cutting likewise.
References
Napoleon and Hitler also↩︎