4 min read

Octet Bit Logic

‘Bit strings’ amount to lists of colon-separated value-width terms. Erlang allows such expressions explicitly using << and >> operators; see Erlang Bit Syntax. Examples in general list form, less Erlang specific, might be [0:4, 15:4] meaning 0 for the first most-significant four bits and 15 for the second least-significant four bits within the same octet. The shifted value of the bits comes first before the colon followed by its integer bit width. The list of terms specify an octet by sub-spans of bits, or bit fields.

Take an example as a hypothetical Prolog query. Below, A unifies with ten because binary 1010 corresponds to decimal 10 after shifting; B unifies with 5 because binary 0101 equals decimal 5.

?- octet_bits(2'10100101, [A:4, B:4]).
A = 10,
B = 5.

Octets are bytes by another name but the name ‘octet’ adds an extra important connotation: that of ordering. Extracting ordered bit values from some octet stream is a typical computing requirement for many applications or application sub-components such as communication libraries.

Shift Accumulator

Perhaps the simplest approach tail-recursively walks through the bit-string specification term by term using an accumulated Shift down-counter to pull out and realign the values.

octet_bits(Octet, Fields) :-
    octet_bits(Fields, 8, Octet).

octet_bits([Value:Width|T], Shift, Octet) :-
    Shift_ is Shift - Width,
    Value is (Octet >> Shift_) /\ ((1 << Width) - 1),
    octet_bits(T, Shift_, Octet).
octet_bits([], 0, _Octet).

\(2^w-1\) for width \(w\) defines the bit mask; left shifting \(1\) by \(w\) effectively computes \(2^w\).

Limitations apply. The Octet cannot pass to the predicate as a variable; it must be an integer and therefore a bound ground term. The next section extends the implementation to accomodate scenarios where the octet is an unknown integer.

Variadic Octet

For an unknown integer, the octet_bits/2 predicate needs to reverse the preceding process, running backwards from known values to an unknown octet; bound bit-field values, unbound octet in Prolog terminology.

octet_bits([Value:Width|T], Shift, Octet0, Octet) :-
    Shift_ is Shift - Width,
    Octet_ is Octet0 \/ ((Value /\ ((1 << Width) - 1)) << Shift_),
    octet_bits(T, Shift_, Octet_, Octet).
octet_bits([], 0, Octet, Octet).

What should happen if a given value exceeds the bit-width limits? Fail or mask? In this implementation, it masks.

To and Fro

Combining the shift accumulator with the variadic octet predicate gives the following mode (?, ?) predicate. It can run from octet to fields, i.e. Prolog predicate (+, -) mode, or the other way around (-, +) where plus stands for bound non-variable and minus stands for unbound variable.

octet_bits(Octet, Fields) :-
    var(Octet),
    !,
    octet_bits(Fields, 8, 0, Octet).
octet_bits(Octet, Fields) :-
    octet_bits(Fields, 8, Octet).

octet_bits([], 0, _Octet).
octet_bits([Value:Width|T], Shift, Octet) :-
    Shift_ is Shift - Width,
    Value is (Octet >> Shift_) /\ ((1 << Width) - 1),
    octet_bits(T, Shift_, Octet).

octet_bits([], 0, Octet, Octet).
octet_bits([Value:Width|T], Shift, Octet0, Octet) :-
    Shift_ is Shift - Width,
    Octet_ is Octet0 \/ ((Value /\ ((1 << Width) - 1)) << Shift_),
    octet_bits(T, Shift_, Octet_, Octet).

Predicate octet_bits/2 unifies integral eight-bit Octet with a list of Value:Width terms where the Width integers sum to eight and the Value terms unify with the shifted bit values encoded within the eight-bit byte.

See its deployment to the Canny bag o’ Tudor pack with tests, albeit with some re-factoring. The refactor work comes after realising that the 3- and 4-arity sub-predicates above have distinct utility by themselves: pulling values out of arbitrary integers; or pushing arbitrary values to integer variables. The Shift term can also be any other integer value, not just \(8\); it can handle \(16\) for sixteen-bit words, \(32\) for 32-bit double words and so forth.

Future Pure

Another approach may exist, a layered way of decomposing the shift-OR operations. We could pre-assemble the ground values by shifting them all first. Shift first, OR-wise merge second. Call this the ‘pure’ functional approach. The first step applies shifting. The second step merges by OR’ing the pre-shifted sub-octets.

Future work may explore this alternative.