Semantically ordered arguments should be lexically ordered too

In his C++Now talk “Preconditions, Postconditions, Invariants: How They Help Write Robust Programs,” Andrzej Krzemieński used as an example a function inRange with signature

bool inRange(int val, int lo, int hi)
    [[pre: lo <= hi]]
{
    return (lo <= val && val <= hi);
}

Part of why we want to specify a precondition here is that it is astoundingly easy for the human programmer to mix up the order of their arguments:

void assertInConfiguredRange(int existingValue) {
    int lower = config.get("LOWER");
    int upper = config.get("UPPER");
    assert(inRange(lower, upper, existingValue)); // oops
}

Besides adding a runtime-checkable precondition, Andrzej also suggested the present-day solution of introducing a strong type:

struct IntRange { int lo; int hi; };
bool inRange(int val, IntRange r);

assert(inRange({lower, upper}, existingValue)); // compile-time error

assert(inRange(existingValue, {lower, upper})); // OK

But if you’re stuck with three integers (maybe for performance, or C compatibility, or something), then I claim (completely tangential to the point of Andrzej’s talk) that the original inRange puts those three integer arguments in the wrong order!

Lexical order should match expected semantic order

The reason we write

return lo <= val && val <= hi;

instead of, say,

return val <= hi && val >= lo;

is that the former puts the numbers in “number-line order”: we expect that val’s value is (semantically, numerically) between lo’s and hi’s, so we place val’s identifier (physically, lexically) between lo’s and hi’s. Languages such as Python and Raku (but maybe only those two?) explicitly support this idiom via “chained comparisons”:

return lo <= val <= hi

The same idiom can be used for function parameter ordering. In fact we see this idiom consistently throughout the entire Stepanov-era STL:

std::rotate(first, new_first, last)
std::partial_sort(first, output_last, input_last)
std::nth_element(first, pos, last)
std::inplace_merge(first1, last1_also_first2, last2)

There’s no physical reason Stepanov couldn’t have written rotate(first, last, new_first) — the computer wouldn’t care! But humans find it easier to reason about things when their physical lexical order matches their expected semantic order.

We also see this idiom in Howard Hinnant’s combinations.h library:

for_each_permutation(first, mid, last, callback)

We also see it in Walter E. Brown’s preferred min and max semantics. Notice that min below is std::min, but max below is not the standard std::max:

template<class T> auto& min(const T& a, const T& b) { return b < a ? b : a; }
template<class T> auto& max(const T& a, const T& b) { return b < a ? a : b; }

Except for std::clamp

C++17 std::clamp muffed it.

std::clamp(value, lo, hi)

This is exactly the antipattern that led to Andrzej’s original example. The API is easy to misuse, and so the implementor has to resort to safety nets like preconditions and assertions. (Not that you should avoid writing safety nets, even in hard-to-misuse code! But writing hard-to-misuse APIs turns our safety nets from “must-haves” into “nice-to-haves.”)

René Rivera pointed out to me that OpenGL’s clamp function makes the same API mistake, and so C++’s std::clamp was probably designed to copy that familiar (if unfortunate) API.

I noticed something interesting in the OpenGL API docs:

The returned value is computed as min(max(x, minVal), maxVal).

Now, my preferred mnemonic for clamp-style functions is

int myClamp(int lo, int mid, int hi) {
    return min(max(lo, mid), hi);
}

That is, you write min and max in order; and then you just start listing out the arguments in order: lo then mid (then we run out of parameter slots for min and have to close the parentheses) then hi (and close those parentheses).

It was remarkable to me that OpenGL uses the same mnemonic! Since max is commutative, it doesn’t matter if you take your arguments in the order lo, mid, hi or mid, lo, hi.

int myClamp(int lo, int mid, int hi) {
    return min(max(lo, mid), hi);
}

int glClamp(int mid, int lo, int hi) {
    return min(max(mid, lo), hi);
}

But if you wanted to take “range first, value second,” then you’d have to find a different mnemonic.

int brokenClamp(int lo, int hi, int mid) {
    return min(max(lo, hi), mid); // oops!
}

Combinatorics

For ease of reading, let’s rename our parameters from lo, mid, hi to a, b, c. We have six possible parameter orderings, each with a more or less “mnemonic” implementation that uses the arguments in that specific order:

return min(max(a, b), c);  // my preference
return max(a, min(c, b));
return min(max(b, a), c);  // OpenGL, C++17
return max(min(b, c), a);
return min(c, max(a, b));
return max(min(c, b), a);

When b appears lexically in the middle — as it should! — then we have two implementation options; but I hope you’ll agree that the left-hand option below is “more mnemonic” than the right-hand option.

return min(max(a, b), c); <===> return max(a, min(b, c));
return max(min(c, b), a); <===> return min(c, max(b, a));

What’s the point of all this combinatorial stuff? None, really.

But what are the takeaways overall?

Do as std::rotate does.

The lexical ordering of a parameter list should match the expected or logical ordering of the parameters’ values.

int oneTrueClamp(int lo, int mid, int hi) {
    return min(max(lo, mid), hi);
}

bool oneTrueInRange(int lo, int mid, int hi) {
    return lo <= mid && mid <= hi;
}
Posted 2021-05-05