Covariance and contravariance in C++

Today I’d like to try to explain covariance and contravariance, and the many places we see that notion pop up in C++.

Covariance and contravariance, by example

Consider the following veterinary analogy. We have the notion of an animal. All animals can make noise. Cats are animals. Horses are animals that you can ride.

struct Animal {
    virtual void make_noise() = 0;
    virtual ~Animal() = default;
};

struct Cat : Animal {
    void make_noise() override { puts("meow"); }
};

struct Horse : Animal {
    void make_noise() override { puts("neigh"); }
    virtual void ride() { puts("okay sure"); }
};

Crucially, it is part of the static type system that you can’t ride cats. It’s not just that you try and it throws an exception, or something; cats literally don’t have a ride method. Okay?

Now we add a producer of animals; let’s say, an animal breeder. (I was going to say “a mama animal,” but then we’d have to decide whether to express that a MamaCat was both a MamaAnimal and a Cat, and that way lies madness.)

struct AnimalBreeder {
    virtual Animal *produce() = 0;
};

struct CatBreeder : AnimalBreeder {
    Animal *produce() override { return new Cat; }
};

struct HorseBreeder : AnimalBreeder {
    Animal *produce() override { return new Horse; }
};

Notice that a cat breeder IS-AN animal breeder. If you say to me, “Quick, I need the phone number of someone who can produce me an animal!” and I give you the number of a cat breeder, you will be happy. (That’s logic!)

Because a cat IS-AN animal, any breeder that produces cats by definition IS-A breeder that produces animals. We say that the relationship between Animal and Cat is covariant with the relationship between AnimalBreeder and CatBreeder.


Notice that a CatBreeder only ever produces Cats, and a HorseBreeder only ever produces Horses. Conveniently, C++ permits us to encode that extra snippet of information into our source code.

struct AnimalBreeder {
    virtual Animal *produce() = 0;
};

struct CatBreeder : AnimalBreeder {
    Cat *produce() override { return new Cat; }
};

struct HorseBreeder : AnimalBreeder {
    Horse *produce() override { return new Horse; }
};

This is a feature known as “covariant return types.”


Now consider what happens if you already have a Cat, and you come to me and say, “Quick, I need the phone number of someone who can treat my sick cat!” I’d try to find you an animal doctor.

struct DoctorDolittle {
    virtual void treat(Animal *);
};

struct DoctorHackenbush {
    virtual void treat(Horse *);
};

struct DoctorKeat {
    virtual void treat(Cat *);
};

(Footnote: My lovely wife, the namesake of DoctorKeat, would also have a treat(Dog *) method at least. But for the purposes of this example, we’ll stick with cats.)

According to the static type system, Dr. Keat treats only cats, and Dr. Hackenbush treats only horses. Whereas, of course, Dr. Dolittle can treat anything!

What does this mean for our class hierarchy? Well, consider Horse. A horse can do anything a generic animal can do, plus more (namely, be ridden), so we made Horse inherit from Animal. DoctorDolittle can do anything DoctorHackenbush can do, plus more, so logically DoctorDolittle should inherit from DoctorHackenbush.

Taken literally, this would lead to a bit of a mess, because DoctorDolittle would have to multiply inherit from DoctorHackenbush and DoctorKeat and a bunch more animal-doctor classes that we don’t even know about yet. Which is crazy. So let’s back off and just consider Dolittle and Hackenbush in isolation.

struct HorseDoctor {
    virtual void treat(Horse *);
};

struct AnimalDoctor : HorseDoctor {
    void treat(Animal *) override;  // hmm...
};

This hierarchy is logically correct. If you come to me and say, “I need the number of someone to treat my sick horse!” then you will be happy with any HorseDoctor; but if you need a horse doctor to treat your sick cat, you’ll need a more capable horse doctor. If it helps, consider that an AnimalDoctor like Dr. Dolittle is also a CatAndHorseDoctor, which sounds more like a proper subclass of HorseDoctor.

We say that the relationship between Animal and Horse is contravariant with the relationship between AnimalDoctor and HorseDoctor: an AnimalDoctor IS-A HorseDoctor precisely because a Horse IS-AN Animal.

Perhaps inconveniently, C++ does not permit us to write the function marked hmm... above. C++’s classical OOP system supports “covariant return types,” but it does not support “contravariant parameter types.”

This concludes our explanation of covariance and contravariance in the classical OOP system. Now let’s look at other places the notion shows up in C++.

In std::function

(Thanks to Michał Dominiak for giving this example.)

The C++ core language may not support contravariant parameter types, but the standard library does, in the form of std::function and its many implicit conversions. Example:

struct Animal {};
struct Horse : Animal {};

using AnimalDoctor = std::function<void(Animal*)>;
using HorseDoctor = std::function<void(Horse*)>;

auto hackenbush = [](Horse *) {};
auto dolittle = [](Animal *) {};

HorseDoctor a = hackenbush;  // OK
HorseDoctor b = dolittle;  // OK
AnimalDoctor c = hackenbush;  // ERROR
AnimalDoctor d = dolittle;  // OK

We see that dolittle is an acceptable AnimalDoctor and thus also a HorseDoctor; whereas hackenbush is nothing more than a HorseDoctor.

Similarly, std::function supports covariant return types. Example:

using AnimalBreeder = std::function<Animal*(void)>;
using HorseBreeder = std::function<Horse*(void)>;

auto ben_ishak = []() -> Horse * { return new Horse; };
auto moreau = []() -> Animal * { return /*some Animal*/; };

HorseBreeder e = ben_ishak;  // OK
HorseBreeder f = moreau;  // ERROR
AnimalBreeder g = ben_ishak;  // OK
AnimalBreeder h = moreau;  // OK

If you say to me, “Quick, I need the phone number of someone who can produce me an animal!” and I give you the number of Dr. Moreau, you will be happy. (That’s logic!) But if you need specifically a horse, I should send you to a more capable breeder, who is not only able to produce you an animal but also to give you compile-time assurance that it is actually a horse. I’d better not send you to Moreau. With Moreau, you never know what you’re going to get.

In const-correctness

Covariance and contravariance are fully supported in both C and C++ for implicit conversions involving the const qualifier. Here’s covariance.

(Remember that in C, when we want to produce something, we use an out-parameter — we “pass by pointer.”)

using Animal = int const *;
using Horse = int *;

const int tiger = 42;
int stallion = 42;

void moreau(Animal *r) { *r = &tiger; }
void ben_ishak(Horse *r) { *r = &stallion; }

This snippet is logically correct! A horse is a more capable animal. In our classical OOP example, a horse was an animal you could ride(). In this example, a horse is an animal whose target you can ++. (That is, a mutable int is a more capable const int. Rust got it right!)

Now you come to me and say, “Quick, I need the phone number of someone who can produce me a horse! It’s a gift for my daughter.”

Horse giftbox;  // to be filled in by the producer
ben_ishak(&giftbox);  // OK
moreau(&giftbox);  // ERROR

I can send your empty gift box over to Ben Ishak, whom I trust to fill it with a horse. I can’t send your box to Moreau; he’ll probably fill it with some other kind of animal. With Moreau, you never know what you’re going to get.

(Incidentally, it is super important that the compiler does enforce this rule for us. If moreau(&giftbox) were permitted to compile, then on Christmas morning your daughter would probably try to ++*giftbox and end up riding that tiger by mistake.)

That’s covariance: the relationship between int const * and int * is covariant with the relationship between void(int const **) and void(int **). Now for contravariance:

using Animal = int const *;
using Horse = int *;

void dolittle(Animal);
void hackenbush(Horse);

If you come to me asking for someone to treat your cat (an Animal that is not a Horse), I can send you to Dolittle but not to Hackenbush:

Animal your_cat;
dolittle(your_cat);    // OK
hackenbush(your_cat);  // ERROR

That’s contravariance: the relationship between int const * and int * is contravariant with the relationship between void(int const *) and void(int *).

In non-type template parameters (C++17)

In C++17, non-type template parameters (NTTPs) demonstrate contravariance:

#define Animal auto
#define Horse int
#define Cat int*

template<Animal> struct dolittle {};
template<Horse> struct hackenbush {};

template<template<Cat> class>
struct you {};

Now you come and say to me, “Quick, I need someone who can treat my sick cat!” I can send you to Dr. Dolittle, but not to Dr. Hackenbush:

template struct you<dolittle>;
template struct you<hackenbush>;

(GCC, Clang, and ICC agree on this point. MSVC believes it’d be fine to send you to Hackenbush.)

For the covariant case, we can use our “gift box” metaphor:

static int tiger = 0;

template<template<Horse> class Giftbox>
struct ben_ishak { Giftbox<1> stallion; };

template<template<Animal> class Giftbox>
struct moreau { Giftbox<&tiger> surprise; };

template<Horse> struct your_giftbox {};
template struct ben_ishak<your_giftbox>;
template struct moreau<your_giftbox>;

In this case, not a single compiler detects the problem with moreau<your_giftbox>: everyone freely instantiates moreau and lets Moreau place a &tiger inside the Giftbox, which results in a hard error.

In non-type template parameters with alias templates

By the way, we can replace dolittle, hackenbush, and your_giftbox with alias templates; it doesn’t change any compiler’s behavior. (MSVC still incorrectly accepts you<hackenbush>; everyone still incorrectly accepts moreau<your_giftbox>.)

template<Animal> using dolittle = int;
template<Horse> using hackenbush = int;
template<Horse> using your_giftbox = int;

In this case, every compiler does give a hard error at the point of instantiation of Giftbox<&tiger>, which is refreshing — but nobody detects that your_giftbox should never have been given to moreau in the first place.

In constrained template type parameters (C++2a)

The C++2a Working Draft permits template type parameters to be constrained, and even gives the programmer a shorthand syntax which (unfortunately? fortunately?) makes the following example look almost identical to the NTTP example above.

template<class T> concept Animal = true;
template<class T> concept Horse = Animal<T> && sizeof(T)==4;
template<class T> concept Cat = Animal<T> && sizeof(T)==2;

template<Animal> struct dolittle {};
template<Horse> struct hackenbush {};

template<template<Cat> class>
struct you {};

Now you come and say to me, “Quick, I need someone who can treat my sick cat!” I can send you to Dr. Dolittle, but not to Dr. Hackenbush:

template struct you<dolittle>;
template struct you<hackenbush>;

(GCC and Clang agree on this point.)

For the covariant case, we can use our “gift box” metaphor:

template<template<Horse> class Giftbox>
struct ben_ishak { Giftbox<char[4]> stallion; };

template<template<Animal> class Giftbox>
struct moreau { Giftbox<char[42]> tiger; };

template<Horse> struct your_giftbox {};
template struct ben_ishak<your_giftbox>;
template struct moreau<your_giftbox>;

GCC and Clang agree that ben_ishak<your_giftbox> is acceptable and moreau<your_giftbox> is unacceptable.

In constrained template type parameters with alias templates

By the way, we can replace dolittle, hackenbush, and your_giftbox with alias templates; it doesn’t change any compiler’ behavior.

template<Animal> using dolittle = int;
template<Horse> using hackenbush = int;
template<Horse> using your_giftbox = int;

In variadic template parameters

A variadic parameter list (that could have any number of parameters) is like an Animal; a parameter list constrained to take only two parameters is like a Horse. (A horse IS-AN animal, but not all animals are horses.)

#define Animal class...
#define Horse class, class
#define Cat class, class, class

template<Animal> struct dolittle {};
template<Horse> struct hackenbush {};

template<template<Cat> class>
struct you {};

Now you come and say to me, “Quick, I need someone who can treat my sick cat!” I should be able to send you to Dr. Dolittle, but not to Dr. Hackenbush:

template struct you<dolittle>;
template struct you<hackenbush>;

(Here, Clang falters by incorrectly rejecting you<dolittle>. GCC, ICC and MSVC accept it. Everybody correctly rejects you<hackenbush>.)

For the covariant case, we can use our “gift box” metaphor:

template<template<Horse> class Giftbox>
struct ben_ishak { Giftbox<int, int> stallion; };

template<template<Animal> class Giftbox>
struct moreau { Giftbox<int, int, int> tiger; };

template<Horse> struct your_giftbox {};
template struct ben_ishak<your_giftbox>;
template struct moreau<your_giftbox>;

In this case, not a single compiler detects the problem with moreau<your_giftbox>: everyone freely gives your_giftbox to moreau and lets Moreau try to place a tiger inside it, which results in a hard error.

In variadic template parameters with alias templates

You might think this shouldn’t be different (and you’d be right that it shouldn’t), but in fact we do see a change in MSVC’s behavior…

template<Animal> using dolittle = int;
template<Horse> using hackenbush = int;
template<Horse> using your_giftbox = int;

With hackenbush expressed as an alias template instead of a class template, MSVC incorrectly accepts you<hackenbush>! Clang continues to incorrectly reject you<dolittle>. And nobody notices Moreau taking the giftbox until it’s too late.

More?

What are some more places that covariance and contravariance show up in C++? Think of other ways to smuggle “producers” (like moreau and ben_ishak) and “consumers” (like hackenbush and dolittle) into the language… and if you think of a good one, tell me about it!

The “implementation variance” on some of these variadic-template examples is the subject of CWG1430.


For a sequel to this post, see “P1616R0 and health insurance” (2019-07-03).

Posted 2019-01-20