Covariance and contravariance have had a tweak in C# 4.0. But before we can start using the benefits in our code, we need to know exactly what they are! The subject can be a little bit tricky when you first encounter it, so in this post I'm going to introduce it in a way that I hope is easy to follow.
Almost all of the information in this post comes from Eric Lippert's excellent series of posts, and I recommend you go and take a look at his blog right now (links to the whole series are at the bottom of this post.)
Tigers and a giraffe (credit: andrewmalone) |
If you 'get it' straight away, great - stick with that, he's the expert. If you'd like a slightly gentler introduction, avoiding (I hope!) common 'gotchas' for learning the subject, read my post (this post) first.
What I have done is to explain the same concepts using the same examples, but in an order and with an emphasis which I think make it much easier to understand the basics. This is particularly good for you if you are a programmer coming at it from the cold, i.e. you haven't encountered covariance and contravariance before, or you have but you don't understand them yet.
When you're done here I'd suggest you go back and read Eric's posts in order - they should be much easier for you to read by then. Eric's posts will flesh out all the interesting details, and continue on to discuss more advanced topics.
Inheritance and assignability
We're not going to begin talking about covariance and contravariance straight away. First, we're going to make a distinction between inheritance and assignability.
As Eric points out, for any two types T
and U
, exactly one of the following statements is true:
T
is bigger thanU
T
is smaller thanU
T
is equal toU
T
is not related toU
Now, one of the things that seems to have caused some confusion on Eric's blog (see the comments) is usage of the phrase "is smaller than". It is used frequently, and is key, so I want to make it's definition crystal clear now before we move on. Eric says:
"Suppose you have a variable, that is, a storage location. Storage locations in C# all have a type associated with them. At runtime you can store an object which is an instance of an equal or smaller type in that storage location."
In simple scenarios, this is something so familiar to programmers that it's barely worth mentioning. We all know that, looking at the list above, only in the middle two scenarios is T
assignable to U
. The smaller than relation:
class U{} class T : U{} U myInstance = new T();
This is the first thing that comes to mind, right? An inheritance hierarchy.
But Eric didn't mention inheritance hierarchies. Sure, an inheritance hierarchy is one way to make a T
which is assignable to a U
, but what about this one:
U[] myArray = new T[10];
... or, the same statement using classes from the animals hierarchy:
Animal[] animals = new Giraffe[10];
The type Animal
inherits from Giraffe
, but the type Animal[]
doesn't inherit from Giraffe[]
. They are assignable, but not linked by inheritance, and this tells us something about what 'is smaller than' means:
T
is smaller thanU
can be read as
T
is assignable toU
You can visualise it this way:
T is smaller than U T < U --> //Direction of assignability
As we have seen, in some cases this direction of assignability may be because of an inheritance relationship, but in others it is simply because the CLR and languages (C#, Java etc.) happen to support that particular assignment operation.
There is still an inheritance hierarchy involved, i.e. this wouldn't work:
Tiger[] tigers = new Giraffe[10]; //illegal
But the key thing is that there is a difference between inheritance and assignability: they are not the same thing.
I'll say it one more time (for good luck!): The phrase "is smaller than" refers to assignability, not inheritance. The direction of assignability always flows from the smaller type to the larger type. We'll come back to this in a moment.
Covariance and Contravariance
Eric's second post discusses the array assignment operation (the one I used in the Animal
/Giraffe
example above), and the problems with it. It's definitely worth reading, but park it for now, because things really come alive in post number three.
Eric's example uses delegate methods, and I'll use a simplified version of it here, just to get us started.
It is clear why this is a legal operation:
static Giraffe MakeGiraffe() { return new Giraffe(); } //Inside some method: Func<Animal> func = MakeGiraffe; // <-- Direction of assignment
Notice that in the assignment operation, Animal
is on the left and Giraffe
is on the right. That is, the declared type is based on Animal
and the assigned type is based on Giraffe
.
Now let's look at another example:
static void AcceptAnimal(Animal animal) { //operate on animal } //Inside some method: Action<Giraffe> action = AcceptAnimal; // <-- Direction of assignment
Notice that Giraffe
is on the left and Animal
is on the right. That is, the declared type is based on Giraffe
and the assigned type is based on Animal
.
The Func<out T>
assignment operation supports covariance. The Action<in T>
assignment operation supports contravariance.
What does that mean?
Have a quick look at this summary:
(remember to read < as 'is smaller than' and 'is assignable to')
//Direction of assignability --> Giraffe < Animal Giraffe MakeGiraffe() < Func<Animal> //covariance AcceptAnimal(Animal animal) < Action<Giraffe> //contravar..
Now read Eric's definition of covariance and contravariance, from the first post in his series:
(the "operation" which manipulate types being the two assignment operations)
Consider an "operation" which manipulates types. If the results of the operation applied to any T and U always results in two types T' and U' with the same relationship as T and U, then the operation is said to be "covariant". If the operation reverses bigness and smallness on its results but keeps equality and unrelatedness the same then the operation is said to be "contravariant".
Hopefully it should start to become clear. In line 4 above, the direction of assignability with respect to the original types, was preserved, while in line 5 it was reversed!
Line 4 represents a covariant operation, and line 5 represents a contravariant operation.
The main heuristic
Let's put it back to C# code so that we can see it with the right-to-left assignability we are used to (now the smaller types are on the right):
Animal animal = new Giraffe(); //basic type assignment Func<Animal> func = MakeGiraffe; //covariant Action<Giraffe> action = AcceptAnimal; //contravariant // <-- Direction of assignability
Notice how in the covariant operation, Animal
and Giraffe
are on the same sides as in the basic type assignment operation. And notice how in the contravariant operation, they are on opposite sides - the operation "reverses bigness and smallness".
In both cases, the opposites are illegal. As Eric puts it in post number five:
"Stuff going 'in' may be contravariant,... but not vice-versa:
stuff going 'out' may be covariant"
Func<Giraffe> func = MakeAnimal; //contravariant (illegal) Action<Animal> action = AcceptGiraffe; //covariant (illegal) // <-- Direction of assignability
And by the way, if there's one heuristic you remember as a result of reading this post, it's probably best to make it the one above!
I'll repeat it later in this article.
Hang on, methods aren't types!
A quick aside - at this stage you might be asking why I'm referring to methods as though they were types. The straight answer is, I'm copying Eric. His caveat:
"A note to nitpickers out there: yes, I said earlier that variance was a property of operations on types, and here I have an operation on method groups, which are typeless expressions in C#. I’m writing a blog, not a dissertation; deal with it!"
Can't argue with that.
What's new in C# 4.0?
Well, 'new' is the wrong word since the stable release of C# 4.0 was two years ago! But all of the types of variance we've looked at so far in this post have been supported since C#2 or before.
We as developers didn't really have to think about those types of variance to use them, because it wasn't exposed syntactically. In other words, we didn't have to write anything different to make it happen, it's just what is and what isn't supported by C# compilers and the CLR.
In post numbers four and six, Eric discusses types of variance which went on to become part of the specification for C# 4.0, and it's those types of variance that I'll discuss now.
Real delegate variance
The first one is easy, and it's discussed in post number four. It's simply about taking the operations which were already legal in terms of method groups and making the same operations legal in terms of typed expressions.
Take our covariant example from earlier:
static Giraffe MakeGiraffe() { return new Giraffe(); } //Inside some method: Func<Animal> func = MakeGiraffe; // <-- Direction of assignment
Well, in C#3 this essentially equivalent operation was illegal, whereas in C#4 it is legal:
Func<Animal> func = new Func<Giraffe>(() => new Giraffe()); // <-- Direction of assignment
In fact because of lambda syntax and inferred typing, it can be shortened to:
Func<Animal> func = () => new Giraffe(); // <-- Direction of assignment
You can now do with typed expressions what you could already do with method groups. Simple.
But here's where it makes sense to quickly explain something I breezed over earlier.
Covariance and Contravariance, at once
Take a look again at the heuristic:
"Stuff going 'in' may be contravariant,
stuff going 'out' may be covariant"
So what happens when you are dealing with a type which has both an 'in' and an 'out'?
The short answer is: it can be covariant, contravariant, both, or neither. But it's easier than that makes it sound!
Take a look at this example. It's a Func
that accepts a Mammal
and returns a Mammal
:
Func<Mammal, Mammal> func;
Now here are some assignment operations:
- This is a covariant operation:
Func<Mammal, Giraffe> toAssign = //somehow initialise; Func<Mammal, Mammal> func = toAssign;
- This is a contravariant operation:
Func<Animal, Mammal> toAssign = //somehow initialise; Func<Mammal, Mammal> func = toAssign;
- This is both:
Func<Animal, Giraffe> toAssign = //somehow initialise; Func<Mammal, Mammal> func = toAssign;
... and, well, I'm sure I don't need to spell out the neither!
Interface variance
The other new feature, as discussed in post number six, is the extension of variance to interfaces. There's not much to add here - it's just the same thing, but using interfaces. Eric gives a really nice example of the practical benefit here, and I'm going to repeat it almost verbatim.
Take a look at this code block. This is another example of something which is illegal in C#3, and legal in C#4:
void FeedAnimals(IEnumerable<Animal> animals) { foreach(Animal animal in animals) if (animal.Hungry) Feed(animal); } //... IEnumerable<Giraffe> adultGiraffes = from g in giraffes where g.Age > 5 select g; FeedAnimals(adultGiraffes);
Just as earlier on, when we call FeedAnimals(IEnumerable<Animal> animals)
we are assigning a 'smaller' type to a 'larger' type:
//Direction of assignability --> Giraffe < Animal Giraffe MakeGiraffe() < Func<Animal> //covariance IEnumerable<Giraffe> < IEnumerable<Animal> //covariance
Of course, anywhere else that you reference that assigned-to variable (IEnumerable<Animal>
), what comes out will be typed as Animal
. All pretty uncontroversial.
In and out
But finally, let's look at the in
and out
keywords, and how they fit in when designing your own interfaces (or using the upgraded C#4 ones.) Recall one more time the heuristic:
"Stuff going 'in' may be contravariant,
stuff going 'out' may be covariant"
In C# 4.0, IEnumerable<T>
has become IEnumerable<out T>
. The out
marks the IEnumerable
as supporting covariance on the T
. This means that, as in the example above, you can assign based on something smaller than T
.
But it also means that the interface cannot accept the type T
as an input! It will only allow the interface to send T
out, in whatever fashion you like - but it will never accept a T
in. If you try it, the compiler won't allow it. Hence, the name: out
.
Reading through this code block should make it clear why:
//the compiler won't allow this, but go with it to see why: interface ICustom<out T> { T GetFirst(); //ok void Insert(T t); //compiler complains } //.. ICustom<Giraffe> giraffes = //somehow init; ICustom<Animal> animals = giraffes; Animal animal = animals.GetFirst(); //ok animals.Insert(new Tiger()); //problem - //backing store is Giraffe
Think of it this way - an IEnumerable<Animal>
variable can have an IEnumerable<Giraffe>
assigned to it and it will churn out Giraffe
s typed as Animal
s all day long. Because of how it's declared, users of the IEnumerable<Animal>
variable expect to be dealing with Animal
s.
But a Tiger
is also an animal. What would happen if there were a method on the interface that allowed a user to put an Animal
in?
The user could put a Tiger
in instead, and the backing store - IEnumerable<Giraffe>
- wouldn't be able to cope.
The same in reverse
Now here's a similarly invalid code block, this time using the in
keyword:
//the compiler won't allow this, but go with it to see why: interface ICustom<in T> { T GetFirst(); //compiler complains void Insert(T t); //ok } //.. ICustom<Animal> animals = //somehow init; ICustom<Giraffe> giraffes = animals; giraffes.Insert(new Giraffe()); //ok Giraffe giraffe = giraffes.GetFirst(); //problem //backing store is Animal
So when a type is marked as out
, it's out
only. And when a type is marked as in
, it's in
only too! A type can't be both in
and out
.
How to read it
So when you read an out
type in an interface, read it this way:
interface ICustom<out T> { //You can initialise me using <=T //And I will use it as a backing store //But I will only send T's out-wards //Because T's coming in could be too wide // for my <=T backing store }
And for an in
type in an interface:
interface ICustom<in T> { //You can initialise me using >=T //And I will use it as a backing store //But I will only accept T's in-wards //Because my >=T backing store is too wide // to produce T's to send out }
If it helps, try reading those again - but this time, with the out T
interfaces read T
as Animal
and <=T
as Giraffe
.
And with the in T
interfaces read T
as Giraffe
and >=T
as Animal
.
Or, more concisely
Here's out
again more concisely:
interface ICustom<out T> { //covariant //assign a smaller T, i'll only send it out }
And for in
:
interface ICustom<in T> { //contravariant //assign a larger T, i'll only take it in }
I hope that helps!
The payoff
As Eric points out, the only way to make the above example of FeedAnimals
work in C#3 is to use a "silly and expensive casting operation":
FeedAnimals(adultGiraffes.Cast<Animal>()); //or FeedAnimals(from g in adultGiraffes select (Animal)g);
He goes on:
"This explicit typing should not be necessary. Unlike arrays (which are read-write) it is perfectly typesafe to treat a read-only list of giraffes as a list of animals"
And the example which Eric suggests hypothetically in that post, Matt Hidinger later demonstrates for us using C#4!
The full series
That's about as much as I want to write on the subject!
Below are links to the full series. Bear in mind that these explanatory posts were written prior to the release of C# 4.0. But they are still an excellent programmer's introduction, with much more info than I have covered in this post: