I can correct this by adding the else
clause, but how come if there is no outcome handled by default, the function tries to return a Unit
?
In Scala, unlike more "imperative" languages, (almost) everything is an expression (there are very few statements), and every expression evaluates to a value (which also means that every method returns a value).
This means that, for example, the conditional expression if (condition) consequence else differentConsequence
is an expression that evaluates to a value.
For example, in this piece of code:
val foo = if (someRandomCondition) 42 else "Hello"
the then
part of the expression will evaluate to 42
, the else
part of the expression will evaluate to "Hello"
, which means the if
expression as a whole will evaluate to either 42
or "Hello"
.
So, what is the type of foo
going to be? Well, in the then
case, the value is of type Int
and in the else
case, the value is of type String
. But, this depends on the runtime value of someRandomCondition
, which is unknown at compile time. So, the only choice we have as the type for the whole if
expression is the lowest common ancestor (technically, the weak least upper bound) of Int
and String
, which is Any
.
In a language with union types, we could give it a more precise type, namely the union type Int | String
. (Scala 3 has union types, so we could give the expression this exact type, although Scala 3 will not infer union types.) In Scala 3, we could even annotate it with the even more precise type 42 | "Hello"
, which is actually the type that TypeScript is going to infer for the equivalent conditional expression:
const foo = someRandomCondition ? 42 : "Hello"
Now, let's move forward towards the code in the question:
val bar = if (someRandomCondition) 42
What is the type of bar
going to be? We said above that it is the lowest common ancestor of the types of the then
and else
branch, but …?what is the type of the else
branch? What does the else
branch evaluate to?
Remember, we said that every expression evaluates to a value, so the else
branch must evaluate to some value. It can't just evaluate to "nothing".
This is solved by a so-called unit value of a unit type. The unit value and type are called the "unit" value and type, because the type is designed in such a way that it can only possibly be inhabited by a single value. The unit type has no members, no properties, no fields, no semantics, no nothing. As such, it is impossible to distinguish two values of the unit type from one another, or put another way: there can only be one value of the unit type, because very other value of the unit type must be identical.
In many programming languages, the unit value and type use the same notation as a tuple value and type, and are simply identified with the empty tuple ()
. An empty tuple and a unit value are the same thing: they have no content, no meaning. In Haskell, for example, both the type and the value are written ()
.
Scala also has a unit value, and it is also written ()
. The unit type, however, is scala.Unit
.
So, the unit value, which is a useless value, is used to signify a meaningless return value.
A related, but different concept in some imperative languages is the void
type (or in some languages, it is more a "pseudo-type").
Note that "returns nothing" is different from "doesn't return", which will become important in the second part of this answer.
So the first half of the puzzle is: the Scala Language Specification says that
if (condition) expression
is equivalent to
if (condition) expression else ()
Which means that in the (implicit) else
case, the return type is Unit
, which is not compatible with List[(Int, Int)]
, and therefore, you get a type error.
But why does throwing an exception fix this?
This brings us to the second special type: Nothing
. Nothing
is a so-called bottom type, which means that it is a subtype of every type. Nothing
does not have any value. So, what then, would a return type of Nothing
signify?
It signifies an expression that doesn't return. And I repeat what I said above: this is different from returning nothing.
A method that has only a side-effect returns nothing, but it does return. Its return type is Unit
and its return value is ()
. It doesn't have a meaningful return value.
A method that has an infinite loop or throws an exception doesn't return at all. Its return type is Nothing
and it doesn't have a return value.
And that is why throwing an exception in the else
clause fixes the problem: this means that the type of the else
clause is Nothing
, and since Nothing
is a subtype of every type, it doesn't even matter what the type of the then
clause is, the lowest common supertype of the type of the then
clause and Nothing
will always be the type of the then
clause. (Think about it: the lowest common ancestor of a father and any of his children, grandchildren, great-grandchildren, etc. will always be the father himself. The lowest common ancestor of T
and any subtype of T
will always be T
. Since Nothing
is a subtype of all types, the lowest common ancestor of T
and Nothing
will always be T
because Nothing
is always a subtype of T
, no matter what T
is.)