What special rules does the scala compiler have for the unit type within the type system

1.4k views Asked by At

The Unit gets special handling by the compiler when generating byte code because it's analogous to void on the jvm. But conceptually as a type within the scala type system, it seems like it also gets special treatment in the language itself (examples below).

So my question is about clarifying this and understanding what mechanisms are used and if there really is special treatment for the Unit type.


Example 1:

For "normal" scala types like Seq, if a method returns Seq, then you must return Seq (or a more specific type that extends Seq)

def foo1: Seq[Int] = List(1, 2, 3)
def foo2: Seq[Int] = Vector(1, 2, 3)
def foo3: Seq[Int] = "foo" // Fails

The first two examples compile because List[Int] and Vector[Int] are subtypes of Seq[Int]. The third one fails because String isn't.

But if I change the third example to return Unit though, it will compile and run without issue even though String isn't a subtype of Unit:

def foo3(): Unit = "foo" // Compiles (with a warning)

I don't know of any other type which this exception would be allowed for in scala. So does the compiler have special rules for the Unit type at the type system level, or is there some kind of more general mechanism at work e.g. an implicit conversion.


Example 2:

I'm also not clear how unit interacts in situations where variance rules would normally be applied.

For example, we sometimes hit this bug with Future[Unit] where we accidentally use map instead of flatMap and create a Future[Future]:

def save(customer: Customer): Future[Unit] = ... // Save to database

def foo: Future[Unit] = save(customer1).map(_ => save(customer2))

The map is creating a Future[Future[Unit]] and the compiler requires a Future[Unit]. Yet this compiles!

At first I thought this was because Future[+T] is covariant, but actually Future[Unit] isn't a subtype of Unit so it doesn't seem to be that.

If the type gets changed to Boolean for example, the compiler detects the bug:

def save(customer: Customer): Future[Boolean] = ...

def foo: Future[Boolean] = save(customer1).map(_ => save(customer2)) // Compiler fails this

And for every other non-Unit type it won't compile (except Any because Future[Any] happens to be a subtype of Any by coincidence).

So does the compiler have special rules in this case? Or is there a more general process happening?

3

There are 3 answers

1
Michael Zajac On BEST ANSWER

I'm going to answer the title question for more coverage. Unit gets special treatment in a few places, more than what's going on in those code examples. In part, this is because Unit is a figment of the compiler that reduces to void on the JVM.


Value Discarding

This is the most surprising case for people. Any time the expected type of some value is Unit, the compiler tacks on Unit at the end of the expression that produces the value, according to the SLS - 6.26.1:

If ee has some value type and the expected type is Unit, ee is converted to the expected type by embedding it in the term { ee; () }.

Thus,

def foo3(): Unit = "foo"

becomes:

def foo3(): Unit = { "foo" ; () }

Likewise,

def foo: Future[Unit] = save(customer1).map(_ => save(customer2))

becomes:

def foo: Future[Unit] = save(customer1).map(_ => { save(customer2); () })

The benefit of this is that you don't need to have the last statement of a method have the type Unit if you don't want to. This benefit is small however, because if the last statement of your method that returns Unit isn't a Unit, that usually indicates an error, which is why there is a warning flag for it (-Ywarn-value-discard).

In general, I find it better to return a more specific type, if possible, rather than returning Unit. For example, when saving to a database, you may be able to return the saved value (perhaps with a new ID, or something).


Value Class

Unit is a value class created by the Scala compiler, with only one instance (if it needs to be instantiated as a class at all). This means that it compiles down to the primitive void on the JVM, unless you treat it as a class (e.g. ().toString). It has its very own section in the specification, SLS - 12.2.13.


Empty Block Type

From the SLS - 6.11, the default type of an empty block is assumed to be Unit. For example:

scala> val x = { }
x: Unit = ()

Equals

When comparing a Unit to another Unit (which must be the same object, since there is only one), the compiler will emit a special warning to inform you something is likely wrong in your program.

scala> ().==(())
<console>:12: warning: comparing values of types Unit and Unit using `==' will always yield true
       ().==(())
            ^
res2: Boolean = true

Casting

You can cast anything to a Unit, as the compiler will optimize it away (though it's unclear to me if value discarding takes over after type inference).

object Test {
  val a = "a".asInstanceOf[Unit]
  val b = a
}

becomes:

object Test extends Object {
  def <init>(): Test.type = {
    Test.super.<init>();
    ()
  };
  private[this] val a: scala.runtime.BoxedUnit = scala.runtime.BoxedUnit.UNIT;
  <stable> <accessor> def a(): Unit = ();
  private[this] val b: scala.runtime.BoxedUnit = scala.runtime.BoxedUnit.UNIT;
  <stable> <accessor> def b(): Unit = ()
}
0
rethab On

As written in the scala language specification chapter 6.26.1:

Value Discarding

If e has some value type and the expected type is Unit, e is converted to the expected type by embedding it in the term { e; () }.

0
Eduardo Pareja Tobes On

rethab's answer already gave you the link to the spec; just let me add that

  • you can disable this (make the warning an error) through the -Xfatal-warnings compiler flag
  • you'll get better messages with the -Ywarn-value-discard flag; for foo3 the compiler warning will be the more informative discarded non-Unit value

Note that this "any to Unit" conversion is compiler magic, so neither -Yno-predef or -Yno-imports will disable it; you do need the flags above. I consider this being part of the language specification an error, as if for some reason you want this dubious behavior you can just add something like

implicit def any2Unit(a: Any): Unit = ()

while opting-out of it requires an unsupported (by definition, as it breaks the specification) compiler flag.

I also recommend wartremover, where you have this and much more.