Note: phiên bản Tiếng Việt của bài này ở link dưới.

https://duongnt.com/kotlin-null-safety-vie

Kotlin null safety deep dive

One of Kotlin’s improvements over Java is null safety. Kotlin distinguishes between references that can hold null (nullable types) and references that cannot (non-nullable types). But at the end of the day, Kotlin code is still compiled into bytecode and runs on the JVM.

As an example, today we will find out how Kotlin distinguishes between the String and String? types when the JVM only knows of Ljava/lang/String. The same principle can be applied to any other type.

Null safety with local variables

We will start with the simplest case, null safety with a local nullable variable. Below is the code to define a String and a String?

val canBeNull: String? = null
val nonNull: String = null // compiler error

Here is the corresponding bytecode.

LOCALVARIABLE nonNull Ljava/lang/String; L2 L3 1
LOCALVARIABLE canBeNull Ljava/lang/String; L1 L3 0

As we can see, at the bytecode level, there is no difference between a nullable type and a non-nullable type. Both are compiled into Ljava/lang/String. The Kotlin compiler, on the other hand, detects when we try to assign a null to nonNull variable and throws an error.

But what happens when we call a predefined function? Does the compiler need to walk up the call chain and check the source code of each and every function? What if we only have the bytecode and don’t have the source code?

Check the return type of a function

Take these two functions as examples.

fun returnNullable(): String? = "dummy data"

fun returnNonNullable(): String = "dummy data"

Below is their bytecode.

public final static returnNullable()Ljava/lang/String;
@Lorg/jetbrains/annotations/Nullable;() // invisible
 L0
  LINENUMBER 127 L0
  LDC "dummy data"
  ARETURN
 L1
  MAXSTACK = 1
  MAXLOCALS = 0

public final static returnNonNullable()Ljava/lang/String;
@Lorg/jetbrains/annotations/NotNull;() // invisible
 L0
  LINENUMBER 129 L0
  LDC "dummy data"
  ARETURN
 L1
  MAXSTACK = 1
  MAXLOCALS = 0

The keys here are the two annotations @Lorg/jetbrains/annotations/Nullable;() and @Lorg/jetbrains/annotations/NotNull;(). Although both return types are Ljava/lang/String, the returnNullable function is marked with @Nullable. While returnNonNullable is marked with @NotNull.

By themselves, those annotations don’t have any effect on how the JVM executes the bytecode. But they can be used to indicate if a return type is nullable or not. Thanks to them, the compiler can throw an error when there are violations, like in this snippet.

val nonNull: String = returnNullable() // type mismatch

Check parameter types of a function

In this next example, we add two parameters to the returnNullable function.

fun returnNullable(canBeNull: String?, notNull: String): String? = "dummy data"

The bytecode also looks a bit different.

public final static returnNullable(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
@Lorg/jetbrains/annotations/Nullable;() // invisible
  // annotable parameter count: 2 (visible)
  // annotable parameter count: 2 (invisible)
  @Lorg/jetbrains/annotations/Nullable;() // invisible, parameter 0
  @Lorg/jetbrains/annotations/NotNull;() // invisible, parameter 1
 L0
  ALOAD 1
  LDC "notNull"
  INVOKESTATIC kotlin/jvm/internal/Intrinsics.checkNotNullParameter (Ljava/lang/Object;Ljava/lang/String;)V
 L1
  LINENUMBER 127 L1
  LDC "dummy data"
  ARETURN
 L2
  LOCALVARIABLE canBeNull Ljava/lang/String; L0 L2 0
  LOCALVARIABLE notNull Ljava/lang/String; L0 L2 1
  MAXSTACK = 2
  MAXLOCALS = 2

Again, we can see that both parameter types are Ljava/lang/String. But we now have two more annotations, which map to each of the two parameters. They are marked @Nullable and @NotNull respectively, which match the actual nullability.

Nullability of generic type parameter

Generic types do not have null safety annotations

The most tricky case is handling nullability in generic typing. Let’s take a look at the simple class below.

class DummyClass {
    fun genericType(): Pair<String, String?> {
        throw NotImplementedError()
    }
}

And here is the bytecode of the genericType function.

public final genericType()Lkotlin/Pair;
@Lorg/jetbrains/annotations/NotNull;() // invisible
 L0
  LINENUMBER 3 L0
  NEW kotlin/NotImplementedError
  DUP
  ACONST_NULL
  ICONST_1
  ACONST_NULL
  INVOKESPECIAL kotlin/NotImplementedError.<init> (Ljava/lang/String;ILkotlin/jvm/internal/DefaultConstructorMarker;)V
  CHECKCAST java/lang/Throwable
  ATHROW
 L1
  LOCALVARIABLE this LDummyClass; L0 L1 0
  MAXSTACK = 5
  MAXLOCALS = 1

At first glance, there is one @NotNull annotation. That, however, is the annotation for the Pair<out A, out B> type. It does not mark the type of the generic parameters in Pair.

Yet the compiler can still detect type mismatch errors, like in this assignment: val (a: String, b: String) = getGenericType(). This means there must be another way to indicate if a generic type parameter is nullable or not.

Metadata to the rescue

Kotlin bytecode uses metadata to provide additional information about the generic type parameters in question. Here is the metadata for DummyClass.

  @Lkotlin/Metadata;(mv={1, 7, 0}, k=1, d1={"\u0000\u0016\n\u0002\u0018\u0002\n\u0002\u0010\u0000\n\u0002\u0008\u0002\n\u0002\u0018\u0002\n\u0002\u0010\u000e\n\u0000\u0018\u00002\u00020\u0001B\u0005\u00a2\u0006\u0002\u0010\u0002J\u0014\u0010\u0003\u001a\u0010\u0012\u0004\u0012\u00020\u0005\u0012\u0006\u0012\u0004\u0018\u00010\u00050\u0004\u00a8\u0006\u0006"}, d2={"LDummyClass;", "", "()V", "genericType", "Lkotlin/Pair;", "", "Demo"})

This is not very human-readable. But we can parse it with the kotlinx-metadata-jvm package.

dependencies {
    implementation("org.jetbrains.kotlinx:kotlinx-metadata-jvm:0.6.0")
}

Read nullable flag from the metadata

Below is the code to check the nullability of each generic type parameter in the return type of the genericType function.

val metadataAnnotation = Metadata(
    kind = 1, // value of the k field in the metadata
    metadataVersion = intArrayOf(1, 7, 0), // value of the mv field
    // map to d1
    data1 = arrayOf(<value of the d1 field>),
    // map to d2
    data2 = arrayOf(<value of the d2 field>),
)

val metadata = KotlinClassMetadata.read(metadataAnnotation)
// Because DummyClass is a class, we use KotlinClassMetadata.Class here
val classMetadata = metadata as KotlinClassMetadata.Class
val kclass = classMetadata.toKmClass()

// DummyClass only has one function, so we retrieve the first element in kclass.functions,
// then we retrieve its return type arguments
val returnTypeArguments = kclass.functions.first().returnType.arguments
// Because the return type has two generic type parameters, we retrieve the first and second elements from returnTypeArguments array,
// then we extract their types
val firstArgType = returnTypeArguments[0].type
val secondArgType = returnTypeArguments[1].type

// For each type, we check its flag to see if it is nullable
val isFirstArgNullable = Flag.Type.IS_NULLABLE(firstArgType?.flags!!)
val isSecondArgNullable = Flag.Type.IS_NULLABLE(secondArgType?.flags!!)

// And we print the result to console
println("isFirstArgNullable: $isFirstArgNullable, isSeconArgNullable: $isSecondArgNullable")

This is the output of the snippet above.

isFirstArgNullable: false, isSeconArgNullable: true

As we can see, the metadata stores the nullability of each generic type parameter. And the compiler can use this information to throw mismatch errors at compile time.

How about calling Java code from Kotlin

Another interesting case is when we call Java code from Kotin. If a Java function uses String then should we map it to String or String? in Kotlin? After all, the bytecode in this case does not have any annotations. And we also don’t have the necessary metadata.

The answer is Kotlin will treat this type as platform type. Users can choose between String or String? and the compiler will trust this decision. Let’s take a look at the example below.

// Java code
public class Container {
    public static String FromJava() {
        return "dummy";
    }
}

// Kotlin code
val canBeNull: String? = Container.FromJava()
val nonNull: String = Container.FromJava()

The Kotlin code produces the following bytecode.

// val nullable: String? = Container.FromJava()
LINENUMBER 88 L2
INVOKESTATIC Container.FromJava ()Ljava/lang/String;
ASTORE 1

// val nonNull: String = Container.FromJava()
LINENUMBER 89 L3
INVOKESTATIC Container.FromJava ()Ljava/lang/String;
DUP
LDC "Container.FromJava()"
INVOKESTATIC kotlin/jvm/internal/Intrinsics.checkNotNullExpressionValue (Ljava/lang/Object;Ljava/lang/String;)V
ASTORE 2

As we can see, when we assign the return value of FromJava to a non-nullable String, the compiler adds an additional call to the checkNotNullExpressionValue function. This helps verify that the return value is indeed not null. In other words, while the second line can compile, it might fail at runtime.

Conclusion

Kotlin uses a combination of annotations and metadata to implement null safety. Whether we use String or String?, the type that the JVM sees is still Ljava/lang/String. Thanks to this, null safety does not break the compatibility between Kotlin and Java.

A software developer from Vietnam and is currently living in Japan.

One Thought on “Kotlin null safety deep dive”

Leave a Reply