print() make thread-local cache invalid?

132 views Asked by At

In Java/Kotlin or any JVM languages, each thread has a "local memory" AKA. "Cache". When a thread want to write a variable into the memory, it first update the value in its own cache, then sync the modification into the Main Memory, which is shared between threads.

The following demo illustrates a use case that Thread unrealized changing happened:

class VolatileExample3 {

    var flag = false

    var a = 0

    fun write() {
        a = 1
        Thread.sleep(10)
        flag = true
    }

    companion object {
        @JvmStatic
        fun test() {
            val example = VolatileExample3()
            val th = Thread {
                example.write()
            }
            th.start()
            while (!example.flag) {}
            println("a = ${example.a}")
        }
    }
}

The thread th calls the function write(), first update the value of a, then sleep for 10 ms as a simulated IO operation, finally set flag as true indicating that the "write" has been done.

In Main thread, it always querying the value of flag and skip while loop when flag is true, which is ought to be the time that write() is over.

We want the output as "a = 1", which is the correct answer after writing.

However, because the "local memory / cache" exists, the Main Thread will loop around in while as if it was stuck.

Main thread won't read Main Memory

Just before the content of write() run, main thread firstly read flag and get false, it won't get the value in Main Memory until we tell it explicitly.

Answer of the problem is obvious: add @Volatile to flag, in which the READ op onto it will make it's local memory invalid ahead of the real LOAD VALUE.

But I found another solution of this problem which is:

while (!example.flag) {
    print("")
}

It works without @Volatile! I wandered whether the flush cache operation inside print() would make the local memory / cache of Main Thread invalid or not. If so, what is the mechanism?

Also I found that putting th.join() before the while loop also works. Are these methods essentially the same, which means that the termination of a thread would be noticed to all other threads leading to the invalidation of local memory?

3

There are 3 answers

2
Solomon Slow On

There is no "cache" in the Java programming language. Caches are part of the underlying computer hardware architecture, and they work differently on different computers.

You job, as a Java programmer, is to make sure that your code obeys the rules that are published in the Java Language Specification (JLS). The job of the developers who provide the Java compiler (javac), the Java virtual machine (JVM), and the Java run-time environment (JRE) for your particular computer is to ensure that if your code obeys the rules in the JLS, then their code will keep the promises made by the JLS.

Last time I looked, the rules and promises regarding access to variables from multiple threads were mostly in Chapter 17 and, mostly expressed in terms of "synchronization" and "happens before" relationships.

I found another solution of this problem which is:...print()

That may work on your computer, with the operating system version and the JDK version that you happen to be using today. In fact, it works for a lot of people and many different computers, but it is not guaranteed to work.

Experimentation is not the way to find out what "works" and what does not. I once worked for a large corporation that released a major software package that broke the rules. It passed several weeks worth of aggressive testing in our offices. It even "worked" on all of our customer's computers when we first released it. But then, one of our customers upgraded their OS, and the software stopped working for them.

Also I found that putting th.join() before the while loop also works.

That is guaranteed to work. The JLS explicitly guarantees that whatever thread A does to shared variables must become visible to any other thread after the other thread has called A.join(). In the language of the JLS, everything that thread A does "happens before" an A.join() call returns in the other thread.

3
Stephen C On

The short answer to this is that according to the Java Memory Model (JMM) which Kotlin on JVM depends on, your program is incorrect.

There is no happens before relationship between writing to flag in the child thread and subsequently reading it in the main thread. The behavior of your application is unspecified: there are executions that are not well-formed.

From the language perspective, that means is that your "main" thread may or may not see flag change state. The actual behavior could depend on your hardware platform (including how CPU type, many cores you have, how much RAM you have), the specific JRE version and build, or indeed ... anything.

Trying to understand how and why code that is incorrect changes behavior when you tweak it is kind of pointless. You shouldn't write code like that anyway ... even if you think you know what is happening "under the hood". The unspecified "under the hood" stuff could change, be different on different machines, and so on.


So what is actually going on?

Will it is hard to be sure, but I suspect that the print call is doing some synchronization of its own, and that has the serendipitous effect of causing memory caches to be flushed1.

But you can't rely on it. The classes involved are not specified to have this effect. It is a lucky implementation detail.

Note: since flag is not declared as volatile, it would be a legitimate2 optimization for the JIT compiler to detect that there is no happens before relationship and then emit code that read flag from RAM and put it into a register once. If it did that, the print(""); hack would have no effect.

1 - Assuming that memory caches are involved. There are other possible mechanisms in play too; e.g. use of registers.
2 - Legitimate in the sense that it doesn't violate the JMM visibility guarantees. They only apply to well-formed executions!

0
Basil Bourque On

The two answers by Stephen C and by Solomon Slow are both correct.

AtomicBoolean instead of volatile

My preferred solution to your issue is the use of the Atomic… classes. Use of these classes clearly remind the reader that we are protecting access across threads.

To hold a Boolean value, use AtomicBoolean.

By changing your Boolean variable to a AtomicBoolean, you’ll not need volatile.

I don’t know Kotlin syntax but I’ll make an attempt at revising your code. Here we achieve thread-safety without any need for volatile.

class VolatileExample3 {

    var flag = AtomicBoolean( false )

    var a = 0

    fun write() {
        a = 1
        Thread.sleep(10)
        flag.set ( true )
    }

    companion object {
        @JvmStatic
        fun test() {
            val example = VolatileExample3()
            val th = Thread {
                example.write()
            }
            th.start()
            while ( ! example.flag.get() ) {}
            println("a = ${example.a}")
        }
    }
}

Of course your code has another concurrency issue. There is a gap in time between execution of the the while and the println. During that time the flag could be flipped. But that seems to be outside the scope of your Question.