Notice

╠ This is my personal blog and my posts here have nothing to do with my employers or any other association I may have. It is my personal blog for my personal experience, ideas and notes. ╣

Saturday, August 13, 2022

Virtual Thread (JEP-425)

 

Java Thread 

First let's talk about the normal Java thread. Before we proceed further we have a new name for the normal Java thread and the name is Platform Thread. Now onward we will address normal Java thread as platform thread. 


Platform thread is a wrapper of Operating System (OS) thread in Java. When we create a platform thread and run the thread using start() method we are actually making a native call to the OS  and OS to assign a thread to platform thread for the execution. As we are getting the platform thread from the OS it is limited in numbers and it also means that whenever we ask for a platform thread it may not get assigned immediately. As the OS also needs to manage other tasks as well. The term ' expensive' is generally ascribed to anything that is limited in numbers and not immediately available. Thus, platform thread is undoubtedly expensive but it must be kept in mind that, since it executes on OS thread it is faster than virtual threads. Since it is expensive, we can not have as many as we want . So there is a bottle neck whenever we need a huge number of platform threads for execution. Technically speaking, there is no restriction from the OS side on the number of threads and with proper hardware infrastructure and OS configuration we can have a huge number of platform threads , but if we blindly spawn platform threads it will eat-up the computing resources for unnecessary tasks like context switching.





So why do we require a 'virtual thread'? 

As we already know that Java platform thread is expensive and it runs on OS thread. It means whenever platform threads need to wait it blocks the precious OS thread. Consequently, we are not able to utilise the OS thread at optimum level and therefore, we are wasting it when platform threads wait for some tasks. 

So, we require a thread which is good at waiting and at the same time can scale easily without eating-up the computer resources for unnecessary tasks like context switching. Virtual thread will not give us any significant or any advantage over platform thread.

As the virtual thread is good at waiting. So the ideal usage of the virtual thread will be when application is executing the IO operations like a database or other third party service calls, etc. During this IO operations virtual thread will wait for the IO operations to get complete without blocking the OS thread once the IO operation is done it will continue the rest of the task execution.

Probably, now we have an adequate idea about why we require a virtual thread.  

Now what is a 'virtual thread'? 

A virtual thread is thread which run on platform thread and it is of type java.lang.Thread
It is made of two components - a continuation and a scheduler. Java already has an excellent scheduler in the form of ForkJoinPool, and added continuation to the JVM. 

 

What is a 'continuation' now? 

Now about continuation, it is in other words delimited continuation and also sometimes is called coroutine (continuation + routine)

Continuation is an abstract representation of control state in the computer program. 
Routine is reusable pice of code that are usually called multiple times during the execution. 

Key properties of coroutine 

  • Can be suspended and resumed at anytime. 
  • It is a data structure that represents the process state and call stack trace. 
  • Can yield / give control to other coroutines. 
  • It must have isDone() method which tell us the execution is done or not, yield() method to suspend the current continuations unto the given scope and run() method to mounts and runs the continuation body. If suspended, continues it from the last suspend point.  

We will understand the continuation / coroutine using an example. 

Output of the above code


In first iteration you can see it is executing till before line # 10 [Continuation.yield(scope);] then the below code parked in the heap for the future execution. 
In second iteration you can see it is executing after line # 10 [Continuation.yield(scope);]  it takes out the code from the heap which need to be executed. 

How to create a virtual thread?


Terminologies we will be using in virtual thread

  • JDK assigning a virtual thread to platform thread is called MOUNTING.
  • JDK unassigned  a virtual thread to platform thread is called UNMOUNTING.
  • The platform thread which is running the virtual thread is called its CARRIER THREAD.
  • Due to some reason both virtual and platform thread got blocked this is called PINNING of Operating System (OS) thread. 

How does a 'virtual thread' work? 

  1. JDK creates a ForkJoinPool executor. 

  2. JDK then creates a continuation object. 

  3. Finally, on start() method invocation on virtual thread will schedule the execution. 

    • Assign the virtual thread to a carrier thread. 
    • Inherit the ExtendLocal bindings for the given carrier thread.
    • Submit the task to the scheduler.

Example to understand virtual thread scheduler


Output
Try to understand the code first. In this code we are simply creating two virtual thread object [code line # 9 to 29] and running that virtual thread in line # 31. To understand scheduler we are making each virtual thread to sleep for 1sec in line # 18. 

Scheduler 

VirtualThread[#21]/runnable@ForkJoinPool-1-worker-1 ==> Started the task this virtual thread 

after came back from sleep 


VirtualThread[#21]/runnable@ForkJoinPool-1-worker-2 ==> but the same task was completed by other virtual thread


Let's put scaling to the test for virtual thread 


NOTE:- Just changed the line # 8 to generate platform thread.  


Statistic of Platform Thread


Statistic of Virtual Thread

Lets analysis the statistic captured for the same code but different type of threads. 
  • In case of virtual thread heap memory usage is more that platform thread. It is because when virtual thread is getting blocked it put its continuation object into heap memory. 
    • [virtual thread heap memory usage > platform thread heap memory usage] 
  • Only one extra class get loaded for virtual thread. 
    • [# of classes loaded in virtual thread > # of classes loaded in platform thread]
  • More than three thousand platform threads got generated and processed whereas only twenty nine threads in case of virtual threads. As virtual threads are running on ForkJoinPool scheduler it does not required so many platform threads to get the execution completed. 
    • [# of platform threads > # of virtual thread's manager] 
  • Platform threads consumed CPU consistently than virtual threads. 

Pinning

We learn about pinning in the above terminology section. 

"Due to some reason both virtual and platform thread got blocked this is called PINNING of Operating System (OS) thread."

What are those reasons for pinning? 

Reason for pinning

During native call, monitor held (wait, notify, notifyAll) and in critical section execution (synchronized block execution) virtual thread could block the platform thread. 

It is recommend to use java.util.concurrent.locks.* package API instead of synchronized block.

Let's understand with an example. In this example for counter value zero's virtual thread is going to synchronized block and for counter value three's virtual thread is going to ReentrantLock lock.



Output 
Pinning example

From the above output we could derived that in case of synchronized block same virtual thread (methodVirtualThread[#21]/runnable@ForkJoinPool-1-worker-1) is executing the task before and after the critical section. In this case continuation didn't worked as the virtual thread pinned the carrier thread. 

In case of  ReentrantLock lock before lock it was different thread (methodVirtualThread[#24]/runnable@ForkJoinPool-1-worker-2) which was executing the task and after unlock rest of the task was completed by the different virtual thread (methodVirtualThread[#24]/runnable@ForkJoinPool-1-worker-1). Continuation worked here as this virtual thread didn't blocked the carrier thread. 

Some of the best practice to follow for virtual thread 

  • Don't pool the virtual thread object. 
  • Revisit synchronized code for better scalability. 
  • Don't use TheadLocal rather user extend-local variables. 
Github link to download the code.