Inside Java Virtual Threads: Architecture, Scheduling, and Performance

A deep technical dive into Project Loom, carrier threads, continuations, and how virtual threads reshape Java concurrency

Inside Java Virtual Threads: Architecture, Scheduling, and Performance

Virtual threads (Project Loom) represent a fundamental shift in how Java handles concurrent workloads. Rather than a simple API change, they introduce a new execution model built on continuations and cooperative scheduling. This article explores the internals: what virtual threads are, how they interact with the JVM scheduler, their performance implications, and where they fit alongside reactive and async patterns.


1. Platform Threads vs. Virtual Threads

Platform Threads (1:1 Model)

Traditionally, Java threads map directly to OS threads (1:1 model). Each java.lang.Thread:

  • Allocates 1–2 MB of stack memory (OS-dependent)
  • Has its own kernel context and registers
  • Carries full scheduling responsibility to the OS kernel
  • Limits practical concurrency to thousands (before memory/scheduler exhaustion)

For I/O-bound workloads (network requests, database queries), most platform threads spend time blocked, wasting OS resources.

Virtual Threads (Many-to-One Model)

Virtual threads are lightweight, user-space threads managed by the JVM. Key properties:

  • Minimal memory footprint (~200 bytes vs. 1+ MB)
  • Scheduled on a limited pool of carrier threads (OS threads)
  • Automatically suspended when blocking on I/O or other I/O operations
  • Can scale to millions of concurrent instances on modest hardware

Virtual threads don’t replace platform threads; they’re a higher-level abstraction sitting on top of them.


2. Carrier Threads and the JVM Scheduler

Carrier Thread Architecture

A carrier thread is a platform thread that executes virtual thread code. The JVM maintains:

  • Default carrier pool: typically ForkJoinPool with a size of 2 × CPU cores
  • Virtual thread scheduler: decides which virtual thread runs on which carrier thread
Virtual Thread 1 ─┐
Virtual Thread 2 ─┤── Carrier Thread A
Virtual Thread 3 ─┘

Virtual Thread 4 ─┐
Virtual Thread 5 ─┤── Carrier Thread B
Virtual Thread 6 ─┘

Scheduling Model

The JVM scheduler:

  1. Mounts a virtual thread onto a carrier thread (runs it)
  2. Suspends it when it hits a blocking point (via continuations)
  3. Unmounts it and schedules another virtual thread on that carrier
  4. Later, resumes the virtual thread when the blocking operation completes

This is non-preemptive, cooperative scheduling: a virtual thread yields control voluntarily, not forced by the OS.


3. Continuations: The Mechanical Heart

Virtual threads rely on continuations — a mechanism to pause and resume execution mid-method without unwinding the stack.

What is a Continuation?

A continuation captures the entire execution state:

  • Local variables and method parameters
  • Call stack frames
  • Program counter (instruction pointer)

When a blocking operation occurs (e.g., Socket.read()), the JVM:

  1. Saves the continuation state
  2. Suspends the virtual thread
  3. Releases the carrier thread to run another virtual thread
  4. Later, when I/O completes, resumes the saved continuation

Example: Under the Hood

var executor = Executors.newVirtualThreadPerTaskExecutor();
executor.submit(() -> {
    System.out.println("Start"); // Virtual thread mounted
    var data = socket.read();     // BLOCKING CALL
    System.out.println(data);      // Resumed later
});

Timeline:

  1. Virtual thread starts; mounted on carrier thread A
  2. socket.read() blocks waiting for network data
  3. JVM captures continuation and unmounts virtual thread
  4. Carrier thread A is freed; another virtual thread mounts
  5. Network data arrives; OS wakes the JVM’s I/O handler
  6. JVM resumes the first virtual thread’s continuation on any available carrier thread
  7. Execution continues from socket.read() (transparently to the application)

4. Pinning: The Hidden Gotcha

What is Pinning?

Pinning occurs when a virtual thread cannot be suspended and unmounted from its carrier thread, effectively blocking the carrier. This ruins the scalability benefit.

When Does Pinning Happen?

  1. Synchronized blocks or methods
    synchronized(lock) {
        socket.read(); // Virtual thread PINNED; cannot unmount
    }
    

    The JVM cannot unmount a virtual thread while holding a monitor lock (Java’s synchronized).

  2. Calling native code via JNI that blocks
    nativeBlockingCall(); // Pinned while native code runs
    
  3. Thread-local variables accessed inside blocking calls
    ThreadLocal.get(); // Access inside blocking region may pin
    

Why is Pinning Bad?

If many virtual threads pin simultaneously, the carrier threads are exhausted, and queued virtual threads stall.

Virtual Thread 1 (pinned) ─┐
Virtual Thread 2 (pinned) ─┤── Only 2 carriers total
Virtual Thread 3 (blocked) ─ waiting for a carrier!
Virtual Thread 4 (blocked) ─ waiting for a carrier!

Avoiding Pinning

  • Replace synchronized with ReentrantLock (does not pin)
  • Use StampedLock or ReadWriteLock for fine-grained control
  • Keep native code execution short or avoid blocking in JNI

5. Blocking Calls, I/O, and Monitor Interaction

Blocking Operations That Suspend (Don’t Pin)

Virtual threads are suspended (unmounted) on:

  • Socket I/O: Socket.read(), write()
  • File I/O: FileInputStream.read(), FileOutputStream.write() (if async-capable)
  • Thread.sleep()
  • Lock.lock() (ReentrantLock, not synchronized)
  • Coordination primitives: CountDownLatch.await(), Semaphore.acquire()

Interaction with java.lang.Thread.currentThread()

var vt = Thread.ofVirtual().start(() -> {
    System.out.println(Thread.currentThread().isVirtual()); // true
});

Existing APIs that call currentThread() work unchanged; the virtual thread identity is preserved across suspensions.

Interaction with Exception Handling

Stack traces and exception handling remain unchanged from the application’s perspective:

try {
    socket.read(); // Virtual thread suspended here
} catch (IOException e) {
    e.printStackTrace(); // Shows correct virtual thread stack
}

6. Performance Characteristics and Limits

Memory and CPU Overhead

Metric Platform Thread Virtual Thread
Memory per thread ~1–2 MB ~200 bytes
Max scalable threads ~10K–50K 1M+
Creation overhead High (~1 µs) Very low (~100 ns)
Context-switch cost High (kernel) Low (JVM)

Throughput Example: Echo Server

Platform Threads (thread-per-request):

while (true) {
    Socket client = server.accept();
    new Thread(() -> handleClient(client)).start(); // New thread per request
}

Scales to ~1,000 concurrent connections before resource exhaustion.

Virtual Threads (virtual thread per request):

var executor = Executors.newVirtualThreadPerTaskExecutor();
while (true) {
    Socket client = server.accept();
    executor.submit(() -> handleClient(client)); // Virtual thread per request
}

Scales to 100,000+ concurrent connections on the same hardware.

Latency and Tail Latency

Virtual threads reduce tail latency for I/O-bound applications:

  • No queue-head-of-line blocking (scheduler moves to next unblocked thread)
  • Lower context-switch overhead
  • Better cache locality (virtual threads on same carrier share CPU cache)

7. Comparison with Async/Reactive Models

Reactive Approach (Vert.x, Project Reactor, RxJava)

httpClient.get("/api")
    .thenCompose(resp -> resp.bodyAsStream())
    .thenAccept(body -> process(body))
    .exceptionally(e -> {
        log.error("Failed", e);
        return null;
    });

Pros:

  • No OS thread allocation per request
  • Efficient for high-concurrency scenarios
  • Forced non-blocking discipline

Cons:

  • Complex, callback-heavy code
  • Hard to debug (stack traces fragmented)
  • Difficult error handling
  • Requires async-aware libraries

Virtual Threads Approach

try {
    var resp = httpClient.get("/api").block();
    process(resp.bodyAsStream());
} catch (IOException e) {
    log.error("Failed", e);
}

Pros:

  • Sequential, imperative code (easier to reason about)
  • Standard exception handling
  • Works with blocking libraries (no rewrites needed)
  • Better debuggability

Cons:

  • Still requires pinning-aware design
  • Overhead vs. raw reactive (though small)
  • Not suitable for CPU-bound workloads

When to Use Each

Scenario Platform Threads Virtual Threads Reactive
Small I/O-heavy service (~100 req/s) Maybe overkill
High-concurrency I/O (~1M+ open connections) ✅✅
CPU-bound or batch processing ✅ (in thread pool)
Complex logic with multiple async stages ✅✅ ✅ (with care)
Legacy code migration ✅✅

8. Current Limitations and Future Evolution

Current Limitations (Java 21–23)

  1. Pinning with synchronized
    • Virtual threads pin when holding monitors
    • Workaround: use ReentrantLock (though future improvements may help)
  2. Debugging and tooling
    • IDE and profiler support improving but not yet complete
    • Large numbers of virtual threads can overwhelm traditional debuggers
  3. Native code integration
    • JNI code that blocks or uses thread-locals can cause pinning
    • Need careful design for C++ interop
  4. Virtual thread aware libraries
    • Not all libraries optimize for virtual threads yet
    • Thread pool sizes may not adapt automatically
  5. Kernel support variance
    • Different OS I/O models (epoll, kqueue, iouring) have different efficiency

Future Evolution

Planned improvements:

  • Loom enhancements: better monitor lock handling, reducing pinning
  • Scoped values: thread-local replacement without pinning risk
  • Structured concurrency: APIs to manage task hierarchies (already in preview)
  • Foreign Function & Memory API: safer JNI replacement
  • Thread-local optimizations: avoid pinning in more scenarios

9. Practical Example: A High-Concurrency HTTP Server

import com.sun.net.httpserver.*;
import java.io.IOException;
import java.net.InetSocketAddress;
import java.util.concurrent.Executors;

public class HighConcurrencyServer {
    public static void main(String[] args) throws IOException {
        HttpServer server = HttpServer.create(
            new InetSocketAddress(8080), 
            128
        );

        // Virtual thread per request executor
        var executor = Executors.newVirtualThreadPerTaskExecutor();
        server.setExecutor(executor);

        server.createContext("/api/data", exchange -> {
            try {
                // Simulate I/O (database query, API call)
                Thread.sleep(100);
                
                String response = "{ \"status\": \"ok\" }";
                exchange.getResponseHeaders().set("Content-Type", "application/json");
                exchange.sendResponseHeaders(200, response.length());
                exchange.getResponseBody().write(response.getBytes());
                exchange.close();
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                exchange.sendResponseHeaders(500, 0);
                exchange.close();
            }
        });

        server.start();
        System.out.println("Server running on http://localhost:8080");
        System.out.println("Handling requests with virtual threads...");
    }
}

Why this scales:

  • Thousands of concurrent requests → thousands of virtual threads
  • Each request’s I/O suspension unmounts the virtual thread
  • Small pool of carrier threads handles all virtual threads
  • No memory explosion, clean code

10. Conclusion

Virtual threads represent Java’s answer to the scalability challenges of the blocking model without sacrificing code readability. By leveraging continuations and cooperative scheduling, they enable millions of lightweight, concurrent tasks on modest hardware.

Key takeaways:

  1. Virtual threads are suspended (not blocked) on I/O, freeing carrier threads
  2. Pinning is the primary gotcha; use ReentrantLock instead of synchronized
  3. They excel for I/O-bound, high-concurrency workloads
  4. Existing blocking libraries work unchanged; no callback rewrite needed
  5. They’re complementary to, not replacements for, reactive approaches

Virtual threads ensure that Java remains relevant and competitive in the era of high-concurrency, distributed systems. Java will not only survive — it will evolve.


Further reading:

Share: X (Twitter) Facebook LinkedIn