Java Concurrency: From Banking Race Conditions to Designing Systems for 10,000 TPS

Java Concurrency Deep Dive

Hello! If you are preparing for a Senior Java Developer or Tech Lead position, there is likely no topic more “hot” or frequently asked in interviews than Java Concurrency. It’s not just about whether you can write synchronized or create a Thread; it’s the story of how you design a financial system that doesn’t lose money, how you optimize an API that’s as slow as a snail, and how you prevent an entire microservices cluster from collapsing because of a single stuck thread.

In this post, I’ll walk you through a real-world banking case study, dissect the internal mechanisms, provide “production-ready” solutions, and finally, offer an interview framework to help you navigate technical rounds with confidence. Let’s dive in!

1. When Customers Lose Money: The “Duplicate Transaction” Problem

Imagine you are a backend engineer at X Bank. On a fine morning of March 15th, the system alerts: a customer has been charged twice for the same 50 million VND transfer. The most severe error in finance has occurred.

The technical team immediately traces the logs and discovers a classic scenario:

10:23:45.001 - Thread-A: BEGIN transfer accountId=12345, amount=50,000,000
10:23:45.002 - Thread-B: BEGIN transfer accountId=12345, amount=50,000,000 ← Mobile app auto-retries due to lag
10:23:45.010 - Thread-A: SELECT balance = 200,000,000
10:23:45.011 - Thread-B: SELECT balance = 200,000,000 ← Both read the balance before anyone writes
10:23:45.050 - Thread-A: UPDATE balance = 150,000,000 → COMMIT
10:23:45.051 - Thread-B: UPDATE balance = 150,000,000 → COMMIT ← 50 MILLION LOST!

Before diving into the solution, ask yourself three key questions:

Is this a Race Condition or a Deadlock? What is the core difference?
If you only use synchronized in Java, will the problem be solved?
If the system runs on 3 parallel instances (horizontal scaling), is synchronized still effective?

(Hint: The answers are in section 3.2. But try to answer them yourself before scrolling down).

2. Mental Model: What’s Really Happening Under the Hood?

To handle concurrency professionally, you need a proper mental model of how the JVM and OS interact.

2.1. Threads Are Not “Units of Work”

A thread is actually an Operating System Scheduler unit. Problems arise from how memory is shared within a Java process:

JVM Process
├── Heap (Shared Memory — THE ROOT OF ALL EVIL)
│   ├── Object instances (e.g., Account object)
│   └── Static fields
└── Per-Thread (Private Memory — SAFE BY DEFAULT)
    ├── Stack (Local variables, method calls)
    ├── Program Counter
    └── Native Stack

Real-world analogy: The Heap is like the bank’s shared vault. The Stack is like the private wallet in each employee’s pocket. A Race Condition occurs when two employees open the vault at the same time, both see 10 billion inside, and both withdraw 5 billion without telling each other.

2.2. Race Condition vs. Deadlock: The Invisible Thief and the Open Robber

These two concepts are often confused. Let’s distinguish them:

Criteria	Race Condition (Invisible Thief)	Deadlock (Open Robber)
Definition	Multiple threads access shared data; the result depends on luck (CPU timing).	Threads hold resources and wait for each other forever.
Consequence	Data Corruption (Wrong balance, double charge). Very hard to detect.	System Freeze (App hangs, timeouts).
Detection	Intermittent bugs, hard to reproduce. Requires code review or static analysis.	Thread Dumps show `BLOCKED` or `WAITING` states.
Banking Example	2 threads read `balance = 100`, subtract `20` -> write `80` (losing `20`).	Thread A holds lock on Account X, waits for Y. Thread B holds Y, waits for X.

Conditions for a Race Condition to occur:

Shared Mutable Data: Changeable data shared by many.
Multiple Threads: Two or more threads.
At Least One Writer: At least one thread performing a write operation.

Interview Tip: To eliminate a Race Condition, you only need to break one of these three conditions.

2.3. Thread Pool: Don’t Just Use `Executors` and Leave It

ThreadPoolExecutor is the heart of any high-performance Java application. Its mechanism is not simple:

Task Queue                       Worker Threads
    │                                 │
    ▼                                 ▼
[ New Task ] ─────► [ Core Threads busy? ] ──────► [ Queue has space? ]
    │                      │                            │
    │                      ▼                            ▼
    │                [Wait in Queue]          [Create thread up to maxPoolSize]
    │                                                   │
    └────────────────────────── If Max + Queue Full ──► [ Rejected Handler ]

The 4 Golden Parameters of ThreadPoolExecutor:

corePoolSize: “Permanent” soldiers always ready for duty.
maxPoolSize: Maximum soldiers that can be mobilized during overload.
keepAliveTime: How long “temporary” soldiers (above core) stay idle before being dismissed.
workQueue: The type of queue used. This is where OOM traps are born.

2.4. CompletableFuture: From Synchronous to Reactive

CompletableFuture is the “secret weapon” for asynchronous programming in Java. It turns sequential API calls (taking total time) into parallel calls (taking only the time of the slowest API).

Remember this vital difference:

thenApply(): Runs on the same thread as the previous task (if it’s already finished).
thenApplyAsync(): Always runs on a different thread from the pool (adds context-switching cost but doesn’t block the caller).

3. Production-Grade Implementation: Coding Like a Lead Engineer

Here are real-world code snippets you should bring into your projects.

3.1. Standard Thread Pool Configuration for Banking Services

Anti-pattern: ExecutorService pool = Executors.newFixedThreadPool(10); Why is it wrong? Its unbounded LinkedBlockingQueue will swallow memory during system overload, leading to OutOfMemoryError and app crashes.

@Configuration
public class ThreadPoolConfig {

    @Bean("transferExecutor")
    public Executor transferExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();

        // IO-bound task: Rule of thumb = N_CPU * 2
        executor.setCorePoolSize(10);
        executor.setMaxPoolSize(50);
        
        // BOUNDED queue to detect bottlenecks early
        executor.setQueueCapacity(200);
        
        executor.setKeepAliveSeconds(60);
        executor.setThreadNamePrefix("transfer-"); // Crucial for debugging logs
        
        // Backpressure: When queue is full, the HTTP thread itself runs the task -> slows down input
        executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
        
        // Wait for pending transactions to finish before shutting down
        executor.setWaitForTasksToCompleteOnShutdown(true);
        executor.setAwaitTerminationSeconds(30);
        
        executor.initialize();
        return executor;
    }
}

3.2. Preventing “Duplicate Transactions” with Database Locking

Synchronized cannot save you in a multi-instance (microservices) environment. The correct solution is to use Pessimistic Locking at the Database level or a Distributed Lock (Redis). Here is the Spring Data JPA implementation:

@Service
@Slf4j
public class TransferService {

    @Transactional(isolation = Isolation.READ_COMMITTED)
    public TransferResult transfer(TransferRequest request) {
        
        // 1. Idempotency Check FIRST to reduce lock pressure
        Optional<Transaction> existing = transactionRepository
                .findByIdempotencyKey(request.getIdempotencyKey());
        if (existing.isPresent()) {
            return TransferResult.fromExisting(existing.get());
        }

        // 2. Sort IDs to lock in a fixed order -> Prevents Deadlock
        Long firstId = Math.min(request.getFromAccountId(), request.getToAccountId());
        Long secondId = Math.max(request.getFromAccountId(), request.getToAccountId());

        // 3. SELECT ... FOR UPDATE (Pessimistic Lock)
        Account fromAccount = accountRepository.findByIdWithLock(firstId)
                .orElseThrow(() -> new AccountNotFoundException(firstId));
        Account toAccount = accountRepository.findByIdWithLock(secondId)
                .orElseThrow(() -> new AccountNotFoundException(secondId));

        // 4. Validate business logic
        if (fromAccount.getBalance().compareTo(request.getAmount()) < 0) {
            throw new InsufficientBalanceException();
        }

        // 5. Execute write
        fromAccount.debit(request.getAmount());
        toAccount.credit(request.getAmount());
        
        accountRepository.save(fromAccount);
        accountRepository.save(toAccount);
        
        // 6. Save transaction log
        Transaction tx = Transaction.builder()
                .idempotencyKey(request.getIdempotencyKey())
                .status(TransactionStatus.COMPLETED)
                .build();
        transactionRepository.save(tx);

        return TransferResult.success(tx);
    }
}

// Repository Interface
public interface AccountRepository extends JpaRepository<Account, Long> {
    @Lock(LockModeType.PESSIMISTIC_WRITE) 
    @Query("SELECT a FROM Account a WHERE a.id = :id")
    Optional<Account> findByIdWithLock(@Param("id") Long id);
}

3.3. API Acceleration Pattern: Parallel Enrichment

Instead of calling CustomerService (300ms) -> TransactionService (200ms) -> CreditService (150ms) = 650ms, run them in parallel. Response time drops to only 300ms.

@Service
public class CustomerDashboardService {

    private final Executor dashboardExecutor;

    public CustomerDashboard getDashboard(Long customerId) {
        // Call 3 services IN PARALLEL
        CompletableFuture<CustomerInfo> customerFuture = CompletableFuture
                .supplyAsync(() -> customerService.getCustomer(customerId), dashboardExecutor)
                .exceptionally(ex -> { 
                    log.error("Customer service down", ex);
                    return CustomerInfo.empty(); // Graceful degradation
                });

        CompletableFuture<List<Transaction>> txFuture = CompletableFuture
                .supplyAsync(() -> transactionService.getRecentTransactions(customerId), dashboardExecutor)
                .orTimeout(2, TimeUnit.SECONDS) // Don't wait forever
                .exceptionally(ex -> Collections.emptyList());

        // Wait for all to complete (even if some fail)
        CompletableFuture.allOf(customerFuture, txFuture).join();

        return CustomerDashboard.builder()
                .customer(customerFuture.join())
                .transactions(txFuture.join())
                .build();
    }
}

4. Trade-offs and Anti-patterns: Pitfalls on the Road

Scenario	Optimal Solution	Trade-off Reason
Single JVM, low conflict	`synchronized` or `ReentrantLock`	Simple, very low overhead.
High Read, Low Write	`ReadWriteLock`	Allows multiple Readers in parallel, only blocks the Writer.
Simple Counter	`AtomicInteger` / `AtomicLong`	Uses CPU CAS (Compare-And-Swap), faster than locking (lock-free).
Multi-JVM (Microservices)	Redis Distributed Lock / DB Lock	`synchronized` is only effective within 1 JVM.
Complex Workflow, Async	`CompletableFuture`	Excellent pipeline and error management.

Top 3 Deadly Anti-patterns

Calling .get() or .join() inside an HTTP Request Thread:

    // ❌ WRONG: Blocks Servlet Container thread -> Kills Throughput
    @GetMapping("/data")
    public Data getData() {
        return future.get(); 
    }
    
    // ✅ RIGHT: Return CompletableFuture for Spring MVC to handle Async
    @GetMapping("/data")
    public CompletableFuture<Data> getData() {
        return service.getDataAsync();
    }

Shared Mutable State in a Stateless Bean:

    // ❌ WRONG: Spring Bean is a Singleton -> instance variables are shared state
    @Service
    public class Calculator {
        private BigDecimal result; // Race Condition!
    }
    
    // ✅ RIGHT: Always use local variables
    @Service
    public class Calculator {
        public BigDecimal calc() {
            BigDecimal result = BigDecimal.ZERO; // Stack memory -> Thread-safe
            return result;
        }
    }

Forgetting .remove() for ThreadLocal: When using a Thread Pool, threads are reused. If you don’t remove() data in a ThreadLocal (e.g., UserContext), Request B might accidentally read sensitive data from Request A. Always use try-finally to clean up.

5. Interview Framework

When asked about Concurrency in a Senior interview, don’t just give definitions. Use the Tier 1 - Tier 2 - Tier 3 structure to demonstrate depth.

Tier 1: Surface Level (What a Junior knows)

Q: What is the difference between synchronized and ReentrantLock? A: synchronized is managed by the JVM and easy to use but lacks timeouts and cannot be interrupted. ReentrantLock is more flexible with tryLock(timeout) and lockInterruptibly(). In financial systems, I prefer ReentrantLock to set timeouts and avoid infinite deadlocks.

Tier 2: Deep Dive (Mid vs. Senior distinction)

Q: Why is volatile not enough to protect count++? A: volatile only solves the Visibility problem (ensuring the latest value from Main Memory). But count++ consists of 3 non-atomic steps (Read-Modify-Write). If Threads A and B read 5 simultaneously, both increment to 6 and write back, the final result is 6 (losing one increment). To fix this, one must use AtomicInteger with the CAS mechanism (lock-free but atomic).

Tier 3: Architecture (Senior/Lead)

Q: Design a system to handle 10,000 concurrent payment transactions, requiring no money loss, no duplicates, and P99 latency < 2s. Approach:

Clarify: Multi-region? External Gateway timeouts?
Bottlenecks: Usually not CPU, but Database Connection Pool and External API Rate Limits.
Design:
- Ingress: Nginx -> API Gateway (Spring WebFlux or Servlet Async).
- Queue: Kafka/RabbitMQ to decouple request reception and processing.
- Processing: Worker Pool processing, checking Idempotency Key (Redis/Database).
- Locking: Use Pessimistic Locking in DB for high-value accounts, Optimistic Locking (versioning) for accounts with low contention.
- Circuit Breaker: Use Resilience4j for external gateways.

6. Proficiency Checklist

Have you truly mastered Java Concurrency? Try explaining these 5 things without looking at documentation:

1. Distinguish Race Condition and Deadlock: Give real-world examples and fixes.
2. Thread Pool Mechanism: Draw the Core -> Queue -> Max -> Reject flow and explain why Executors.newCachedThreadPool() causes OOM.
3. volatile vs synchronized vs Atomic: Explain the difference between Visibility and Atomicity.
4. CompletableFuture Pipeline: Write code to call 3 services in parallel with individual timeouts, where one service failing doesn’t fail the whole request.
5. Banking Concurrency Solution: Explain why synchronized fails in the Cloud and how to coordinate Idempotency Key + Pessimistic DB Lock.

Final Thoughts

Java Concurrency isn’t scary if you understand that the core problem is Shared Mutable State. Always ask yourself: “If 1,000 people hit this button at once, what happens?“. Good luck writing thread-safe code and acing those tough interviews!