Episode - Race Conditions in SQL and ACID Properties — The Complete Deep Dive

Episode - Race Conditions in SQL and ACID Properties — The Complete Deep Dive

Hey everyone! Welcome back to Namaste JavaScript. Today we go extremely deep into two of the most important and most asked topics in SQL interviews — Race Conditions and ACID Properties!

These topics are asked in every senior developer interview. Most people give surface-level answers. By the end of this blog, you will give answers that make interviewers nod their heads!

What we will cover:

  • What is a Race Condition in SQL?
  • The 4 Concurrency Problems — Lost Update, Dirty Read, Non-Repeatable Read, Phantom Read
  • Locks in SQL — Shared Lock vs Exclusive Lock
  • Optimistic vs Pessimistic Locking
  • SELECT FOR UPDATE and SELECT FOR SHARE
  • Deadlock — What it is, How it Happens, How to Prevent
  • ACID — Deep Dive into All 4 Properties
  • Isolation Levels — The Bridge Between Race Conditions and ACID
  • Real-World Scenarios
  • Interview Questions

What is a Race Condition in SQL?

A race condition happens when two or more transactions access the same data at the same time, and the final result depends on the order in which they execute — leading to incorrect or unexpected data.

Simple Real-World Analogy:
===========================

Imagine a bank account with balance = ₹1000.

Two people withdraw ₹800 at the SAME TIME from two ATMs:

ATM 1 (Transaction 1):          ATM 2 (Transaction 2):
  READ balance → 1000              READ balance → 1000
  Check: 1000 >= 800? YES          Check: 1000 >= 800? YES
  WRITE balance = 1000 - 800       WRITE balance = 1000 - 800
  balance = 200                    balance = 200

Final balance in DB: ₹200

But ₹1600 was withdrawn from a ₹1000 account!
The bank LOST ₹600!

This is a RACE CONDITION!
Both transactions read the OLD value before either could write!
Why do Race Conditions happen?
================================

Databases handle MANY users simultaneously (concurrent access).
Without proper controls:
  Transaction A reads data
  Transaction B reads SAME data
  Transaction A writes updated data
  Transaction B writes updated data (based on STALE read!)
  → Transaction A's update is LOST!

This is why we need:
  ✅ Transactions
  ✅ Locking
  ✅ Isolation Levels
  ✅ ACID properties

The 4 Concurrency Problems — What Interviews ALWAYS Ask

Problem 1: Lost Update

Two transactions read and write the same row.
One transaction's update OVERWRITES the other's.

accounts table:
  acc_id | balance
    1    | 1000

Timeline:
──────────────────────────────────────────────────────────
Time | Transaction 1 (T1)              | Transaction 2 (T2)
──────────────────────────────────────────────────────────
 1   | BEGIN                           |
 2   |                                 | BEGIN
 3   | SELECT balance FROM accounts    |
 4   | → reads 1000                    |
 5   |                                 | SELECT balance FROM accounts
 6   |                                 | → reads 1000
 7   | UPDATE accounts                 |
     | SET balance = 1000 - 200 = 800  |
 8   | COMMIT                          |
 9   |                                 | UPDATE accounts
     |                                 | SET balance = 1000 - 500 = 500
10   |                                 | COMMIT

Final balance: 500  ❌ WRONG!
Expected:      1000 - 200 - 500 = 300

T1's update (deducting 200) was LOST because T2 read the old value!
T2 overwrote T1's work.

This is called a LOST UPDATE.

Problem 2: Dirty Read

Transaction 1 reads data that Transaction 2 has MODIFIED
but NOT YET COMMITTED. If T2 rolls back → T1 read invalid data!

Timeline:
──────────────────────────────────────────────────────────
Time | Transaction 1 (T1)              | Transaction 2 (T2)
──────────────────────────────────────────────────────────
 1   |                                 | BEGIN
 2   |                                 | UPDATE accounts
     |                                 | SET balance = 5000  ← not committed!
 3   | BEGIN                           |
 4   | SELECT balance FROM accounts    |
 5   | → reads 5000  ← DIRTY READ!     |
 6   | Makes decisions based on 5000   |
 7   |                                 | ROLLBACK  ← T2 undoes change!
 8   | COMMIT                          |
     | (based on data that never existed!)

T1 read balance = 5000 (which was never committed).
T2 rolled back → balance is still original value.
T1 made decisions based on PHANTOM data!

This is called a DIRTY READ.
Only possible at READ UNCOMMITTED isolation level.

Problem 3: Non-Repeatable Read

Same query gives DIFFERENT results within the same transaction
because another transaction MODIFIED the data between the two reads.

Timeline:
──────────────────────────────────────────────────────────
Time | Transaction 1 (T1)              | Transaction 2 (T2)
──────────────────────────────────────────────────────────
 1   | BEGIN                           |
 2   | SELECT balance FROM accounts    |
 3   | → reads 1000                    |
 4   | (doing some processing...)      | BEGIN
 5   |                                 | UPDATE accounts SET balance = 2000
 6   |                                 | COMMIT
 7   | SELECT balance FROM accounts    |
 8   | → reads 2000  ← DIFFERENT!      |
 9   | COMMIT                          |

T1 reads balance TWICE in the same transaction.
Gets 1000 first, then 2000!
The ROW was UPDATED between reads.

This is called a NON-REPEATABLE READ.
Prevented by REPEATABLE READ isolation level and above.

Problem 4: Phantom Read

Same query returns DIFFERENT ROWS within the same transaction
because another transaction INSERTED or DELETED rows between reads.

Timeline:
──────────────────────────────────────────────────────────
Time | Transaction 1 (T1)              | Transaction 2 (T2)
──────────────────────────────────────────────────────────
 1   | BEGIN                           |
 2   | SELECT COUNT(*) FROM employees  |
 3   | WHERE salary > 50000            |
 4   | → returns 4                     |
 5   |                                 | BEGIN
 6   |                                 | INSERT INTO employees
     |                                 | VALUES (7, 'John', 10, 70000, 1)
 7   |                                 | COMMIT
 8   | SELECT COUNT(*) FROM employees  |
 9   | WHERE salary > 50000            |
10   | → returns 5  ← PHANTOM ROW!    |
11   | COMMIT                          |

T1 counts 4 employees, then 5 in the SAME transaction!
A NEW ROW "appeared" (phantom) — inserted by T2.

This is called a PHANTOM READ.
Only prevented by SERIALIZABLE isolation level.
Summary of All 4 Problems:
============================

Problem              What Happens                          Caused By
──────────────────────────────────────────────────────────────────────
Lost Update          T2 overwrites T1's uncommitted write  Concurrent writes
Dirty Read           T1 reads T2's uncommitted change      READ UNCOMMITTED
Non-Repeatable Read  Same row read twice, gives diff value  UPDATE by T2 between reads
Phantom Read         Same query returns diff row count     INSERT/DELETE by T2 between reads

Locks in SQL — How Databases Protect Data

Locks are the PRIMARY mechanism to prevent race conditions.
A lock prevents other transactions from accessing the same data
until the current transaction is done.

TWO MAIN LOCK TYPES:
======================

1. SHARED LOCK (S-Lock / Read Lock)
   - Multiple transactions CAN hold shared locks on the same row
   - A transaction with S-Lock CAN READ the row
   - A transaction with S-Lock CANNOT WRITE the row
   - Another transaction CAN also get S-Lock (reads are safe to share)
   - Another transaction CANNOT get X-Lock while S-Lock exists

2. EXCLUSIVE LOCK (X-Lock / Write Lock)
   - Only ONE transaction can hold an X-Lock on a row at a time
   - A transaction with X-Lock CAN READ and WRITE the row
   - No other transaction can get ANY lock (S or X) on that row
   - All other requests wait until X-Lock is released

Lock Compatibility Matrix:
============================

               Existing Lock
               NONE    S-Lock   X-Lock
Requested  ┌─────────────────────────────
S-Lock     │  ✅ Yes   ✅ Yes   ❌ WAIT
X-Lock     │  ✅ Yes   ❌ WAIT  ❌ WAIT

Multiple readers = fine (S+S compatible)
Writer needs exclusive access (X blocks everyone)
More Lock Types:
=================

INTENT LOCKS (database-managed, not user-facing):
  IS (Intent Shared)  → planning to read rows in this table
  IX (Intent Exclusive) → planning to write rows in this table
  SIX (Shared + Intent Exclusive) → reading table, writing some rows
  These are set on TABLE level so DB doesn't scan all row locks!

ROW-LEVEL LOCK → locks a single row
TABLE-LEVEL LOCK → locks the entire table (faster to acquire, blocks everyone!)
PAGE-LEVEL LOCK → locks a page of data (between row and table)

-- Check current locks in MySQL:
SELECT * FROM information_schema.INNODB_LOCKS;
SELECT * FROM performance_schema.data_locks;

-- Check blocked transactions:
SELECT * FROM information_schema.INNODB_LOCK_WAITS;

Optimistic vs Pessimistic Locking

Two strategies to handle concurrency — both valid, each with tradeoffs!

Pessimistic Locking

Assumption: Conflicts WILL happen. Lock data upfront.
Strategy: Acquire lock BEFORE reading, hold until done.

"I'm going to modify this row — nobody else touch it!"

-- Implementation with SELECT FOR UPDATE
BEGIN;

SELECT balance FROM accounts
WHERE acc_id = 1
FOR UPDATE;   ← Acquires X-Lock on this row!

-- Other transactions trying to read/write acc_id=1 will WAIT here!

UPDATE accounts SET balance = balance - 200 WHERE acc_id = 1;
COMMIT;       ← Lock released!

Timeline with Pessimistic Locking:
────────────────────────────────────────────────────────────
Time | Transaction 1 (T1)              | Transaction 2 (T2)
────────────────────────────────────────────────────────────
 1   | BEGIN                           |
 2   | SELECT balance ... FOR UPDATE   |  ← T1 acquires X-Lock
 3   | → reads 1000 (locked!)          |
 4   |                                 | BEGIN
 5   |                                 | SELECT balance ... FOR UPDATE
 6   |                                 | → BLOCKED! Waits for T1...
 7   | UPDATE balance = 800            |
 8   | COMMIT (lock released)          |
 9   |                                 | → Unblocked! reads 800
10   |                                 | UPDATE balance = 800 - 500 = 300
11   |                                 | COMMIT

Final balance: 300 ✅ CORRECT! (1000 - 200 - 500 = 300)

WHEN TO USE Pessimistic Locking:
- High conflict scenarios (many users writing the same row)
- Financial systems (bank accounts, inventory)
- When incorrect data is very costly
- Short transactions (lock time is brief)

Optimistic Locking

Assumption: Conflicts are RARE. Don't lock upfront.
Strategy: Read freely. Before writing, CHECK if data changed.
          If changed → RETRY. If not → write safely.

"I'll read and work, then check at the end if anyone changed it."

Implementation: Add a 'version' column (or timestamp) to the table.

-- Table structure
ALTER TABLE accounts ADD COLUMN version INT DEFAULT 0;

-- Step 1: Read with version number
SELECT balance, version FROM accounts WHERE acc_id = 1;
→ balance = 1000, version = 5

-- Step 2: Do your calculations...

-- Step 3: Write with version check
UPDATE accounts
SET balance = 800, version = version + 1
WHERE acc_id = 1
AND version = 5;   ← ONLY update if version hasn't changed!

-- Step 4: Check rows affected
-- If affected_rows = 1 → success! No conflict.
-- If affected_rows = 0 → someone else changed it! RETRY!

Timeline with Optimistic Locking:
────────────────────────────────────────────────────────────
Time | Transaction 1 (T1)              | Transaction 2 (T2)
────────────────────────────────────────────────────────────
 1   | SELECT bal=1000, ver=5          | SELECT bal=1000, ver=5
 2   | (processing...)                 | (processing...)
 3   | UPDATE SET bal=800, ver=6       |
 4   | WHERE ver=5 → SUCCESS ✅         |
 5   |                                 | UPDATE SET bal=500, ver=6
 6   |                                 | WHERE ver=5 → 0 rows! ❌
 7   |                                 | RETRY: re-read bal=800, ver=6
 8   |                                 | UPDATE SET bal=300, ver=7
 9   |                                 | WHERE ver=6 → SUCCESS ✅

Final balance: 300 ✅ CORRECT!

WHEN TO USE Optimistic Locking:
- Low conflict scenarios (users rarely write the same row)
- Read-heavy workloads
- Long transactions (don't want to hold lock for long time)
- Distributed systems / microservices
- Web applications where user edits a form over several seconds
Pessimistic vs Optimistic — Summary:
=======================================

Feature           Pessimistic Locking        Optimistic Locking
─────────────────────────────────────────────────────────────────
Lock acquired     At READ time               Never (check at write)
Conflicts         Prevented upfront          Detected at write time
Performance       Lower (waiting)            Higher (no blocking)
Best for          High conflict              Low conflict
Risk              Deadlock possible          Retry storms possible
Implementation    SELECT FOR UPDATE          Version column + check
Database support  Native (all DB)            Application-level (usually)

SELECT FOR UPDATE and SELECT FOR SHARE

SELECT FOR UPDATE:
===================
Acquires an EXCLUSIVE lock on selected rows.
Other transactions CANNOT read (with lock) or write these rows.
Used when you plan to UPDATE after reading.

BEGIN;
SELECT * FROM inventory
WHERE product_id = 101
FOR UPDATE;               ← X-Lock on this row

-- Other transactions will WAIT here if they try to SELECT FOR UPDATE
-- or UPDATE the same row

UPDATE inventory
SET quantity = quantity - 1
WHERE product_id = 101;

COMMIT;

-- Variants:
SELECT ... FOR UPDATE NOWAIT;         ← fail immediately if can't lock (don't wait)
SELECT ... FOR UPDATE SKIP LOCKED;    ← skip locked rows, process others
SELECT ... FOR UPDATE OF table_name;  ← lock only specific table in JOIN

-- SKIP LOCKED is great for job queues!
BEGIN;
SELECT id, task FROM job_queue
WHERE status = 'pending'
LIMIT 1
FOR UPDATE SKIP LOCKED;   ← grab any unlocked job, skip locked ones
-- Multiple workers can process different jobs in parallel!
UPDATE job_queue SET status = 'processing' WHERE id = ?;
COMMIT;
SELECT FOR SHARE (SELECT LOCK IN SHARE MODE in MySQL):
=======================================================
Acquires a SHARED lock on selected rows.
Other transactions CAN read (with or without lock).
Other transactions CANNOT write these rows.
Used when you are reading data that another transaction shouldn't change
while you're doing your calculation.

BEGIN;
SELECT balance FROM accounts
WHERE acc_id = 1
FOR SHARE;               ← S-Lock on this row

-- Other transactions CAN also SELECT FOR SHARE (reads fine)
-- Other transactions CANNOT UPDATE or SELECT FOR UPDATE (write blocked)

-- Example: calculate interest before deciding whether to charge
-- You don't want balance to change while you calculate!

COMMIT;

FOR UPDATE vs FOR SHARE:
──────────────────────────────────────────────────────────
                  FOR UPDATE (X-Lock)    FOR SHARE (S-Lock)
Other can READ    ❌ WAIT               ✅ Yes (without lock)
Other can WRITE   ❌ WAIT               ❌ WAIT
Use when          You WILL update       You just need stable read
Blocks            Everything            Only writes

Deadlock — When Two Transactions Block Each Other Forever

A DEADLOCK occurs when:
  Transaction 1 holds Lock A, waiting for Lock B
  Transaction 2 holds Lock B, waiting for Lock A
  Neither can proceed → Both wait FOREVER!

Classic Deadlock Scenario:
============================

 T1 holds lock on Row A                T2 holds lock on Row B
 T1 wants lock on Row B         ←→     T2 wants lock on Row A

Timeline:
───────────────────────────────────────────────────────────────
Time | Transaction 1 (T1)              | Transaction 2 (T2)
───────────────────────────────────────────────────────────────
 1   | BEGIN                           | BEGIN
 2   | UPDATE accounts SET...          |
     | WHERE acc_id = 1  ← LOCKS Row 1 |
 3   |                                 | UPDATE accounts SET...
     |                                 | WHERE acc_id = 2  ← LOCKS Row 2
 4   | UPDATE accounts SET...          |
     | WHERE acc_id = 2   ← WAITS...   |
 5   |                                 | UPDATE accounts SET...
     |                                 | WHERE acc_id = 1   ← WAITS...
 6   | ← DEADLOCK! Both waiting forever! →

T1 waits for T2 to release Row 2
T2 waits for T1 to release Row 1
Nobody releases → DEADLOCK!

Database detects this (usually within 1-5 seconds) and:
→ Picks ONE transaction as the "deadlock victim"
→ ROLLS BACK the victim transaction
→ The other transaction can proceed

MySQL error: ERROR 1213 (40001): Deadlock found when trying to get lock;
             try restarting transaction
How to DETECT Deadlocks:
=========================

-- MySQL: View last deadlock details
SHOW ENGINE INNODB STATUS;
-- Look for LATEST DETECTED DEADLOCK section

-- PostgreSQL: Enable deadlock logging
SET deadlock_timeout = '1s';
-- Check pg_stat_activity for waiting queries
SELECT pid, wait_event_type, wait_event, query
FROM pg_stat_activity
WHERE wait_event_type = 'Lock';
How to PREVENT Deadlocks:
==========================

1. ACCESS RESOURCES IN THE SAME ORDER (Most Important Fix!)
   If T1 always locks Row 1 then Row 2,
   and T2 also always locks Row 1 then Row 2,
   deadlock CANNOT happen!

   -- ❌ Inconsistent order → DEADLOCK possible
   T1: Lock Row A → Lock Row B
   T2: Lock Row B → Lock Row A  ← opposite order!

   -- ✅ Consistent order → NO DEADLOCK
   T1: Lock Row A → Lock Row B
   T2: Lock Row A → Lock Row B  ← same order!

   -- In code: always sort IDs before locking
   const ids = [userId1, userId2].sort((a, b) => a - b);
   -- Lock ids[0] first, then ids[1]

2. KEEP TRANSACTIONS SHORT
   The longer a transaction holds locks, the higher chance of deadlock.
   Don't do heavy computation inside a transaction!

   -- ❌ BAD: heavy work inside transaction
   BEGIN;
   SELECT ... FOR UPDATE;
   doExpensiveCalculation();  ← holds lock during this!
   UPDATE ...;
   COMMIT;

   -- ✅ GOOD: compute first, then transact
   const result = doExpensiveCalculation();
   BEGIN;
   SELECT ... FOR UPDATE;
   UPDATE ... SET value = result;
   COMMIT;

3. USE SELECT FOR UPDATE NOWAIT (fail fast instead of waiting)
   If lock unavailable → fail immediately → retry later
   Rather than waiting and risking deadlock:

   BEGIN;
   SELECT * FROM accounts WHERE acc_id = 1 FOR UPDATE NOWAIT;
   -- If locked: immediately throws error → application retries
   COMMIT;

4. USE LOWER ISOLATION LEVEL WHERE POSSIBLE
   READ COMMITTED reduces lock holding time vs SERIALIZABLE
   Less locking → less chance of deadlock

5. DEADLOCK RETRY LOGIC IN APPLICATION
   Always implement retry on deadlock error:

   async function transferWithRetry(fromId, toId, amount, retries = 3) {
       for (let i = 0; i < retries; i++) {
           try {
               await db.transaction(async (trx) => {
                   // Lock in SORTED order to prevent deadlock!
                   const [first, second] = [fromId, toId].sort((a,b) => a-b);
                   await trx.raw('SELECT * FROM accounts WHERE id = ? FOR UPDATE', [first]);
                   await trx.raw('SELECT * FROM accounts WHERE id = ? FOR UPDATE', [second]);
                   await trx('accounts').where('id', fromId).decrement('balance', amount);
                   await trx('accounts').where('id', toId).increment('balance', amount);
               });
               return;  // Success!
           } catch (err) {
               if (err.code === 'ER_LOCK_DEADLOCK' && i < retries - 1) {
                   await sleep(100 * (i + 1));  // Wait before retry
                   continue;
               }
               throw err;
           }
       }
   }

ACID Properties — The Complete Deep Dive

ACID is what makes a database trustworthy. Every property solves a specific type of failure. Let's go deep into each one!

ACID = Atomicity + Consistency + Isolation + Durability

Each property protects against a different category of failure:

Atomicity    → protects against PARTIAL FAILURES (crash mid-transaction)
Consistency  → protects against INVALID STATES (constraint violations)
Isolation    → protects against CONCURRENCY PROBLEMS (race conditions)
Durability   → protects against DATA LOSS (crash after commit)

A — Atomicity

Definition: A transaction is ALL or NOTHING.
Every operation in a transaction either ALL SUCCEED or ALL FAIL.
There is NO partial success.

Real-World: Bank Transfer
==========================

BEGIN TRANSACTION;
  UPDATE accounts SET balance = balance - 1000 WHERE acc_id = 101;  -- Step 1
  UPDATE accounts SET balance = balance + 1000 WHERE acc_id = 202;  -- Step 2
COMMIT;

Scenario 1: Both succeed → COMMIT → both changes saved ✅
Scenario 2: Step 1 succeeds, Step 2 FAILS (server crash, constraint error)
            → AUTOMATIC ROLLBACK → Step 1 is UNDONE ✅
            → Account 101 gets its ₹1000 back
            → Database is back to the state before the transaction

WITHOUT ATOMICITY:
  Step 1 succeeds (₹1000 deducted from 101)
  Server crashes
  Step 2 never runs (₹1000 never added to 202)
  ₹1000 has VANISHED from the system! 😱

How Atomicity is implemented:
================================
Database keeps a UNDO LOG (rollback log).
Every change writes the BEFORE image to the undo log first.
If transaction fails → replay undo log in reverse → restore original state.

ROLLBACK LOG EXAMPLE:
  Before: acc_101.balance = 5000, acc_202.balance = 2000
  Step 1 runs: acc_101.balance = 4000
    Undo log: "acc_101 was 5000"
  CRASH! → Read undo log → SET acc_101.balance = 5000 ← RESTORED!

All-or-nothing is guaranteed by the undo log + ROLLBACK command.

C — Consistency

Definition: A transaction brings the database from one VALID state
to another VALID state. All defined rules and constraints are satisfied.

Consistency is about the RULES of your data — not just about transactions.
The database must ALWAYS obey:
  - NOT NULL constraints
  - UNIQUE constraints
  - FOREIGN KEY constraints
  - CHECK constraints
  - Triggers
  - Application-defined business rules

Example — Consistency in Action:
==================================

CREATE TABLE accounts (
    acc_id INT PRIMARY KEY,
    balance DECIMAL(10,2) NOT NULL CHECK (balance >= 0),  ← can't go negative!
    owner_id INT REFERENCES users(user_id)                ← FK constraint
);

BEGIN TRANSACTION;
UPDATE accounts SET balance = balance - 5000 WHERE acc_id = 101;
-- Current balance is ₹2000. After deducting ₹5000 → -₹3000!
-- CHECK (balance >= 0) is VIOLATED!
-- → Transaction is ROLLED BACK automatically!
-- → Database stays at ₹2000 (valid state) ✅

WHO enforces consistency?
  Database: constraints, foreign keys, triggers
  Application: business logic before writing to DB
  Both working together = full consistency

Example — Consistency with Foreign Keys:
==========================================
BEGIN;
INSERT INTO orders (order_id, customer_id, amount)
VALUES (1001, 9999, 5000);  ← customer_id 9999 doesn't exist!
-- FOREIGN KEY violation!
-- → Transaction rolled back
-- → No orphaned order in the system ✅
COMMIT;

IMPORTANT: Consistency is the ONLY ACID property NOT enforced
solely by the DB engine. It also requires correct application logic!

I — Isolation

Definition: Concurrent transactions execute as if they were SERIAL
(one after another). Each transaction is ISOLATED from others.

Isolation is the property that DIRECTLY prevents race conditions.
But full isolation (serializable) has a performance cost.
SQL gives you 4 isolation LEVELS to choose the right tradeoff.

THE 4 ISOLATION LEVELS:
=========================

Level 1: READ UNCOMMITTED (weakest — fastest)
  Can see OTHER transactions' uncommitted changes.
  All 4 problems possible: dirty read, non-repeatable, phantom, lost update.
  Almost never used in production.

  Transaction 1                    Transaction 2
  BEGIN                            BEGIN
                                   UPDATE balance = 9999 (not committed)
  SELECT balance → 9999  ← DIRTY READ!
                                   ROLLBACK
  -- T1 read data that never existed!

Level 2: READ COMMITTED (most common default)
  Only sees COMMITTED data from other transactions.
  Dirty reads prevented. Non-repeatable and phantom still possible.
  Default in: PostgreSQL, Oracle, SQL Server, DB2.

  Transaction 1                    Transaction 2
  BEGIN                            BEGIN
  SELECT balance → 1000            UPDATE balance = 2000
                                   COMMIT
  SELECT balance → 2000  ← CHANGED! (non-repeatable read)
  COMMIT

Level 3: REPEATABLE READ
  Rows you've read stay stable within your transaction.
  Dirty + non-repeatable reads prevented. Phantom reads still possible.
  Default in: MySQL (InnoDB uses gap locks to also prevent phantoms!).

  Transaction 1                    Transaction 2
  BEGIN                            BEGIN
  SELECT balance → 1000
                                   UPDATE balance = 2000 COMMIT
  SELECT balance → 1000  ← SAME! Stable read ✅
                                   INSERT new employee COMMIT
  SELECT COUNT(*) → 5  ← PHANTOM! (new row appeared) ⚠️
  COMMIT

Level 4: SERIALIZABLE (strictest — slowest)
  Full isolation. Transactions run as if serial (one at a time).
  All 4 problems prevented: dirty, non-repeatable, phantom, lost update.
  Achieved via range locks or MVCC snapshot at transaction start.

  Transaction 1                    Transaction 2
  BEGIN                            BEGIN
  SELECT COUNT(*) WHERE sal > 50000 → 4
                                   INSERT employee with sal=70000
                                   COMMIT ← BLOCKED! T1 has range lock
  SELECT COUNT(*) WHERE sal > 50000 → 4  ← same! No phantom ✅
  COMMIT ← T2 can now proceed

ISOLATION LEVEL COMPARISON:
══════════════════════════════════════════════════════════════
Level              Dirty   Non-Repeat  Phantom   Performance
READ UNCOMMITTED   YES     YES         YES       Fastest
READ COMMITTED     NO      YES         YES       Fast
REPEATABLE READ    NO      NO          YES*      Medium
SERIALIZABLE       NO      NO          NO        Slowest

*MySQL InnoDB also prevents phantoms at REPEATABLE READ via gap locks!
How to Set Isolation Level:
=============================

-- MySQL
SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN;
-- your queries
COMMIT;

-- PostgreSQL
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- your queries
COMMIT;

-- SQL Server
SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
BEGIN TRANSACTION;
-- your queries
COMMIT;
MVCC — Multi-Version Concurrency Control:
==========================================
Most modern databases (PostgreSQL, MySQL InnoDB, Oracle) use MVCC.
Instead of blocking reads with locks, MVCC keeps multiple VERSIONS of a row.

Reader sees the version that existed at transaction start time.
Writer creates a NEW version of the row.
→ Readers NEVER block Writers
→ Writers NEVER block Readers
→ Maximum concurrency with correct isolation!

Timeline (PostgreSQL with READ COMMITTED):
  T1 starts: reads row at timestamp T=100 (balance=1000)
  T2 updates balance to 2000 at T=110, commits
  T1 reads again: NOW sees T=110 version → 2000 (READ COMMITTED)

Timeline (PostgreSQL with REPEATABLE READ):
  T1 starts at T=100, takes snapshot
  T2 updates balance at T=110, commits
  T1 reads at T=120: still sees T=100 snapshot → 1000 ✅
  → T1's reads are perfectly repeatable!

D — Durability

Definition: Once a transaction is COMMITTED, it is PERMANENT.
Even if the server crashes the very millisecond after COMMIT,
the data will still be there when the server restarts.

Scenario without Durability (BAD):
  User clicks "Place Order"
  Database writes data to memory
  Database returns success to user
  CRASH! Memory lost!
  User paid but order is GONE → catastrophic!

How Durability is implemented:
================================

1. WRITE-AHEAD LOG (WAL) / REDO LOG:
   Before making any change, the database writes the change to a LOG FILE on DISK.
   The log file is written FIRST (write-ahead), then the actual data pages.

   Steps for a COMMIT:
     a. Write change to WAL log on disk (fast sequential write)
     b. Return "committed" to client
     c. Apply change to data pages (can happen later, asynchronously)

   If crash happens after step (a): redo log replays → data recovered ✅
   If crash happens before step (a): transaction was not committed → OK ✅

2. FSYNC:
   After writing to WAL, database calls fsync() to ensure data is
   physically written to disk platters — not just OS buffer!
   Without fsync → OS could buffer the write → crash → data lost!

3. CHECKPOINTS:
   Periodically, database flushes dirty (modified) pages from memory to disk.
   On crash recovery: read last checkpoint + replay WAL from that point.
   Without checkpoints → WAL grows forever and recovery takes forever.

4. REPLICATION (Additional Durability):
   For extra protection, commit is only acknowledged after:
   - Writing to primary server's WAL
   - AND replication to at least one replica's disk (synchronous replication)
   Even if primary server explodes, replica has the committed data!

Durability tradeoff:
   fsync ON  → safe, durable → slightly slower (disk I/O per commit)
   fsync OFF → dangerous, but MUCH faster (only for dev/test environments!)

   -- PostgreSQL: disable fsync (NEVER in production!)
   SET synchronous_commit = off;  -- async, faster, but risk of data loss
   SET synchronous_commit = on;   -- default, safe, durable

Isolation Levels as the Solution to Race Conditions

Race Condition             Prevented By
─────────────────────────────────────────────────────────
Dirty Read                 READ COMMITTED and above
Non-Repeatable Read        REPEATABLE READ and above
Phantom Read               SERIALIZABLE (or gap locks)
Lost Update                REPEATABLE READ + FOR UPDATE, or SERIALIZABLE

The connection between isolation levels and race conditions:

               READ        READ        REPEATABLE   SERIAL-
               UNCOMMITTED COMMITTED   READ         IZABLE
Lost Update    possible    possible    possible*    prevented
Dirty Read     possible    prevented   prevented    prevented
Non-Repeatable possible    possible    prevented    prevented
Phantom Read   possible    possible    possible**   prevented

*  Lost update prevented with SELECT FOR UPDATE at any level
** MySQL InnoDB prevents phantom at REPEATABLE READ via gap locks

CHOOSING THE RIGHT LEVEL:
Financial systems (payments, banking) → SERIALIZABLE or READ COMMITTED + FOR UPDATE
Most web applications                 → READ COMMITTED (good balance)
Analytics / reporting (read-only)     → READ UNCOMMITTED (max speed)
Critical business reports             → REPEATABLE READ (stable snapshot)

Real-World Scenarios — How to Solve Them

Scenario 1: Flash Sale — 100 users, only 1 item left in stock

-- inventory table:
-- product_id | quantity
--     1      |    1      ← only 1 left!

-- 100 users click "Buy" at the same time!

-- ❌ WRONG approach — race condition!
BEGIN;
SELECT quantity FROM inventory WHERE product_id = 1;
-- 100 transactions all read quantity = 1
-- 100 transactions all think: "1 > 0, I can sell!"
UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 1;
-- quantity goes to -99! 😱
COMMIT;

-- ✅ FIX 1 — Pessimistic Locking (FOR UPDATE)
BEGIN;
SELECT quantity FROM inventory
WHERE product_id = 1
FOR UPDATE;          ← only ONE transaction gets through at a time!
IF quantity > 0:
    UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 1;
    INSERT INTO orders ...
    COMMIT;
ELSE:
    ROLLBACK;  ← out of stock!

-- ✅ FIX 2 — Atomic UPDATE with CHECK (no separate SELECT needed!)
UPDATE inventory
SET quantity = quantity - 1
WHERE product_id = 1
AND quantity > 0;    ← guard condition in the UPDATE itself!

IF rows_affected = 1:
    success! Create order.
ELSE:
    out of stock!

-- The WHERE quantity > 0 check + atomic UPDATE prevents overselling!
-- This is the most efficient approach for high concurrency flash sales!

Scenario 2: Generating Unique Order Numbers

-- Many services generate order IDs concurrently
-- You need globally unique, sequential IDs

-- ❌ WRONG — race condition on MAX()
SELECT MAX(order_id) + 1 FROM orders;  ← two transactions can get same MAX!
INSERT INTO orders (order_id, ...) VALUES (max_plus_1, ...);
-- Duplicate order IDs! 😱

-- ✅ FIX 1 — AUTO_INCREMENT (let database handle it)
CREATE TABLE orders (
    order_id INT AUTO_INCREMENT PRIMARY KEY,
    ...
);
INSERT INTO orders (...) VALUES (...);  -- DB assigns unique ID atomically!

-- ✅ FIX 2 — SEQUENCE (PostgreSQL)
CREATE SEQUENCE order_seq START 1000 INCREMENT 1;
SELECT nextval('order_seq');  ← atomic, unique, never repeated

-- ✅ FIX 3 — UUID (universally unique, no collision)
INSERT INTO orders (order_id, ...) VALUES (UUID(), ...);
-- No central coordination needed! Works across distributed systems!

Scenario 3: Double Booking Prevention (Hotel/Airline)

-- Two users try to book the same hotel room on same dates

-- ❌ WRONG — race condition
SELECT COUNT(*) FROM bookings
WHERE room_id = 101 AND date = '2025-01-15';
-- Both T1 and T2 see count = 0 → both think room is available!
INSERT INTO bookings (room_id, date, user_id) VALUES (101, '2025-01-15', user1);
INSERT INTO bookings (room_id, date, user_id) VALUES (101, '2025-01-15', user2);
-- Room double-booked! 😱

-- ✅ FIX 1 — UNIQUE constraint (database enforces it!)
ALTER TABLE bookings
ADD UNIQUE KEY uq_room_date (room_id, date);

-- Now the second INSERT will fail with UNIQUE violation!
-- Application catches the error → "Room already booked"

-- ✅ FIX 2 — FOR UPDATE with check
BEGIN;
SELECT * FROM bookings
WHERE room_id = 101 AND date = '2025-01-15'
FOR UPDATE;                     ← Lock! No other transaction can check/book

IF no_rows_returned:
    INSERT INTO bookings ...    ← Safe! We have the lock
    COMMIT;
ELSE:
    ROLLBACK;                   ← Already booked!

Interview Questions

Q1: What is a race condition in SQL and how do you prevent it?

"A race condition in SQL happens when two or more concurrent transactions access the same data simultaneously and the final result depends on the order of execution, leading to incorrect data. For example, two users reading a balance of ₹1000 and both deducting ₹800 — both think there's enough balance, but the actual balance should go negative. Prevention methods include: using transactions with appropriate isolation levels, pessimistic locking with SELECT FOR UPDATE to serialize access, optimistic locking with a version column to detect conflicts, and using atomic operations like UPDATE ... WHERE quantity > 0 that combine the read and write check into one statement."

Q2: What is the difference between Dirty Read, Non-Repeatable Read, and Phantom Read?

"A Dirty Read happens when Transaction 1 reads data that Transaction 2 has modified but not yet committed — if T2 rolls back, T1 read invalid data. A Non-Repeatable Read happens when the same row is read twice within the same transaction but returns different values — because another committed transaction updated that row between the two reads. A Phantom Read happens when the same query returns different sets of rows within the same transaction — because another committed transaction inserted or deleted rows between the two queries. Each is prevented by progressively stricter isolation levels: READ COMMITTED prevents dirty reads, REPEATABLE READ prevents non-repeatable reads, and SERIALIZABLE prevents phantom reads."

Q3: What is the difference between Pessimistic and Optimistic Locking?

"Pessimistic locking assumes conflicts will happen and prevents them by acquiring a lock before reading the data — typically with SELECT FOR UPDATE. Other transactions must wait until the lock is released. It is safe but reduces concurrency and can cause deadlocks. Optimistic locking assumes conflicts are rare, so it reads data without locking. Before writing, it checks whether the data changed since it was read — usually via a version column. If the data changed, the transaction retries. It allows higher concurrency but can cause retry storms if conflicts are frequent. Pessimistic locking is best for financial systems with high contention; optimistic locking is best for read-heavy web apps with low write conflict."

Q4: What is a Deadlock and how do you prevent it?

"A deadlock happens when Transaction 1 holds a lock on Resource A and waits for Resource B, while Transaction 2 holds a lock on Resource B and waits for Resource A — both wait forever. The database detects this cycle and rolls back one transaction as the deadlock victim. Prevention strategies: always acquire locks in the same order across all transactions so cycles cannot form; keep transactions short to minimize lock hold time; use SELECT FOR UPDATE NOWAIT to fail immediately rather than wait; implement retry logic in the application for deadlock errors; and use lower isolation levels where possible to reduce locking."

Q5: Explain ACID properties in detail.

"Atomicity means a transaction is all-or-nothing — if any step fails, all changes are rolled back using the undo log, so the database never has partial state. Consistency means the database goes from one valid state to another, always obeying constraints like NOT NULL, UNIQUE, FOREIGN KEY, and CHECK — this is enforced by both the database engine and application logic. Isolation means concurrent transactions don't interfere with each other — each sees the database as if it were the only transaction, controlled through isolation levels and locking or MVCC. Durability means committed transactions survive crashes — enforced through write-ahead logging where changes are written to disk before being acknowledged, and fsync calls ensure the data is physically on disk and not just in OS buffer."

Q6: What is SELECT FOR UPDATE and when would you use it?

"SELECT FOR UPDATE acquires an exclusive lock on the selected rows, preventing other transactions from reading with a lock or writing to those rows until the current transaction commits or rolls back. Use it when you need to read a value and then update it based on that value within the same transaction — the classic read-modify-write pattern. For example, reading inventory quantity and then decrementing it, or reading an account balance and then deducting from it. Without SELECT FOR UPDATE, two transactions could both read the same value and both write based on it, causing a lost update. The NOWAIT variant fails immediately if the lock is unavailable, and SKIP LOCKED is useful for job queues to process rows that are not already being processed by another worker."

Q7: How does MVCC prevent blocking between readers and writers?

"MVCC — Multi-Version Concurrency Control — keeps multiple versions of each row, one for each point in time it was changed. When a reader starts a transaction, it takes a snapshot of the database at that moment. All its reads see only the version of each row that existed at snapshot time, regardless of what writers do concurrently. Writers create new versions of rows rather than overwriting in place. This means readers never block writers and writers never block readers — they operate on different versions of the same data. The database periodically cleans up old versions through a process called vacuum or garbage collection. This is how PostgreSQL, MySQL InnoDB, and Oracle achieve high concurrency while maintaining isolation."


Quick Recap

ConceptDescriptionSolution
Race Condition Two transactions read/write same data concurrently → wrong result Locking, Isolation Levels, Atomic Operations
Lost Update T2 overwrites T1's write because both read the old value SELECT FOR UPDATE or SERIALIZABLE
Dirty Read Reading uncommitted data from another transaction READ COMMITTED and above
Non-Repeatable Read Same row read twice gives different values (UPDATE by other tx) REPEATABLE READ and above
Phantom Read Same query returns different row count (INSERT/DELETE by other tx) SERIALIZABLE (or gap locks)
Shared Lock (S) Multiple readers can hold simultaneously, blocks writers SELECT FOR SHARE
Exclusive Lock (X) Only one holder, blocks all readers and writers SELECT FOR UPDATE
Pessimistic Lock Lock before read — prevents conflict upfront SELECT FOR UPDATE
Optimistic Lock No lock on read — detect conflict at write time Version column + check on UPDATE
Deadlock T1 waits for T2, T2 waits for T1 — circular wait Consistent lock order, short transactions, NOWAIT, retry
Atomicity All or nothing — no partial transactions Undo log + ROLLBACK
Consistency Always valid state — constraints never violated Constraints, FK, triggers, app logic
Isolation Concurrent transactions don't interfere Isolation levels + MVCC + Locking
Durability Committed data survives crashes Write-Ahead Log (WAL) + fsync + replication
MVCC Row versioning — readers never block writers Built into PostgreSQL, MySQL InnoDB, Oracle

Key Points to Remember

  • Race condition = two transactions, same data, wrong final result due to timing
  • 4 concurrency problems: Lost Update → Dirty Read → Non-Repeatable → Phantom (increasing severity)
  • SELECT FOR UPDATE = exclusive lock = only one transaction at a time → prevents lost update
  • SELECT FOR SHARE = shared lock = many readers, no writers
  • Pessimistic = lock first, safe but slower | Optimistic = version check, faster but retry needed
  • Deadlock = always lock in SAME ORDER across all transactions to prevent it
  • SKIP LOCKED = perfect for job queues (workers grab different rows in parallel)
  • NOWAIT = fail immediately if locked → better than waiting and risking deadlock
  • Atomicity = undo log | Durability = redo log (WAL)
  • Isolation is the ACID property that directly prevents race conditions
  • MVCC = multiple row versions → readers never block writers → high concurrency
  • READ COMMITTED is the safe default for most applications
  • SERIALIZABLE = fully safe but slowest — use only when data correctness is critical
  • Atomic UPDATE ... WHERE quantity > 0 is the best pattern for inventory/flash sales
  • UNIQUE constraint is the best way to prevent double-booking at the database level

Keep coding, keep learning! See you in the next one!