Thread Pool in libuv | node js
Episode 10 - Thread Pool in libuv
Hey everyone! Welcome back to the Node.js tutorial series. Today we are going to learn about the Thread Pool in libuv!
This is a very important concept that will help you understand how Node.js handles heavy operations behind the scenes.
What we will cover:
- How Thread Pool Works
- Default Thread Pool Size
- When Does libuv Use Thread Pool?
- Is Node.js Single-threaded or Multi-threaded?
- Changing Thread Pool Size
- Networking - Does it Use Thread Pool?
- epoll and kqueue
- File Descriptors and Socket Descriptors
- Event Emitters
- Streams and Buffers
How Thread Pool Works
Whenever there's an asynchronous task, V8 offloads it to libuv. For example, when reading a file, libuv uses one of the threads in its thread pool.
Thread Pool Workflow:
=====================
Your Code: fs.readFile("file.txt", callback)
│
▼
V8 Engine
"This is async!"
│
▼
┌─────────────────────────────────────────┐
│ libuv │
│ │
│ Thread Pool (Default: 4 threads) │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│ │Thread 1│ │Thread 2│ │Thread 3│ │Thread 4│
│ │ BUSY │ │ FREE │ │ FREE │ │ FREE │
│ │(file) │ │ │ │ │ │ │
│ └────────┘ └────────┘ └────────┘ └────────┘
│ │
└─────────────────────────────────────────┘
│
▼
Operating System
(Reads the file)
The file system (fs) call is assigned to a thread in the pool, and that thread makes a request to the OS.
Important: While the file is being read, the thread in the pool is fully occupied and cannot perform any other tasks!
Once the file reading is complete, the engaged thread is freed up and becomes available for other operations.
Example: File Read + Crypto Operation ===================================== 1. fs.readFile() → Assigned to Thread 1 2. crypto.pbkdf2() → Assigned to Thread 2 Thread 1: Reading file... BUSY Thread 2: Hashing... BUSY Thread 3: FREE Thread 4: FREE When file read completes: Thread 1: FREE (available for new tasks) Thread 2: Still hashing... BUSY
Default Thread Pool Size
In Node.js, the default size of the thread pool is 4 threads:
UV_THREADPOOL_SIZE = 4
Now, suppose you make 5 simultaneous file reading calls. What happens?
5 Simultaneous File Reads:
==========================
fs.readFile("file1.txt", cb1); → Thread 1 ✅
fs.readFile("file2.txt", cb2); → Thread 2 ✅
fs.readFile("file3.txt", cb3); → Thread 3 ✅
fs.readFile("file4.txt", cb4); → Thread 4 ✅
fs.readFile("file5.txt", cb5); → WAITING! ⏳
Thread Pool Status:
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│Thread 1│ │Thread 2│ │Thread 3│ │Thread 4│
│ BUSY │ │ BUSY │ │ BUSY │ │ BUSY │
│(file1) │ │(file2) │ │(file3) │ │(file4) │
└────────┘ └────────┘ └────────┘ └────────┘
file5.txt: "I'm waiting for a thread to be free!" 😢
4 file calls will occupy 4 threads, and the 5th one will wait until one of the threads is free!
Q: When does libuv use the Thread Pool?
Whenever you perform tasks like:
- File System (fs) operations
- DNS lookups (Domain Name System)
- Cryptographic methods (crypto module)
- Compression (zlib module)
libuv uses the thread pool!
Thread Pool Usage: ================== ✅ Uses Thread Pool: - fs.readFile() - fs.writeFile() - crypto.pbkdf2() - crypto.randomBytes() - dns.lookup() - zlib.gzip() ❌ Does NOT Use Thread Pool: - setTimeout() - setImmediate() - HTTP requests (networking) - TCP/UDP connections
Q: Is Node.js Single-threaded or Multi-threaded?
Now that you have enough knowledge, let's answer this important question!
The Answer:
- If you're dealing with synchronous code, Node.js is single-threaded
- If you're dealing with asynchronous tasks, it utilizes libuv's thread pool, making it multi-threaded
Node.js Threading:
==================
Synchronous Code:
-----------------
let a = 10;
let b = 20;
console.log(a + b);
→ Single Thread (V8's main thread)
Asynchronous Code:
------------------
fs.readFile("file.txt", callback);
crypto.pbkdf2(password, salt, ...);
→ Multi-threaded (libuv thread pool)
Important Note: The order of execution is not guaranteed when using thread pool. Whichever thread finishes first will win!
// Example: Order not guaranteed!
fs.readFile("small.txt", () => console.log("Small file"));
fs.readFile("large.txt", () => console.log("Large file"));
// Output could be:
// "Small file" then "Large file"
// OR
// "Large file" then "Small file"
// Depends on which file finishes reading first!
Q: Can you change the size of the Thread Pool?
A: Yes!
You can change the size of the thread pool by setting the UV_THREADPOOL_SIZE environment variable.
// Change thread pool size to 8 process.env.UV_THREADPOOL_SIZE = 8; // Now libuv has 8 threads instead of 4!
Or set it before running your Node.js application:
# Linux/Mac UV_THREADPOOL_SIZE=8 node app.js # Windows (CMD) set UV_THREADPOOL_SIZE=8 && node app.js # Windows (PowerShell) $env:UV_THREADPOOL_SIZE=8; node app.js
When to increase thread pool size?
If your production system involves:
- Heavy file handling
- Many cryptographic operations
- Lots of DNS lookups
You can adjust the thread pool size accordingly to better suit your needs!
Example: File-heavy Application =============================== // Default: 4 threads // 100 users uploading files simultaneously // Only 4 can be processed at a time! // Solution: Increase thread pool process.env.UV_THREADPOOL_SIZE = 16; // Now 16 file operations can happen simultaneously!
Q: Do API Requests Use the Thread Pool?
Suppose you have a server with many incoming requests, and users are hitting APIs. Do these APIs use the thread pool?
A: NO!
API Requests: ============= User 1 → HTTP Request → Does NOT use thread pool! User 2 → HTTP Request → Does NOT use thread pool! User 3 → HTTP Request → Does NOT use thread pool! ... User 1000 → HTTP Request → Still does NOT use thread pool! Networking is handled differently!
How Networking Works in libuv
In the libuv library, when it interacts with the OS for networking tasks, it uses sockets. Networking operations occur through these sockets.
Each socket has a socket descriptor, also known as a file descriptor (although this has nothing to do with the file system).
Networking Flow:
================
Incoming Request
│
▼
┌─────────────┐
│ Socket │ ← Socket Descriptor (fd)
└──────┬──────┘
│
▼
┌─────────────┐
│ libuv │ ← Uses OS mechanisms
└──────┬──────┘
│
Uses epoll/kqueue
(NOT thread pool!)
When an incoming request arrives on a socket, and you want to write data to this connection, it involves blocking operations. To handle this efficiently:
- Creating a separate thread for each connection is NOT practical
- Especially when dealing with thousands of requests!
epoll and kqueue - OS-Level Mechanisms
Instead, the system uses efficient mechanisms provided by the OS:
- epoll (on Linux)
- kqueue (on macOS)
- IOCP (on Windows)
These mechanisms handle multiple file descriptors (sockets) without needing a thread per connection!
epoll/kqueue Mechanism: ======================= Traditional Approach (Bad): --------------------------- 1000 connections = 1000 threads needed! 😱 (Not scalable, resource intensive) epoll/kqueue Approach (Good): ----------------------------- 1000 connections = 1 monitoring mechanism! 😊 (Highly scalable, efficient) How it works: 1. Create epoll/kqueue descriptor 2. It monitors MANY sockets at once 3. OS kernel notifies libuv of any activity 4. libuv handles only active connections
This approach allows the server to handle a large number of connections efficiently without creating a thread for each one!
The kernel-level mechanisms, like epoll and kqueue, provide a scalable way to manage multiple connections, significantly improving performance and resource utilization in a high-concurrency environment.
File Descriptors (FDs) and Socket Descriptors
File Descriptors (FDs) are integral to Unix-like operating systems, including Linux and macOS. They are used by the operating system to manage open files, sockets, and other I/O resources.
Socket descriptors are a special type of file descriptor used to manage network connections. They are essential for network programming, allowing processes to communicate over a network.
File Descriptors: ================= FD 0: stdin (Standard Input) FD 1: stdout (Standard Output) FD 2: stderr (Standard Error) FD 3: Your file (when you open a file) FD 4: Socket (when you create a connection) FD 5: Another socket ... Everything is a "file descriptor" in Unix!
Event Emitters
Event Emitters are a core concept in Node.js, used to handle asynchronous events. They allow objects to emit named events that can be listened to by other parts of the application.
The EventEmitter class is provided by the Node.js events module.
const EventEmitter = require('events');
// Create an EventEmitter instance
const myEmitter = new EventEmitter();
// Register event listener using 'on'
myEmitter.on('greet', (name) => {
console.log(`Hello, ${name}!`);
});
// Emit the event
myEmitter.emit('greet', 'John');
OUTPUT: Hello, John!
Key Methods:
- on(event, listener) - Register an event listener
- emit(event, data) - Trigger an event
- once(event, listener) - Listen only once
- removeListener(event, listener) - Remove a listener
// Practical Example: Custom Event
const EventEmitter = require('events');
class UserService extends EventEmitter {
createUser(name) {
console.log(`Creating user: ${name}`);
// Emit event after user is created
this.emit('userCreated', { name, createdAt: new Date() });
}
}
const userService = new UserService();
// Listen for userCreated event
userService.on('userCreated', (user) => {
console.log('New user created:', user);
// Send welcome email, log to database, etc.
});
userService.createUser('John');
OUTPUT:
Creating user: John
New user created: { name: 'John', createdAt: 2024-01-15T10:30:00.000Z }
Streams
Streams in Node.js are objects that facilitate reading from or writing to a data source in a continuous fashion. Streams are particularly useful for handling large amounts of data efficiently.
Types of Streams: ================= 1. Readable - Read data from source 2. Writable - Write data to destination 3. Duplex - Both read and write 4. Transform - Modify data while reading/writing
// Example: Reading a large file with streams
const fs = require('fs');
// Instead of loading entire file into memory:
// const data = fs.readFileSync('large-file.txt'); // BAD for large files!
// Use streams:
const readStream = fs.createReadStream('large-file.txt');
readStream.on('data', (chunk) => {
console.log('Received chunk:', chunk.length, 'bytes');
});
readStream.on('end', () => {
console.log('File reading complete!');
});
Buffers
Buffers are used to handle binary data in Node.js. They represent a fixed-length sequence of bytes.
// Creating buffers
// From string
const buf1 = Buffer.from('Hello');
console.log(buf1); // <Buffer 48 65 6c 6c 6f>
// Allocate buffer of specific size
const buf2 = Buffer.alloc(10);
console.log(buf2); // <Buffer 00 00 00 00 00 00 00 00 00 00>
// Convert buffer to string
console.log(buf1.toString()); // "Hello"
Quick Recap
| Concept | Description |
|---|---|
| Thread Pool | 4 threads by default for heavy operations |
| Uses Thread Pool | fs, crypto, dns, zlib operations |
| Doesn't Use Thread Pool | Networking (HTTP, TCP, UDP) |
| UV_THREADPOOL_SIZE | Environment variable to change pool size |
| epoll/kqueue | OS mechanisms for efficient networking |
| Event Emitters | Emit and listen to custom events |
| Streams | Handle data in chunks (efficient for large data) |
Interview Questions
Q: Is Node.js single-threaded or multi-threaded?
"For synchronous code, Node.js is single-threaded (V8's main thread). For asynchronous operations like file system, crypto, and DNS, libuv uses a thread pool making it multi-threaded. The default thread pool size is 4."
Q: What is the default thread pool size and how to change it?
"The default thread pool size is 4. You can change it by setting the UV_THREADPOOL_SIZE environment variable, for example: process.env.UV_THREADPOOL_SIZE = 8"
Q: Do HTTP requests use the thread pool?
"No. Networking operations like HTTP requests don't use the thread pool. Instead, libuv uses OS-level mechanisms like epoll (Linux), kqueue (macOS), or IOCP (Windows) to handle many connections efficiently."
Q: What operations use the thread pool?
"File system operations (fs module), cryptographic operations (crypto module), DNS lookups (dns.lookup), and compression (zlib module) use the thread pool."
Key Points to Remember
- Thread pool default size = 4 threads
- fs, crypto, dns, zlib use thread pool
- Networking does NOT use thread pool
- UV_THREADPOOL_SIZE to change pool size
- Execution order not guaranteed with thread pool
- epoll/kqueue handle networking efficiently
- Node.js is single-threaded for sync, multi-threaded for async
- Event Emitters for custom events
- Streams for handling large data
- Buffers for binary data
What's Next?
Now you understand how the thread pool works in libuv! In the next episode, we will:
- Build a Node.js server
- Handle HTTP requests
- Create REST APIs
Keep coding, keep learning! See you in the next one!
Post a Comment