cluster

Clusters in Node.js

What are Clusters?

Clusters allow Node.js applications to run multiple instances of the same code on different CPUs or cores. This is useful for distributing workloads and improving performance.

How to Create a Cluster?

To create a cluster, use the cluster module:

import cluster from "node:cluster";

The cluster module provides two types of processes:

  • Primary Process: The process that creates and manages child processes.

  • Child Process (Worker): Processes that run the application code.

Primary Process

The primary process is responsible for:

  • Creating child processes (workers).

  • Managing worker processes (e.g., listening for events, restarting crashed workers).

  • Distributing incoming requests to worker processes.

Child Process (Worker)

Worker processes are responsible for:

  • Executing the application code.

  • Sharing server ports with other worker processes.

  • Accepting incoming requests and processing them independently.

Code Snippet for Creating a Cluster

Here's a simplified code snippet that creates a cluster:

import cluster from "node:cluster";
import http from "node:http";

if (cluster.isPrimary) {
  // Create worker processes equal to the number of available CPUs
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  // Listen for worker exit events
  cluster.on("exit", (worker) => {
    console.log(`Worker ${worker.process.pid} exited`);
  });
} else {
  // Create an HTTP server in each worker
  http
    .createServer((req, res) => {
      res.writeHead(200);
      res.end("Hello world!");
    })
    .listen(8000);

  console.log(`Worker ${process.pid} started`);
}

Real-World Applications

Clusters are useful in applications that require high performance and scalability. Some real-world applications include:

  • Web servers that handle high traffic volumes.

  • Data processing applications that need to be distributed across multiple cores.

  • Machine learning applications that require parallel processing.


Simplified Explanation of Node.js Cluster Module

Imagine a group of workers who are waiting for tasks to do. The cluster module creates these workers, which are separate from the main process.

How Workers Get Tasks

There are two ways for workers to get tasks:

  1. Round-robin: The main process listens for incoming connections and sends them to workers in turn, like a circle. This is the default method.

  2. Primary-to-worker: The main process creates a listening socket and sends it to the workers. The workers then accept connections directly. This method is faster in theory, but can have its downsides due to operating system issues.

Worker Differences from Main Process

When working with workers, there are a few differences to keep in mind:

  1. Listening on File Descriptors: When using server.listen({fd: 7}), the main process will listen on file descriptor 7, not the worker.

  2. Explicit Handle Listening: Using server.listen(handle) will cause the worker to use the given handle directly.

  3. Random Ports: When using server.listen(0), workers will all listen on the same "random" port. To use unique ports, generate them based on the worker ID.

Why Use the Cluster Module?

The cluster module is often used for networking, such as:

  • Web servers (like running multiple instances of a Node.js server to handle more requests)

  • Chat servers

However, it can also be used for other applications that require separate processes, such as:

  • Parallel data processing

  • Background tasks

Real-World Example

Let's say you want to run a web server that can handle more requests:

const cluster = require("cluster");
const express = require("express");

if (cluster.isMaster) {
  // Primary process
  for (let i = 0; i < 4; i++) {
    cluster.fork();
  }
} else {
  // Worker process
  const app = express();
  app.get("/", (req, res) => {
    res.send("Hello from worker " + cluster.worker.id);
  });
  app.listen(3000);
}

In this example:

  • The primary process starts 4 worker processes.

  • Each worker process runs a separate instance of the web server, listening on port 3000.

  • Incoming requests are distributed to the workers in a round-robin fashion.

  • The worker serves a simple response indicating its ID.

This allows you to scale your web server by adding more worker processes.


Simplified Explanation of Worker Class

What is a Worker?

Imagine you have a big task that needs to be completed. Instead of doing it all yourself, you can divide it into smaller parts and hire workers to do them. These workers are called "processes" in Node.js. Each process is like a separate computer running on your machine.

The Worker class represents one of these workers. It contains information about the tasks assigned to the worker and provides methods to control the worker.

Extends: {EventEmitter}

This means that Worker inherits all the properties and methods of the EventEmitter class. An EventEmitter can trigger events that other objects can listen to and respond to.

Properties

  • id: A unique number that identifies the worker.

  • process: The underlying Node.js process that is running the worker.

  • state: The current state of the worker (e.g., "online", "offline").

Methods

  • disconnect(): Disconnects the worker from the cluster.

  • send(message): Sends a message to the worker.

  • kill([signal]): Kills the worker process.

  • isDead(): Checks if the worker is dead.

  • on("event", listener): Adds an event listener to the worker. For example, you can listen for the "online" event to know when the worker becomes available.

Real-World Applications

  • Parallel Processing: Using multiple workers allows you to process data in parallel, making your applications faster.

  • Task Queuing: Workers can be used to manage a queue of tasks, ensuring that all tasks are processed in the correct order.

  • Fault Tolerance: If a worker fails, the cluster can automatically start a replacement worker to keep the application running smoothly.

Example

// In the primary process
const cluster = require("cluster");

cluster.on("online", (worker) => {
  console.log(`Worker ${worker.id} is now online.`);
});

worker.on("message", (message) => {
  console.log("Received message from worker:", message);
});

// In a worker process
const cluster = require("cluster");
const worker = cluster.worker;

worker.on("disconnect", () => {
  console.log("Worker is disconnecting.");
});

worker.send({ message: "Hello primary!" });

Event: 'disconnect'

Explanation:

The 'disconnect' event is emitted when a worker process in a cluster disconnects. It's similar to the 'disconnect' event emitted by the entire cluster, but this event is specific to the individual worker process that disconnected.

Simplified Example:

Imagine you have a cluster of worker processes handling tasks. If one of those workers suddenly loses connection, the cluster will emit a 'disconnect' event for that specific worker process.

Code Snippet:

const cluster = require("cluster");

cluster.fork().on("disconnect", () => {
  console.log("Worker disconnected!");
});

Real-World Example:

In a web server, you might use a cluster of worker processes to handle incoming requests. If one of the workers crashes or disconnects, the cluster will emit a 'disconnect' event for that worker. You can use this event to log the disconnect, notify a monitoring system, or automatically spawn a new worker process to replace the disconnected one.

Potential Applications:

  • Monitoring and logging worker process disconnects

  • Automatically replacing disconnected worker processes

  • Notifying external systems of worker process disconnections


Event: 'error'

This event is triggered when an error occurs in any of the workers. It is the same event that is emitted by the child_process.fork() function.

Example:

const cluster = require("cluster");

// Create a new cluster with four workers
cluster.fork({ silent: true });
cluster.fork({ silent: true });
cluster.fork({ silent: true });
cluster.fork({ silent: true });

// Listen for the 'error' event on the cluster
cluster.on("error", (worker, code, signal) => {
  console.error(
    `Worker ${worker.process.pid} died with code: ${code}, signal: ${signal}`
  );
});

Real-world applications:

  • Monitoring worker processes for errors

  • Automatically restarting worker processes that have crashed


Event: 'exit'

Explanation:

Imagine a worker in a cluster of computers as a robot in a factory. When the robot finishes its task or has a problem, it needs to let the boss, the primary process, know. This 'exit' event is that robot's way of communicating with the boss.

Parameters:

  • code: Number. If the robot finished its task normally, this number will tell the boss how the task went. A 0 means success, while other numbers indicate errors.

  • signal: String. If the robot had a problem and had to be stopped, this string will tell the boss what happened. For example, it might say 'SIGHUP' if the robot received a hang-up signal from the boss.

Example:

// In the boss's code (primary process)
const worker = cluster.fork(); // Create a new robot

worker.on("exit", (code, signal) => {
  // When the robot finishes or has a problem, the boss will get a message
  if (signal) {
    console.log(`The robot was stopped because of a problem: ${signal}`);
  } else if (code !== 0) {
    console.log(`The robot had an error: ${code}`);
  } else {
    console.log("The robot finished its task successfully!");
  }
});

Real-World Applications:

This event is helpful for the boss to monitor its robots and know when they need attention. For example, if a robot has an error, the boss can restart it or investigate the issue.


Event: 'listening'

The 'listening' event is emitted by a worker when it starts listening for incoming connections on a specific address.

  • address {Object}

This object contains the address and port that the worker is listening on.

// Example
const cluster = require("cluster");
const worker = cluster.fork();

// Listen for the 'listening' event
worker.on("listening", (address) => {
  console.log(
    `Worker ${worker.id} is listening on ${address.address}:${address.port}`
  );
});

Real-World Applications

The 'listening' event can be used to:

  • Log when a worker starts listening for connections.

  • Monitor the status of workers in a cluster.

  • Automatically restart workers that stop listening for connections.


Event: 'message'

  • message {Object}

  • handle {undefined|Object}

Simplified Explanation:

When you run a Node.js script using the cluster module, it creates multiple worker processes. These workers can communicate with each other and with the primary process (the one that started the workers) using messages. The 'message' event is emitted when a worker receives a message.

Real-World Example:

Let's say you have a script that needs to perform a lot of calculations and you want to split the work between multiple processes. You can use the cluster module to create worker processes and have them share the calculations. To make the workers communicate, you can use the 'message' event:

// Primary process
const cluster = require("node:cluster");

if (cluster.isPrimary) {
  const workers = [];

  // Create worker processes
  for (let i = 0; i < 4; i++) {
    workers.push(cluster.fork());
  }

  // Send messages to workers
  for (const worker of workers) {
    worker.send({ message: "Hello from primary!" });
  }

  for (const worker of workers) {
    worker.on("message", (message) => {
      console.log(`Worker ${worker.id} replied: ${message.message}`);
    });
  }
} else {
  // Worker process
  process.on("message", (message) => {
    console.log(`Message received from primary: ${message.message}`);
    process.send({ message: `Hello from worker ${cluster.worker.id}!` });
  });
}

In this example, the primary process creates four worker processes. It sends a message to each worker, and then listens for messages from the workers. Each worker replies to the primary process with a message of its own.

Other Applications:

The 'message' event can be used for various purposes, such as:

  • Sharing data between workers

  • Coordinating work between workers

  • Sending notifications from workers to the primary process


Event: 'online'

Explanation:

When a cluster worker process starts running successfully and becomes available to handle tasks, it emits the 'online' event. This event is triggered only in the parent process, not in the worker process itself.

Simplified Example:

Imagine you have a cluster of workers that are handling tasks. When a new worker is started and becomes online, the parent process will receive the 'online' event, indicating that the worker is now ready to process requests.

Code Example:

const cluster = require("cluster");

cluster.fork().on("online", () => {
  console.log("Worker is now online");
});

In this example, the 'online' event is triggered when the worker process starts successfully. The event handler logs a message to the console, indicating that the worker is now available.

Real-World Application:

The 'online' event can be used to monitor the status of worker processes in a cluster. For example, you can use it to automatically restart workers that fail or become unresponsive. This helps ensure that your cluster is always running efficiently and handling requests effectively.


Simplified Explanation of worker.disconnect() Method in Node.js

What is worker.disconnect()?

worker.disconnect() is a method used to disconnect a worker process from the primary process in a Node.js cluster.

When to Use worker.disconnect()?

You would typically use worker.disconnect() when you want to:

  • Shut down a worker process gracefully.

  • Close all server connections associated with the worker process.

How It Works:

  • In a worker process, worker.disconnect() closes all server connections, waits for them to close, and then disconnects the IPC channel to the primary process.

  • In the primary process, worker.disconnect() sends an internal message to the worker process, causing it to call worker.disconnect() on itself.

Important Note:

  • After calling worker.disconnect(), the exitedAfterDisconnect property on the worker object is set to true.

  • After a server is closed, it will no longer accept new connections, but existing connections will be allowed to close as usual.

  • Existing client connections are not automatically closed by workers.

Real-World Application:

Consider a scenario where you have a cluster of worker processes running a web server. When a worker process is no longer needed (e.g., due to low traffic), you can use worker.disconnect() to gracefully shut it down. This ensures that all active server connections are closed and the worker process exits cleanly.

Improved Code Example:

if (cluster.isPrimary) {
  // Fork a worker
  const worker = cluster.fork();

  // Send a 'shutdown' message to the worker after a period of inactivity
  setTimeout(() => {
    worker.send("shutdown");
  }, 2000);

  // Listen for the 'disconnect' event on the worker
  worker.on("disconnect", () => {
    console.log(`Worker ${worker.id} disconnected.`);
  });
} else if (cluster.isWorker) {
  process.on("message", (msg) => {
    if (msg === "shutdown") {
      // Close server connections and other resources
      // Then call `worker.disconnect()` to initiate graceful shutdown
      worker.disconnect();
    }
  });
}

worker.exitedAfterDisconnect

  • Explanation:

    The worker.exitedAfterDisconnect property tells you if a worker process exited because you explicitly told it to disconnect (using .disconnect()) or if it exited for some other reason.

    If the property is true, it means the worker exited because you called .disconnect(). If it's false, it means the worker exited accidentally.

  • Simplified Explanation:

    Imagine you have a bunch of worker processes doing tasks for you. If you want one of them to stop, you can ask it to "disconnect" nicely. This sets the exitedAfterDisconnect property to true.

    If the worker process crashes or exits for any other reason, the exitedAfterDisconnect property will be false.

  • Real World Application:

    The primary process can use the exitedAfterDisconnect property to decide whether or not to respawn a worker process. If the property is true, the primary process knows that the worker exited voluntarily and doesn't need to be respawned. If the property is false, the primary process knows that the worker exited accidentally and should be respawned.

  • Code Example:

cluster.on("exit", (worker, code, signal) => {
  if (worker.exitedAfterDisconnect === true) {
    console.log("The worker exited voluntarily. No need to respawn.");
  } else {
    console.log("The worker exited accidentally. Respawning.");
    cluster.fork();
  }
});

Worker ID

Explanation: Each worker process (a separate instance of your Node.js application) that's created by the cluster module is assigned a unique ID number. This ID helps distinguish between different worker processes.

Code Snippet:

const cluster = require("cluster");

cluster.on("online", (worker) => {
  console.log(`Worker ${worker.id} is online.`);
});

Real-World Application: In a real-world application, you can use worker IDs to track individual worker processes and handle them differently based on their ID. For example, you could assign specific tasks to different workers based on their ID.

Example: Consider a web server application that serves multiple client requests. Each worker process could be assigned a range of client IDs to handle. This allows you to distribute requests evenly across workers and improves scalability.

Cluster API:

  • cluster.workers: An object that contains all the worker processes created by the cluster module. Each worker object has an id property that contains the unique ID of the worker.


worker.isConnected()

This function checks if a worker process is connected to its primary process.

Simplified Explanation:

Imagine a worker process as a child process that's working for a primary process, like a manager. The worker.isConnected() function lets you check if the child process is still talking to its manager.

Code Example:

// In the primary process:
const cluster = require("cluster");

const worker = cluster.fork();

worker.on("message", (message) => {
  if (worker.isConnected()) {
    // The worker is still connected to the primary
  } else {
    // The worker has disconnected
  }
});

Real-World Applications:

  • Load Balancing: In a cluster of worker processes, the primary can check if a worker has disconnected and launch a new one to replace it.

  • Error Handling: If a worker disconnects unexpectedly, the primary can log the error and investigate why it happened.

  • Process Monitoring: The primary can monitor the status of its worker processes and ensure they're all running smoothly.


worker.isDead()

This function checks if a worker process has terminated. It returns true if the worker has exited or has been terminated by a signal, and false otherwise.

const cluster = require("cluster");

// Create a worker and check if it's dead
const worker = cluster.fork();
console.log(`Worker ${worker.process.pid} is dead: ${worker.isDead()}`); // false

// Kill the worker and check again
worker.process.kill("SIGTERM");
console.log(`Worker ${worker.process.pid} is dead: ${worker.isDead()}`); // true

Real-world applications

This function can be useful for monitoring worker processes and taking appropriate actions when a worker dies, such as restarting it or logging an error.

For example, a web application could use this function to automatically restart a worker process if it crashes, ensuring that the application remains available to users.


What is worker.kill()?

worker.kill() is a function in Node.js's cluster module that allows you to kill (stop) a worker process.

How does worker.kill() work?

In the main worker, worker.kill() disconnects the worker process and then sends a signal to kill it. In the worker, it simply sends the kill signal.

Why use worker.kill()?

You might need to use worker.kill() if a worker process becomes unresponsive or if you want to stop the process.

Simplified example:

const cluster = require("cluster");

// Create a worker
const worker = cluster.fork();

// Kill the worker after 10 seconds
setTimeout(() => {
  worker.kill();
}, 10000);

In this example, the worker process will be killed after 10 seconds.

Real-world applications:

worker.kill() can be useful in situations where you need to stop a worker process that is misbehaving or no longer needed. For example, you might use worker.kill() to stop a worker process that is consuming too many resources or is causing errors.

Potential applications:

  • Stopping unresponsive worker processes

  • Shutting down worker processes when the server is shutting down

  • Terminating worker processes that are no longer needed


worker.process

  • {ChildProcess}

Essentially, worker.process is a representation of the child process that is created when a new worker is spawned using child_process.fork(). The worker.process object holds information about the child process, such as its process ID (PID), status, and the communication channels between the parent and child processes.

In a worker process, the global process object is actually a reference to the worker.process object. This means that within a worker, you can use the process object to interact with the child process in the same way that you would in a standalone Node.js script.

For example, you can use the process.exit(0) method to terminate the worker process. Additionally, workers will automatically call process.exit(0) if the 'disconnect' event occurs on process and .exitedAfterDisconnect is not set to true. This helps prevent unexpected disconnections and ensures that the worker process exits gracefully.

Real-World Use Case

Clustered workers can be used in a variety of real-world applications, such as:

  • Web servers: Balancing requests across multiple worker processes can improve the performance and scalability of a web server.

  • Background tasks: Long-running tasks, such as data processing or email sending, can be offloaded to worker processes, freeing up the main event loop for more responsive tasks.

  • Microservices: Workers can be used to implement small, independent services that can be deployed and managed separately from the main application.

  • Fault tolerance: In the event of a worker process failure, the cluster can automatically restart the failed worker, ensuring high availability of the application.

Here is an example of a complete code implementation for a simple web server that uses clustered workers:

const cluster = require("cluster");
const http = require("http");

if (cluster.isMaster) {
  // In the master process, create worker processes
  const numCPUs = require("os").cpus().length;
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  // In a worker process, create an HTTP server
  http
    .createServer((req, res) => {
      res.writeHead(200);
      res.end("Hello World!");
    })
    .listen(8080);
}

When this script is run, the master process will create multiple worker processes, each of which will run its own copy of the HTTP server. This allows the web server to handle multiple requests concurrently and improves its overall performance.


Simplified Explanation of worker.send:

Imagine a cluster of computers working together like a team. Each computer is like a "worker," and there's a "primary" computer coordinating them.

What does worker.send do?

It's like sending a message from one worker to another worker or to the primary. You can also send a special "handle" along with the message.

Sending a Message:

Let's say the primary wants to tell a worker, "Hey, do this task."

// In the primary process
const worker = cluster.fork();
worker.send({ task: "process data" });

In the worker process, there's a 'message' event that listens for messages from the primary. When it gets the message, it starts processing the data.

Sending a Handle:

A "handle" is like a way to share resources between workers. Imagine you have a file shared between multiple workers. You can send the handle of the file to each worker, allowing them to access the same file without needing to copy it.

// In the primary process
const fs = require("fs");
const fileHandle = fs.openSync("data.txt", "r");
worker.send({ task: "read data", fileHandle });

Potential Applications:

  • Distributed computing: Divide a large task into smaller parts and distribute them to workers for faster processing.

  • Load balancing: Automatically distribute tasks based on the workload of each worker, ensuring efficient use of resources.

  • Error handling: Send error messages from workers to the primary for centralized logging and debugging.

Code Examples:

Primary sends a message to a worker:

const cluster = require("cluster");

if (cluster.isPrimary) {
  const worker1 = cluster.fork();
  worker1.send("Hello from the primary!");
}

Worker receives a message from the primary:

if (cluster.isWorker) {
  process.on("message", (msg) => {
    console.log(msg); // Prints 'Hello from the primary!'
  });
}

Primary sends a message with a handle to a worker:

// In the primary process
const fs = require("fs");
const fileHandle = fs.openSync("data.txt", "r");
worker1.send({ task: "read data", fileHandle });

Worker receives a message with a handle:

// In the worker process
process.on("message", (msg) => {
  const { task, fileHandle } = msg;
  console.log("File handle:", fileHandle);
});

Simplified Explanation of the 'disconnect' Event:

What it means: The 'disconnect' event happens when a worker process in your cluster has stopped communicating with the main process. This can happen for several reasons:

  • The worker process has crashed or exited.

  • The worker process has been terminated.

  • The worker process has been disconnected manually.

When it's useful: You should handle this event if you want to know when a worker process is no longer available. You can then take appropriate actions, such as respawning the worker or logging the event.

Code Example:

// Import the cluster module
const cluster = require("cluster");

// Create a worker
const worker = cluster.fork();

// Listen for the 'disconnect' event
worker.on("disconnect", () => {
  console.log(`Worker ${worker.id} has disconnected`);
});

Real-World Application: In a real-world application, you might use the 'disconnect' event to:

  • Log the disconnect event for debugging purposes.

  • Respawn the worker process to ensure that your application continues to function properly.

  • Send a notification to other workers in the cluster to let them know that one of their peers has disconnected.


'exit' Event

When a worker process in a Node.js cluster exits, the cluster module emits an 'exit' event. This event provides information about the exited worker, including its exit code and the signal that caused the exit.

Code

The code property of the event object contains the exit code of the worker process. If the worker exited normally, this will be a non-negative integer. If the worker was killed by a signal, this will be a negative integer representing the signal number.

Signal

The signal property of the event object contains the name of the signal that caused the worker process to exit. If the worker exited normally, this will be null.

Event Handler

To handle the 'exit' event, you can add a listener using the cluster.on() method. The event handler function will receive three arguments:

  • worker: The worker process that exited.

  • code: The exit code of the worker.

  • signal: The name of the signal that caused the exit.

Example

The following code shows how to handle the 'exit' event and restart the exited worker:

cluster.on('exit', (worker, code, signal) => {
  console.log(`worker ${worker.process.pid} died (${signal || code}). restarting...`);
  cluster.fork();
});

Real-World Applications

The 'exit' event is useful for monitoring worker processes and restarting them if they fail. This can help to ensure that your application remains available even if individual workers encounter errors.

One potential application of the 'exit' event is in a web server. Each worker process can handle a separate request, and if a worker crashes, the cluster module will automatically restart it, ensuring that the web server remains responsive.


Event: 'fork'

  • worker {cluster.Worker}

The 'fork' event is emitted when a new worker process is created in a cluster. This event can be used to perform custom actions, such as logging or setting up timeouts, when a new worker is created.

Example: Setting up timeouts for worker processes

// Create a timeout for each worker
const timeouts = [];
cluster.on("fork", (worker) => {
  timeouts[worker.id] = setTimeout(() => {
    console.error(`Worker ${worker.id} has not responded in time.`);
  }, 2000);
});

// Clear the timeout when the worker starts listening
cluster.on("listening", (worker, address) => {
  clearTimeout(timeouts[worker.id]);
});

// Clear the timeout when the worker exits
cluster.on("exit", (worker, code, signal) => {
  clearTimeout(timeouts[worker.id]);
});

In this example, a timeout is created for each worker process when it is forked. If the worker process does not start listening within 2 seconds, an error message is logged. The timeout is cleared when the worker process starts listening or exits.

Potential applications

The 'fork' event can be used to perform a variety of tasks, such as:

  • Logging worker activity

  • Setting up timeouts for worker processes

  • Creating custom metrics or monitoring dashboards

  • Performing any other custom actions that need to be performed when a new worker process is created


Event: 'listening'

When a worker calls listen(), the primary process will receive a 'listening' event. This event tells the primary that the worker is listening on a specific IP address and port.

Parameters:

  • worker: The worker that is listening.

  • address: An object containing the IP address and port that the worker is listening on.

Example:

// In the primary process
cluster.on("listening", (worker, address) => {
  console.log(
    `Worker ${worker.id} is listening on ${address.address}:${address.port}`
  );
});

// In a worker process
server.listen(8080);

Real-World Applications:

  • Monitoring worker processes: The primary process can use the 'listening' event to keep track of which workers are listening on which ports. This can be useful for troubleshooting and load balancing.

  • Automatic failover: If a worker process crashes, the primary process can automatically create a new worker and have it listen on the same port. This ensures that there is no downtime for the application.

  • Scaling: The primary process can use the 'listening' event to determine when to create new worker processes. For example, if the number of requests is increasing, the primary process can create a new worker to handle the additional load.


Simplified Explanation:

Event: 'message'

Whenever a worker process (a separate process created by the cluster) sends a message to the primary process (the one that started the workers), the 'message' event is triggered.

Parameters:

  • worker: The worker process that sent the message.

  • message: An object containing the message data.

  • handle: A handle that can be used to respond to the worker.

How to Use It:

To listen for the 'message' event, you can use the following code:

cluster.on("message", (worker, message, handle) => {
  // Handle the message here
});

Real-World Examples:

  • Worker: Sends updates on its progress to the primary process.

  • Primary: Monitors the status of all workers and responds accordingly.

  • Communication: Workers and the primary can exchange data and commands using this event.

Potential Applications:

  • Load Balancing: The primary can distribute tasks to workers based on their messages.

  • Error Handling: Workers can report errors to the primary, which can handle them centrally.

  • Data Collection: Workers can send data to the primary for aggregation and analysis.


Event: 'online'

Explanation:

When you create multiple processes (called workers) in your Node.js application using the cluster module, these workers will start running separately. After a worker is created, it sends a message to the main process (called the primary) to let it know that it's ready to start working. This message is called the 'online' event.

Code Snippet:

// In the primary process (main.js):
const cluster = require("cluster");

// Create workers (only in the primary process)
if (cluster.isPrimary) {
  for (let i = 0; i < 4; i++) {
    cluster.fork();
  }

  // Listen for the 'online' event from each worker
  cluster.on("online", (worker) => {
    console.log(`Worker with ID ${worker.id} is now online.`);
  });
}

Real-World Application:

In a real-world scenario, you might want to use cluster to create multiple workers that can handle different parts of your application. For example:

  • One worker could be dedicated to processing user requests.

  • Another worker could handle heavy calculations.

  • A third worker could focus on database operations.

By using multiple workers, you can spread the load across multiple cores or processors, making your application more efficient and responsive.

Simplified Example:

Imagine you have a lemonade stand with multiple employees. Each employee (worker) is responsible for taking orders and making lemonade. When an employee is ready to start taking orders, they come to the manager (primary) and say "I'm ready!" This is like the 'online' event in the cluster module. The manager notes that the employee is now ready to work and can start taking orders from customers.

Additional Notes:

  • The 'online' event is emitted only for new workers that are created.

  • The 'online' event is only emitted in the primary process.


Event: 'setup'

  • Purpose:

    • Notifies when the primary process (the one starting the cluster) has completed its setup.

  • Event Object: settings

    • An object containing the cluster settings at the time when the event was emitted.

  • Example:

    // This is just an example, not a complete implementation.
    cluster.on("setup", (settings) => {
      // Access and utilize the cluster settings for the primary process.
      console.log(settings);
    });

Potential Applications:

  • Accessing and using the cluster settings in the primary process to make decisions or perform specific actions based on the configuration.


Simplified Explanation

What is disconnect()?

Imagine you have a group of computers (workers) that are controlled by a main computer (primary). When you use the disconnect() method, you're telling the primary computer to stop communicating with all the workers and close any connections it has with them.

Why Would You Use disconnect()?

You might want to use disconnect() if you're finished using your workers and want to shut them down gracefully. This will allow the primary computer to end its own process without any issues.

Code Snippet

Here's a simplified example of how to use disconnect():

// In the primary process
cluster.disconnect(function () {
  console.log("All workers disconnected and handles closed.");
});

When you run this code, it will call .disconnect() on each worker in the cluster. Once all the workers are disconnected, the callback function will be called and it will print a message to the console.

Real-World Applications

Here are a few real-world applications of disconnect():

  • Shutting down a worker cluster after a task is complete.

  • Gracefully handling unexpected errors in the primary process.

  • Resetting a worker cluster to a known state.

Potential Improvements

Here's an improved version of the code snippet above that includes error handling:

// In the primary process
cluster.disconnect(function (err) {
  if (err) {
    console.error("Error disconnecting workers:", err);
  } else {
    console.log("All workers disconnected and handles closed.");
  }
});

This version will catch any errors that occur while disconnecting the workers and print them to the console.


cluster.fork([env])

  • env {Object} Key/value pairs to add to worker process environment.

  • Returns: {cluster.Worker}

Spawn a new worker process

This method can only be called from the primary process. It creates a new worker process and returns a cluster.Worker object representing that worker.

Syntax

cluster.fork([env]);

Parameters

  • env {Object} (Optional) An object containing key/value pairs to add to the worker process environment.

Return value

  • {cluster.Worker} A cluster.Worker object representing the new worker process.

Example

const cluster = require("cluster");

if (cluster.isMaster) {
  // Create a new worker process
  const worker = cluster.fork();

  // Listen for messages from the worker process
  worker.on("message", (msg) => {
    console.log(`Received message from worker: ${msg}`);
  });

  // Send a message to the worker process
  worker.send("Hello from master!");
} else {
  // This code runs in the worker process

  // Listen for messages from the master process
  process.on("message", (msg) => {
    console.log(`Received message from master: ${msg}`);
  });

  // Send a message to the master process
  process.send("Hello from worker!");
}

Potential applications

Clustering is useful for scaling up an application to handle more requests. By creating multiple worker processes, the application can distribute the load and improve performance.

Clustering can also be used to improve fault tolerance. If one worker process crashes, the other workers can continue to serve requests.


Simplified Explanation:

In a Node.js cluster, there can be multiple worker processes running simultaneously. The cluster.isMaster property is used to determine if the current process is the "master" process that supervises the workers.

Usage:

The following code snippet demonstrates how to use cluster.isMaster:

const cluster = require("cluster");

if (cluster.isMaster) {
  // Code that runs in the master process
} else {
  // Code that runs in the worker processes
}

Applications in the Real World:

  • Load balancing: The master process can distribute incoming requests to the worker processes, ensuring that all workers are utilized and that no single worker becomes overloaded.

  • Fault tolerance: If a worker process crashes, the master process can automatically restart it, ensuring that the application remains responsive.

  • Concurrency: By using multiple worker processes, a cluster can handle a high volume of concurrent requests, making it suitable for applications that require real-time responsiveness.

Improved Code Snippet:

The following code snippet provides a more complete example of how to use cluster.isMaster in a cluster application:

const cluster = require("cluster");
const numCPUs = require("os").cpus().length;

if (cluster.isMaster) {
  // Create worker processes
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }

  // Monitor worker processes
  cluster.on("exit", (worker, code, signal) => {
    console.log(
      `Worker ${worker.process.pid} died with code ${code} and signal ${signal}`
    );
    cluster.fork();
  });
} else {
  // Worker process code
  require("./worker.js");
}

In this example, the master process creates multiple worker processes based on the number of CPU cores available on the system. It also monitors the worker processes and automatically restarts any that crash. The worker processes are responsible for running the actual application code.


cluster.isPrimary

Simplified Explanation:

In a Node.js cluster, there is one primary process and multiple worker processes. The primary process is responsible for managing the worker processes and assigning tasks to them.

cluster.isPrimary is a boolean variable that tells you if the current process is the primary process. If it's true, then you're in the primary process, otherwise, you're in a worker process.

Real World Example:

Let's say you have a web server that you want to scale up by adding more CPUs. You can use Node.js's cluster module to create a cluster of worker processes that will handle incoming requests.

In the primary process, you would:

const cluster = require("cluster");

if (cluster.isPrimary) {
  console.log("I am the primary process.");

  // Create a number of worker processes
  for (let i = 0; i < 4; i++) {
    cluster.fork();
  }
} else {
  console.log("I am a worker process.");

  // Handle incoming requests
  require("http")
    .createServer((req, res) => {
      res.end("Hello world!\n");
    })
    .listen(8080);
}

In this example, the primary process will create 4 worker processes. Each worker process will handle incoming requests and respond with a simple message.

Potential Applications:

  • Scaling web applications: By creating a cluster of worker processes, you can handle more incoming requests and improve the performance of your web application.

  • Parallel processing: You can use worker processes to parallelize tasks, such as processing large datasets or performing complex calculations.

  • Microservices: You can use worker processes to isolate different parts of your application into separate microservices. This makes your application more modular and easier to maintain.


cluster.isWorker

  • Simplified Explanation:

    Checks if the current Node.js process is a "worker" process in a cluster of processes.

    • A cluster is a group of related processes that share the same port and communicate with each other.

    • A worker process is one of the multiple processes in the cluster that handles requests.

    cluster.isWorker returns true if the current process is a worker process, and false if it is the main process (called the "primary" process).

  • Code Snippet:

    const cluster = require('cluster');
    
    if (cluster.isWorker) {
      // This code is executed in a worker process
    } else {
      // This code is executed in the primary process
    }
  • Real-World Applications:

    • Scaling applications: Clusters allow you to create multiple worker processes to handle increased load, distributing requests across multiple machines.

    • Fault tolerance: If one worker process fails, the other processes can continue handling requests, ensuring high availability.

    • Parallelizing tasks: You can create multiple worker processes to perform different tasks in parallel, such as processing data or performing calculations.


Simplified Explanation

Scheduling Policy

In Node.js's cluster module, you can control how worker processes are scheduled to handle tasks. You can choose between two options:

Round-robin (SCHED_RR):

  • Tasks are distributed evenly among all available worker processes.

  • This ensures that no worker process becomes overloaded while others are idle.

System Default (SCHED_NONE):

  • The operating system decides how tasks are assigned to worker processes.

  • This can lead to uneven distribution, with some workers handling more tasks than others.

Implementation

You can set the scheduling policy in your cluster script using the cluster.schedulingPolicy property.

// Set round-robin scheduling policy
cluster.schedulingPolicy = cluster.SCHED_RR;

// Set system default scheduling policy
cluster.schedulingPolicy = cluster.SCHED_NONE;

Alternatively, you can set the policy using the NODE_CLUSTER_SCHED_POLICY environment variable.

NODE_CLUSTER_SCHED_POLICY=rr node cluster.js

Real-World Applications

Round-robin:

  • Used when you want to ensure equal distribution of tasks among workers.

  • Suitable for applications where all tasks take roughly the same amount of time to process.

System Default:

  • Used when you want to let the operating system optimize task scheduling.

  • Useful for applications where task processing times vary significantly.

Complete Code Example

const cluster = require("cluster");
const numCPUs = require("os").cpus().length;

// Set round-robin scheduling policy
cluster.schedulingPolicy = cluster.SCHED_RR;

// Create a worker for each CPU
for (let i = 0; i < numCPUs; i++) {
  cluster.fork();
}

// Listen for messages from workers
cluster.on("message", (worker, message) => {
  console.log(`Message from worker ${worker.id}: ${message}`);
});

Cluster Settings in Node.js

In Node.js, the cluster module allows you to create multiple processes (called "workers") that execute the same code. These workers run concurrently, sharing a common port. To control the behavior of these workers, you can specify certain settings when setting up the cluster.

execArgv

This is an array of strings that specify the arguments that will be passed to the Node.js executable when the workers are created. These arguments are typically used to enable debugging or other special features.

For example, to set the max memory usage for the workers, you could use the following setting:

cluster.settings.execArgv = ["--max-old-space-size=1024"];

exec

This is the file path to the worker file. This file contains the code that will be executed by each worker.

For example, let's say you have a worker file named worker.js in the same directory as your main script. You would set the exec setting as follows:

cluster.settings.exec = "worker.js";

args

This is an array of strings that specify the arguments that will be passed to the worker script.

For example, to pass a command-line argument to the workers, you could use the following setting:

cluster.settings.args = ["--some-option"];

cwd

This is the current working directory of the worker process. This is the directory in which the worker script will be executed.

For example, to set the working directory to a specific folder, you could use the following setting:

cluster.settings.cwd = "/home/user/my-project";

serialization

This specifies the type of serialization used for sending messages between processes. Possible values are 'json' and 'advanced'. 'json' is the default and uses JSON to serialize objects, while 'advanced' uses a more efficient binary protocol.

For example, to use the advanced serialization, you could use the following setting:

cluster.settings.serialization = "advanced";

silent

This specifies whether or not to send output from the workers to the parent process's stdio. When set to true, no output will be sent.

For example, to suppress the output from the workers, you could use the following setting:

cluster.settings.silent = true;

stdio

This is an array that configures the stdio of the forked processes. This configuration must contain an 'ipc' entry, as the cluster module relies on IPC to function.

For example, to redirect the stdout and stderr of the workers to files, you could use the following setting:

cluster.settings.stdio = ["pipe", "pipe", "pipe", "ipc"];

uid and gid

These settings allow you to set the user and group identity of the workers. This is useful for running the workers with specific privileges or permissions.

For example, to run the workers as a specific user, you could use the following settings:

cluster.settings.uid = 1000;
cluster.settings.gid = 1000;

inspectPort

This setting allows you to set the inspector port of the workers. This is useful for debugging the workers.

For example, to set the inspector port to 9229, you could use the following setting:

cluster.settings.inspectPort = 9229;

windowsHide

This setting is only applicable on Windows systems. It specifies whether or not to hide the console window that would normally be created for each worker.

For example, to hide the console window, you could use the following setting:

cluster.settings.windowsHide = true;

Real-World Applications

The cluster module is useful in a variety of real-world applications, including:

  • Scaling web applications to handle high traffic

  • Running long-running tasks in parallel

  • Distributing computations across multiple machines

Potential Applications

Here are some potential applications of the cluster settings:

  • Using execArgv to enable debugging or performance monitoring on the workers

  • Using exec and args to pass different configuration options to different workers

  • Using cwd to run the workers in a specific directory, such as the directory where the application's source code is located

  • Using serialization to improve the performance of message passing between the workers and the parent process

  • Using silent to suppress the output from the workers for cleaner logs

  • Using stdio to redirect the stdout and stderr of the workers to files for logging or analysis

  • Using uid and gid to run the workers with specific privileges or permissions

  • Using inspectPort to debug the workers using the Chrome DevTools

  • Using windowsHide to hide the console window on Windows systems for a more streamlined user experience


cluster.setupMaster([settings])

Sets up the master process settings.

Deprecated: use cluster.setupPrimary() instead.

Parameters:

  • settings {Object} Master process settings object.

Properties:

  • exec {String} Path to the program to execute when a worker dies.

  • execArgs {Array} Arguments to pass to the exec program.

  • silent {Boolean} Suppress output from the exec program.

  • stdio {Array} Array of stdio streams to connect to the exec program.

Example:

const cluster = require("cluster");

// Setup the master process to execute a new worker on worker death.
cluster.setupMaster({
  exec: "node",
  execArgs: ["worker.js"],
});

// Start the cluster.
cluster.fork();
cluster.fork();

Real World Application:

This can be used to automatically restart worker processes when they crash or exit unexpectedly. This can be useful for maintaining high availability in a production environment.


cluster.setupPrimary([settings])

This method is used to configure the primary process in a cluster. It allows you to specify settings that will be applied to all future workers that are created using .fork().

Settings:

The following settings can be configured using .setupPrimary():

  • exec: The path to the worker script.

  • args: An array of arguments to pass to the worker script.

  • silent: A boolean value that indicates whether or not to suppress output from the worker script.

  • stdio: An array of file descriptors to use for the worker script.

  • env: An object containing environment variables to set for the worker script.

Usage:

To use .setupPrimary(), simply pass an object containing the desired settings as its argument. For example:

cluster.setupPrimary({
  exec: "worker.js",
  args: ["--use", "https"],
  silent: true,
});

This will configure the primary process to create workers that execute the worker.js script with the --use and https arguments, and will suppress output from the workers.

Real-World Applications:

.setupPrimary() can be used to configure workers for a variety of different applications. For example, you could use .setupPrimary() to:

  • Create workers that listen on different ports.

  • Create workers that use different environment variables.

  • Create workers that execute different scripts.

Potential Applications:

  • Web server: You could use .setupPrimary() to create workers that handle different types of requests, such as HTTP and HTTPS requests.

  • Database server: You could use .setupPrimary() to create workers that handle different types of database queries.

  • Data processing: You could use .setupPrimary() to create workers that process different types of data.

Code Snippet:

The following code snippet shows how to use .setupPrimary() to create workers that listen on different ports:

const cluster = require("cluster");

cluster.setupPrimary({
  exec: "worker.js",
  args: ["--port", "8080"],
});

cluster.fork();

cluster.setupPrimary({
  exec: "worker.js",
  args: ["--port", "8081"],
});

cluster.fork();

This code will create two workers that listen on ports 8080 and 8081, respectively.


cluster.worker

Summary:

The cluster.worker property represents the current cluster worker process. It is only available within worker processes.

Detailed Explanation:

In a Node.js cluster, the primary process (also known as the parent) creates multiple child processes called workers. Each worker has its own separate event loop and memory space. The cluster.worker property allows you to access information and manipulate the current worker process.

Example:

if (cluster.isWorker) {
  console.log(`Worker ID: ${cluster.worker.id}`);
}

This code prints the ID of the current worker process.

Real-World Applications:

The cluster.worker property can be useful in various scenarios, such as:

  • Identifying the current worker: You can use the id property to identify the specific worker process.

  • Logging and debugging: You can use the console and process objects to log information and debug issues within the worker process.

  • Resource management: You can manage resources (e.g., memory usage) specific to the worker process.

Complete Code Example:

This code creates a cluster with two workers and uses the cluster.worker property to log information from each worker:

// Primary process
const cluster = require("node:cluster");

if (cluster.isPrimary) {
  for (let i = 0; i < 2; i++) {
    cluster.fork();
  }
} else {
  // Worker process
  const id = cluster.worker.id;
  console.log(`Worker ${id}: Logging some info...`);
}

Output:

Worker 1: Logging some info...
Worker 2: Logging some info...

cluster.workers

  • Type: Object

  • Description:

    • Stores active worker objects keyed by their id.

    • Allows easy iteration through all workers.

    • Only available in the primary process.

cluster.workers Properties

  • Keys: Worker IDs

  • Values: Worker objects

Example

const cluster = require("cluster");

// Iterate through all workers
for (const worker of Object.values(cluster.workers)) {
  // Send a message to each worker
  worker.send("Hello from the primary process!");
}

Application

  • Load Balancing: Distribute tasks across multiple workers to improve performance.

  • Fault Tolerance: If a worker fails, another worker can take over its tasks.

  • Scalability: Easily add or remove workers as needed to handle changing workloads.