Scaling

CPU-Intensive

Node Apps

Darren DeRidder

@73rhodes

Is node fast?

it depends...

Simon Willison, "Evented I/O based web servers, explained using bunnies"

Some terms

  • multi-core / multi-cpu system / hyperthreading
  • single-threaded, multi-process, multi-threaded
  • CPU-bound vs. I/O-bound apps
Multi-Threaded Multi-Process
Single process * Multiple processes
Threads share same memory space Each process has own memory space
Fewer resources More resources

The cluster Module


 
  const cluster = require('cluster');
 
					

if (cluster.isPrimary) {
    // Primary process here.
    // Fork workers.
    // Handle messages from workers.
    // Allocate work to worker processes.
    // Finally tell workers to shut down.
    // Print result when all workers exit.
} else {
    // Worker process here.
    // Handle messages from primary process.
    // Send messages back to primary process.
}
					

 
  // Fork workers
  const numWorkers = require('os').cpus().length;

  for (let i = 0; i < numWorkers; i++) {
  	cluster.fork();
  }
 
          

 
  // Primary sending messages to workers.
 
  for (var id in cluster.workers) {
    cluster.workers[id].send('Hello from primary');
  }
 
					

 
  // Primary handling events from workers.
 
  cluster.on("message", myMessageHandler);
  cluster.on("error", myErrorHandler);
  cluster.on("exit", myExitHandler);
 
					

 
  // Worker
 
  process.on("message", (msg) => {
     console.log(`Got message "${msg}" from primary.`);
     process.send(`Hello from worker ${cluster.worker.id}`);
  });
 
					

 
  // Shutting down
  cluster.on("exit", (worker, code, signal) => {
     if(Object.keys(cluster.workers).length === 0) {
        // finalization
     }
  });
 
				 
Alternative: Use pm2 cluster mode.

The worker_threads Module


 
  const {
  	Worker,
  	isMainThread,
  	parentPort,
  	workerData,
  	threadId
  } = require('worker_threads');
 
					

 
  if (isMainThread) {
      // Main thread here.
      // Spawn worker threads.
      // Set up message handlers.
      // Allocate work to workers.
  } else {
      // Worker process.
      // Set up message handlers.
      // Do work, send messages to main thread.
  }
 
					

 
  let threads = new Set();
 
  for (let i=0; i < numCpus; i++) {
      threads.add(new Worker(__filename, { workerData: 3 }));
  }
 
					

 
  for (let worker of threads) {
      worker.on("message", (msg) => { // handle it });
      worker.on("exit", (code) => { // handle it });
      worker.on("error", console.error);
  }
 
					

 
  // `worker thread ${threadId}`
 
  for (let i=0; i < workerData; i++) {
      let result = fibonacci(40);
      parentPort.postMessage(result);
  }
 
					

When to use cluster

  • for networking; more requests in parallel
  • processes bind to same port; round-robin requests
  • each process gets its own `libuv`, memory, etc.

When to use worker-threads

  • long-running CPU-bound tasks
  • threads share a copy of `libuv`
  • uses fewer resources

General advice 1

  • Know if your app is a good candidate.
  • Know the pitfalls, google "Threads are Evil"
  • Know the cpu architecture:
    lscpu
    sysctl
    node require('os').cpus()

General advice 2

  • Don't expect eg. 4-fold increase on 4 cores.
  • Not all tasks scale well (eg. simple iteration)
  • There's overhead for spawning threads / forking processes. Better to create a "pool" of them.
  • Plan how to divide work (eg. round-robin) & correlate results.
  • Avoid lock-in; weigh benefits / complexity.
Thanks!