ES6 Iterator in depth

Iterator is a pattern for pulling data from a data source one at a time. This pattern has been in programming for a long time. In Javascript, this is a new feature in ES6.

Interfaces

Following are interfaces defined in ES6:

iterator interfaces

  • Iterator: This interface defines 3 methods:
    • next(): [required] Returns next IteratorResult object
    • return(): [optional] Stops iterator and return IteratorResult object
    • throw(): [optional] Indicates an error and return IteratorResult object
  • IteratorResult: This interface defines following properties:
    • value:[optional] Current iteration value, can be any values
    • done: Indicates completion status of iterator, can be truthy / falsy values
  • Iterable: This interface defines only one method:
    • @@iterator(): [required] Returns an Iterator

Many built in data structures in Javascript implement Iterable interface. As Javascript doesn’t have interfaces, these interfaces are just more of a convention. That said, we can create our own iterators adhering to these interfaces.

Iterability

Looping through a collection and processing each item of the collection is a very common operation. Javascript provides a number of ways for iterating through a collection such as for loop, map, filter, forEach… ES6 iterator brings the concept of iterability to the language.

data consumers - data sources

It’s nearly impossible (and unwise) for data consumers to support all data sources. The introduction of iterator interfaces solves the problem: Any data source that wants to be consumed by data consumers just need to implement the interface. Data consumers just use the iterator to get the values from data sources they are consuming.

Iterator consumption

for..of loop

The ES6 for..of loop consumes a conforming iterable data sources.

Arrays and Typed Arrays are iterable.
let arr = [1, 2, 3];
for(let item of arr) {
  console.log(item);//1 2 3
}
String

Strings are also iterables.

for(let c of "hello") {
  console.log(c);//h e l l o
}
Map

Maps are iterables over their entries. Each entry is an array [key, value].

const m = new Map();
m.set('a', 'A');
m.set('b', 'B');
for(let pair of m) {
  console.log(pair); //['a', 'A']
                     //['b', 'B']
}
Set

Sets are iterables over their elements.

const set = new Set();
set.add('a');
set.add('b');
set.add('c');
for(let e of set) {
  console.log(e); //a
                  //b
                  //c
}
arguments
function test(){
  for(let param of arguments){
    console.log(param);
  }
}
test('a', 'b', 'c'); //a b c
Iterable computed data

Some ES6 data structures such as Map, Set, Arrays have following methods that return iterables:

  • entries()
const set = new Set();
set.add('a');
set.add('b');
set.add('c');
for(let e of set.entries()) {
  console.log(e); //['a', 'a']
                  //['b', 'b']
                  //['c', 'c']
}

const m = new Map();
m.set('a', 'A');
m.set('b', 'B');
for(let pair of m.entries()) {
  console.log(pair); //['a', 'A']
                     //['b', 'B']
}
  • keys()
const set = new Set();
set.add('a');
set.add('b');
set.add('c');
for(let e of set.keys()) {
  console.log(e); //['a' ]
                  //['b']
                  //['c']
}

const m = new Map();
m.set('a', 'A');
m.set('b', 'B');
for(let pair of m.keys()) {
  console.log(pair); //['a']
                     //['b']
}
  • values()
const set = new Set();
set.add('a');
set.add('b');
set.add('c');
for(let e of set.values()) {
  console.log(e); //['a']
                  //['b']
                  //['c']
}

const m = new Map();
m.set('a', 'A');
m.set('b', 'B');
for(let pair of m.values()) {
  console.log(pair); //['A']
                     //['B']
}

Destructuring (via Array pattern)

Destructuring via Array patterns works with iterables.

const set = new Set();
set.add('a');
set.add('b');
set.add('c');
set.add('d');
let [a, b, ...cd] = set;
//a='a'
//b='b'
//cd=['c', 'd']
const m = new Map();
m.set('a', 'A');
m.set('b', 'B');
let [e1, e2] = m;
//e1: ['a', 'A']
//e2: ['b', 'B']

Spread operator

Spread operator can be used to insert iterables into an array.

const set = new Set();
set.add('b');
set.add('c');

let items = ['a', ...set, 'd'];//["a", "b", "c", "d"]

And spread operator can be used to convert an iterable to an array

const map = new Map();
map.set('b', 'B');
map.set('c', 'C');

let keys= [...map.keys()];//["b", "c"]

APIs accepting iterables

There are APIs accepting iterables:

  • Map([iterable]), WeakMap([iterable]), Set([iterable]), WeakSet([iterable])
  • Array.from([iterable]), Promise.all([iterable]), Promise.race([iterable])
let set = new Set("abca");
let map = new Map([[1,"a"],[2,"b"],[3,"c"]]);

Custom Iterator

In addition to the built-in iterators, we can create our own. All we need to do is adhere to above interfaces. That means we need to implement a method whose key is [Symbol.iterator]. That method must return an iterator, an object that iterate over items of the iterable.

let idGen = {
  [Symbol.iterator](){
    return this;
  },
  next(){
    return {value: Math.random(), done: false}
  }
}
let count = 0;
for(let id of idGen) {
  console.log(id);
  if (count > 4) {
    break;
  }
  count++;
}

Array-like objects are not iterable by default

let arrayLikeObj = {
  0: 'a',
  1: 'b',
  2: 'c',
  3: 'd',
  length: 4
}
for(var e of arrayLikeObj) {
  console.log(e);//TypeError: arrayLikeObj[Symbol.iterator] is not a function
}

We can make array-like objects iterable by implementing the method @@iterator:

let arrayLikeObj = {
  0: 'a',
  1: 'b',
  2: 'c',
  3: 'd',
  length: 4
  [Symbol.iterator]: Array.prototype[Symbol.iterator]
}
for(var e of arrayLikeObj) {
  console.log(e);
}
//Output:
//a
//b
//c
//d

Optional return(..) and throw(..)

The optional methods return(..) and throw(..) are not implemented in built in iterators.
return(..) is a signal to an iterator that the consumer code is complete and will not pull any value from it. This signal can be used to notified the producer to perform clean up step such as closing file handle, database, etc.

throw(..) is used to signal an exception/error to an iterator. We will take a deep look at this in Generators topic.

In for..of loops, the termination can be caused by:

  • break
  • return
  • throw
  • continue

In these cases, for..of loops let the iterator know about termination.

var Fib = {
    [Symbol.iterator]() {
        var n1 = 1, n2 = 1;
        return {
            next() {
                var current = n2;
                n2 = n1;
                n1 = n1 + current;
                return { value: current, done: false };
            },
            return(v) {
                console.log("Fibonacci sequence terminated.");
                return { value: v, done: true };
            }
        }
    }

};

for (let n of Fib) {
  console.log(n);
  if (n > 20) {
    break;
  }
}
//1
//2
//3
//5
//8
//13
//21
//Fibonacci sequence terminated.

Some techniques with ES6 default parameter values, spread and rest

Default behavior

Default parameter values let function parameters be initialized with default values when no value or undefined is passed.

function join(arr=[], sep=','){
    return arr.join(sep);
}

join();//""
join([1,2,3]); //"1,2,3"
join(["javascript", "is", "awesome"], " "); //"javascript is awesome"

We can also specify a function as a default value.

import rp from 'request-promise';

function jsonParser(body, response) {
    if (/^application\/json.*/i.test(response.headers['content-type'])){
        return JSON.parse(body);
    }
    return body;
}
function fetch(url, transform=jsonParser) {
    return rp({
        url: url,
        transform: jsonParser
    });
}
Required parameters

Another technique with default parameter values is to allow a function to declare required parameters (thanks to ). This is really useful when designing APIs that need parameter validation.

function required(param){
    throw new Error(`${param} is required`);
}
const Storage = {
    setItem: function setItem(key = required('key'), value=required('value')){
           //implentation code goes here
    },
    getItem: function getItem(key = required('key')){
    }    
}

Storage.setItem();//Uncaught Error: key is required
Storage.setItem('key1');//Uncaught Error: value is required
Storage.setItem('key1', 'value1'); //OK
Copy arrays and modify them

In ES5, we can use Array#concat or Array#slice to make a copy of an array.

var arr = [1, 2, 3, 4, 5];
var arr2 = arr.slice(0); //1, 2, 3, 4, 5
var arr3 = [].concat(arr); //1, 2, 3, 4, 5

In ES6, copying an array is even easier with spread operator.

const arr = [1, 2, 3, 5, 6];
const arr2 = [...arr]; //1, 2, 3, 5, 6
const b = [...arr.slice(0, 3), 4, ...arr.slice(3)];//1, 2, 3, 4, 5, 6
Copy objects and modify them

In ES5, we can borrow a utility function such as jQuery#extend, _.assign to make a copy of an object:

var o = {
    name: 'John',
    age: 30,
    title: 'Software Engineer',
}
var o2 = _.assign({}, o);
var o3 = _.assign({}, o, {age: 25});

In ES6, we can use the built-in function Object.assign:

const o = {
    name: 'John',
    age: 30,
    title: 'Software Engineer',
}
const o2 = Object.assign({}, o);
const o3 = Object.assign({}, o, {age: 25});

Another way is to use spread (…) operator:

const o2 = {
    ...o
}
const o3 = {
    ...o,
    age: 25
}

Note: spread operator for objects isn't supported in ES6. Hopefully, this will be included in ES7. If you're using a transpiler like Babel, you're covered.

Avoid Function.prototype.apply

Some functions such as Math.max, Math.min, Date, etc. require a list of parameters.

Math.max(1, 100, 90, 20);
new Date(2016, 7, 13);

What if we have a list of parameter values contained in an array ? A workaround is to use Function.prototype.apply(thisArg, [])

var numbers = [1, 100, 90, 20];
Math.max.apply(null, numbers); // 100

In ES6, this can be solved easily with spread operator:

var numbers = [1, 100, 90, 20];
Math.max(...numbers);

var parts = [2016, 7, 13];
var d = new Date(...parts);
Forget arguments

arguments is an array-like object that is available inside function calls. It represents the list of arguments that were passed in when invoking the function. There're some gotchas:

  • arguments is not an array. We need to convert it to an array if we want to use array methods such as slice, concat, etc.
  • arguments may be shadowed by function parameter or a variable with the same name.
function doSomething(arguments) {
    console.log(arguments);
}
doSomething(); //undefined
doSomething(1); //1

function doSomething2() {
    var arguments = 1;
    console.log(arguments);
}
doSomething2();// 1
doSomething2(2, 3, 4); // 1

In ES6, we can completely forget arguments. With rest(…) operator, we can collect all arguments passed function calls:

function doSomething(...args) {
    console.log(args);
}

doSomething(1, 2, 3, 4); //[1, 2, 3, 4]

With rest operator, all arguments passed to doSomething are collected into args. More than that, args is an array, so we don't need an extra step for converting to an array as we did for arguments.

All in one

In this part, we will use techniques above in a complex case. Let's implement the fetch API. For simplicity, we build the API on top of request-promise module.

function fetch(url, options){

}

The first step is parameter checking:

//ES5
import rp from 'request-promise';
function fetch(url, options){
    var requestURL = url || '';
    var opts = options || {};
    ...
}
//ES6
import rp from 'request-promise';
function fetch(url='', options={}){
   ...
}

We also need to check some properties of options object:

function jsonParser(body, response) {
    if (/^application\/json.*/i.test(response.headers['content-type'])){
        return JSON.parse(body);
    }
    return body;
}
//ES5
import rp from 'request-promise';
function fetch(url, options){
    var requestURL = url || '';
    var opts = options || {};
    var method = options.method || 'get';
    var headers = opts.headers || {'content-type': 'application/json'};
    var transform = jsonParser;
    ...
}
//ES6
import rp from 'request-promise';
function fetch(url='', {method='get',
                        headers={'content-type': 'application/json'},
                        transform=jsonParser}){

}

In the ES6 version of the API, we use destructuring to extract some properties (method, headers and transform) and set some default values. This doesn't work if we don't pass the options object because we can't match an pattern against undefied:

fetch();//TypeError: Cannot match against 'undefined' or 'null'

This can be fixed by a default value:

//ES6
import rp from 'request-promise';
function fetch(url='', {method='get',
                        headers={'content-type': 'application/json'},
                        transform=jsonParser} = {}){
    return rp({
        url: url,
        method: method,
        headers: headers,
        transform: transform
    });
}

As client code may pass properties other than method, headers and transform, we need to copy all remaining properties:

//ES5
import rp from 'request-promise';
function fetch(url, options){
    var requestURL = url || '';
    var opts = options || {};
    var method = options.method || 'get';
    var headers = opts.headers || {'content-type': 'application/json'};
    var transform = jsonParser;
    //copy all properties and then overwrite some
    opts = _.assign({}, opts, {method: method, headers: headers, transform: transform})

    return rp(opts);
}

In ES6, we need to collect remaining properties by rest operator:

function fetch(url='', {method='get',
                        headers={'content-type': 'application/json'},
                        transform=jsonParser,
                        ...otherOptions} = {}){

}

and using spread operator to pass those properties to the target function:

function fetch(url='', {method='get',
                        headers={'content-type': 'application/json'},
                        transform=jsonParser,
                        ...otherOptions} = {}){
    return rp({
        url: url,
        method: method,
        headers: headers,
        transform: transform,
        ...otherOptions
    });   
}

And finally, with object literal shorthand, we can write this:

function fetch(url='', {method='get',
                        headers={'content-type': 'application/json'},
                        transform=jsonParser,
                        ...otherOptions} = {}){
    return rp({
        url,
        method,
        headers,
        transform,
        ...otherOptions
    });   
}

ES6 classes: The hidden truths

ES6 classes: The hidden truths

For many developers, especially those coming from traditional languages such as Java, C#, PHP, etc. and those who have a great passion for OOP paradigm, ES6 may be a huge win as the most wanted feature in Javascript finally has been out: class. But are there only roses ?

Old wine in a new bolttle

We declare a class in ES6 like below:

class A {
}

When we inspect type of A

typeof A

It will print out function. If we type above code in Chrome’s console, it will print

function class A {
}

A is just a special function in that it isn’t callable (until this point in time). Trying to invoke A() will generate ReferenceError:

Uncaught ReferenceError: A is not defined

Apparently, nothing new is shipped with ES6.

Just convenient syntax

Let’s see what the Javascript engine will do when it deals with class

class Rectangle {
    constructor(width, height) {
        this.height= height;
        this.width = width;
    }

    getArea() {
        return this.height* this.width;
    }
    toString() {
        return `Rectangle: width(${this.width}), height(${this.height})`;
    }
    static create(height, width) {
        return new Rectangle(height, width);
    }
}

class Square extends Rectangle {
    constructor(height) {
        super(height, height);
    }
    toString() {
        return `Square: length(${this.height})`;
    }
}

const s1 = new Square(10);

Below is what the javascript engine will see:

Object diagrams

Apparently, nothing new with ES6 class.

It’s just syntactic sugar on top of prototype inheritance.

An apple vs. an orange

Let’s see what are the differences between ES6 class and OOP language class.

Concept

In traditional OOP languages, class is a blueprint / template from which instances are created.

In Javascript, class is just a constructor function.

Behavior

In traditional languages, when creating instances, methods, properties, etc. are copied down from parent classes to child classes and then from the class to new instances.
OOP behavior

While in Javascript, there is no such a copy from classes to classes and from classes to instances. There are just links between objects.
JS behavior

Features

class is at heart of OOP languages. It’s no strange that OOP languages support a lot of features such as: class variable scopes, multiple inheritances, static block, nested class..

In Javascript, class is just syntactic sugar on top of prototype inheritance. Features support in class is limited. Currently, only following are supported:

  • constructor
  • instance method
  • static method

Note that class property is not supported.

So, comparing Javascript class to OOP language class is like comparing an apple to an orange.

Awaiting AWS resources

Normally, when we work with AWS using AWS SDK, we need to wait for AWS resources to be in a specific status such as: an EC2 instance is running, a Kinesis stream is active, a Opsworks deployment process is successful… before we can continue. This can be done by continuous polling AWS resources until they are in a desired status.

Bellow is sample code for polling a newly created kinesis stream until it’s active.

function waitForStreamActive(streamName){
    let count = 0;
    const interval = 5000;
    const maxTries = 15;
    return (function wait(){
        return describeStream({StreamName : streamName}).then((data)=>{
            if(data.StreamDescription.StreamStatus === 'ACTIVE'){
                return Promise.resolve(streamName);
            } else {
                count++;
                logger.info(`Waiting for the stream ${streamName} active: ${count}`);
                //The stream is not active yet. Wait for some seconds
                if(count < maxTries){
                    return Promise.delay(interval).then(wait);
                } else {
                    return Promise.reject(`Max tries ${count} reached but the stream ${streamName} still not active`);
                }
            }
        });
    }());
}

We don't want to wait forever. In above code, when a polling completes, we will wait 5 seconds (interval) before a next polling. And we will do at most 15 tries (maxTries). If the resource isn't in a desired status after maxTries, we will terminate the polling process.

I kept doing this polling (partly b/c I was in a rush) by writing my own code until I realized that AWS SDK provides an API for this need (see waitFor):

waitFor(state, params, callback) ⇒ void

As waitFor is in abstract class (AWS.Service), we need to consult specific resource class for supported state names.

So, above code can be rewritten using AWS API waitFor as follows:

waitFor('streamExists', {StreamName: 'stream name'})
    .then(function(data){
        console.log(data);
    })
    .catch(function(err) {
        console.error(err);
    });

Sadly, AWS SDK for Node doesn't seem to allow us to config interval and maxTries parameters. I hadn't thought so ( because I know that AWS SDK for Ruby does allow us to do so) until I read the document carefully and found the hard-coded parameters stored in kinesis-2013-12-02.waiters2.json

{
  "version": 2,
  "waiters": {
    "StreamExists": {
      "delay": 10,
      "operation": "DescribeStream",
      "maxAttempts": 18,
      "acceptors": [
        {
          "expected": "ACTIVE",
          "matcher": "path",
          "state": "success",
          "argument": "StreamDescription.StreamStatus"
        }
      ]
    }
  }
}

Note: In code samples above, AWS's callback style APIs such as kinesis.describeStream, kinesis.waitFor… are converted to Promise style by using a a Promise library like bluebird

Understanding middleware pattern in express.js

Understanding middleware pattern in express.js

The term middleware (middle-ware, literally the software in the middle) may cause confusing for inexperienced and especially those coming from enterprise programming world. That is because in the enterprise architecture, middleware reminds of software suits that shield developers from having to deal with many of the low level and difficult issues, allowing developers to concentrate on business logic.

In express.js, middleware function is defined as

Middleware functions are functions that have access to the request object (req), the response object (res), and the next middleware function in the application’s request-response cycle. The next middleware function is commonly denoted by a variable named next.

A middleware function has following signature:

function(req, res, next) { ... }

There is a special kind of middleware named error-handling. This kind of middleware is special because it takes 4 arguments instead of three allowing expressjs to recognize this middleware as error-handling

function(err, req, res, next) {...}

Middleware functions can perform following tasks:

  • Logging requests
  • Authenticating / authorizing requests
  • Parsing the body of requests
  • End a request – response lifecycle
  • Call the next middleware function in the stack.

These tasks are not core concerns (business logic) of an application. Instead, they are cross cutting concerns applicable throughout the application and affecting the entire application.

Request-response lifecycle through a middleware is as follows:

alt text

  1. The first middleware function (A) in the pipeline will be invoked to process the request.
  2. Each middleware function may end the request by sending response to client or invoke the next middleware function (B) by calling next() or hand over the request to an error-handling middleware by calling next(err) with an error argument
  3. Each middleware function receives input as the result of previous middleware function
  4. If the request reaches the last middleware in the pipeline, we can assume 404 error
app.use((req, res) =&amp;gt; {
    res.writeHead(404, { 'Content-Type': 'text/html' });
    res.end(&quot;Cannot &quot; + req.method.toUpperCase() + &quot; &quot; + req.url);
});

As we can see, the idea behind the middleware pattern is not new. We can consider middleware pattern in express.js a variant of

This pattern has some benefits:

  • Avoid coupling the sender of a request to the receiver by giving more than one object a chance to handle the request. Both the receiver and the sender have no explicit knowledge of each other.
  • Flexibility in distributing responsibilities among objects. We add or change responsibilities for handling a request by adding to or changing the chain at run-time
references

Understanding execFile, spawn, exec and fork in node

Understanding execFile, spawn, exec and fork in node

In node, child_process module provides 4 different methods for executing external applications:

1. execFile

2. spawn

3. exec

4. fork

All of these are asynchronous. Calling these methods will return an object which is instance of ChildProcess class.

cp_methods

The right method depends on what we need. We will take a look in detail at these.


1. execFile

what?

Executes an external application, given optional arguments and callback with the buffered output after the application exits. Below is the method signature (from nodejs document):

child_process.execFile(file[, args][, options][, callback])

how?

In below example, node program will be executed with argument “–version”. When the external application exists, callback function is called. Callback function will contains the stdout and stderr output of the child process. The output stdout from the external application is buffered internally.
Running below code will print out the current node version.

const execFile = require('child_process').execFile;
    const child = execFile('node', ['--version'], (error, stdout, stderr) => {
    if (error) {
        console.error('stderr', stderr);
        throw error;
    }
    console.log('stdout', stdout);
});

How does node know where to find the external application?

PATH environment variable which specifies a set of directories where executable programs are located. If an external application exists on PATH environment, it can be located without needing an absolute or relative path to the application

when ?

execFile is used when we just need to execute an application and get the output. For example, we can use execFile to run image-processing application like ImageMagick to convert an image from PNG to JPG format and we only care if it succeeds or not. execFile should not be used when the external application produces a large amount of data and we need to consume that data in real time manner.

2. spawn

what?

The spawn method spawns an external application in a new process and return a streaming interface for I/O.

child_process.spawn(command[, args][, options])
  • command <String> The command to run
  • args <Array> List of string arguments
  • options <Object>
    • cwd <String> Current working directory of the child process
    • env <Object> Environment key-value pairs
    • stdio <Array> | <String> Child’s stdio configuration. (See options.stdio)
    • detached <Boolean> Prepare child to run independently of its parent process. Specific behavior depends on the platform, seeoptions.detached)
    • uid <Number> Sets the user identity of the process. (See setuid(2).)
    • gid <Number> Sets the group identity of the process. (See setgid(2).)
    • shell <Boolean> | <String> If true, runs command inside of a shell. Uses ‘/bin/sh’ on UNIX, and ‘cmd.exe’ on Windows. A different shell can be specified as a string. The shell should understand the -c switch on UNIX, or /s /c on Windows. Defaults to false (no shell).
  • return: <ChildProcess>

how?

const spawn = require('child_process').spawn;
const fs = require('fs');
function resize(req, resp) {
    const args = [
        "-", // use stdin
        "-resize", "640x", // resize width to 640
        "-resize", "x360<", // resize height if it's smaller than 360
        "-gravity", "center", // sets the offset to the center
        "-crop", "640x360+0+0", // crop
        "-" // output to stdout
    ];

    const streamIn = fs.createReadStream('./path/to/an/image');
    const proc = spawn('convert', args);
    streamIn.pipe(proc.stdin);
    proc.stdout.pipe(resp);
}

In the nodejs function above (an expressjs controller function), we read an image file using a stream. Then, we use spawn method to spawn convert program (see imagemagick.org). Then, we feed ChildProcess proc will the image stream. As long as the proc object produces data, we write that data to the resp (which is a Writable stream) and users can see the image immediately without having to wait for the whole image converted (resized)

when?

As spawn returns a stream based object, it’s great for handling applications that produce large amount of data or working with data as it reads in.As it’s stream based, all stream benefits apply as well:

  • Low memory footprint
  • Automatically handle back-pressure
  • Lazily produce or consume data in buffered chunks.
  • Evented and non-blocking
  • Buffers allow you to work around the V8 heap memory limit

3. exec

what?

This method will spawn a subshell and execute the command in that shell and buffer generated data. When the child process completes, callback function will be called with:

  • buffered data when the command executes successfully
  • error (which is an instance of Error) when the command fails
child_process.exec(command[, options][, callback])
  • command <String> The command to run, with space-separated arguments
  • options <Object>
    • cwd <String> Current working directory of the child process
    • env <Object> Environment key-value pairs
    • encoding <String> (Default: ‘utf8’)
    • shell <String> Shell to execute the command with (Default: ‘/bin/sh’ on UNIX, ‘cmd.exe’ on Windows, The shell should understand the -c switch on UNIX or /s /c on Windows. On Windows, command line parsing should be compatible withcmd.exe.)
    • timeout <Number> (Default: 0)
    • maxBuffer <Number> largest amount of data (in bytes) allowed on stdout or stderr – if exceeded child process is killed (Default:200\*1024)
    • killSignal <String> (Default: ‘SIGTERM’)
    • uid <Number> Sets the user identity of the process. (See setuid(2).)
    • gid <Number> Sets the group identity of the process. (See setgid(2).)
  • callback <Function> called with the output when process terminates
  • Return: <ChildProcess>

Comparing to execFile and spawn, exec doesn’t have an args argument because exec allows us to execute more than one command on a shell. When using exec, if we need to pass arguments to the command, they should be part of the whole command string.

how?

Following code snippet will print out recursively all items under current folder:

const exec = require('child_process').exec;
exec('for i in $( ls -LR ); do echo item: $i; done', (e, stdout, stderr)=> {
    if (e instanceof Error) {
        console.error(e);
        throw e;
    }
    console.log('stdout ', stdout);
    console.log('stderr ', stderr);
});

When running command in a shell, we have access to all functionality supported by that shell such as pipe, redirect..

const exec = require('child_process').exec;
exec('netstat -aon | find "9000"', (e, stdout, stderr)=> {
    if (e instanceof Error) {
        console.error(e);
        throw e;
    }
    console.log('stdout ', stdout);
    console.log('stderr ', stderr);
});

In above example, node will spawn a subshell and execute the command “netstat -aon | find “9000”” in that subshell. The command string includes two commands:

  • netstat -aon: netstat command with argument -aon
  • find “9000”: find command with argument 9000

The first command will displays all active TCP connections(-a), process id (-o), ports and addresses (expressed numerically -n) on which the computer is listening. Output of this command will feed into the second command which finds the process with port id 9000. On success, following line will print out

TCP    0.0.0.0:9000           0.0.0.0:0              LISTENING       11180

when?

exec should be used when we need to utilize shell functionality such as pipe, redirects, backgrounding…

Notes
  • The exec will execute the command in a shell which map to /bin/sh (linux)and cmd.exe (windows)
  • Executing a command in a shell using exec is great. However, exec should be used with caution as shell injection can be exploited. Whenever possible, execFile should be use as invalid arguments passed to execFile will yield an error.

4. fork

what?

The child_process.fork() method is a special case of child_process.spawn() used specifically to spawn new Node.js processes. Like child_process.spawn(), a ChildProcess object is returned. The returned ChildProcess will have an additional communication channel built-in that allows messages to be passed back and forth between the parent and child.

The fork method will open an IPC channel allowing message passing between Node processes:

  • On the child process, process.on(‘message’) and process.send(‘message to parent’) can be used to receive and send data
  • On the parent process, child.on(‘message’) and child.send(‘message to child’) are used

Each process has it’s own memory, with their own V8 instances assuming at least 30ms start up and 10mb each.

child_process.fork(modulePath[, args][, options])
  • modulePath <String> The module to run in the child
  • args <Array> List of string arguments
  • options <Object>
    • cwd <String> Current working directory of the child process
    • env <Object> Environment key-value pairs
    • execPath <String> Executable used to create the child process
    • execArgv <Array> List of string arguments passed to the executable (Default: process.execArgv)
    • silent <Boolean> If true, stdin, stdout, and stderr of the child will be piped to the parent, otherwise they will be inherited from the parent, see the ‘pipe’ and ‘inherit’ options for child_process.spawn()‘s stdio for more details (Default:false)
    • uid <Number> Sets the user identity of the process. (See setuid(2).)
    • gid <Number> Sets the group identity of the process. (See setgid(2).)
  • Return: <ChildProcess>

how?

//parent.js
const cp = require('child_process');
const n = cp.fork(`${__dirname}/sub.js`);

n.on('message', (m) => {
  console.log('PARENT got message:', m);
});

n.send({ hello: 'world' });
//sub.js
process.on('message', (m) => {
  console.log('CHILD got message:', m);
});

process.send({ foo: 'bar' });

when?

Since Node main process is single threaded, long running tasks like computation will tie up the main process. As a result, incoming requests can’t be serviced and the application becomes unresponsive. Off loading long running tasks from the main process by forking a new Node process allows the application to serve incoming requests and stay responsive.

summary

The method should be used to execute an external application can be summarized as the image below.

references