Cross Origin Techniques

Cross Origin Techniques

To understand the techniques, we need to understand the problem we’re trying to solve: what are cross origin requests ?

But what is an origin first?

An origin defines where a resource lives.

Following URLs:

have the same origin: https://dzone.com

And, following URLs have different origins

URLs Origin
https://canho.me/2016/06/18/awaiting-aws-resources https://canho.me
http://canho.me http://canho.me
https://api.canho.me https://api.canho.me
https://canho.me:5000 https://canho.me:5000
file:///D:/projects/node-sample/package.json null

So, the origin is everything in the URL up until the path. In other words, the origin is a combination of the scheme, host, and port.

Same-origin vs Cross-origin

When we’re saying same-origin and cross-origin, we’re actually comparing origins of 2 objects: the client and the server. So, when an origin refers to the client making the request, it’s client origin. When an origin refers to the server receiving the request, it’s server origin.

So, we can define a request is a same-origin request when client origin and server origin are exactly the same.

Alt

Otherwise, the request is a cross-origin request.

Alt

But what is the problem anyway ? Why do I care about that ? Same-origin policy!
The same-origin policy restricts how a document or script loaded from one origin can interact with a resource from another origin. It is a critical security mechanism for isolating potentially malicious documents.

Same-origin policy is necessary but it’s too restrictive that causes some problems for servers using multiple domains and it’s hard for servers to open up its APIs to a new world of users.

Bellow are some techniques dealing with cross origin:

JSONP

JSONP – JSON with padding, is the oldest technique that is based on the fact that the browser doesn’t impose same-origin policy on script tag. And, we use the script tag to make cross-origin requests

<!DOCTYPE html>
<html>
    <head>
        function loadUsers(users) {

        };
        <script src="https://api.github.com/users?jsoncallback=loadUsers" />
    </head>
    <body>
    <div id="users">
    </div>


    </body>
</html>

The important part of script tag is the parameter jsoncallback. The value of this parameter is the name of an existing function loadUsers. When sending a response to the client, the server first pads the response with the name of the callback function, like this

loadUsers([{"id": "user1",...}, {"id": "user2"}])

When the client receives the response, it calls the callback function with the actual data returned by the server.

JSONP only supports GET requests. Ideally, this is use for sharing public data.

Cross-origin messaging

HTML5’s postMessage method allows documents from different origins to communicate with each other.

Alt

The page wanting send cross-origin requests needs to embed a document from the server via iframe and use postMessage to communicate with the iframe. As the iframe and the server are from the same origin, requests from the iframe and the server are same-origin requests.

As postMessage is a low level API, we may want a library acting as abstraction layer and providing high level messaging semantics on top of postMessage. There’re such libraries:

As postMessage is widely supported (see http://caniuse.com/#search=postmessage), this technique can be used in most cases.

Using proxy server

The same-origin policy is imposed by the browser on javascript code running in the browser. There isn’t such same-origin policy on the server. So, we can use a client’s own server acting as a proxy server receiving requests from the client and forwarding requests to the server.

Alt

Using this technique enables almost any type of cross-origin requests.

CORS

CORS – Cross-Origin Resource Sharing, is a W3C spec that allows cross-origin communication. CORS works by adding new HTTP headers that allow servers to describe a set of origins that are permitted to interact with the server. Most part of this technique involves in server configuration.

Below is cors flow with preflight request
Alt

CORS headers prefixed with Access-Control-

  • Access-Control-Allow-Origin (required): This header must be included in all valid responses. Possible values: * or a specific origin
  • Access-Control-Allow-Methods: indicates the methods allowed when accessing the resource
  • Access-Control-Allow-Headers: used in response to a preflight request to indicate which HTTP headers can be used when making the actual request
  • Access-Control-Allow-Credentials: indicates if the server allows credentials during CORS requests

A great post about CORS can be found here

CORS can be used as a modern alternative to the JSONP pattern. While JSONP supports only the GET request method, CORS also supports other types of HTTP requests. Using CORS enables a web programmer to use regular XMLHttpRequest, which supports better error handling than JSONP.(wiki)

Summary

  • CORS is the standardized mechanism for making cross-origin requests. Large part of CORS involved with server configuration. Almost all browsers supports CORS.
  • The first 3 techniques follow the same pattern: using a proxy object that receives the request from the client and send it to the server
  • The first 3 techniques require custom code. This leads to additional maintenance cost.

Awaiting AWS resources

Normally, when we work with AWS using AWS SDK, we need to wait for AWS resources to be in a specific status such as: an EC2 instance is running, a Kinesis stream is active, a Opsworks deployment process is successful… before we can continue. This can be done by continuous polling AWS resources until they are in a desired status.

Bellow is sample code for polling a newly created kinesis stream until it’s active.

function waitForStreamActive(streamName){
    let count = 0;
    const interval = 5000;
    const maxTries = 15;
    return (function wait(){
        return describeStream({StreamName : streamName}).then((data)=>{
            if(data.StreamDescription.StreamStatus === 'ACTIVE'){
                return Promise.resolve(streamName);
            } else {
                count++;
                logger.info(`Waiting for the stream ${streamName} active: ${count}`);
                //The stream is not active yet. Wait for some seconds
                if(count < maxTries){
                    return Promise.delay(interval).then(wait);
                } else {
                    return Promise.reject(`Max tries ${count} reached but the stream ${streamName} still not active`);
                }
            }
        });
    }());
}

We don't want to wait forever. In above code, when a polling completes, we will wait 5 seconds (interval) before a next polling. And we will do at most 15 tries (maxTries). If the resource isn't in a desired status after maxTries, we will terminate the polling process.

I kept doing this polling (partly b/c I was in a rush) by writing my own code until I realized that AWS SDK provides an API for this need (see waitFor):

waitFor(state, params, callback) ⇒ void

As waitFor is in abstract class (AWS.Service), we need to consult specific resource class for supported state names.

So, above code can be rewritten using AWS API waitFor as follows:

waitFor('streamExists', {StreamName: 'stream name'})
    .then(function(data){
        console.log(data);
    })
    .catch(function(err) {
        console.error(err);
    });

Sadly, AWS SDK for Node doesn't seem to allow us to config interval and maxTries parameters. I hadn't thought so ( because I know that AWS SDK for Ruby does allow us to do so) until I read the document carefully and found the hard-coded parameters stored in kinesis-2013-12-02.waiters2.json

{
  "version": 2,
  "waiters": {
    "StreamExists": {
      "delay": 10,
      "operation": "DescribeStream",
      "maxAttempts": 18,
      "acceptors": [
        {
          "expected": "ACTIVE",
          "matcher": "path",
          "state": "success",
          "argument": "StreamDescription.StreamStatus"
        }
      ]
    }
  }
}

Note: In code samples above, AWS's callback style APIs such as kinesis.describeStream, kinesis.waitFor… are converted to Promise style by using a a Promise library like bluebird

Understanding middleware pattern in express.js

Understanding middleware pattern in express.js

The term middleware (middle-ware, literally the software in the middle) may cause confusing for inexperienced and especially those coming from enterprise programming world. That is because in the enterprise architecture, middleware reminds of software suits that shield developers from having to deal with many of the low level and difficult issues, allowing developers to concentrate on business logic.

In express.js, middleware function is defined as

Middleware functions are functions that have access to the request object (req), the response object (res), and the next middleware function in the application’s request-response cycle. The next middleware function is commonly denoted by a variable named next.

A middleware function has following signature:

function(req, res, next) { ... }

There is a special kind of middleware named error-handling. This kind of middleware is special because it takes 4 arguments instead of three allowing expressjs to recognize this middleware as error-handling

function(err, req, res, next) {...}

Middleware functions can perform following tasks:

  • Logging requests
  • Authenticating / authorizing requests
  • Parsing the body of requests
  • End a request – response lifecycle
  • Call the next middleware function in the stack.

These tasks are not core concerns (business logic) of an application. Instead, they are cross cutting concerns applicable throughout the application and affecting the entire application.

Request-response lifecycle through a middleware is as follows:

alt text

  1. The first middleware function (A) in the pipeline will be invoked to process the request.
  2. Each middleware function may end the request by sending response to client or invoke the next middleware function (B) by calling next() or hand over the request to an error-handling middleware by calling next(err) with an error argument
  3. Each middleware function receives input as the result of previous middleware function
  4. If the request reaches the last middleware in the pipeline, we can assume 404 error
app.use((req, res) =&amp;gt; {
    res.writeHead(404, { 'Content-Type': 'text/html' });
    res.end(&quot;Cannot &quot; + req.method.toUpperCase() + &quot; &quot; + req.url);
});

As we can see, the idea behind the middleware pattern is not new. We can consider middleware pattern in express.js a variant of

This pattern has some benefits:

  • Avoid coupling the sender of a request to the receiver by giving more than one object a chance to handle the request. Both the receiver and the sender have no explicit knowledge of each other.
  • Flexibility in distributing responsibilities among objects. We add or change responsibilities for handling a request by adding to or changing the chain at run-time
references

Executing async tasks serially with Array#reduce

Executing async tasks serially with Array#reduce

Suppose that we’re assigned a task of writing a migration tool for a database with following requirements:

  • The tool will read a list of sql scripts and execute them serially one after another.
  • Each script will run once previous script has completed.
  • If any script execution fails, no more scripts will execute.

This can be done by using a library such as async#reduce( or async#series )

async.reduce(files, Promise.resolve(), function(prevPromise, file, callback){
    prevPromise.then(function(){
        return readFileAsync(file, {encoding: 'utf-8'});
    }).then(function(query){
        return execDbAsync(query);
    }).then(function(data){
        callback(data);
    });
}, function(err, result){

});

Without using additional library, this executing async tasks serially problem can be solved by using Array#reduce method:

files.reduce(function(prevPromise, curr, curIdx, arr){
    return prevPromise.then(function(){
        return readFileAsync(curr, {encoding: 'utf-8'});
    })
    .then(function(query){
        return execDbAsync(query);
    });
}, Promise.resolve());

Above code will return a Promise which:

  • gets resolved when all chaining promises get resolved.
  • gets rejected when any of chaining promises gets rejected.

Comparing to using async#reduce, using builtin Array#reduce method has some benefits:

  • No additional library needed
  • No callback
  • Less code as we don’t need to explicitly call “callback (data)” to notify a task completion

Execution order looks like bellow:

read (script 1)—> exec (script 1)—> read (script2)—> exec (script 2)—>…–> read (script n)—> exec (script n)