Consuming a DynamoDB stream in the browser

At ironsource we wanted to create a serverless web application for one of our internal systems, which relies heavily on DynamoDB and Lambda.

To do that we needed to consume a dynamodb stream directly from the browser without having a server mediating the communication between it and the stream. The result was this module, open sourced under MIT license at ironSource's github account

Basically...

Browserify this:

var DynamoDBStream = require('dynamodb-stream')  
var aws = require('aws-sdk')  
var domReady = require('domready')

aws.config.region = 'us-east-1'  
aws.config.credentials = new aws.CognitoIdentityCredentials({  
    IdentityPoolId: '[your cognito pool id here]'
})

domReady(function () {  
    aws.config.credentials.get(function() { 
        var ddbStream = new DynamoDBStream( new aws.DynamoDBStreams(), '[your dynamodb stream arn here]')

        ddbStream.on('insert record', function(record) {})
        ddbStream.on('remove record', function(record) {})
        ddbStream.on('modify record', function (newRecord, oldRecord) {})

        setInterval(function () {
            ddbStream.fetchStreamState(function (err) {
                if (err) { return console.error('error fetching stream state', err) }
                console.log('stream state fetched successfully')
            })
        }, 1000)
    })
})

And you're done.

What's going on here?

When the dom is ready, we use Cognito to assume an unauthenticated role. In real world scenario, this will probably be a prelude for upgrading to an authenticated role. (more details here)

Once we set up the authentication (and possibly authorization) we proceed to creating an instance of DynamoDBStream. This instance will expose some events that you can use to modify your local app state: insert record, remove record and modify record

Then, we set up a polling procedure. The polling is needed to get updates from the stream. This is not very efficient, but it Will do for now. (Hi AWS folks, if you're reading this, it will be super cool if you expose a websocket interface for Kinesis and DynamoDB streams)

That's basically it.

Ingredients
  • dynamodb-stream module
  • A dynamodb table with a stream enabled (new and old images)
  • A Cognito identity pool with proper dynamodb policies: DescribeStream, GetRecords, GetShardIterator (for the purpose of this exercise we will use an unauthenticated pool).
  • domready module
  • browserify installed globally
Additional notes
  • I'm not sure how many concurrent connections a DynamoDBStream can support.
  • In real world scenario one would start by calling fetchStreamState(), then scan the whole table, populate the local state and only then start mutating it from the stream events. More information about this approach can be found here

Why did I write this module?

After reviewing the other methods described below, I realized that I needed to implement something from scratch if I want to be able to connect the browser directly to AWS services without any middleware.

What is Cognito?

Advertised heavily as a service for saving mobile user data in the cloud, Amazon Cognito exposes a second (or third?) generic authentication mechanism called Identity Pools which can be used to gain access to AWS services and resources.

Cognito is designed to work from a client (such as a browser or a mobile device) instead of a server, and support authenticated and "unauthenticated" access to aws. There is also a cherry on top of the Cognito cake, it integrates with other SSO services like facebook, twitter, google and even custom providers.

back to top

Alternatives for consuming a DynamoDB stream

Gaining access with Cognito is only the first step. Next comes the actual consumption of the stream. DynamoDB documentation describes several methods to process a DynamoDB stream:

I'll briefly go over them.

Low Level API

In my opinion, AWS javascript SDK in general and dynamodb streams api in particular leaves a lot to be desired. The API is extremely verbose and in the case of streams, does not implement the full functionality needed to consume streams properly.

DynamoDB streams Kinesis adapter

Sadly, kinesis api suffers from the same affliction as the low level api. However, amazon provides a java implementation that encapsulates the complex logic needed to consume a stream called KCL (and KPL for producers).

Unfortunately the rest of the SDK implementations (like javascript) depend on this implementation in java and in order to consume a stream or produce data one needs to run KCL or KPL in a separate process and communicate with it using a custom json protocol.

In my opinion, when running a node.js server on a micro machine its quite an overkill to run a separate jvm process simply to be able to communicate with the stream.

Of course, inside a browser, it will be hard, if not impossible, to use the adapter.

Lambda function using DynamoDB triggers

"Basically, I love lambda"

It such a good instrument for so many things.

DynamoDB triggers and lambda takes most of the pain away from stream consumption. The coding is very simple, the function is invoked with one or more stream records, you process them and enjoy life. No need to iterate over shards or any of that stuff. Lambda, however, is another server side only solution

Comments powered by Disqus