Middleware is easily one of Extend’s (and Webtask.io) most powerful features. You can use them for simply “run this before” style actions or create powerful compilers to support other languages. Recently though a great question was posted on our support forums:

A client of mine would like to ingest device generated event data into our platform. I’d like to be able to use Extend for normalizing the data I receive into an output that can be processed by my post task execution scripts. Since this device generated data has a fairly high throughput (up to 10 req/s), I want to batch the normalization process so I can use 1 webtask execution to normalize an array of events.

This is an interesting question. They wanted to let their users write extensions that could process an atomic unit of data but then have it processed in batches when it was actually executed. I wrote up a quick demo for the user and I’d like to share how I implemented this.

The Extension

First, I began by imagining a hypothetical extension related to processing events. Events would have an ID, timestamp, and status. Status represents how “serious” the event is and if it is over a certain threshold, we want to modify the data to flag it as something that needs follow-up. Here’s that sample code:

module.exports = async function(event) {

  //status values range from 1 (ok!) to 5 (very, very bad!)
  if(event.status && event.status > 3) event.followup = true;
  else event.followup = false;

  return event;
    
};

A couple of things to note here. First, I’m using the async keyword even though my actual logic is synchronous. I’ll do this when prototyping code that I believe will most likely be asynchronous in the future. It’s not required here but it helps future-proof the code and my middleware is going to assume it’s asynchronous. Secondly - the function is assuming one argument, event, that represents the event fired by an IoT device. This is not one of the supported programming models, but this is where the power of middleware truly comes into play. The end user doesn’t need to know anything Extend or context objects or anything else. Finally, the logic itself is simply adding a followup flag for events that need to be checked.

The Middleware/Compiler

Now let’s take a look at the middleware (which is also acting as a compiler):

const rawBody = require('raw-body');

/*
This is passed an array of events. Expects webtask to handle each one.
*/

module.exports = () => {

	return (req, res, next) => {
		let handler = req.webtaskContext.compiler.script;
		req.webtaskContext.compiler.nodejsCompiler(handler, (e, func) => {
			if(e) return next(e);

			rawBody(req, {encoding:'utf-8'}, (err, body) => {

				let data = JSON.parse(body);

				let promises = [];
				data.forEach(d => {
					promises.push(func(d));
				});

				Promise.all(promises)
				.then(results => {
					//console.log(results);
					return res.end(JSON.stringify(results));
				});
			
			});

		});
	}

}

So what do we have going on here? The core of the middleware acts upon the extension script by grabbing it via req.webtaskContext.compiler.script. It then compiles it using the Node.js compiler passed to all middleware. THe net result is a function object, in this case represented by the variable func, that is an executable version of the extension code shown earlier.

Once we have that, we can then look at the data passed in. As the comment says, the assumption is that there is an array of objects passed in. Right now the middleware will not correctly handle invalid inputs, but it would be trivial to add in that support.

Since the extension is asynchronous (ok, not really, but that’s alright), we make use of the Promise object’s all method to handle waiting for all of the calls to wrap up.

And that’s pretty much it. This middleware is assuming that the caller is doing the batching but it would be possible to devise a system by which a large array of data is passed in and then stored (perhaps with the built in Storage system) and paired with a CRON-based task to handle getting batches of ten items at a time. Again - the flexibility is there for you to build it as you see fit - but none of this impacts or adds complexity to the simple customization the end user was able to write.