Serverless IoT Analytics with OpenWhisk Part 3 — How to keep state?

This is the third part of a series.

May 10, 2017

This is the third part of a series.

FaaS is stateless, which means it provides neither long-term storage nor short-term memory. If we want to keep “something” across invocations, it has to be put somewhere externally.

Watson IoT Analytics has two “states”. One is rules and triggering actions. The other is for stateful rule processing, the so-called “frequency requirement” on rules.

Let’s look at rules and triggering actions first.

In previous parts, I simply embed them in code. That works fine. But wait! Hard coded data? Shouldn’t we use a database?

That’s not hard to do, so let’s use Cloudant for storing rules and triggering actions first. Both rules and triggering actions are stored as documents in Cloudant. A map-reduce view is created to provide an easy way to get all rules with associated triggering actions embedded.

My revised OpenWhisk action now sends a query to Cloudant to get all rules (together with their associated triggering actions):

I’ve added 3 parameters for Cloudant related configuration. They need to be set as default parameters since my OpenWhisk action is called from Message Hub trigger which knows nothing about it.

> wsk action update iot-analytics-3 iot-analytics-3.js
> wsk action update iot-analytics-3 -p cloudant_username <cloudant account> -p cloudant_password <cloudant password> -p cloudant_db <name of cloudant database>

Now the fun part. Since we are using an external database, is it slower than embedded version?

It seems not much. The 95th percentile duration is 0.64 (embedded) v.s. 0.68 (Cloudant) seconds. That’s close enough to be ignored. However, the 99th percentile duration shows the impact of the remote database query on the slower end (worst cases): 0.74 v.s. 1.02 seconds.

Note that I did use a simple cache[1] to avoid querying Cloudant for each invocation, which might be the reason of comparable latency. Most of the invocations simply used the cached data in memory. As mentioned in part 2, OpenWhisk action instances do get “reused/cached” and this kind of optimization is pretty common (and very useful for performance).

For the record, this already works well enough for my use case.

BTW, Watson IoT Analytics does similar caching. In fact, it does not query database at all. All rules and triggering actions are kept in local cache and we have a notification mechanism to inject changes into running instances. This makes it possible to have extremely short processing time.

Embedded v.s. External Database

Before we move on, I want to drill down on this one further.

It’s probably against common (good) programming practice to embed data in code. But if the network to the database is not fast enough, embedded approach is one of few choices for near real-time IoT use cases.

I’ve been pondering on this for some time. The downside of embedded approach is increased complexity in function deployment management. When functions are deployed, we need some kind of template (fragment replacement) to inject the data into the code. This should be built into the deployment pipeline with the database (where the data stored) as input. We obviously need to track the “base” for every deployed function. Also, we need to track the link between a deployed function and the data injected. When new code is out or when data is changed, we need to replace/upgarde all related deployed functions.

That sounds complicated! But there are a few reasons that this may not be as bad an idea as it seems:

Before FaaS, having a general code serving different “data” means we only need to run one or just a few servers. This is called multitenancy.
With FaaS, this no longer is necessary. Since FaaS only charges for real usage, we can conveniently deploy one instance per “tenant”. The unit can be per user or even per rule. No use, no cost, no worry.
FaaS also takes care of elastic scaling, by individual deployed function. For peak, off-peak and even no traffic period.
The data to be injected into one function is small. It is just a single or a few rules of one user.
FaaS deployment is both easy and fast. When data is changed, instant redeploying related functions is no big deal.

If the latency of using external database is of concern, this could be a viable option.

That being said, as long as the external database works fine, I will not pursue this approach at least for now.

Stateful Processing

Watson IoT Analytics rules can have frequency requirement specified. It basically is count and/or time-based constraints controlling whether rule triggering is to be carried out.

Stateful processing requires short-term storage for storing intermediate results. The fastest short-term storage can be in-process memory or host storage. Unfortunately, FaaS does not provide that. To implement the “frequency requirement” on rules, we need to use external storage for keeping the “state”.

For example, for the 3rd option “Trigger only the first time conditions are met and reset when conditions are no longer met”, we can save each incoming message‘s timestamp along with the rule match result in Cloudant. Whenever there’s a rule match, we fetch the list of messages sorted by timestamp in descending order. If the immediate previous message did not have a match, we trigger the rule’s associated actions; otherwise, we don’t.

However, there’s another restriction of FaaS (at least the current options available on the market) which makes stateful “streaming” processing difficult. There’s no ordering guarantee on invocations. The following is from OpenWhisk’s documentation:

Invocations of an action are not ordered. If the user invokes an action twice from the command line or the REST API, the second invocation might run before the first. If the actions have side effects, they might be observed in any order.

Additionally, there is no guarantee that actions will execute atomically. Two actions can run concurrently and their side effects can be interleaved. OpenWhisk does not ensure any particular concurrent consistency model for side effects. Any concurrency side effects will be implementation-dependent.

The only way is to roll out our own “buffering/windowing/watermarking”. I doubt that worth the effort. At this point, if message order is required, I’d say use a streaming processing platform.

Notes

The simple cache does not consider changes to the rules and/or triggering actions from Cloudant side. To add the cache stale data eviction, we can use OpenWhisk Cloudant trigger to receive notification of such changes and force reload the corresponding OpenWhisk actions accordingly.

Bryan's Reflective Path

Discussion about this post