The Challenges Of Data Ingestion
Functional services, like AWS Lambda and Azure Functions, are a hot topic in cloud service offerings. They are interesting for many reasons, including their scalability and isolation. But what I find most fascinating about them is how they can make it easy to create pipelines of message transformation.
I have worked at a few companies over the years that have depended on building scalable ingestion of data through RESTful apis. In the past, many services solved this problem by over-provisioning resources at every step of a small ingestion pipeline. For example, an ingestion service may have beefy routers splitting traffic to beefier sharded servers that then save the data into even beefier databases. In the past few years we have thankfully created better solutions.
Kafka (Or The Message Queuing System You Love To Hate)
One major contribution to better data ingestion was Apache Kafka, which is a sort of message queuing system on steroids. Message queuing systems existed before Kafka, but resiliency and durability of messages was hard to guarantee. Kafka made these aspects easy to achieve (as long as you could handle idempotency), even if your service behind a Kafka queue was getting hammered for a period of time and couldn't keep up.
However, I also have plenty of experience with the challenges of developing for, monitoring, and scaling Kafka. In short, Kafka is a very small piece of a puzzle that requires a lot of extra tooling built on top of it in order to safely get messages into queues and retrieve them on the other side.
In Steps Functional Services
I like to think of functional services as more than just compute units like AWS Lambda Functions. To me, functional services also encompasses how messages reach these units, and then how these units propagate new messages on to other services in an orderly fashion. This expanded scope includes services like AWS API Gateway and AWS Kinesis Streams. API Gateway adds a layer in front of AWS Lambda to provide a RESTful interface to your functions. Kinesis Streams is an analogue of Apache Kafka.
With these three services you can create a RESTful data ingestion pipeline that is both horizontally scalable, due to the underlying dynamics of functional compute units, and temporally resilient to outages or scalability issues elsewhere in a stack, due to the ability of queuing systems to buffer messages.
But more than that, the best functional services make it easy to combine individual services in a manner that guarantees resiliency and durability. As an example, AWS makes this possible by linking Kinesis Streams to Lambda Functions without requiring any logic in-between.
The Next Frontier
We now have easy-to-use individual functional services. To complete the message processing revolution we need an easy way to design entire web applications using functional architecture patterns. This was my own personal motivation for creating Stackery. Request a demo today to see if it revolutionizes how you handle your data pipelines!