Asynchronous Data Pipelines
From the early 2010s onwards, multiple companies rushed into developing frameworks and tools to handle the ‘Big Data’ they generated, but many discovered that the amount of data processed was in fact rather small. The problem was that traditional techniques were often found wanting when it came to processing the data available in new formats and structures.
There were business lessons there, but the response from the tech industry was to flex the boundaries of traditional processing models and offer lightweight, local-first solutions for processing data. A well-written sequence of independent but related data processing tasks is the common basis from which technology and product teams start when designing processes today.
A common use case is the processing of datasets, where a dataset is a collection of related data. In a perfect world all data would be well structured and consistent, but in real-life situations you typically need...