DevOps has established itself as a critically important discipline, developing and leveraging unique tooling and processes to deliver many types of applications more rapidly, efficiently and reliably. As data has become a more central component in these applications, DataOps has emerged to work hand-in-hand with DevOps to better support the role of data in the modern enterprise.
Yet, a new generation of applications–such as real-time, data streaming and microservices applications, including a growing share that run natively in and across the cloud and datacenter–is already threatening to upend established practices and tools that have been the foundation of DataOps and DevOps collaboration. DevOps is built around an agile mindset, but what must the discipline itself do to adapt?
These new applications are born of the concept of fast data: acting and reacting to data immediately as it hits the enterprise. In contrast to classic big-data applications that gather, aggregate, store and then later process information–typically in batches–fast data focuses on immediacy to transform, refine and analyze data as it arrives, enabling users, applications and services to quickly act on that information. In many instances this is in support of a larger breed of intelligent applications expected to act on the data without requiring any sort of human intervention.
The organizational benefits come in the form of speed and improved insight. While the former might be obvious, the latter is no less important: Knowing and acting on what’s happening right now significantly improves accuracy, nimbleness and responsiveness.
For DevOps, success in this new era of fast data applications is not possible by applying existing practices and solutions. It requires an evolution, both in those tools/practices themselves, and in how teams approach and think about the overall process of being part of a data-driven organization.
That’s because a number of things change in the world of fast data:
- Applications are no longer self-contained but comprised of many interdependent components that may stretch across geographies, infrastructure and disciplines.
- Data flows are the critical link between these components.
- Data is being collected, processed, analyzed and acted upon before it reaches traditional repositories such as data lakes or data warehouses (if it ever reaches them at all).
- Fast data applications in production must always be on to keep up with the flow of data, and must frequently handle significant scaling challenges, both up and down.
As an example of these new considerations, contemplate a classic scenario that might consist of a database and content application, with middle-tier logic in between. A typical DevOps process might involve an update to that mid-tier logic that is then deployed, tested and iterated.
In the realm of fast data, you might instead have eight different data services that talk to a dozen different components, all of which communicate with yet more downstream processing applications. Deploying any changes in isolation without an understanding of the various data service and application interfaces might cause the whole system to fall apart. Working in the world of fast data requires the DevOps team to work with DataOps to validate that individual components can be updated and deployed without harm, or establish how to deploy the whole system or portions of the system as a set. These are considerations that weren’t in play when the number of interface points was relatively small and relied on standard protocols or processes.
Fortunately, there are steps DevOps can take to successfully enable these new fast data applications, including:
- Recognize the necessity of delivering streaming/real time/fast data infrastructure not only for the production environment, but also for the development environment. Most of development and testing for monolithic batch applications can be done using basic infrastructure. However, development and testing of fast data applications requires infrastructure that can stream and process data on-demand. DevOps teams need to deploy data infrastructure that can easily support that.
- Work with DataOps professionals and/or principles to implement technologies and processes that make it easy to move fast data applications from testbeds into full production environments. Fast data applications are typically composed of a set of services, so moving an application into production requires a carefully coordinated and orchestrated process. DevOps needs to make that process as simple and reliable for developers as possible.
- Support an always-on environment. Fast data applications by definition need to act on data as soon as it arrives. That means that maintenance downtime for upgrades or reconfiguration is not allowed. DataOps teams need to identify and choose technologies that are designed to operate without interruption or disruptive degradation not only in the face of failures but also in the face of day-to-day maintenance and operations.
- Design for the cloud from the start. Fast data applications are typically deployed all or part in the cloud, collecting, acting on and distributing data inside and outside the enterprise. That makes it critical for DevOps teams to choose fast data technologies that offer cloud-native resiliency, scalability (up and down), high performance and ease of configuration/operation.
As mentioned earlier, it’s also important DevOps teams adjust their perspective toward data flows supported by infrastructure, rather than just considering application components.
Holistically, data is flowing through a series of interconnected components toward an outcome; the components themselves are just transitional steps in that journey. Changes to any given component must be framed in terms of implications both upstream and down. If a component is changed, does the upstream source need to adjust accordingly? Does the adjustment impact downstream users/applications that rely on the data flow? In larger scenarios this upstream/downstream journey might entail multiple hoops to different systems and business processes. It’s no longer possible to approach changes with a break/fix mentality of a few interconnected components, it must instead consider the deployment or revision to an entire data flow that might rely on a large number of microservices.
Fast data applications—from retail and manufacturing to financial services, media/entertainment and more—aren’t just coming in the future, they’re already here and are enabling a wealth of new business opportunities. DevOps teams need to adjust their practices and tools to best accommodate this new paradigm that’s equally about the data flow and the applications. And they can start preparing for this new fast data landscape by asking a single question: What could I do if I knew what the customers, partners and systems were doing right now?