Every enterprise is battling vendor lock-in—constantly. Ask any IT leader about what keeps them up at night and vendor lock-in is certainly near, if not at, the top of the list. And while network, storage and compute have long been democratized by virtualization, one discipline has held out as the last bastion of vendor lock-in: Databases.
How bad can it be? Take Amazon, for example. They celebrated their retirement of the last internal Oracle database exuberantly. People familiar with the matter suggested that replacing Oracle was at least a decade in the making, probably longer. It is anybody’s guess how much effort it took to accomplish this feat. A big celebration was in order, indeed.
Now, let’s take a look at database virtualization. As a relatively new discipline, the idea of virtualizing critical systems like data warehouses made a big splash recently. But does it apply to your systems? And the ubiquitous question: Does it come with performance penalties?
What Is Database Virtualization?
Database virtualization is the concept of abstracting applications and databases from each other. Using virtualization, an application written for, say, an Oracle database can run on an open source database like PostgreSQL without making changes to SQL or to associated APIs. As an aside, this is not to be confused with data virtualization—which is really a data integration play.
A database virtualization platform translates queries and data in real-time. Through emulation, it bridges functional gaps between the source and destination system so applications remain unchanged while moving to a different database. Database virtualization doesn’t make migrations easier; it makes them effectively obsolete.
The idea is rather compelling. So much so that Amazon developed and released their own version of a database virtualization product recently that targeted small standalone database systems.
Why You Need Database Virtualization
Here’s a simple rule of thumb. The cost of migrating a typical data warehouse—without virtualization—is about the total cost of the database over the next five years. So, if your current data warehouse runs you, say, $5 million a year, (which is typical for a mid-range system) your cost to migrate it will probably be close to $25 million. Moreover, the project could take up to five years.
But wait; there’s more. The true downside of conventional database migrations is actually their abysmal failure rate. Analysts put it typically above 80%. For large data warehouse systems, it seems to be higher still. That’s what makes databases the ultimate bastion of vendor lock-in.
As an IT leader, you naturally don’t have the time, the budget or the appetite for such a risky undertaking; you’d probably rather deploy that kind of money toward creating new revenue. Database virtualization lets you do just that: Sidestep these risks and, instead, focus on what’s really relevant.
When to Use Database Virtualization
Database virtualization is particularly suitable for replatforming data warehouse workloads. These workloads are notorious for their complexity and for the effort a rewrite would entail. Often, these workloads have been grown and curated for years—sometimes decades. Maintaining the workload while replatforming makes economic sense: It means preserving and protecting long-standing investments.
However, for database virtualization to be successful, both the source and destination systems need to be of comparable performance and scale. This should be obvious: Moving a workload from, say, a Teradata system to a Postgres instance is unlikely to satisfy your business users.
But there is good news. Over the past few years, cloud data warehouses have increasingly reached competitive performance targets. They have become viable alternatives to legacy on-premises appliances. From an IT leader’s perspective, that is great timing. Most IT leaders today are currently tasked with sunsetting these appliances over the course of the next few years. Database virtualization couldn’t be any timelier.
Goodbye, Vendor Lock-In
Virtualization, as a general concept, has revolutionized the IT stack over the past 20 years. Server virtualization, which started as a niche for QA and developers, became the foundation of cloud computing; virtualization then spread to network and storage. Databases are one of the few remaining disciplines that have bucked the virtualization trend—until now.
Not surprisingly, the premise of database virtualization resonates greatly with practitioners. And even though the initial success in the data warehousing arena is yet to be replicated in other areas, it seems like it’s just a matter of time before the concept becomes ubiquitous.