DBOS Cloud overturns database-on-OS conventions for speed

Postgres pioneer Mike Stonebraker and Spark creator Matei Zaharia have cofounded a venture whose main product is a database-oriented operating system or DBOS — a high-performance distributed database that runs OS services on top.

Multicloud by design approach simplifies the cloud experience

PostgreSQL pioneer Mike Stonebraker and Spark creator Matei Zaharia, along with other computer scientists at MIT and Stanford have come up with a new database-oriented operating system (DBOS) to help development of greenfield web applications.

They have set up a company, DBOS Inc., to make the OS available to developers.

Its first product, DBOS Cloud, launched Tuesday, is a transactional serverless application platform, also sometimes defined as functions-as-a-service (FaaS). It is offered via Amazon Web Services (AWS) using the open-source virtual machine monitoring service Firecracker and is powered by the DBOS operating system.

It consists of three main components: an open source DBOS SDK currently for TypeScript, a DBOS Time Travel Debugger, and the underlying OS.

The company said it will help developers build and run serverless functions, workflows, and applications, adding that it comes with features such as time-travel debugging and SQL-accessible observability data. 

Genesis of DBOS and DBOS Cloud

But how did Stonebraker, Zaharia and the other researchers come together to build DBOS and what was their rationale?

Over three years ago, Stonebraker told InfoWorld, he identified that the rise in demand for data and compute had thrown up a new challenge for databases—storing operating system states of large magnitude. Around that time he attended a talk by Zaharia, who is also the CTO of Databricks, where he heard the latter “complain” about the performance of PostgreSQL. 

The Databricks CTO, according to Stonebraker, was explaining how his company was performing OS scheduling.

“Zaharia said that Databricks is routinely managing a ‘million-ish’ Spark sub tasks on a cloud and there’s no possible way that the company can run at that scale and use traditional OS scheduling techniques. Instead, Zaharia said that Databricks was putting all the scheduling information in a Postgres database, and doing scheduling as a SQL application,” Stonebraker explained.

Stonebraker reached out to the Zaharia soon after, realizing that “there is a whole bunch of commercial companies that can’t use traditional OS capabilities at scale.”

Their discussions led to the birth of DBOS, as the founders decided to run a database management system at the bottom of their new stack, and then run all OS services as equal.

“We built enough of this along with the team to prove that this inverted OS is about as fast as whatever enterprises were using or currently doing. Essentially, this meant that enterprises could get everything in the database with no drop in performance,” Stonebraker said.

Data provenance

As the database logs everything, the team’s next task was to develop a data provenance system that minimizes the use of the Linux-based kernel.

“We have a very sophisticated provenance system that gets spooled into a data warehouse,” Stonebraker said, adding that this allows DBOS to eliminate may layers, such as Linux, Kubernetes, any other transactional file systems, and any high availability delivery system.

The elimination of layers, according to the company, provides benefits in terms of cost, complexity, and reduced attack surface.

“You don’t need containers or orchestration layers, and you write less code because the OS is doing more for you,” Stonebraker explained, adding that it is a simple environment to maintain and keep a watch on abnormal events without compromising speed when compared to existing products.

The other advantage, according to Stonebraker, is the ability for the OS to backup quickly in case of adverse events, such as a ransomware attack.

“In the event of an attack, the system can be backed up to a specific time as it has the entire event log to skirt past around the offensive transaction. The backup takes seconds to minutes in contrast to other offerings where it may take days or weeks,” the founder explained.

After the development of the provenance system, the team built a programming interface for developers with a focus on the cloud rather than on-premises systems.

“We wrote a software-as-a-service (SaaS) programming environment on top of our database system,” Stonebraker said, adding that it was a Typescript-based environment.

It enables developers to write a collection of micro-operations connected into a graph, which are ingested into the database where they will get concurrency control to stop parallel program bugs. It also supports a debugger for applications, he said.

Cloud first

Although the team decided to launch DBOS in the cloud first, that’s not its only target.

“Over time, once we get traction, then we will probably pivot to the enterprise because that’s where large amounts of money are,” Stonebraker said, adding that enterprise software sales cycles are typically “very long.”

To get it to run on-premises, the team will need to add support for the POSIX set of standard  interfaces for Unix .

The technical documentation for DBOS to help developers start using it can be found here.

In terms of pricing, DBOS Cloud in its free tier offers a million service calls per month and a system data retention time of 3 days while using Amazon RDS Postgres.

Enterprises or developers can choose to use DBOS Cloud across other databases but will have to raise a request for customization.

Will there be many takers for DBOS?

While several analysts, including IDC’s Carl Olofson, dbInsight’s Tony Baer and Constellation Research’s Holger Mueller, attest to DBOS’ positive impact on reducing the time taken to develop an application and the security advantages of the platform, they highlight certain drawbacks and concerns.

Mueller wondered whether DBOS the company can scale. “Will a small team at DBOS be able to run an OS, database, observability, workflow and cyber stack as good as the combination of the best of breed vendors?” he asked.

Olfson also pointed out that in this era of specialized database management systems, such as key-value, timeseries, and document, among others, a relational database system might not be able to address all needs.

Explaining further about cybersecurity, Olofson pointed out that though DBOS has good security features, the biggest cause of data theft and loss is the use of false credentials, usually obtained through techniques, such as phishing attacks.

“No DBMS technology can prevent a bad actor with apparently legitimate credentials from stealing or destroying data,” Olofson said.

Copyright © 2024 IDG Communications, Inc.