veritas: shared verifiable databases and tables in the...

Veritas: Shared Verifiable Databases and Tables in the CloudLindsey Allen†, Panagiotis Antonopoulos†, Arvind Arasu†, Johannes Gehrke†, Joachim Hammer†, James Hunter†, Raghav Kaushik†, Donald

Kossmann†, Jonathan Lee†, Ravi Ramamurthy†, Srinath Setty†, Jakub Szymaszek†, Alexander van Renen‡, Ramarathnam Venkatesan††Microsoft Corporation ‡Technische Universität München

ABSTRACTIn this paper we introduce shared, verifiable database tables, a newabstraction for trusted data sharing in the cloud.

KEYWORDSBlockchain, data management

1 INTRODUCTIONOur economy depends on interactions – buyers and sellers, suppli-ers and manufacturers, professors and students, etc. Consider thescenario where there are two companies A and B, where B supplieswidgets to A. A and B are exchanging data about the latest price forwidgets and dates of when widgets were shipped. The price of awidget is fluctuating, and thus A bids for a quantity of widgets. Inour (simplified) bidding process across the two companies, A asksfor a quantity of widgets, Bmakes an offer, then A rejects or acceptsthe offer. Once A has accepted an offer, a contract exists, and B goesahead and ships the widgets.

The two companies need to share data, and they would like tohave an easy way to share asks, offers, prices, contracts, and ship-ping information. Thus they need to have an easy capability toshare data. But with sharing and cooperation also come disputes,and thus companies A and B would like the shared data have animmutable audit log such that in case any disputes arise, all inter-actions with this shared table can be made available for auditingby a third party. This does not mean that the shared tables shouldbe publicly accessible, but that selected parts of the log, the inter-actions with the table, can be made available to auditors, that thelog is tamperproof, and that an auditor can quickly and cheaplyreconstruct the shared state as of some point of time in the pastand then audit state changes going forward.

How is this achieved today? Today, companies A and B share databy agreeing upon common APIs and data exchange formats thatthey use to make web service calls to each other. This solves thedata sharing issue, however, this does not achieve immutability, nordoes it enable the capability of selective auditing. An auditor wouldhave to go through the databases at A and B, and trust that nobodytampered with the local logs. The auditors need special tools toderive an old state of the database from the logs, and the answersto read-only queries do not appear in the log at all. The problem isthat neither reads nor updates are logged in some immutable way,and thus they cannot be audited by a third party without trustingthat the local databases at A and B have not been tampered with.

This article is published under a Creative Commons Attribution License(http://creativecommons.org/licenses/by/3.0/), which permits distribution and repro-duction in any medium as well allowing derivative works, provided that you attributethe original work to the author(s) and CIDR 2019.CIDR’19, January 2019, CA© 2019ACM ISBN 978-x-xxxx-xxxx-x/YY/MM.https://doi.org/10.1145/nnnnnnn.nnnnnnn

A recent class of new technology under the umbrella term"blockchain" [11, 13, 22, 28, 33] promises to solve this conundrum.For example, companies A and B could write the state of these sharedtables into a blockchain infrastructure (such as Ethereum [33]). Theblockchain then gives them transaction properties, such as atomic-ity of updates, durability, and isolation, and the blockchain allowsa third party to go through its history and verify the transactions.Both companies A and B would have to write their updates intothe blockchain, wait until they are committed, and also read anystate changes from the blockchain as a means of sharing data acrossA and B. To implement permissioning, A and B can use a permis-sioned blockchain variant that allows transactions only from a setof well-defined participants; e.g., Ethereum [33].

This blockchain technology is quite a step forward, but also afew steps backwards, since with the power of these new blockchaintechnologies also come new problems. The application that han-dles interactions between A and B has to be re-written for the newblockchain platform, and the primitives that the new platform pro-vides are very different than SQL, the lingua franca of existing dataplatforms. In addition, the transaction throughput for blockchains islow, and there is no query optimization or sophisticated query pro-cessing. Today, for gaining immutability and auditability with newblockchain platforms, we give up decades of research in data man-agement – and hardened, enterprise-ready code that implementsthese ideas.

In this paper, we propose two new infrastructure abstractionsto address the above issues. We propose the notion of a blockchaindatabase. A blockchain database system is a database system witha regular SQL interface, however it provides the same guaranteesof immutability of state transitions, trust, and open-source verifia-bility as a blockchain. Thus A and B can simply move their sharedstate to a blockchain database, and they now interact with it aswith any other database using SQL. Such a blockchain databasewould address many of the limitations of blockchains such as thelack of a SQL interface. However, since the blockchain database is aphysically different database system, interactions with it still needto happen through a middle tier; for example, Company A cannotsimply write transactions across its own database system and theblockchain datababase; it would need to create a distributed trans-action across its own database system and the blockchain database,a capability that in practice is always achieved through queues withasynchronous, idempotent writes. Thus while an interesting stepforward in terms of capabilities, a blockchain database is always atarms-length from the database infrastructure at A.

In order to close this gap, we also propose the abstraction of ashared database table in the cloud that is part of the databases ofboth companies A and B. Assume that A and B are both customersof BigCloud, running their database infrastructure on the BigCloudSQL Database System. We will enable A to create a shared table, atable that A can share with B, and that now appears in both A’s and

https://doi.org/10.1145/nnnnnnn.nnnnnnn

CIDR’19, January 2019, CA L. Allen et al.

B’s databases. The table can be used like any other table, and it canbe synchronously accessed by both A and B; both companies canwrite application logic including ACID transactions that includethis table.

By putting together both the idea of a blockchain database andthe notion of a stared table, we arrive at the notion of a shared,verifiable table, which is part of the database of both companies Aand B and has an immutable, accessible log with clean auditabilitycapabilities.

How will we build such a shared, verifiable table? We propose tobuild on existing database technology; this gives us many benefits,but it also opens up several technical challenges which we start toaddress in this paper. The first challenge is to integrate blockchaintechnology into a database system, thereby maintaining the trustproperties of a blockchain and the performance and productivityproperties of a database system. One idea would be to leverage thelogging capabilities of modern database systems for durability (i.e.,the transaction log), but it turns out that the transaction log doesnot have the right information to verify the database system. Asecond challenge is to implement the shared table abstraction andto leverage possible optimization opportunities that arise from theshared table abstraction.

Note that there is another challenge that is often studied inthe context of blockchains: Confidentiality. The nature of mostblockchains (like Bitcoin [26]) is that all participants see all transac-tions so that they can validate the correctness of these transactions(e.g., no double-spending of the same bitcoin). Recently, confiden-tiality has been addressed in novel blockchain proposals such asCoco [11], Quorum [28], or Corda [13]. It turns out that confiden-tiality (and the specific blockchain protocol used) is to a large extentorthogonal to the issues discussed in this paper so that all thesenovel blockchain proposals are relevant to our work, but discussingconfidentiality guarantees in sufficient detail is beyond the scopeof this work.

Our Contributions. Our contributions in this paper are as fol-lows:

• We introduce the notion of a verifiable (blockchain) databaseand the notion of a shared table. (Section 2).

• We discuss how to implement verifiable databases. (Section3)

• We discuss how to implement shared, verifiable tables. (Sec-tion 4)

• We describe experimental results from a prototype system.(Section 5)

We discuss related work in Section 6, and we conclude in Section 7.

2 DATABASES AND TABLES THAT CAN BESHARED AND VERIFIED

There are many different ways in which two (or more) parties suchas Companies A and B can interact in the cloud. Figure 1 shows thetraditional approach using Web Services (REST) or RPC. CompanyA carries out transactions using its local Database dbA. Transactionsthat involve an external entity such as ordering widgets for a givenprice from Company B are executed as follows: All updates thatcan be executed locally to dbA are committed synchronously; allrequests to B are carried out asynchronously: The request is written

A B

REST Calls / RPC

Figure 1: Web Services

into a local queue synchronously as part of the transaction so thatthe external requests are not lost in the event of failures. The localqueue is part of dbA and thus no distributed transaction is needed.An asynchronous job will read the requests from the queue andmake a Web Services call to Company B. The middle tier of B thenreceives the data and writes it to its own local database, DatabasedbB. If the transaction at dbB fails, then Amay retry the transaction,and in case of repeated failures, it needs to execute a compensatingtransaction on dbA using Sagas or another nested transaction model[18].

The resulting architecture, which is shown in Figure 1, has beenin use to drive electronic business for decades. It works extremelywell once a framework for data exchange and trust between Com-panies A and B has been established. Setting it up, however, iscumbersome, and there is no good way to audit all the interactionsbetween A and B. When there is a dispute, it is difficult to puzzle to-gether what has happened, especially for more complex queries. Ifprices for widgets fluctuate frequently, for instance, it is difficult forA to prove that it placed an order at a lower price before B increasedthe price. While simple instances can be solved as a one-off by digi-tally signing API calls, the auditing of more complex queries suchas a query over accounts payable that aggregates an outstandingbalance is much more challenging. In general, the architecture ofFigure 1 works best for large enterprises which have the resourcesand business volume to make it worth-while to set up, and whichhave the technical infrastructure to implement this architectureand legal scale to audit it.

2.1 Blockchain DatabasesRecently, there has been a big hype around blockchain technol-ogy [22, 26, 28, 33] which allows entities to do business in a moreagile way, without prior formal legal agreements. Figure 2 showshow A and B can collaborate using this architecture. Both A andB have their local databases and queues to affect reliable (exactlyonce) and asynchronous calls to external entities – just as in Fig-ure 1. The crucial difference to Figure 1 is that all communicationsuch as making quotes and placing orders is made through theblockchain. The blockchain provides a total order so that an auditorcan reconstruct the exact sequence of events across A and B. Thusthe auditor can prove that A placed its order before B increased itsprice. Using cryptographic methods, the blockchain is immutableso that neither A nor B can tamper with the history of interactions,and they cannot delete or reorder events.

The architecture of Figure 2 solves the trust and auditabilityissues of Figure 1. From an information infrastructure and database

Veritas: Shared Verifiable Databases and Tables in the Cloud CIDR’19, January 2019, CA

A B

Blockchain

Figure 2: Blockchain

perspective, however, it is not ideal. Logically, the blockchain is adatabase which orders transactions and stores transactions durably.Modern blockchains such as Ethereum are even able to implementintegrity constraints in form of smart contracts. Unfortunately,today’s generation of blockchains cannot compete with moderndatabase systems in terms of performance, query processing ca-pabilities, productivity, and tooling. The blockchain communityhas realized that there is a significant gap between their existingtechnology and the capability of a modern database system. Thereare now a gazillion of start-ups that are adding these basic databasefeatures to blockchains, but it will take years if not decades to catchup.

Instead of adding database capabilities to blockchains, we pro-pose to address the problem from the opposite approach: We addtrust and auditability to existing database management systems.Figure 3 shows this concept. Instead of writing all transactions thatspan trust zones into a blockchain, these transactions are writteninto a shared verified database (which we also often call blockchaindatabase). A verifiable database is an extension of a traditionalDBMS (e.g., Microsoft SQL DB, MySQL, or Oracle) or a cloud data-base system (e.g., Microsoft SQL Azure or Amazon Aurora), and itcontains all the tables with the shared data – in our example quotesand orders that keep the information accessible to A and B and anyother parties that participate in this marketplace. The most impor-tant feature of this blockchain database is that it is verifiable. Theprovider who operates this database cannot be trusted because theprovider might collude with A or B, the database might be hacked,or the database system source code might contain backdoors. Therequirement is that the verifiable database gives the same trustguarantees as the blockchain in the architecture of Figure 2 – itcontains an immutable, totally ordered log, and the state transitionscan be verified by an auditor. This way, A and B can prove to anauditor that they behaved correctly and executed their actions inthe right order.

Recently, Amazon announced QLDB [27]. QLDB was inspiredby this Blockchain Database trend. In its current version, the QLDBservice provider (i.e., Amazon) must be trusted so that QLDB pro-vides different trust properties. In particular, the QLDB providercan drop transactions spuriously.

2.2 Shared, Verifiable TablesWhile the concept of a blockchain database sounds attractive atfirst glance, it does not provide much additional value compared

A B

Blockchain

Verifiable Database

Figure 3: Verifiable Database

A B

Blockchain

Figure 4: Verifiable Tables

to the architecture of Figure 2. Application developers and admin-istrators still need to worry about distributed transactions acrosstwo databases and need to use reliable queues to implement trans-actions; there is asynchrony involved across different systems, andthe middle tier needs to deal with one more system.

To have a truly seamless database experience and support trustedcollaboration in the cloud, we propose the concept of a shared,verifiable table which integrates the tables from the blockchaindatabase of Figure 3 directly into the databases of A and B – as ifthey were to share a single, common instance of this table.

Figure 4 shows how A and B collaborate using shared, verifiabletables. A writes all widget orders into a shared order table in itslocal database dbA. A can update and query this table just like anyother table; there is no queuing delay, and A can write SQL andtransactions across the shared tables and its private tables. However,the same instance of this table is also visible to B, which has thesame capabilities as A against the shared, verifiable table. In addition,updates to all shared tables of a database are written to a tamper-proof and auditable logwhich could be implemented as a blockchain.More precisely, there is an N:1 relationship between shared tablesand tamper-proof logs. Both A, B, and any auditor who can accessthe log, can see and inspect the log. An auditor can now provewhether A placed an order and whether it happened before afterthe latest price change. Likewise, B can publish all its price changesinto a shared, verifiable quotes table which is also visible to A .


In the next section, we discuss how to implement a blockchaindatabase and how to implement shared, verifiable tables.

3 IMPLEMENTING VERIFIABLE DATABASESThe previous section introduced two new abstractions: verifiabledatabase and verifiable table. A verifiable database has all the fea-tures of a regular database: it talks SQL, stored procedures, supportscomplex query processing, indexes, integrity constraints, transac-tions, backup and recovery, etc. Furthermore, clients connect to averifiable database using drivers such as ODBC, JDBC, etc. A veri-fiable database is a platform to share data and collaborate acrossclients, just like any other traditional database system. Analogously,a shared, verifiable table has all the features of a regular table of arelational database.

What makes verifiable databases and tables special is that theysupport tamper-evident collaboration across mutually untrustedentities. They provides verification mechanisms using which eachparty can verify the actions (updates) of all other parties and provideproof for its own actions. Further, all parties can verify that thestate of the shared database (or shared table) and its responses toqueries is consistent with prior actions of legitimate actors. Theseguarantees are cryptographic. Unauthorized parties (hackers oroperators with administrative privileges) cannot tamper with thestate of the verifiable database (or verifiable table) without beingdetected by the verification mechanism.

Just like blockchains, a verifiable database system that supportsverifiable databases and tables relies on cryptographic identitiesto establish the legitimacy and originator of actions. The systemis permissioned and tracks the identities (public keys) of autho-rized participants. An authorized participant signs all interactions(queries) with the system, and the signatures are checked for authen-ticity during verification. Queries are also associated with uniqueidentifiers to prevent replay attacks.

A second critical building block is that the verifiable databasesystem logs all client requests and the affects of these requests.As part of this log, the system adds information (typically hashes)which allow auditors and participants to verify whether the data-base behaves correctly. We call all agents that audit the database inthis way verifiers. Since confidentiality is out of the scope of thispaper, we assume that all verifiers have access to the whole loggenerated by the verifiable database. Frameworks like Coco [11],Quorum [28], Spice [30] or Corda [13] address confidentiality inBlockchains and the ideas of these frameworks are applicable toour work because they are in principle orthogonal.

Given these three building blocks, there are many different waysto implement a verifiable database system. Essentially, there arethree fundamental design questions:

• How do the verifiers consume the log of the verifiable data-base and generate consensus on the state of the verifiabledatabase?

• How do the verifiers check whether the verifiable databasedoes the right thing?

• Do verifiers check transactions synchronously or asynchro-nously?

The remainder of this subsection describes our proposals toaddress these three questions in the Veritas system that we are

A B

Blockchain

Verifiable Database

Verifier Verifier Verifier

Figure 5: Verifiers and Blockchain

currently building atMSR.We address these questions in the contextof verifiable databases. We discuss the specifics of verifiable tablesin Section 4.

3.1 Verification ArchitectureFigure 5 shows one possible way to add verifiers to a verifiabledatabase (Figure 3). The verifiers consume the log produced by theverifiable database, check whether everything is okay (see nextsubsection), and write their vote into a Blockchain.

In this architecture, the Blockchain is only used to store the votesof verifiers. Depending on the specific Blockchain technology, thecost and performance of Blockchain operations varies, but they arealways expensive. So, it is advantageous to minimize the number ofBlockchain operations. In the architecture of Figure 5, verifiers canfurther reduce the number of times they write to the Blockchain bybatching their votes. Instead of acknowledging each log record ofthe verifiable database, they record into the Blockchain the positionof the position until which they have verified all operations of theverifiable database. If there is a dispute, all parties can reconcilethe history with the log of the verifiable database (which needs tobe archived) and the votes of the verifiers in the Blockchain. Thisprotocol is called “Caesar Consensus”, and it was first devised forthe Volt system [31].

Disseminiating the log of the verifiable database to the verifiersacross a wide-area network can be expensive and slow. One variantof the architecture of Figure 5 is to leverage emerging technology tocreate trusted execution environments (TEEs) and collocate the TEEwith the verifiable database. Examples of TEEs are Intel SGX [14],ARM Trustzone [2], Windows VBS [32], or FPGAs [17]. In thisvariant, the TEE consumes the log and verifies that the verifiabledatabase operates correctly, informs all other verifiers, and writesits vote into the Blockchain. This variant is inspired by the Cocoframework [11]; in addition to performance enhancements, thisapproach can also be used to implement confidentiality in the archi-tecture of Figure 5. To make this approach secure, the TEE needsto be attested by all parties (and other verifiers) in order to makesure that it does not run malicious code and verifies correctly.

3.2 Fine-grained vs. Coarse-grainedVerification

The second design question addresses the inner workings of theverifiers. The "Blockchainway" of building verifiers is implementingverifiers as replicated state machines [24]. That is, the verifiabledatabase logs all requests of clients and the answers returned tothe clients. Each verifier implements exactly the same logic as the


verifiable database system: It retruns the client requests and checkswhether it returns the same results. If the results match, the verifiervotes that everything is okay; otherwise, it signals a violation. Tomake the log reproducible, the results of indeterministic functions(e.g., random() or getTimeOfDay() need to be logged, too. We callthis approach coarse-grained verification.

Conceptually, coarse-grained verification is simple. However, itis also costly because it repeats expensive database operations suchas joins and sorting. For instance, it is easier to verify the resultof a sort operator than to sort the sequence again. Furthermore,the coarse-grained approach requires that the verifier maintainsa full copy of the whole database. As a result, the coarse-grainedapproach does not compose well with the TEE variant described inthe previous subsection. State-of-the-art TEEs (e.g., Intel SGX) havesevere memory limitations and cannot carry out I/O operationsreliably, thereby making it difficult to embed a full-fledged DBMSinto a TEE. Another severe issue of the coarse-grained verificationapproach is that the verifier inherits all bugs of the database system:The verifier has a large trusted computing base (TCB), and it isvirtually impossible to verify the correctness of the software thatdrives the verify.

As an alternative, a fine-grained verifier only checks the logi-cal properties of the results and affects of all database operations.This approach is particularly attractive for databases because thesemantics of database operations is (fairly well) standardized. Abrute-force version of a fine-grained verifier is also straight-forwardto implement: It will involve logging every record that is read andcreated as intermediary or final query result. The current versionof Veritas implements such a brute-force fine-grained verification.It works well for simple, OLTP-style database statements (pointlookups and updates). This approach also parallelizes verificationwell, following the design devised in [1]. Parallelizing verificationis important so that the verifiers can keep up with the Verifiabledatabase system which has a high degree of concurrency. Moreresearch, however, is needed to optimize this approach for morecomplex queries. Furthermore, more research on the fine-grainedapproach is needed to check whether the verifiable database systemused a correct query plan (for complex queries) and whether theverifiable database system guaranteed the desired isolation level.

Veritas uses the fine-grained approach for three reasons: (a)verification is potentially cheaper as the sorting example in theprevious paragraph shows, (b) footprint, and (c) security. The fine-grained approach has a small TCB, and fine-grained verifiers can bevirtually stateless. The verifier of the Concerto key-value store [1]which is the designwe adopted for Veritas, for instance, is composedof two simple AES blocks implemented in an FPGA and provablyverifies the affects of a key-value store with hundred thousands oflines of code. It is also fully streaming and only keeps a few hashes asinternal state. In contrast, a coarse-grained verifier is a full-fledgedDBMSwithmillions lines of code and involves replicating the wholedatabase. A small TCB and footprint is particularly important todeploy a verifier using TEEs such as Intel SGX and FPGAs that havememory constraints.

3.3 Online vs. Deferred VerificationThe third design question has attracted a great deal of attention inthe research literature [8, 16]. Online verification means that thetransaction cannot commit in the verifiable database until it hasbeen verified by all (or a quorum) of verifiers. This approach is im-portant for transactions that have external effects; e.g., dispensingcash at an ATM. In contrast, deferred verification batches transac-tions and periodically validates the affects of this set of transactionsin an asynchronous way.

It turns out that in the absence of external effects which cannotbe rolled back, online and deferred verification provide the sametrust guarantees. If the verification of a transaction fails in theonline model, then just rolling back that specific transaction mightnot be enough. In fact, the transaction might be perfectly okayand verification fails because of an integrity violation with theverifiable database by, e.g., a malicious admin a long time before thistransaction. So, online verification does not guarantee immediatedetection of malicious behavior, a common misunderstanding ofonline verification.

4 IMPLEMENTING SHARED, VERIFIABLETABLES

Essentially, all the design considerations to implement trust (i.e.,distributed verifiers) in a verifiable database discussed in Section3 apply in the same way to the implementation of shared, veri-fiable tables. Conceptually, the big difference between verifiabledatabases and verifiable tables is concurrency control. Concurrencycontrol is centralized in a verifiable database and implementedin the same way as in any traditional database system (e.g., us-ing Snapshot Isolation [4] or two-phase locking [20]). In contrast,concurrency control is fundamentally distributed in a system thatsupports shared, verifiable tables.

As shown in Figure 4, verifiable tables are a way to integratedata from a shared, verifiable database into a local database. Theadvantage is that applications need not worry about distributedtransactions across a local and a shared database. Distributed trans-actions are fundamental to the shared, verifiable table concept aseach node is autonomous and carries out transactions on the shared(and private) tables independently. So, rather than letting the ap-plication mediate between the local database and the Blockchainor verifiable database, the database system needs to implementdistributed coordination of transactions to support verifiable tables.

There are many ways to implement distributed transactions[5]. Recently, there has been a series of work on implementingdistributed Snapshot Isolation [7, 15]. For Veritas, we chose to usea scheme that is based on a centralized ordering service and that isinspired by the Hyperledge Blockchain protocol.

Figure 6 depicts an example scenario with three parties, denotedas Node 0, Node 1, and Node 2. Each of these nodes operates aVeritas database on behalf of their organization which includesprivate tables only visible to users of that organization and shared,verifiable tables visible to users of all three organizations. Users andapplications of each node issue transactions to secret and sharedtables in a transparent way; that is, users and application programsneed not be aware of the existence of other nodes and organizations,and they need not be aware of the difference between private and


Node 0 Node 1 Node 2

Broadcast Service

Timestamp(Ordering)

Service

Local Transactions

(Kafka brokers)

Local Transactions

Local Transactions

Figure 6: Shared Tables Implementation Architecture

shared tables. The database system at each node deals with all issuesconcerning distributed transactions, keeping the copies of sharedtables consistent, and making all updates to shared tables verifiable.

Transactions that involve only private tables are executed locallyonly. Coordination is only needed for transactions that involve readsand updates to shared tables. For such “global” transactions, thedatabase system does two things, as shown in Figure 6:

• Concurrency Control: We use Timestamp-based concurrencycontrol in our prototype implementation of Veritas.

• Disseminate Log and Voting: We use a Kafka-based broad-cast service to propagate all updates of a shared table to allnodes. Each node embeds its votes (following the “CaesarConsensus Protocol” of Section 3.1) into the log that it shipsto all other nodes.

Again, there are many ways to effect distributed concurrencycontol for a system that supports shared tables like Veritas. Forsimplicity, we chose to use an optimistic protocol that is basedon Timestamp ordering with a centralized Timestamp service ála Hyperledger in our prototype implementation of Veritas. Atthe beginning of a global transaction that accesses a shared table,Veritas calls the Timestamp service to get a global timestamp for theglobal transaction. This timestamp determines the commit order ofall global transactions.

Once a global transaction has received a (global) timestamp, it isexecuted locally at the Veritas node speculatively (or optimistically).While the global transaction is executed it is not yet knownwhetherit will commit successfully. In Veritas, a transaction can be aborteddue to concurrency control conflicts or user aborts, just like in anyother database system. Furthermore, transaction on shared tablescan fail in Veritas if they are not verified by the other Veritas nodes.

To execute the transaction, each Veritas node keeps two datastructures: (1) a commit watermark, TC that indicates the highestglobal timestamp for which the node knows whether all globaltransactions with timestamp less thanTC committed successfully oraborted. (2) A version history of all records of all shared tables. Thisversion history contains the latest version of each record commitby a global transaction before TC and all versions of the record ofspeculative transactions with timestamp higher than TC .

Node

Committed State

Speculative Updates

Non-shared State

Local Transactions

(Redis database)

Local (unresolved) transaction log Global (resolved) transaction log

(In-memory table)

Ship local logs

Merge remote logs

Figure 7: State of a Veritas Node

Periodically, every Veritas node ships all log records that involvereads and writes to shared tables as well as the begin of transaction,commit, and abort log records of all global transactions to all otherVeritas nodes using the Broadcast Service (Figure 6). When a Veritasnode receives log records from another node, it applies that log andverifies the affects of all committed transactions. (The node bufferslog records of active transactions until it receives the commit orabort log records for those transactions.) While applying that log,the node checks for read/write and write/write conflicts using itsversion store of records of the shared tables. If the node detects aconflict, it marks the transaction as aborted and ignores all updatesto its copy of the shared table. If the transaction can be verifiedand does not create any conflicts, the transaction is marked ascommitted and the node’s copy of the shared tables are updatedaccordingly.

Figure 7 depicts the state and log maintained by each Veritasnode. In addition to the private (non-shared) tables, every Veri-tas node keeps a clean copy of all shared tables - this clean copycontains the state of all records of shared tables as of time TC , thecommit watermark. In the current Veritas prototype, the non-sharedtables and the clean shared tables are persisted in a Redis database.Furthermore, the Veritas node keeps all (known) versions of recordsof shared tables of speculative transactions; i.e., transactions afterTC so that the node does not know the destiny of those transactionsyet. Furthermore, each Veritas node keeps a local log of all updatesto private (non-shared) tables and a global log of updates to sharedtables.

In this scheme, the (centralized) Timestamp service is a potentialvulnerability of the system because it must be trusted. Again, thereare many possible ways to implement distributed transactions, in-cluding variants that do not rely on a centralized service. Exploringthe full space of these protocals and variants is beyond the scopeof this paper. The goal of this paper is to give a flavor of one pos-sible way to interleave concurrency control and verification in adistributed system that implements shared tables.

5 EXPERIMENTAL RESULTSThis section contains the results of initial performance experimentsdone with the Veritas prototype described in Sections 3 and 4. Thegoal of these experiments was to study the overheads of verification


as a function of the number of nodes. Furthermore, we wanted tostudy the interplay of distribution concurrency control and verifi-cation, depending on data contention.

5.1 Benchmark EnvironmentWe used a variant of the YCSB benchmark [12] for all performanceexperiments reported in this paper. We used a database with onlya single (shared) table and varied the number of records in thatshared table. Every record of the shared table was a key/value pair;both key and value were short strings of eight bytes.

We studied workloads with get() and put() operations only. Alloperations accessed a random record of the shared table using auniform distribution across all records of the shared table. Whileour prototype supports more general transactions, here we reportresults for simple transactions with a single read or a single update(read followed by a write). We used three different workload mixes:(A) update-heavy (50% operations are updates), (B) read-heavy (5%update rate), and (C) read-only (0% update rate).

In all experiments, we ran a fixed number of transactions (onemillion transactions, if not stated otherwise) and measured the timeit took to execute this batch of transactions across all nodes of thesystem. The same number of transactions originated at every node;e.g., for a four-node system 250000 transactions originated at eachnode. Furthermore, we measured the number of transaction aborteddue to concurrency conflicts caused by the optimistic protocoldescribed in Section 4. From this number, we computed the commitrate: (total number of transactions - aborted transactions) / totalnumber of transactions. From these two measurements, we derivedthe throughput as the number of committed transactions per second.We varied the numnber of nodes (Experiment 1) and the number ofrecords in the shared table (Experiment 2).

This paper only reports on experiments done with the sharedtable approach (Section 4). The throughput and commit rate ofverified databases in the Veritas implementation (Section 3) is al-most the same as in a traditional database system as Veritas doesverification asynchronously and in batches. The only overheadsare the creation of digital signatures at clients and the shipping ofthe log to the verifiers.

We ran all Veritas nodes in Azure on VMs with four virtualcores and 16 GB of main memory (i.e., Azure D4sv3 instances).Each node ran a Redis database to store its copy of the shareddatabase persistently. The Timestamp service shown in Figure 6was implemented using Zookeeper. The Broadcast service wasimplemented using Kafka. Both the Timestamp service and theBroadcast service were deployed on separate Azure VMs (againwith four cores and 16 GB of main memory).

We implemented the Veritas prototype in C#; it currently hasabout 1500 lines of code. Each node batches transactions beforeshipping the log to the Broadcast service. The batch size was 10,000transactions in all experiments reported in this paper.

5.2 Experiment 1: Vary Number of NodesFigure 8 shows the throughput of Veritas for a database with 100Krecords and all three workload types (update-heavy, read-heavy,and read-only). Figure 8 shows the results for Veritas configurations

0

10000

20000

30000

40000

50000

60000

1 2 3 4

Thro

ugh

pu

t (t

ran

sact

ion

s/se

c)

Number of nodes

A B C

Figure 8: Throughput: Vary Nodes, 100K DB

0

0.2

0.4

0.6

0.8

1

1.2

1 2 3 4

Co

mm

it F

ract

ion

Number of Nodes

A B C

Figure 9: Commit Rate: Vary Nodes, 100K DB

with 2, 3, and 4 nodes. As a baseline, the figure also shows a config-uration with one Veritas node. With one node, Veritas behaves likeany conventional database system as there is no coordination withother nodes required.

As expected, there is a tax for distributed concurrency controland verification in Veritas. For the update-heavy workload (A), thethroughput is only slightly more than half with distributed con-currency control and verification. With a more sophisticated dis-tributed concurrency control protocol, we could probably improvethese results. However, the figure shows that even for read-onlyworkloads (C), the best case for the optimistic concurrency controlscheme currently implemented in Veritas, the overhead is substan-tial. Nevertheless, Figure 8 also shows that the overhead is notcatastrophic; even with the simple scheme currently used in Veritasis it possible to achieve tens of thousands transactions per second.

Figure 9 shows the commit rates for this experiment. Obviouslyand expectedly, the conflict rate drops with a growing number ofnodes as the number of conflicts increases with the number ofconcurrent nodes. Furthermore, as expected, the drop is most sig-nificant for the update-heavy workload (A). Comparing the commit


0

5000

10000

15000

20000

25000

30000

35000

40000

45000

100 1000 10000 100000

Thro

ugh

pu

t (T

ran

sact

ion

s/se

c)

Database Size

A B C

Figure 10: Throughput: 4 Nodes, vary database size (numberof records)

rates of Figure 9 with the throughput drop shown in Figure 8, itcan be infered that as a rule of thumb, about half of the throughputdrop can be explained by the impact of distributed concurrencycontrol and half of the drop is due to the complexity of verification.The exact split depends on the number of nodes (the more nodes,the higher the relative impact of distributed concurrency control),but the 50/50 split is a nice rule of thumb for deployments with fewnodes.

5.3 Experiment 2: Vary Database SizeTo investigate the impact of concurrency control and data con-tention in more detail, we did a series of experiments with a varyingnumber of records in the shared table (while keeping the numberof nodes fixed at four): The fewer records, the higher the numberof conflicts, and, thus, the higher the number of aborts due to oursimple optimistic concurrency control scheme. Figures 10 and 11show the results. As expected the throughput increases with thesize of the database and a correspondigly lower number of conflictsand aborts. In the extreme of an update-heavy workload with a verysmall database of only 100 records, the throughput is roughly 1/4 ofthe throughput of a read-only workload. This result shows that (a)Veritas does not starve, even with extremely high data contentionand (b) delivers roughly the expected throughput of 1/(number ofnodes as the coordination effort is proportional to the number ofnodes. This result is encouraging because it shows that the imple-mentation of shared tables in Veritas does not create any additionalbottlenecks: Indeed, verification and distributed concurrency con-trol has a price as shown in Experiment 1 - but the bottlenecks ofa database system that uses verification (and distributed trust) arethe same as the bottlenecks of a traditional database system.

6 RELATEDWORK

Blockchain Systems: Bitcoin [26] is one of the first blockchainsystems proposed for decentralized cryptocurrencies. Bitcoin is a

0

0.2

0.4

0.6

0.8

1

1.2

100 1000 10000 100000

Co

mm

it F

ract

ion

Database Size

A B C

Figure 11: CommitRate: 4Nodes, vary database size (numberof records)

public blockchain that relies on cryptographic puzzles (proof-of-work) and incentives for decentralized trust. Ethereum [33] gen-eralizes these ideas for more general computation. Both Bitcoinand Ethereum are public blockchains, meaning anyone can join thenetwork and participate in the blockchain protocol.

For the enterprise applications we are interested in, privateblockchains are more relevant. Examples of such blockchains in-clude Hyperledger [22] and Quorum [28]. These systems use dif-ferent mechanisms such as PBFT [9] for consensus rather thanproof-of-work. But their verification methodology is similar to thepublic blockchains and relies on mirroring the state and rerunningtransactions at every node. Recent blockchain proposals such asCoco [11], Ekiden [10], and Intel Sawtooth Lake [29] rely on TEEssuch as Intel SGX [14] for confidentiality and performance. Asnoted above, Veritas can be combined with similar ideas to achieveconfidentiality. Another interesting approach to confidentiality isCorda [13] that does not use a single global ledger of transactions,but a decentralized network of notaries to prevent double spendingattacks.Database Systems: BigchainDB [6] is a recent system that over-lays blockchain consensus ideas such as PBFT over a databasesystem (MongoDB [25]). While the overall goals of BigchainDBare similar to ours, there are fundamental architectural differences.BigchainDB, similar to traditional blockchains, relies on replicatingthe state and rerunning transactions at every node. Further, theentire MongoDB code is part of the TCB of the system. Veritasarchitecture allows the database state and query processing to becentralized, with much smaller verifier TCB. Verifiable databasesare related to the large body of work on ensuring data integrityin outsourced settings [1, 3, 23, 34]. Most of this work deals withverifying data integrity for analytical workloads [3, 34] with poorupdate performance [23]. Concerto [1] is a recent system that ad-dresses update performance concerns of previous systems and isthe source for many of the technical ideas underlying Veritas.Other work: Google Fusion Tables [21] is a system for sharing anddistributed editing of tables, but it does not provide the verifiability


guarantees required for blockchain style applications. There is alsoa lot of work on data integration that is relevant [19].

7 CONCLUSIONSWe introduced two new abstractions. A verifiable database systemcreates an immutable log and allows an auditor to check the validityof query answers and updates that the database produced. A shared,verifiable table creates the same abstraction at the scale of a tablewhich can be shared and thus part of many different databaseinstances in the cloud — which operate on the table as it is were asingle table.

Experimental results with our prototype system Veritas showthat our abstractions scale with the verification overhead and withthe distributed concurrency control schema.

REFERENCES[1] Arvind Arasu, Ken Eguro, Raghav Kaushik, Donald Kossmann, Pingfan Meng,

Vineet Pandey, and Ravi Ramamurthy. 2017. Concerto: A High ConcurrencyKey-Value Store with Integrity. In Proceedings of the 2017 ACM InternationalConference on Management of Data, SIGMOD Conference 2017, Chicago, IL, USA,May 14-19, 2017. 251–266.

[2] ARM TrustZone 2018. Arm TrustZone. https://developer.arm.com/technologies/trustzone.

[3] Sumeet Bajaj and Radu Sion. 2013. CorrectDB: SQL Engine with Practical QueryAuthentication. PVLDB 6, 7 (2013), 529–540.

[4] Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, and PatrickO’Neil. 1995. A critique of ANSI SQL isolation levels. In ACM SIGMOD Record,Vol. 24. ACM, 1–10.

[5] Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman. 1987. ConcurrencyControl and Recovery in Database Systems. Addison-Wesley. http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx

[6] BigchainDB 2018. BigchainDB: The Blockchain Database. https://www.bigchaindb.com/whitepaper/bigchaindb-whitepaper.pdf.

[7] Carsten Binnig, Stefan Hildenbrand, Franz Färber, Donald Kossmann, JuchangLee, and Norman May. 2014. Distributed snapshot isolation: global transactionspay globally, local transactions pay locally. VLDB J. 23, 6 (2014), 987–1011.https://doi.org/10.1007/s00778-014-0359-9

[8] Manuel Blum, William S. Evans, Peter Gemmell, Sampath Kannan, and MoniNaor. 1994. Checking the Correctness of Memories. Algorithmica 12, 2/3 (1994),225–244.

[9] Miguel Castro and Barbara Liskov. 2002. Practical byzantine fault tolerance andproactive recovery. ACM Trans. Comput. Syst. 20, 4 (2002), 398–461.

[10] Raymond Cheng, Fan Zhang, Jernej Kos, Warren He, Nicholas Hynes, Noah M.Johnson, Ari Juels, Andrew Miller, and Dawn Song. 2018. Ekiden: A Platformfor Confidentiality-Preserving, Trustworthy, and Performant Smart ContractExecution. CoRR abs/1804.05141 (2018). http://arxiv.org/abs/1804.05141

[11] Coco 2017. Coco framework. https://github.com/Azure/coco-framework.[12] Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell

Sears. 2008. Benchmarking cloud serving systems with YCSB. Symposium onCloud Computing (2008).

[13] Corda 2018. Corda blockchain. https://www.corda.net/.[14] Victor Costan and Srinivas Devadas. 2016. Intel SGX Explained. IACR Cryptology

ePrint Archive 2016 (2016), 86. http://eprint.iacr.org/2016/086[15] Jiaqing Du, Sameh Elnikety, and Willy Zwaenepoel. 2013. Clock-SI: Snapshot

Isolation for Partitioned Data Stores Using Loosely Synchronized Clocks. In IEEE32nd Symposium on Reliable Distributed Systems, SRDS 2013, Braga, Portugal, 1-3October 2013. IEEE Computer Society, 173–184. https://doi.org/10.1109/SRDS.2013.26

[16] Cynthia Dwork, Moni Naor, Guy N. Rothblum, and Vinod Vaikuntanathan. 2009.How Efficient Can Memory Checking Be?. In Theory of Cryptography, 6th Theoryof Cryptography Conference, TCC 2009, San Francisco, CA, USA, March 15-17, 2009.Proceedings. 503–520.

[17] Ken Eguro and Ramarathnam Venkatesan. 2012. FPGAs for trusted cloud comput-ing. In 22nd International Conference on Field Programmable Logic and Applications(FPL), Oslo, Norway, August 29-31, 2012. 63–70.

[18] Hector Garcia-Molina, Dieter Gawlick, Johannes Klein, Karl Kleissner, and Ken-neth Salem. 1991. Modeling Long-Running Activities as Nested Sagas. IEEE DataEng. Bull. 14, 1 (1991), 14–18.

[19] Behzad Golshan, Alon Y. Halevy, George A. Mihaila, and Wang-Chiew Tan.2017. Data Integration: After the Teenage Years. In Proceedings of the 36th ACMSIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, PODS 2017,

Chicago, IL, USA, May 14-19, 2017. 101–106.[20] Jim Gray and Andreas Reuter. 1990. Transaction Processing: Concepts and Tech-

niques.[21] Alon Y. Halevy. 2013. Data Publishing and Sharing using Fusion Tables. In CIDR

2013, Sixth Biennial Conference on Innovative Data Systems Research, Asilomar,CA, USA, January 6-9, 2013, Online Proceedings.

[22] Hyperledger 2018. Hyperledger - Blockchain for Business. https://www.hyperledger.org/.

[23] Rohit Jain and Sunil Prabhakar. 2013. Trustworthy data from untrusted databases.In 29th IEEE International Conference on Data Engineering, ICDE 2013, Brisbane,Australia, April 8-12, 2013. 529–540.

[24] Leslie Lamport. 1984. Using Time Instead of Timeout for Fault-Tolerant Dis-tributed Systems. ACM Trans. Program. Lang. Syst. 6, 2 (1984), 254–280.

[25] MongoDB 2018. MongoDB. https://www.mongodb.com/.[26] Satoshi Nakamoto. 2008. Bitcoin: A peer-to-peer electronic cash system.[27] QLDB 2018. Amazon Quantum Ledger Database (QLDB). https://aws.amazon.

com/qldb/.[28] Quorum 2018. Quorum Advanced Blockchain Technology. https://www.

jpmorgan.com/global/Quorum.[29] Sawtooth Lake 2018. Intel Sawtooth Lake. https://intelledger.github.io/.[30] Srinath Setty, Sebastian Angel, Trinabh Gupta, and Jonathan Lee. 2018. Verifiable

and private execution of concurrent services with Spice. In OSDI. To appear.[31] Srinath Setty, Soumya Basu, Lidong Zhou, Michael Lowell Roberts, and Ramarath-

nam Venkatesan. 2017. Enabling secure and resource-efficient blockchain networkswith VOLT. Technical Report. Technical Report MSR-TR-2017-38, MicrosoftResearch.

[32] Windows VBS 2018. Windows Virtualization Based Security. https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-vbs.

[33] Gavin Wood. 2018. Ethereum: A secure decentralized transaction ledger. http://gavwood.com/paper.pdf.

[34] Yupeng Zhang, Daniel Genkin, Jonathan Katz, Dimitrios Papadopoulos, andCharalampos Papamanthou. 2017. vSQL: Verifying Arbitrary SQL Queries overDynamic Outsourced Databases. In 2017 IEEE Symposium on Security and Privacy,SP 2017, San Jose, CA, USA, May 22-26, 2017. 863–880.

https://developer.arm.com/technologies/trustzone

https://developer.arm.com/technologies/trustzone

http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx

http://research.microsoft.com/en-us/people/philbe/ccontrol.aspx

https://www.bigchaindb.com/whitepaper/bigchaindb-whitepaper.pdf

https://www.bigchaindb.com/whitepaper/bigchaindb-whitepaper.pdf

https://doi.org/10.1007/s00778-014-0359-9

http://arxiv.org/abs/1804.05141

https://github.com/Azure/coco-framework

https://www.corda.net/

http://eprint.iacr.org/2016/086

https://doi.org/10.1109/SRDS.2013.26

https://doi.org/10.1109/SRDS.2013.26

https://www.hyperledger.org/

https://www.hyperledger.org/

https://www.mongodb.com/

https://aws.amazon.com/qldb/

https://aws.amazon.com/qldb/

https://www.jpmorgan.com/global/Quorum

https://www.jpmorgan.com/global/Quorum

https://intelledger.github.io/

https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-vbs

https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-vbs

http://gavwood.com/paper.pdf

http://gavwood.com/paper.pdf

veritas: shared verifiable databases and tables in the...

Documents