efﬁcient cross-shard transaction execution in sharded ... · utxo based model where each...

Efficient Cross-Shard Transaction Execution in ShardedBlockchains

Sourav DasUniversity of Illinois atUrbana-Champaign

[email protected]

Vinith KrishnanUniversity of Illinois [email protected]

Ling RenUniversity of Illinois [email protected]

ABSTRACTSharding is a promising blockchain scaling solution. But it currentlysuffers from high latency and low throughput when it comes tocross-shard transactions, i.e., transactions that require coordinationfrom multiple shards. The root cause of these limitations arise fromthe use of the classic two-phase commit protocol, which involveslocking assets for extended periods of time. This paper presentsRivet, a new paradigm for blockchain sharding that achieves lowerlatency and higher throughput for cross-shard transactions. Rivethas a single reference shard running consensus, andmultipleworkershards maintaining disjoint states and processing a subset of trans-actions in the system. Rivet obviates the need for consensus withineach worker shard, and as a result, tolerates more failures withina shard and lowers communication overhead. We prove the cor-rectness and security of Rivet. An evaluation of our prototypeimplementation atop 50+ AWS EC2 instances demonstrates thelatency and throughput improvements for cross-shard transactionsof Rivet over the baseline 2PC. As part of our evaluation effortand an independent contribution, we also propose a more realis-tic framework for evaluating sharded blockchains by creating abenchmark based on real Ethereum transactions.

1 INTRODUCTIONA typical blockchain system replicates storage and computationsamong all its nodes and runs a single consensus algorithm involvingall nodes [28, 38]. Such a global replication approach has limitedscalability and throughput. Sharding has emerged as a promis-ing approach to address the long-standing quest for blockchainscalability [9, 14, 23, 25, 37, 40]. Sharding improves scalability bypartitioning different responsibilities and resources to different setsof nodes. A sharded blockchain can potentially shard its storage,communication, and computation.

A critical design component of a sharded ecosystem is its mech-anism to handle cross-shard transactions, i.e., transactions that in-volve more than one shards. Cross-shard transactions are essentialto sharded blockchains as they enable users to atomically interactwith multiple shards; in other words, a sharded blockchain withoutsuch support is uninteresting as it degenerates to running multipleindependent blockchains. Popular examples of cross-shard trans-actions include atomic exchange of assets maintained at differentshards [19], and atomically booking a flight ticket and a hotel roomwhere the two are being sold in different shards [2, 4].

A number of prior works [9, 14, 23, 25, 37, 40] proposed shardingschemes under different settings. These protocols can linearly scaleintra-shard transactions, i.e., transactions that can be processedwithin a single shard, by adding more shards to the system. How-ever, existing works encounter a performance bottleneck when it

comes to cross-shard transactions. All of the above works adoptthe two-phase commit (2PC) protocol to execute cross-shard trans-actions. While 2PC is the simplest and most well-known atomiccommit protocol, it requires nodes to lock assets for an extendedperiod of time, leading to higher latency and lower throughput forcross-shard transactions.

A new paradigm for sharded blockchains. In this paper, weaim to address the above limitation with a new framework forsharded blockchains called Rivet. Rivet achieves lower confirma-tion latency and better throughput for cross-shard transactions ata modest cost of high intra-shard latency (but not throughput). Wegive an overview of Rivet below.

Rivet has a single reference shard and multiple worker shards.Each shard can be both permissioned and permissionless. This pa-per focuses on the permissioned setting. In particular, we assumethat every node in a shard is aware of identities of other nodesof its own and the reference shard.1 The reference shard runs aconsensus layer and maintains its own blockchain. Each workershard maintains a disjoint set of states in the system. Each workershard executes blocks of transactions involving it and vouches forthe validity of the resulting state. It is important to note that workershards do not run consensus on these blocks – instead, they pe-riodically submit hash digests of worker blocks to the referenceshard. Cross-shard transactions are also submitted to the refer-ence shard by users in the system. The worker shard commitmentsand cross-shard transactions are then finalized and ordered by theconsensus layer of the reference shard. When a set of cross-shardtransactions are finalized, each worker shard locally executes thesubset of these transactions that are relevant to it, atop the latestcommitted states. To do that, a worker shard needs to downloadthe data needed by these transactions from other shards along withaccompanying proofs showing the validity of the data (under thelatest commitments).

Rivet offers two main advantages over the classic 2PC approach.Firstly, Rivet obviates the need to run consensus within eachworker shard. As a result, each worker shard in Rivet requiresfew replicas2 and runs a simpler and cheaper (using less communi-cation) protocol, compared to the 2PC approach. Secondly, Rivetimproves the confirmation latency and throughput of cross-shardtransactions. Specifically, a cross-shard transaction gets confirmedas soon a single worker shard involved in the transaction locallyexecutes it and adds it to a certified worker block (§4). This holdsindependent of the number of shards involved, and is in sharpcontrast to the 2PC approach, where cross-shard transactions are

1Note that this assumption is implicit in all committee based Byzantine Fault Tolerantconsensus protocols.2We use the terms replica and node interchangeably in this paper.

arX

iv:2

007.

1452

1v2

[cs

.CR

] 2

9 Ja

n 20

21

delayed by the slowest participating shards. As a consequence ofthe lower latency, cross-shard transactions in Rivet lock data itemsfor a shorter amount of time (i.e., they are made available to futuretransactions sooner), leading to higher throughput for cross-shardtransactions. Not running consensus protocols in worker shardsalso comes with a downside: intra-shard transactions are finalizedonly when their state commitments get included in the referencechain. This will result in a higher latency for intra-shard transac-tions compared to 2PC.

We implement Rivet (and a 2PC baseline) atop open-sourceQuorum client [6]. On a side note, our implementation supportsthe generic smart contract execution model whereas prior shardingproposals focus on the Unspent Transaction Output (UTXO) [28]transaction model, which might be of independent interest to thereader.

An evaluation framework for sharded blockchains.While at-tempting to compare Rivet against the 2PC approach, we findus (and the field of blockchain sharding) in need of a better eval-uation framework. Currently, the evaluation methodology of ex-isting works is ad-hoc and artificial. In particular, most of themrandomly allocate (synthetic) transactions to shards. Clearly, sucha random allocation would make the vast majority of transactionscross-shard and fail to capture the characteristics of a realisticsharded blockchain.

In light of this, we try to characterize the Ethereum transactionhistory with the aim of better understanding the interactions withinand across shards had we sharded Ethereum.We proceed to create abenchmark for sharded blockchains. At a high level, our benchmarkrepresents interactions between accounts as a graph and partitionsthem into different shards while minimizing the amount of cross-shard transactions. Overall, we observe less than 30% cross-shardtransactions among different shards as opposed to over 90% cross-shard transactions arising from a random allocation of accountsto shards [14, 40]. We observe that our approach partitions majorservices along with their users into different shards. Thus, we be-lieve this benchmark gives a more realistic way to evaluate shardedblockchain systems (ours and future ones).

Experimental Evaluation.We then evaluate themusing our bench-mark on a testbed of 50+ AWS EC2 instances with realistic networkdelays. Our evaluation illustrates that almost all cross-shard trans-actions in Rivet are confirmed within one worker block intervalfrom its inclusion in the reference chain. Furthermore, Rivet hasapproximately 35% better throughput for cross-shard transactionsin comparison to 2PC based design. In addition, the vast majority(> 99%) of state variables accessed by cross-shard transactions areunlocked immediately in Rivet unlike in 2PC.In summary, we make the following contributions:• We present Rivet, a novel sharded system that has lowerconfirmation latency for cross-shard transactions, toleratesmore failures, and has better block utilization over exist-ing approaches. We supplement our claims with theoreticalproofs of their correctness and security.• We analyze historical Ethereum transactions to better char-acterize benefits of sharding in permissionless blockchains

and use our analysis to create a realistic benchmark for eval-uating sharded blockchains.• We implement both Rivet and 2PC atop an open sourceQuorum client and rigorously evaluate themusing our bench-mark on a testbed of 50+AWSEC2 instances. Our evaluationsfurther corroborate our design choices.

Paper Organization. We give the background in §2. We describeour methodology to analyze Ethereum transaction history and ourfindings in §3. We present the detailed design of Rivet in §4 andargue about its correctness and security in §5 (with formal proofs inAppendix 9). §6 describes our prototype implementation of Rivetand 2PC as well as experimental results. We describe related workin §7 and end with a discussion on future research directions in §8.

2 PRELIMINARIES AND NOTATIONSharded Blockchains. Sharding is a prominent approach usedto improve the performance and scalability of current blockchainprotocols. The main idea is to split the overheads of processingtransactions among multiple smaller groups of nodes. These groups,also called shards, work in parallel to maximize performance (in-volving transaction processing and state update) while requiringless communication with fewer other nodes. This allows the systemto scale to a large number of participating nodes.

To begin with, the participating nodes must be partitioned intodifferent groups in such a manner that no shard is overwhelmed bytoo many malicious nodes. This is typically done by partitioning thenodes randomly into approximately equal sized groups [23, 25, 40].The size of the groups are set so that once partitioned, the fractionof faulty nodes in each shard remains below a certain threshold.

The state of the blockchain is then partitioned amongst differentshards, i.e. a disjoint set of accounts are assigned to each shardso that nodes in one shard only process transactions associatedwith those accounts. Sharding is expected to improve performancebecause, hopefully, most transactions are “local” to a single shardand only require the participation of replicas maintaining that shard.We call these transactions intra-shard transactions.

Cross-shard transactions and 2PC. Transactions that involvemultiple shards are called cross-shard transactions. Execution ofcross-shard transactions require some coordination mechanismamong the participating shards. Most existing sharding schemesuse 2PC to atomically execute cross-shard transactions. Moreover,they primarily focus on UTXO based model where each transactionuses unspent tokens as inputs to create a new transaction withfresh unspent outputs. We use an example to illustrate how such asharding system works.

Say a user creates a cross-shard transaction that takes two un-spent tokens, 𝑢1 on a shard 𝑋1 and 𝑢2 on a second shard 𝑋2, andmoves them to a third shard 𝑋3. The creator of the transaction,referred to as the client, broadcasts this transaction to the two in-put shards 𝑋1 and 𝑋2. On receiving this transaction, the two inputshards first validate it, i.e., check whether the tokens are indeedunspent; if so, an input shard locks the input and produces an ap-proval certificate (e.g., signed by sufficiently many replicas withinthe shard) confirming the validity of the input. On the contrary, sayone of the inputs is invalid, e.g., the associated token has alreadybeen spent, the corresponding input shard produces a rejection

2

certificate indicating the invalidity of the input. This is the first(locking) phase in classic 2PC. Note that once an input is locked,no future transaction can use the input until it is unlocked.

The client waits for the certificates from all input shards, andif all input shards unanimously approve the transaction, it sendsthe transaction along with all the certificates to the output shard(s)(𝑋3 in our example). On receiving the cross-shard transaction andthe unanimous approval certificates, the output shard adds thedesired token to the appropriate account and sends a confirmationcertificate to the client. Alternatively, if any of the input shardsreject its input, every output shard rejects the transaction. Theclient also forwards the approval or rejection certificates to everyinput shard. An input shard marks the input as spent if there areunanimous approval certificates, or else unlocks the input for futuretransactions. This is the second phase of the standard 2PC protocol.

Smart contracts. Smart contracts are programs consisting of a setof functions that are identified by unique addresses. Each smartcontract maintains its state, a set of disjoint key-value pairs, thatcan be modified according to the program logic of the contract.Smart contracts are created by sending transactions containing itscode. Upon creation, users can invoke functions in them by sendingtransactions to the contract address. Functions of smart contractscan also be invoked by other smart contracts.

A transaction invokes a function by specifying the appropri-ate contract address, the function, and the required arguments tothe function. On receiving a transaction, the proposer of a blockvalidates the transaction before including it in its proposal. Onceincluded in a proposal, transactions are executed atop some initialstate, and its execution results in a new state. The state transitionis deterministic and is denoted by the function Π. Specifically, ifa transaction tx is executed on top of an initial state state, thenthe resulting state is state′ = Π(state, tx). Sometimes, we over-load the notation to apply the transition function Π on an orderedlist of transactions, which should be interpreted as executing thetransactions in the ordered list one by one.

3 A BENCHMARK FOR SHARDEDBLOCKCHAINS

Prior sharding works partition the state among shards in a uni-formly random manner. Clearly, such a random partitioning doesnot capture a realistic workload for sharded blockchains. In par-ticular, it will result in a dominant fraction of cross-shard transac-tions [14, 40]. Intuitively, onewould expect lotmore structure/patternsbetween cross-shard and intra-shard transactions than the simu-lated transactions with random access pattern. Consequently, eval-uation results from these contrived benchmarks may significantlydepart from reality and fail to accurately reflect the performance ofsharded blockchains. In this section, we seek to create a benchmarksuitable for sharded blockchains by intelligently partitioning theworkload of Ethereum, which is a leading blockchain supportinggeneral computation in the real world.

To this end, we partition the Ethereum state in such a way thatcross-shard interactions are minimized. We analyze our results,and observe that major ”services” are assigned to different shards.Moreover, many other accounts interact with one major servicefrequently and they are assigned to the same shard as that service.

1 1

1 12

11

PartitionI PartitionIIcan be modified according the program logic of the contract.Smart contracts are created by sending transaction containingits code. Upon creation, users can invoke functions in themby sending transactions to the contract address. Functions ofsmart contracts can also be invoked by other smart contracts.

Transactions invoking a function, do so by specifying theidentity of the appropriate contract address, the function,and the required arguments. On receiving a transaction, theproposer of a block validates the transaction, and on successfulvalidation it includes the transaction in its proposal. Onceincluded in a proposal, transactions are executed atop someinitial state, and its execution results in a new state. The statetransition is deterministic and is denoted by the function ⇧.Specifically, let � be the initial state. Then the resulting stateafter executing a transaction say ⌧ is given by �0 = ⇧(�, ⌧).Sometimes, we overload the notation to apply the transitionfunction ⇧ on an ordered list of transactions.

III. CASE STUDY

Existing works on sharding partition the state among itsshards in a uniformly random manner. These works illustratethat such random partitioning results in a high fraction ofcross-shard transactions [6], [7]. Since execution of cross-shard transactions typically involve relatively complex inter-actions between different shards, a natural question that arisesis whether there exists better partitioning methodologies thatsignificantly lowers the fraction of cross-shard transactions?We affirmatively answer this question and present a concreteapproach that achieves this desiderata based on the analysisof Ethereum transaction history.

A. Methodology

We partition the historical Ethereum state based on theinteraction between different addresses that we observe infive thousand different blocks in five different ranges startingat approximately the 7.3 million’th block. Our partitioningscheme is inspired by techniques used in [15]. We representinteraction between the addresses as an undirected graph.Each account is a vertex and edges connect accounts that areaccessed by the same transactions. For transactions that accessinformation from more than two accounts, we connect theseaccounts with a clique. Edge weights denote the number oftransactions that access the pair of accounts connected by theedge.

Four different weights are assigned to every vertex. Theseweights correspond to size of the account storage, the degree ofthe vertex, i.e., the total number of transactions that access thevertex, amount of computation (measured in gas) used by thesetransactions, and the cumulative size of the the transactionsaccessing the account. The weights are used to capture theamount of activity associated with any given account. Thegraph is then partitioned into k non-overlapping shards usingMetis [16]. In doing so, we seek to lower weights on cutedges (i.e. a min-cut) while balancing the individual weightsof shards within a constant factor. The weight of a shard is

Vertex Weighta1 1, 1, 2, 4a2 1, 2, 2, 4a3 1, 2, 2, 4a4 3, 2, 6, 9a5 1, 2, 2, 4a6 3, 3, 4, 11a7 2, 3, 4, 8a8 2, 3, 4, 4

the sum of the corresponding weights of the vertices assignedto the shard.

1 1

1 12

11

PartitionI PartitionIIVertex Weight

a1 1, 1, 2, 4a2 1, 2, 2, 4a3 1, 2, 2, 4a4 3, 2, 6, 9a5 1, 2, 2, 4a6 3, 3, 4, 11a7 2, 3, 4, 8a8 2, 3, 4, 4

1

Fig. 1: Sample graph with eight accounts {a1, . . . , a8} and sixtransactions {⌧1, . . . , ⌧6} marked with regions of different colors.Vertex weights is a four element tuple (degree, state size, gas usage,transaction size).

Figure 1 illustrates our approach on a sample state witheight accounts {a1, . . . , a8} and six transactions {⌧1, . . . , ⌧6}indicated by the colored regions. Accounts accessed by trans-action are enclosed by their respective regions. For example,transaction ⌧5 accesses a6, a7 and a8. Account a4 is accessedby ⌧2, ⌧3 and ⌧4. Edge weights between a pair of nodesrepresent the number of times the pair has been accessedby common transactions. For example, edge (a6, a7) has aweight of 2 as the pair has been accessed by both ⌧5 and ⌧6.Sourav: Talk about vertex weight and partitioning..The state ispartitioned in such a way that the sum of weights of accountsin Partition I is equal to the sum of weights of accounts inPartition II.

B. Findings

Table I illustrates the average fraction of cross-shard trans-actions with increasing number of shards. Observe that thisfraction is below 0.2 for every sample points. Furthermore,all most all cross-shard transactions involve only two shards.This represents a strong evidence that ad-hoc partitioning ofaccounts in existing sharded systems are significantly sub-optimal in terms of fraction of cross-shard transactions.

#Shards 4 5 6 7 8 9 10

%Cross-shard 0.11 0.17 0.14 0.18 0.18 0.17 0.16

TABLE I: Average fraction of cross-shard transactions with varyingnumber of shards, evaluated using a trace of approximately 750thousand historical Ethereum transactions.

Intra and Cross-shard activities. Figure 2 illustrates theactivities between every pair of shards (including intra-shard)

Figure 1: Sample graph with eight accounts {𝑎1, . . . , 𝑎8 } and six transac-tions {tx1, . . . , tx6 } marked with regions of different colors. Vertex weightsis a four element tuple (degree, state size, gas usage, transaction size) .

We believe this will be close to the ecosystem of a realistic shardedblockchain and the benchmark created this way is a good candidatefor evaluating sharded blockchains in this paper as well as futureworks.

3.1 MethodologyWe take four thousand different blocks starting approximately atthe 7.3 million’th block. We represent accounts and transactionsinteraction with them as an undirected graph. Each account is avertex. Edge weights denote the number of transactions that in-volve the corresponding two accounts. For every transaction thatinvolves accounts 𝑢 and 𝑣 both, the edge weight of (𝑢, 𝑣) is incre-mented by one. If a transaction involves more than two accounts,it contributes one unit of weight to all edges in the clique formedby these accounts.

Our partitioning scheme is inspired by techniques used in dis-tributed database partitioning. The connection will be explored insection §7. As mentioned, we hope to partition the accounts into anumber of disjoint shards and minimize the number of cross-shardtransactions. But, a blunt partitioning approach will simply put allaccounts in a single shard and eliminate cross-shard transactions.Thus, we need additional constraints to avoid the above trivial par-tition results. To this end, we require the partition to be more orless balanced in terms of activities. In particular, we will assignevery vertex four different weights: (1) the account’s storage size(measured in bytes) (2) the total degree of the vertex, i.e., the totalnumber of transactions that access the vertex, (3) the total amountof computation (measured in gas) used by the transactions access-ing the account, and (4) the total size of the transactions accessingthe account. These four weights measure the storage, frequency ofinvolvement, computation, and communication associated with anaccount, respectively.

We then seek to partition the graph into non-overlapping shardssuch that the total weight of the cross-shard edges are minimized(i.e. a min-cut) and all shards are balanced within a constant factorin terms of each of the four aggregated weights. For each of thefour metrics (storage, number of involvement, computation, andcommunication), the aggregated weight of a shard is the sum of thecorresponding weights of the vertices assigned to the shard. We usethe Metis tool [21] – a heuristic tool for constrained 𝑘-way graphpartitioning – to perform the partitioning for different values of 𝑘 .

Figure 1 illustrates our approach on a sample state with eightaccounts {𝑎1, . . . , 𝑎8} and six transactions {tx1, . . . , tx6} indicatedby the colored regions. Accounts accessed by the transaction areenclosed by their respective regions. For example, transaction tx5

3

0

10

20

30

40

50

0 10 20 30 40 50 60 70

Number of worker shards 𝑘

Cross-shardfractio

n Observed Desired, 1/(1 + 𝑘)

Figure 2:Average fraction of transactions that are cross-shard with varyingnumber of shards, 𝑘 , evaluated by partitioning a trace of approximately750 thousand historical Ethereum transactions. The average is taken over 5different ranges of 1000 blocks each.

accesses 𝑎6, 𝑎7 and 𝑎8. Account 𝑎4 is accessed by tx2, tx3 and tx4.Edge weights between a pair of nodes represent the number of timesthe pair has been accessed by common transactions. For example,edge (𝑎6, 𝑎7) has a weight of 2 as the pair has been accessed byboth tx5 and tx6. Also, the aggregated weights of accounts in bothpartition are balanced.

3.2 Partitioning Results and AnalysisThe first decision we need to make in creating a benchmark is howmany shards we should have in total. We will employ a heuristicdiscussed below. We have already discussed that we aim to makeevery shard obtained by partitioning have roughly the same num-ber of transactions for processing, resulting in a balanced workload.Following the same principle, we would also like to make the num-ber of cross-shard transactions roughly the same as the number ofintra-shard transactions per shard, again resulting in a balancedworkload between worker shards and reference shard (or coordi-nator shard). This means, we should try to make the fraction ofcross-shard transactions roughly the reciprocal of the total numberof shards, i.e., 1/(𝑘 + 1) where 𝑘 is the number of worker shards.Figure 2 illustrates the fraction of cross-shard transactions obtainedand the desired target of 1/(𝑘 + 1) as a function of the number ofshards. Naturally, the fraction of cross-shard transactions increaseswith the number of shards (in the extreme case of a single workershard, all transactions are intra-shard), while the desired fractionsdecreases. The two curves intersect roughly when the number ofworker shards is 6. This is the number of worker shards we willuse in our experiments.

Intra and cross-shard activities.Weobserve the fraction of cross-shard transactions as well as intra-shard transactions to confirmthat the partitioning indeed leads to a reduction of cross-shardactivity. Figure 3 illustrates the results. The number on an edge isthe fraction of transactions involving those two shards (possiblyothers). The number on a self-edge is the fraction of transactionsinvolving only that shard. As anticipated, there are much fewercross-shard transactions than intra-shard transactions.

On a deeper look within each shard, we observe that each shardhas a few popular contracts that lead to high intra-shard activities.For example, all the accounts of a popular cryptocurrency exchangeBinance [3], are assigned to shard𝑋1, and these accounts frequentlyinteract with each other. Shard 𝑋2 has Ethermine [5], a popular

𝑋2 𝑋3

𝑋6

𝑋4𝑋5

𝑋124.74

1.730.72

0.650.74

0.31

15.52

1.04

0.810.80

0.34

8.05

0.40

0.21

0.13

11.09

0.60

0.42

19.88

0.39

11.33

Figure 3: Fraction of intra-shard and cross-shard transactions in obtainedpartition with six worker shards 𝑋1 to 𝑋6. Self-edge weights denote thefraction of intra-shard transactions (out of all transactions), while otheredge weights denote the fraction of transactions (out of all transactions)involving those two shards.

0

0.2

0.4

0.6

0.8

1

32 64 96 128 160 192 224 256 288

Required data (in bytes)

Fractio

nof

transactions

Shard 𝑋1 Shard 𝑋2



Figure 4: Cumulative fraction of cross-shard transactions in all the sixshards in terms of data downloaded from other shards (in bytes).

Ethereum mining pool, that frequently pays the miners in the poolwhose accounts are mostly assigned to the same shard. Shard 𝑋3’smost popular account is the “Tether token” [7] contract, a popularERC20 token with a value pegged to the US dollar.

Data transfer between shards. Since every worker shard main-tain disjoint subset of the entire state, sometimes nodes of a workershard need to download state information from nodes of otherworker shards to execute cross-shard transactions. Figure 4 illus-trates the cumulative distribution of cross-shard transactions interms of the amount of data transfer needed. Observe that, in ev-ery worker shard, more than 95% of cross shard transactions onlyrequire transferring at most 128 bytes of data (4 values) from othershards to execute the transaction locally. This shows that the datatransfers for local execution of cross shard transactions is minimal.

Potential gaps from future sharded blockchains. Despite ourefforts to mimic realistic workloads, we would like to acknowledgethe potential gap between our workload and real-world shardedblockchains (when they come into existence). Since cross-shardtransactions are inherently more expensive (involve locking anddata transfer between shards), users may take intelligent measuresto reduce the amount of cross-shard transactions they use. For ex-ample, a user who repeatedly uses a service from a shard other thanhis home shard may decide to create accounts in that other shardand transfer some tokens to it. Further, applications or contractsthat expect to receive a large number of cross-shard transactions

4

might adopt a programming practice to distribute its address spaceto reduce conflicts between different transactions. These behaviorsand practices may lead to a further reduction in cross-shard trans-actions in comparison to our benchmark. The dominating activitieson the Ethereum blockchain (and other blockchains) today comefrom trading, exchanges, and mining pools. This will likely changeif blockchains are to find more practical applications. It is hardto predict what applications will prevail and what characteristics(related to sharding) they will exhibit. The methodology in thissection represents our best effort in creating a sharded blockchainworkload given the data available at the time of writing.

4 RIVET DESIGN4.1 System ModelWe adopt the standard partially synchronous network model, i.e., anetwork that oscillates between periods of synchrony and periodsof asynchrony. During periods of synchrony all messages sent byhonest replicas adhere to to a known delay bound Δ. During periodsof asynchrony messages, messages can be delayed arbitrarily. Intheoretical works, the partial synchrony model [16] is often stateddifferently (e.g., using an unknown Global Standardization Time,GST) for rigor or convenience, but the essence is to capture thepractical oscillating timing model mentioned above. A protocolin partial synchrony ensures safety (consistency) even under pe-riods of asynchrony, and provides liveness only during periods ofsynchrony.

This paper focuses on the permissioned setting.3 We assumethat at most 𝑓 replicas can be faulty in each shard. A consensusprotocol tolerating 𝑓 faults under partial synchrony needs at least3𝑓 + 1 replicas [16]. Thus, in 2PC-based protocols, every shard has3𝑓 +1 replicas. In Rivet, only the reference shard has 3𝑓 +1 replicasand every worker shard has 2𝑓 + 1 replicas. All faulty replicas arecontrolled by a single adversary A and they can deviate arbitrarilyfrom the prescribed protocol. All non-faulty replicas are honestand they strictly follow the prescribed protocol. We assume thatAcannot break standard cryptographic constructions such as hashfunctions and signatures schemes.

4.2 OverviewIn Rivet there are 𝑘+1 shards {𝑋0, 𝑋1, · · · , 𝑋𝑘 } in total. Shard𝑋0 isreferred to as the reference shard and runs a fault tolerant consensusprotocol. All cross-shard transactions in Rivet are included andordered by the reference shard. The other 𝑘 shards are called workershards, and they maintain disjoint subsets of the system states. Theworker shards verify and prove the validity of its state; but they donot need to provide consensus. Hence, as mentioned, the referenceshard has 3𝑓 +1 replicas while each worker shard has 2𝑓 +1 replicas.

Each worker shard maintains a sequence of certified blockswhere each block includes some intra-shard transactions and a hashof its predecessor. We refer to this sequence of blocks as the workerchain. Replicas within a worker shard append certified blocks tothe worker chain by collecting at least 𝑓 + 1 distinct signaturesfrom replicas within the shard. Once a block is certified, the workershard submits a commitment of the resulting state to the reference3It is conceivable to have (some or all) shards to be permissionless by running apermissionless blockchain per shard. We leave this direction as future work.

chain, to be finalized by the reference shard. We note again that,instead of running a consensus protocol per worker shard, Rivetonly requires worker shards to certify the validity of worker blocksper the protocol specification (see §4.5 for precise definition ofvalid blocks). A worker block is finalized when a reference blockcontaining its commitment is finalized in the reference chain.

Figure 5 illustrates the high-level idea behind Rivet with anexample. Say a user creates a cross-shard transaction ctx that in-volves two shard 𝑋𝑎 and 𝑋𝑏 . Let state𝑎0 and state𝑏0 be the latestcommitted states from 𝑋𝑎 and 𝑋𝑏 respectively. Also, let tx1 and tx2be two intra-shard transactions. Here, ctx is first included in block𝑃𝑟 by the reference shard; then replicas in 𝑋𝑎 and 𝑋𝑏 execute ctxatop the latest committed states state𝑎0 and state𝑏0 . After executingctx, both shards independently execute some intra-shard transac-tions, e.g., 𝑋𝑎 executes tx1 and 𝑋𝑏 executes tx2, and update theircommitments of the latest execution results state𝑎1 and state𝑏1 .

A careful reader may note that the core approach in Rivet canbe viewed as a locking scheme. At every state commitment, eachshard implicitly locks its entire state to potential future cross-shardtransactions. However, despite locking the state, a worker shardoptimistically proceeds to execute and certify new intra-shard trans-actions atop the locked state, hoping that no conflicting cross-shardtransactions will appear in the reference chain before they committhe updated state. If indeed no cross-shard transactions involvinga worker shard appear in the reference chain, the new state com-mitment gets added to the reference chain and the worker shardmakes progress. On the other hand, if some conflicting cross-shardtransactions appear before the next state commitment, Rivet forcesworker shards to execute those cross-shard transactions first beforeany new intra-shard transactions. In doing so, a worker shard mayhave to discard some certified blocks in its worker chain. We reportstatistics on how often a worker shard has to discard its certifiedblocks in §6.

Note that every worker shard can independently execute cross-shard transactions as soon as it notices them in a finalized referenceblock, independent of the status quo of other involved shards. Thisis in sharp contrast to 2PC where each shard waits for every othershard specified in the transaction to lock its state first, and only thenproceeds to execute the cross-shard transaction atop the lockedstates. Indeed, this very nature of pro-active state commitmentsallows Rivet to execute cross-shard transactions more efficientlythan 2PC.

4.3 Data structuresWorker shard blocks. A certified worker block 𝐵𝑖 at height 𝑖 atshard 𝑋𝑎 consists of the following components

𝐵𝑖 = ⟨hash𝑖−1, 𝑟𝑖 , state𝑖 ,T𝑖 ⟩ (1)

Here, hash𝑖−1 is the hash of the parent worker block, 𝑟𝑖 is theheight of the latest known reference block, and T𝑖 is a ordered listof intra-shard transactions. Let Q(𝑎) be the ordered list of cross-shard transactions that are included in the reference block since thelast commitment from 𝑋𝑎 to height 𝑟𝑖 (both inclusive) and involve𝑋𝑎 . Then, state𝑖 , the state at the end of 𝐵𝑖 , is the resulting stateafter executing Q(𝑎) followed by transactions in T𝑖 . When 𝐵𝑖−1

5

1a.Commit2a.Fetch

5a.Commit

User

Shard ShardReferenceShard

3a.Compute

1b.Commit

2b.Fetch

5b.Commit

3b.Compute4a.Certify 4b.Certify

Figure 5: Overview of Rivet with two shards 𝑋𝑎, 𝑋𝑏 and a cross-shard transaction ctx involving both shards. tx1 and tx2 are intra-shard transactions to 𝑋𝑎

and 𝑋𝑏 respectively. Here 𝐵 (𝑎)𝑖

and 𝐵 (𝑎)𝑖+1 (resp. 𝐵 (𝑏)

𝑗and 𝐵 (𝑏)

𝑗+1) are the two certified worker blocks at shard 𝑋𝑎 (resp. 𝑋𝑏 ). Actions taken by 𝑋𝑎 (resp. 𝑋𝑏 ) areindicated as 1a, · · · , 5a (resp. 1b, · · · , 5b) and they occur in the specified sequence. Here we overload notation to use state𝑎 as the cryptographic digestof state𝑎 . Note that commitment com of worker shard blocks includes other information in addition to the corresponding state. We represent them withellipses (· · · ) in this figure as they are not relevant for understanding the overview.

with state state𝑖−1 is the latest committed block from 𝑋𝑎 :

state𝑖 = Π(Π(state𝑖−1,Q

(𝑎)𝑟𝑖

),T𝑖

)The certificate of a worker block is a signature from at least 𝑓 + 1distinct replicas within the shard.

Reference shard blocks. A reference shard block 𝑃𝑟 at height 𝑟consists of the following components:

𝑃𝑟 = ⟨hash𝑟−1,C𝑟 ,Q𝑟 ⟩ (2)

Similar to worker blocks, hash𝑟−1 is the hash of the parent referenceblock, C𝑟 is a list of block commitments from (not necessarily all)worker shards,Q𝑟 is an ordered list of new cross-shard transactions.Sometimes, we use Q𝑟,𝑠 to indicate the ordered list of cross-shardtransactions that appear between reference blocks at height 𝑟 and𝑠 (both inclusive).

We explain the contents of the block commitments along withtheir purpose and description of cross-shard transactions in theupcoming paragraphs.

Block commitments. Worker shards generate a block commit-ment denoted as com after every worker block and broadcast itto the reference shard. Block commitments in Rivet serve twopurposes. First, reference shard uses these commitments to orderblocks inside worker shard; second, commitments from a shardalso acts a promise to every other shard that all future cross-shardtransactions will be executed atop the latest committed state. Also,as some commitments can get delayed due to network delay, Rivetallows worker shards to certify newer blocks and directly submitcommitment of any successor block of the latest committed block.Specifically, for a block 𝐵𝑖 at height 𝑖 , its commitment com𝑖 consistsof a cryptographic digest of state𝑖 , the resulting state after 𝐵𝑖 andsequence of a hash chain H𝑖 of certified block hashes starting withthe hash of the last committed block to the current block 𝐵𝑖 , i.e.,com𝑖 = ⟨state𝑖 ,H𝑖 ⟩. Replicas in the reference shard use this hashchain to validate that the block indeed extends the last finalizedblock from that worker shard. When clear from the context, weoverload the notation state𝑖 to denote the cryptographic digest ofstate𝑖 .

Intra-shard transaction. Intra-shard transactions in Rivet spec-ify the identity of the function they wish to invoke and appropriatefunction parameters. Creators of intra-shard transactions send these

transactions to replicas of the worker shard storing the state re-quired for their execution. Respective worker shard replicas thengossip the transactions among themselves and include it in the nextavailable worker block.

Cross-shard transaction. In addition to information specified inevery intra-shard transactions, every cross-shard transaction alsospecifies its potential read-write set in its description. The creatorof every cross-shard transactions use ideas akin to Optimistic LockLocation Prediction (OLLP) [36] to generate the read-write set. Weinclude the read-write set in the description of the cross-shardtransaction to indicate the subset of shards necessary for executingthe transaction along with the keys these shards need to exchangefor its execution. Also, unlike intra-shard transactions, creators ofevery cross-shard transaction send their transaction directly to atleast one honest replica of the reference shard. These replicas thengossip these transactions among themselves and include them innew reference blocks as described in the next section.

4.4 Reference Shard ProtocolReplicas in the reference shard run a standard consensus protocol,such as PBFT [11] or HotStuff [39], to finalize new proposed blocksand append them to the reference chain. For concreteness, we useHotstuff [39] as the underlying consensus protocol in the referenceshard. In this section, we primarily focus on the rules for proposinga new block, as we use the remaining part of the Hotstuff protocolas it is.

As in HotStuff, we use views with one leader per view. In everyview, the leader of that view is responsible for driving consensus onnewer blocks. Let 𝐿 be the leader of the current view. To proposea new reference block 𝑃𝑟 at height 𝑟 , 𝐿 includes a subset of validblock commitments and cross-shard transactions. In 𝑃𝑟 , the statecommitments of worker shards are ordered before the cross-shardtransactions, and they are chosen as follows.

Let 𝑋 be a shard with com𝑙 with state state𝑙 for block 𝐵𝑙 as itslatest commit that appears in reference chain up to the parent blockof the proposal 𝑃𝑟 . Let com𝑙 appear in reference block 𝑃𝑠 . Then, anew of commitment com𝑗 = ⟨state𝑗 ,H𝑗 ⟩ for a block 𝐵 𝑗 reportinga reference block at height 𝑠 from 𝑋 is valid if and only if:

(1) Each worker shard block whose hash appear in the hashchain H𝑗 has been signed by at least 𝑓 + 1 distinct replicasin 𝑋 ; and

6

HiHj

CommittmentforblockCommittmentforblock

Transactionpoolof

Figure 6: Illustration of valid (com𝑗 ) and invalid (com𝑖 ) block commitmentsfromworker shard𝑋 available at a reference chain block proposer 𝐿 prior toits new proposal 𝑃𝑟 . Here, com𝑙 is the latest known block commitment from𝑋 that had been included in reference block 𝑃𝑠 . 𝑃𝑡 is the latest referenceblock that includes the cross-shard transaction ctx2 involving 𝑋 .

(2) The block 𝐵 𝑗 extends the latest committed block 𝐵𝑙 of theshard; 𝐿 validates this using the hash chain H𝑗 mentionedinside the commitment com𝑗 .

(3) No cross-shard transaction involving 𝑋 appears after refer-ence block 𝑃𝑠 up until the parent block of 𝑃𝑟 .

For example, in Figure 6, 𝑃𝑟 at height 𝑟 is the block 𝐿 wants topropose and 𝑃𝑠 at height 𝑠 is the reference block that includes thelatest commitment com𝑙 of shard 𝑋 . Also, let 𝑃𝑡 at height 𝑞 be thelast reference block that includes a cross-shard transaction involv-ing 𝑋 . Let com𝑖 and com𝑗 be two newly available commitmentsfrom 𝑋 then commitment com𝑖 is invalid as its violates the thirdcondition mentioned above. Specifically, 𝑠 reported in com𝑖 is lessthan 𝑡 , and 𝑃𝑡 includes ctx2 a cross-shard transactions involving 𝑋 .On the other hand, assuming H𝑗 is a valid hash chain, com𝑗 is avalid state commitment since 𝑢 ≥ 𝑡 . We summarize the procedurefor validating state commitments in Algorithm 1.

For every cross-shard transaction ctx in 𝐿’s transaction pool,it is considered valid if and only if ctx does not intend to readand write from a key that some preceding cross-shard transactionalready intends to write. Let Xctx = {𝑋1, 𝑋2, · · · , 𝑋𝑚} be the subsetof shards that are involved in executing ctx. Let R(𝑎)ctx be the set ofkeys from shard 𝑋𝑎 ∈ Xctx that ctx mentions in its read-set. Also,let com𝑎 be the latest block commitment by shard 𝑋𝑎 . Let W(𝑎) bethe set of keys mentioned in the write set of cross-shard transactionthat are included after the commit com𝑎 . Then, ctx is consideredvalid if and only if,

R(𝑎)ctx ∩W(𝑎) = ∅, ∀𝑋𝑎 ∈ Xctx (3)

Stated differently, this validation check ensures that every cross-shard transaction reads keys that have not been written to by anyother cross-shard transaction since the last commit. Refer to Algo-rithm 1 for precise details. This is important as it enables replicasin a worker shard to prove and validate the correctness of data theyexchange during execution of cross-shard transactions (see §4.6).

Figure 7 provide an illustration of how the proposer 𝐿 validatesthe cross-shard transactions it observes to include them in its nextproposal. Let 𝑃𝑟−1 be the latest reference block with cross-shardtransactions ctx1, ctx2 known to 𝐿. Also, let com𝑎1 and com𝑏1 bethe latest commitments from shards 𝑋𝑎 and 𝑋𝑏 respectively. Say𝐿 wants to propose the next block 𝑃𝑟 . Let us assume that keyskey(𝑎)

𝑖’s and key(𝑏)

𝑖’s denote the states maintained by shard 𝑋𝑎

and 𝑋𝑏 respectively. Let 𝑅ctx and𝑊ctx denote the read-write setmentioned in the description of the cross-shard transaction ctx.

TransactionPoolof

Figure 7: Illustration of valid (ctx5, ctx6) and invalid cross-shard trans-actions in the transaction pool of the current reference chain leader 𝐿.Validation of cross-shard transactions by leader 𝐿 before including themin the next reference block 𝑃𝑟 . Transactions shaded in blue are the setof valid transactions and transactions shaded in red are the set of invalidtransactions.

Lastly, let’s assume that 𝐿 has already included com𝑎1 , ctx3, andctx4 in the 𝑃𝑟 it has created so far.

Now, among the remaining transactions from transaction pool of𝐿, i.e., {ctx5, ctx6, ctx7, ctx8} in our example, ctx7 and ctx8 can notbe included in 𝑃𝑟 as ctx7 aims to read from key(𝑏)1 on which ctx1already holds a write-lock. Similarly, ctx8 aims to read from keyskey(𝑏)3 and key(𝑏)4 that are in write set of ctx3 and ctx4 respectively.On the contrary, ctx5 and ctx6 do not have any read-write conflictswith any of the cross-shard transactions included so far.

4.5 Worker Shard ProtocolAlthough the protocol for a worker shard is not a consensus pro-tocol, we will borrow ideas from popular leader-based paradigmin consensus protocols. In particular, similar to reference shardprotocol, the worker shard protocol proceeds in views. Views arenumbered by monotonically increasing integers. Also, for eachview 𝑣 , one worker replica say replica with identity 𝑣%𝑛 servesas the leader of the view. The leader is responsible for proposingnew worker blocks, getting them certified by the worker shard, andsubmitting them to be finalized in the reference chain.

Similar to Hotstuff [39], we use the rotating leader approach forworker shard, i.e., views are incremented after every fixed intervaland the appropriate node is chosen as the leader of the new view.By doing so, we obviate the need of explicit leader-replacementprotocol that are required for protocol such as PBFT [11].

Next, we describe the detailed protocol within a view in Figure 8.Within each view the protocol has two phases: a proposal phaseand a certification phase.

The proposal phase. Leader 𝐿 of current view in a worker shard𝑋proposes a new block 𝐵𝑖 at height 𝑖 by broadcasting a propose mes-sage to other replicas within the shard. Recall, each new proposalreports a state state after executing all the cross-shard transac-tions (if any) known to the leader, followed by some intra-shardtransactions. All these transactions are executed atop the last com-mitted state of the shard.

When latest known worker block 𝐵𝑖−1 of shard 𝑋 is alreadycommitted, 𝐿 extends it by first executing cross-shard transactionsthat appear since commitment of 𝐵𝑖−1 atop the committed stateand then it executes some intra-shard transactions. Alternatively,when the latest known block 𝐵𝑖−1 is not yet committed, 𝐿 extends𝐵𝑖−1 if and only if no cross shard transactions appear since thereference block, 𝑃𝑟𝑖−1 , reported in 𝐵𝑖−1. Otherwise, 𝐿 proposes anew block atop the latest committed block, say 𝐵 𝑗 , from shard 𝑋 ,

7

In every view 𝑣,(1) Propose. Leader of view 𝑣, 𝐿 creates a new block 𝐵𝑖 at height 𝑖 following Algorithm 3 and broadcasts it to the all replicas within the shard.(2) Certification. On hearing the proposal 𝐵𝑖 , non-leader replicas validates 𝐵𝑖 by running Algorithm 4. On successful validation, the replica signs

𝐵𝑖 and send the signature to 𝐿. 𝑓 + 1 distinct valid signatures on 𝐵𝑖 is called a certificate of 𝐵𝑖 . Once 𝐵𝑖 is certified, 𝐿 creates its commitmentcom𝑖 and broadcasts it to the reference shard.

Figure 8: Summary of worker shard protocol in a view 𝑣.

If

If

Figure 9: Illustration of the protocol followed by a honest proposer 𝐿 inworker shard to propose a new block. Here 𝐵 𝑗 with state state𝑗 is the latestcommitted block. Let the commitment for block 𝐵 𝑗 , com𝑗 , is included inthe reference block 𝑃𝑟 𝑗 and 𝐵𝑟𝑖 be the latest worker block known to 𝐿.

after executing all the cross-shard transactions known since thelast commitment.

Figure 9 illustrates this through an example where 𝐿 proposesthe new block atop 𝐵𝑖−1 only if Q𝑟𝑖−1,𝑟𝑖 is empty, i.e., no new cross-shard transactions involving 𝑋 appear after reference block 𝑃𝑟𝑖−1 .Otherwise, 𝐿 proposes the next worker block atop the latest commit-ted block 𝐵 𝑗 from worker shard, after executing all transactions inQ𝑟 𝑗 ,𝑟𝑖 . Recall, Q𝑟 𝑗 ,𝑟𝑖 denotes the set of relevant cross-shard transac-tions that are included in a reference block since the inclusion of thecom𝑗 in the reference block 𝑃𝑟 𝑗 . We summarize this in Algorithm 3.

The certification phase. Each honest replica 𝑛 upon receivingthe proposal 𝐵𝑖 , replies with a vote message if the replica is inthe same view as the proposal and the proposal is valid. A validproposal satisfies the following properties, which we summarize inAlgorithm 4.

(1) 𝐵𝑖 extends 𝐵𝑙 , the latest committed block known to 𝑛 andthe reference block known to 𝑛 is at a height greater than orequal to the reference block mentioned in 𝐵𝑖 .

(2) The state mentioned in the proposal satisfies the propertiesof the honest proposal mentioned earlier.

On receiving 𝑓 + 1 distinct valid signatures from 𝑓 + 1 distinctreplicas, the leader 𝐿 aggregates them into a certificate, and sendsthe block commitment com𝑖 for 𝐵𝑖 to the reference shard. It is thenthe responsibility of the replica’s in the reference shard to includecom𝑖 in the next available reference block. As mentioned earlier,once the commitment com𝑖 or a commitment of its successor blockis included in the reference chain, the worker block 𝐵𝑖 is finalized.

Informally, since there are at most 𝑓 Byzantine replicas in eachworker shard, 𝑓 +1 distinct signatures on 𝐵𝑖 implies that at least onehonest replica has validated the 𝐵𝑖 as per Algorithm 4 and checkedthat 𝐵𝑖 satisfies all the requirement. This ensures that every certifiedblock adheres to the protocol specification. Similarly, since thereexists at least 𝑓 + 1 honest replicas in each every worker shard, therequirement of only 𝑓 + 1 signatures ensures that honest leaders

can successfully create the required certificate with the help of onlyhonest replica. We will formally argue this later in.

All worker shard replica then enter the next view after a pre-specified time interval and the cycle continues.

4.6 Execution of Cross-shard TransactionOnce a cross-shard transaction 𝑞 appears in the reference chain,the shards involved in its execution, X𝑞 = {𝑋1, · · · , 𝑋𝑚}, exchangethe values corresponding to keys mentioned in the read set of 𝑞with each other. Specifically, replicas within every shard 𝑋 ∈ X𝑞

request replicas of remaining shard for the committed values ofaddresses mentioned in the read-set of 𝑞 that are not maintainedby 𝑋 . On receiving responses from shards in X𝑞 , each replicavalidates the received value against the appropriate state commit.Upon correct validation, the proposer of the next worker blockexecutes the cross-shard transactions in the order they appear inthe reference block.

To avoid data download on every cross-shard transaction, Rivetbatches cross-shard transactions and sends a single download re-quest for all the keys used in all transactions in one reference block.Also, during execution, each shard updates its local state wheneverthe transaction writes to keys from the local shard.

A few subtleties arise in this process. First, every shard shouldbe aware of the description of every function that gets executedas a part of each cross-shard transaction. For example, if a cross-shard transaction executes functions from two different shards,both shards should be aware of the function description. Rivetaddresses this by tagging each smart contracts as global or local.Every shard stores descriptions of all global contracts. Rivet usescross-shard transactions to create global contracts and cross-shardtransactions in Rivet can only invoke functions of global contracts.Local smart contracts are created using intra-shard transactionsand they only accept intra-shard transactions. Although this mayappear to result in considerable overhead, this can be avoided bybetter programming practices. As illustrated in [18, 22], most ofthe contracts in Ethereum are copies of each other. Hence, a betterprogramming practice would be to create standard global librariesfor common functionalities such as ERC’20 and Exchanges.

The second subtlety arises from potential mismatch in the read-write set mentioned in the description of a transaction and theread-write set accessed by the transaction during its executionwithin the worker shards. In such scenarios, each replica abortsexecution of the transaction, reverts all the changes caused by itsexecution so far, and proceeds to the next cross-shard transaction.

5 ANALYSISIn this section, we provide a proof sketch of the safety and livenessguarantees of Rivet. The detailed proofs are deferred to Appendix 9.

8

We will then analyze the performance of Rivet and compare itagainst the 2PC based approach.

5.1 Safety and Proof Sketch of LivenessSafety of Rivet follows directly from the safety of the Byzantinefault tolerant consensus protocol used in the reference shard. Thisholds true even during periods of asynchrony as a partially synchro-nous consensus algorithm is safe under asynchrony. To elaborate,the consensus algorithm in the reference shard provides global or-der for all transactions, the intra-shard ones as well as cross-shardones. Each transaction is associated with a unique reference blockthat finalizes it in the reference chain. Transactions are hence or-dered first by their heights in the reference chain, and then by theirpositions inside the reference block.

Besides an agreed upon total order, Rivet also ensures everyworker block commitment finalized in the reference chain repre-sents a valid state. The reference shard ensures that at most oneworker block at any given height from a worker shard gets finalizedand it extends a previously finalized worker block from that shard.

We next show that Rivet makes progress during periods ofsynchrony, i.e., when messages between honest replicas get deliv-ered within a bounded delay of Δ. Specifically, during periods ofsynchrony, the reference chain and each worker chain will makeprogress. To see this, consider the shard 𝑋 and let 𝑃𝑟 be the latestreference block, and 𝐿𝑟 be honest leader for the next view. Let 𝑡be the time instant when 𝑃𝑟 is created. This implies that by time𝑡 + Δ, 𝐿𝑟 will know about 𝑃𝑟 . Hence, every honest replica of shard𝑋 including the proposer of the next block, 𝐿, will be aware of theblock 𝑃𝑟 by time 𝑡 + 2Δ. Also, since each replica of every shard isconnected with at least one honest replica of every other shard, bytime 𝑡 + 4Δ every honest replica of shard 𝑋 will have the requiredstate to execute the cross-shard transactions in blocks up to 𝑃𝑟 .

Hence, when 𝐿 proposes the next block at time 𝑡 + 4Δ, everyhonest replica will respond immediately with its signature. Thus,by time 𝑡 + 6Δ, 𝐿 will collect a certificate for its proposal, and bytime 𝑡 + 7Δ, the block commitment will reach an honest replica ofreference chain. Also, by time 𝑡 + 8Δ it will reach the leader of thereference chain. This implies that the commitment of worker blockor one of its successors will appear in the next reference block.Refer to Appendix 9 for detailed proof.

5.2 Performance AnalysisIt is easy to see that in both Rivet and 2PC based approach, eachworker shard only stores a subset of the entire state. Hence, both ap-proaches achieve state sharding. Also, each worker shard validatesa subset of all intra-shard transactions and the cross-shard trans-actions it is a part of. Hence, both protocols achieve computationsharding. Moreover, since the reference shard runs a standard BFTconensus protocol, the communication complexity of finalizing areference block is same as the communication complexity of theunderlying consensus protocol, e.g., HotSuff only requires linearcommunication.

Contrary to reference blocks, finalization of a worker block 𝐵𝑖 inRivet involves two steps: certification of 𝐵𝑖 and finalization of thereference block that includes the commitment com𝑖 of 𝐵𝑖 . It is easyto see from §4.5 that certification of every worker block involves

only one round of communication: the leader broadcasts a new pro-posal to each replica and they respond with their signatures. Sinceboth these steps have linear communication costs, overall block cer-tification protocol has linear costs as well. Hence, assuming a linearconsensus protocol in the reference chain, the overall communica-tion of finalizing a worker block is also linear. An important point tonote is that each reference block will potentially include numerousblock commitments and cross-shard transactions simultaneously,and hence the communication overhead gets amortized.

Confirmation latency of transactions. The cross-shard trans-action ctx is finalized as soon as ctx gets included in a referenceblock. The state atop which ctx should be executed has also beenfinalized by then. The only thing remaining is to get the actualexecution result, i.e., the resulting state modification due to execut-ing ctx. Since every shard involved executes ctx deterministicallyatop an identical state, this execution result becomes available assoon as one worker shard block containing ctx gets certified. Itdoes not matter which participating worker worker shard first doesso. Hence, the confirmation latency of a cross-shard transaction ismeasured as the time elapsed since its inclusion in the referencechain till the first worker block containing it gets certified.

We measure the confirmation latency of an intra-shard transac-tion as the elapsed time between its inclusion in a worker block andthe finalization of worker shard block. Note that in 2PC, workerblocks are finalized immediately and so are the intra-shard transac-tions in them.

6 IMPLEMENTATION & EVALUATIONWe implement Rivet and 2PC atop the open-source Quorum clientversion 2.4.0 [6].Quorum is a fork of the Ethereum Go client andinherits Ethereum’s smart contract execution platform and imple-ments a permissioned consensus protocol based on the IstanbulBFT (IBFT) and Tendermint [10] consensus algorithm.4 For Rivet,we use the IBFT implementation for the reference shard, and weimplement the protocol described in §4.5 for each worker shard.For 2PC, we use the IBFT implementation for all shards.

Given the exploding number of sharding proposals [9, 13, 14,23, 26, 37, 40], it is difficult to replicate each of their unique (andvastly different) parameter settings and system model. Since theyall adopt the 2PC paradigm [31], we believe comparing with 2PC inour experimental setup best illustrates the benefits and trade-offsof Rivet. Existing 2PC based approaches primarily focuses on theUTXO model or other specialized computation models (ref. §7), sowe need to implement additional support to extend 2PC to a genericsmart contract model. We next describe implementation details for2PC with generic computation in §6.1.

6.1 2PC Implementation DetailsA coordinator shard manages all cross-shard transactions [14]. Werefer to the blockchain maintained by the coordinator shard as thecoordinator chain. Users send cross-shard transactions along withtheir potential read-write set to the coordinator shard. The leaderof the coordinator shard validates these transactions for read-write

4Saltini and Hyland-Wood in [32] discusses a liveness bug in the original design ofIBFT. The bug has been fixed since then in [33].

9

conflicts and, on successful validation, proposes them to be includedin the coordinator chain. Every cross-shard transaction upon itsinclusion in the coordinator chain acquires an explicit lock on theset of keys in its read-write set. Similar to Rivet, the coordinatorshard includes a new cross-shard transaction only if the transactiondoes not conflict with any of the pending cross-shard transactions.

Worker shards monitor the coordinator chain for new cross-shard transactions. Upon noticing a new cross-shard transactionctx, involved worker shards commit to a state of the keys men-tioned in the read-set of ctx. Each commitment also carries a proofgenerated by running consensus within the worker shard. Once theworker proposal is finalized, every replica locks the keys mentionedin the read-write set of ctx from any other conflicting transactionuntil it executes ctx. Once commitments from all involved shardsappear in the coordinator chain, these shards follow the same pro-cedure as Rivet for data fetching and transaction execution. Uponexecution, worker shard replicas unlock the keys in ctx and send anacknowledgment message to the coordinator shard – at this point,the keys become accessible to future intra-shard transactions.

6.2 Experimental SetupOur experimental setup consists of six worker shards and one ref-erence shard. Each shard tolerates 𝑓 = 3 Byzantine faults. Thus,each worker shard in Rivet consists of 7 nodes (2𝑓 + 1) and thereference shard consists of 10 nodes (3𝑓 + 1). Every shard in 2PCconsists of 10 nodes (3𝑓 + 1). We run all nodes on Amazon WebServices (AWS) t3a.medium virtual machines (VM) with one nodeper VM. All VMs have 2 vCPUs, 4GB RAM, and 5.0 GB/s networkbandwidth. The operating system is Ubuntu 18.04 and the Golangcompiler version is 1.13.6.

Node andnetwork topology.We create a overlay network amongnodes with the following connectivity. Nodes within a shard arepair-wise connected, i.e., form a complete graph. In addition, eachnode is connected to 𝑓 +1 randomly chosen nodes from every othershard. We mimic a setting where each node is placed in one of 10geographical locations across different continents. Instead of plac-ing nodes physically there, we use the measured ping latency [1]for every pair of locations and then use the Linux tc tool to insertthe corresponding delay to every message. We maintain the samenetwork topology and network latency for all our experiments.

Evaluation methodology. We run both Rivet and 2PC for ap-proximately 50 reference and coordinator blocks after a initial stabi-lization period. Every worker shard in both Rivet and 2PC generateblocks after every I𝑤 = 5 seconds. We vary the block interval of thereference chain and coordinator chain to be I𝑟 = 10. We test bothdesigns using the benchmark we created in §3, using Ethereumtransactions from 6000 blocks starting at block height 7.39 M. Thistrace comprises of ~14000 cross-shard transactions. To facilitatesuch evaluation, we initialize each shard with the code and stateof relevant smart contracts. In all our experiments, we broadcasta new batch of cross-shard transactions of fixed size after everyreference or coordinator block. We refer to this batch size as thecross-shard input rate, and test both designs with cross-shard inputrates of 100, 200 and 300.

6.3 Experimental ResultsConfirmation latency. Figure 10 gives the average confirmationlatency of cross-shard transactions in Rivet and 2PC under varyingcross-shard input rate. The confirmation latency of a cross-shardtransaction is the time elapsed since the first time a reference shardreplica attempts to include the transaction in a block till it is exe-cuted by one of the participating worker shard. We further dividethe latency into two parts: wait and execution latency. The waitlatency refers to the time elapsed between the first attempt to in-clude this transaction in a reference block till it is committed inthe reference blockchain. The execution latency is the time elapsedsince the transaction is committed in the reference blockchain tillit its execution.

The wait latency is similar for Rivet and 2PC. But Rivet has ashorterexecution latency. The reason is that the cross-shard exe-cution latency of Rivet only depends on the worker shard blockgeneration interval I𝑤 . In contrast, cross-shard execution latencyin 2PC requires at least one additional reference block, i.e, it isapproximately I𝑟 + I𝑤 .

We then turn to confirmation latency of intra-shard transactions,shown in Figure 11. In Rivet, the latency comes from the factthat worker blocks are finalized only when their commitments arefinalized in the reference chain. In 2PC, the latency is a result oflocking: if an intra-shard transaction needs to read or write a lockedaccount, it is delayed until the lock is released. In our benchmark,most intra-shard transactions do not conflict with locked cross-shard assets, so this latency is insignificant. After an intra-shardtransaction is executed in 2PC it is finalized very quickly by theworker shard consensus protocol. Hence, Rivet has a worse intra-shard latency than 2PC, and this is the major trade-off in Rivet.

Transaction throughput. For a given cross-shard input rate, wemeasure the transaction throughput for cross-shard transactionsas the ratio between average number of cross-shard transactionsincluded per reference block to the cross-shard input rate. Simi-larly, for intra-shard transactions we measure its throughput as theaverage (over shards) ratio between total number of intra-shardtransactions included in the worker shards to the total number ofintra-shard transactions fired during the experiment.

Figure 12 illustrates the cross-shard throughput of Rivet and2PC under varying cross-shard input rate. Recall from §4.4 and §6.1that the reference and the coordinator chains only include non-conflicting transactions in them, and hence not all cross-shardtransactions can be immediately included in the reference chain. Inall our experiments of Rivet, we observe that the cross-shard outputrate is greater than 75% of the cross-shard input rate whereas cross-shard output rate of 2PC is approximately 60% of the correspondingcross-shard input rate. The reason is that 2PC holds locks on certainaccounts for at least one intermediate coordinator block whichresults in higher conflicts during the inclusion of newer cross-shard transactions. On the other hand, conflicts in Rivet can beresolved prior to the next reference block. Lastly, as anticipated, theabsolute value of cross-shard output rate increases linearly withincrease in cross-shard input rate, as the proposer of the referencechain in Rivet (coordinator chain in 2PC) has a larger number oftransactions to choose from for each new reference block.

10

100 200 3000

10

20

30

40

50

Cross-shard input rate

Latency(in

second

s) Rivet Execution RivetWait2PC Execution 2PC Wait

Figure 10: Average confirmation latency of cross-shard transactions withI𝑤 = 5 seconds and cross-shard input rate of 100, 200, and 300.

100 200 300

0

2

4

6

8

10

12


Latency(in

second

s) Rivet 2PC

Figure 11: Average confirmation latency of intra-shard transactions withI𝑤 = 5 seconds and cross-shard input rate of 100, 200, and 300.

100 200 300

0.4

0.6

0.8

1

1.2


Cross-shardthroug

hput Rivet 2PC

Figure 12: Average cross-shard transaction throughput for I𝑤 = 5 seconds,and cross-shard input rate of 100, 200, and 300.

100 200 300

0.4

0.6

0.8

1

1.2

Cross-shard input rateIntra-shardthroug

hput Rivet 2PC

Figure 13: Average intra-shard transaction throughput with I𝑤 = 5 secondsand cross-shard input rate of 100, 200, and 300.

Figure 13 illustrates the intra-shard throughput for both Rivetand 2PC for varying cross-shard input rate. As anticipated, in Rivetall most all available intra-shard transactions are included in everyworker block in every shard. The slightly less throughput of 2PCis due to conflicts between locked cross-shard transactions andavailable intra-shard transactions. As we have described earlier,since the number of such conflicting intra-shard transactions isvery small in comparison to the total number of intra-shard trans-actions, the reduction is throughput in 2PC is barely noticeable.In conclusion, we can say that for our benchmark, both 2PC andRivet have almost optimal throughput for intra-shard transactions.

Other findings (not shown). In addition to the above results, weobserve some other findings that are consistent across our exper-iments. More than 99% of the state commitments in Rivet areincluded in the immediate successor reference block. Furthermore,at most one re-organization per shard during the entire durationof the protocol. The very-low re-organization is a consequence ofour dynamic re-scheduling of block-proposal time instant in a waythat enables the future leaders to commit their block with higherprobability. These properties ensure that the intra-shard latency inRivet is less than one reference block interval. Similarly, almostall commit messages in 2PC are also included in the immediatesuccessor coordinator block. Each node successfully downloads thedata required for a cross-shard transactions within the first twoseconds of hearing about the cross-shard transaction.

7 RELATEDWORKRivet is partially inspired by the approach of Deterministic Trans-actions Execution (DTE) in distributed databases [36]. In DTE all

servers (shards in our case) first agree on an ordered list of transac-tions and then deterministically execute them in the agreed order.Abadi et al. [8] give a great overview of the recent progress andimprovements of DTE. DTE are be made to avoid single points offailure by replicating each server across multiple replicas using acrash fault-tolerant consensus protocol such as Paxos [24]. At somelevel, Rivet can be viewed as a method to make DTE Byzantinefault tolerant. But Rivet also differs from fault-tolerant DTE in twomajor ways: First, Rivet tolerates Byzantine failure without usingany consensus algorithm within the worker shards. Second, DTEglobally orders all the transactions before executing them; in con-trast, cross-shard transactions are ordered before being executedwhereas intra-shard transactions are optimistically executed beforebeing ordered.

Blockchain sharding. Previous blockchain sharding proposalsprimarily focus on increasing the overall throughput of the entiresystem, with minimal emphasize on characterizing and handlingcross-shard transactions [9, 13, 14, 26, 37, 40]. As summarized in [31]almost all prior works use minor variants of 2PC for cross-shardtransactions.

RS-Coin [13] and Omniledger [23] are client driven shardedsystems in the UTXO model where cross-shard transactions areexecuted using 2PC. RS-Coin is a permissioned system whereasOmniledger considers a permissionless model. Chainspace [9] alsouses a variant of 2PC for cross-shard transaction where it substi-tutes the client by a inter-shard consensus protocol called S-BAC.RapidChain [40] also considers UTXO based model where cross-shard transactions are replaced by dummy transactions at every

11

participating shard. These dummy transactions maintain seman-tic properties of the original cross-shard transactions. To executea cross-shard transaction, shards involved in the transaction run2PC protocol with every output shard, playing the role of the 2PCtransaction coordinator and input shards being the server.

Monoxide [37] partitions its participants into shards where nodesin each zone run PoW. Monoxide also adopts UTXO based datamodel and runs 2PC for cross-shard transaction. Cross shard trans-actions are executed in the initiator shards and then the proofs aresent to the receiver shards. Since Monoxide uses PoW, the receivershard needs to wait for a long duration before it can confidentlyuse the certificates from initiator shard. Cross shard transactionsin [14] use two-phase locking (2PL) and 2PC to achieve atomic-ity and isolation. To defend against attacks from clients who canlock-up shared resources for long periods, they replace clients bya distributed committee. They demonstrate that RapidChain doesnot achieve atomicity in non-UTXO model.

State paritioning. Our partitioning technique shares similaritieswith Schism [12], a database partitioning system for distributeddatabases. Schismmodels the database as a graph, where a vertex de-notes a single record/tuple and an edge connects two records if theyare accessed by the same transaction. A recent work Optchain [29]improves the placement of transaction in a sharded blockchain toreduce the fraction of cross-shard transaction. In contrast to ourgraph representation, Optchain models transactions as nodes andtransaction dependencies as edges. It deals only with the UTXOmodel. It also places more emphasis on temporal balancing, wherethe number of nodes in each shard must be the same at all times. Aconcurrent work [34] focuses on increasing throughput by creatingindividual shards for transactions that solely access one particularcontract and a single shard for transactions that access multiplecontracts. These works do not address the problem of efficientexecution of cross-shard transactions.

Sharding andoff-chain based solutions.Off-chain solutions [15,17, 20, 27, 30, 35] represent an alternative direction to improveblockchain scalability. We observe that off-chain solutions andsharding solutions have deep connections. This is not obvious atall from the current state of the literature partly because the twoapproaches start out with very different motivations. Off-chainsolutions shard part of their state/UTXOs among many subset of 𝑛nodes (𝑛 = 2 for payment channels). These nodes process local trans-actions, maintain the latest information about the assigned stateand use the consensus engine, i.e., the blockchain, to order themglobally relative to other shards. Recent off-chain based protocolssuch as [15, 20, 35] extend the dispute resolution using incentives.At their core, sharding schemes have a similar structure. Typicallythey use full-fledged consensus within every shard and some co-ordination schemes (so far 2PC) between shards to get rid of theglobal consensus engine. Our paper deviates from this conventionalwisdom by removing consensus from the worker; it is thus like ahybrid of both sharding and off-chain scalability solutions.

8 CONCLUSION AND FUTURE DIRECTIONSWehave presented Rivet, a new paradigm for executing cross-shardtransactions in a sharded system. Rivet has low latency and high

throughput for cross-shard transactions in comparison with the2PC approach. Also, only the reference shard in Rivet is requiredto run a consensus protocol; worker shards only vouch for thevalidity of blocks, and hence they require fewer replicas and lesscommunication.

It is plausible to substitute the reference chain with a hierarchy ofreference chains each coordinating commitments and cross-shardtransactions between a subset of worker shards. Such a hierarchi-cal design would allow the system to process more cross-shardtransactions concurrently. Furthermore, such a design may betterexploit locality of interaction between different subsets of shards.Extending our approach to a hierarchical design is a promisingfuture research direction.

ACKNOWLEDGMENTSThe authors would like to thank Amit Agarwal, Jong Chan Lee,and Zhuolun Xiang for numerous discussions related to the paper.The authors would also like to thank the Quorum open sourcecommunity for their responses on queries related to the Quorumimplementation.

REFERENCES[1] Global Ping Latency. (????). https://wondernetwork.com/pings [Online; accessed

15-March-2020].[2] 2019. What is the train-and-hotel problem? (2019). https://github.com/ethereum/

wiki/wiki/Sharding-FAQ#what-is-the-train-and-hotel-problem[3] 2020. Binance: Bitcoin Exchange | Cryptocurrency Exchange. (2020). https:

//www.binance.com/en[4] 2020. Distributed transaction. (2020). https://en.wikipedia.org/wiki/Distributed_

transaction[5] 2020. Ethermine - Ethereum (ETH) mining pool. (2020). https://ethermine.org/[6] 2020. Quorum: A permissioned implementation of Ethereum supporting data

privacy. (2020). https://github.com/jpmorganchase/quorum[7] 2020. Tether — Stable digital cash on the Blockchain. (2020). https://tether.to/[8] Daniel J Abadi and Jose M Faleiro. 2018. An overview of deterministic database

systems. Commun. ACM 61, 9 (2018), 78–88.[9] Mustafa Al-Bassam, Alberto Sonnino, Shehar Bano, Dave Hrycyszyn, and George

Danezis. 2017. Chainspace: A sharded smart contracts platform. arXiv preprintarXiv:1708.03778 (2017).

[10] Ethan Buchman. 2016. Tendermint: Byzantine fault tolerance in the age ofblockchains. Ph.D. Dissertation.

[11] Miguel Castro, Barbara Liskov, et al. 1999. Practical Byzantine fault tolerance. InProceedings of the Third Symposium on Operating Systems Design and Implemen-tation. 173–186.

[12] Carlo Curino, Evan Jones, Yang Zhang, and Sam Madden. 2010. Schism: AWorkload-Driven Approach to Database Replication and Partitioning. Proc. VLDBEndow. 3, 1–2 (Sept. 2010), 48–57. https://doi.org/10.14778/1920841.1920853

[13] George Danezis and Sarah Meiklejohn. 2015. Centrally banked cryptocurrencies.arXiv preprint arXiv:1505.06895 (2015).

[14] Hung Dang, Tien Tuan Anh Dinh, Dumitrel Loghin, Ee-Chien Chang, Qian Lin,and Beng Chin Ooi. 2019. Towards scaling blockchain systems via sharding. InProceedings of the 2019 International Conference on Management of Data. ACM,123–140.

[15] Sourav Das, Vinay Joseph Ribeiro, and Abhijeet Anand. 2019. YODA: Enablingcomputationally intensive contracts on blockchains with Byzantine and Selfishnodes. In Proceedings of the 26th Annual Network and Distributed System SecuritySymposium.

[16] Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer. 1988. Consensus in thepresence of partial synchrony. Journal of the ACM (JACM) 35, 2 (1988), 288–323.

[17] Stefan Dziembowski, Lisa Eckey, Sebastian Faust, and Daniel Malinowski. 2019.Perun: Virtual payment hubs over cryptocurrencies. In 2019 IEEE Symposium onSecurity and Privacy (SP). IEEE, 106–123.

[18] Ningyu He, Lei Wu, Haoyu Wang, Yao Guo, and Xuxian Jiang. 2019. Charac-terizing code clones in the Ethereum smart contract ecosystem. arXiv preprintarXiv:1905.00272 (2019).

[19] Maurice Herlihy. 2018. Atomic cross-chain swaps. In Proceedings of the 2018 ACMsymposium on principles of distributed computing. 245–254.

[20] Harry Kalodner, Steven Goldfeder, Xiaoqi Chen, S Matthew Weinberg, andEdward W Felten. 2018. Arbitrum: Scalable, private smart contracts. In 27th

12

https://wondernetwork.com/pings

https://github.com/ethereum/wiki/wiki/Sharding-FAQ#what-is-the-train-and-hotel-problem

https://github.com/ethereum/wiki/wiki/Sharding-FAQ#what-is-the-train-and-hotel-problem

https://www.binance.com/en

https://www.binance.com/en

https://en.wikipedia.org/wiki/Distributed_transaction

https://en.wikipedia.org/wiki/Distributed_transaction

https://ethermine.org/

https://github.com/jpmorganchase/quorum

https://tether.to/

https://doi.org/10.14778/1920841.1920853

{USENIX} Security Symposium ({USENIX} Security 18). 1353–1370.[21] George Karypis and Vipin Kumar. 1998. A software package for partitioning

unstructured graphs, partitioning meshes, and computing fill-reducing orderingsof sparse matrices. (1998).

[22] Lucianna Kiffer, Dave Levin, and Alan Mislove. 2018. Analyzing Ethereum’sContract Topology. In Proceedings of the Internet Measurement Conference 2018.494–499.

[23] Eleftherios Kokoris-Kogias, Philipp Jovanovic, Linus Gasser, Nicolas Gailly, EwaSyta, and Bryan Ford. 2018. Omniledger: A secure, scale-out, decentralized ledgervia sharding. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 583–598.

[24] Leslie Lamport. 2019. The part-time parliament. In Concurrency: the Works ofLeslie Lamport. 277–317.

[25] Loi Luu, Viswesh Narayanan, Chaodong Zheng, Kunal Baweja, Seth Gilbert,and Prateek Saxena. 2016. A secure sharding protocol for open blockchains. InProceedings of the 2016 ACM SIGSAC Conference on Computer and CommunicationsSecurity. ACM, 17–30.

[26] Alex Manuskin, Michael Mirkin, and Ittay Eyal. 2019. Ostraka: Secure Blockchainscaling by node sharding. arXiv preprint arXiv:1907.03331 (2019).

[27] Andrew Miller, Iddo Bentov, Surya Bakshi, Ranjit Kumaresan, and Patrick Mc-Corry. 2019. Sprites and state channels: Payment networks that go faster thanlightning. In International Conference on Financial Cryptography and Data Security.Springer, 508–526.

[28] Satoshi Nakamoto et al. 2008. Bitcoin: A peer-to-peer electronic cash system.(2008).

[29] Lan N Nguyen, Truc DT Nguyen, Thang N Dinh, and My T Thai. 2019. OptChain:optimal transactions placement for scalable blockchain sharding. In 2019 IEEE39th International Conference on Distributed Computing Systems (ICDCS). IEEE,525–535.

[30] Joseph Poon and Thaddeus Dryja. 2016. The bitcoin lightning network: Scalableoff-chain instant payments. (2016).

[31] Pingcheng Ruan, Gang Chen, Tien Tuan Anh Dinh, Qian Lin, Dumitrel Loghin,Beng Chin Ooi, and Meihui Zhang. 2019. Blockchains and Distributed Databases:a Twin Study. arXiv preprint arXiv:1910.01310 (2019).

[32] Roberto Saltini and David Hyland-Wood. 2019. Correctness analysis of ibft. arXivpreprint arXiv:1901.07160 (2019).

[33] Roberto Saltini and David Hyland-Wood. 2019. IBFT 2.0: A Safe and Live Vari-ation of the IBFT Blockchain Consensus Protocol for Eventually SynchronousNetworks. arXiv preprint arXiv:1909.10194 (2019).

[34] Yuechen Tao, Bo Li, Jingjie Jiang, Hok Chu Ng, Cong Wang, and Baochun Li. OnSharding Open Blockchains with Smart Contracts. In International Conference onData Engineering.

[35] Jason Teutsch and Christian Reitwießner. 2019. A scalable verification solutionfor blockchains. arXiv preprint arXiv:1908.04756 (2019).

[36] Alexander Thomson, Thaddeus Diamond, Shu-ChunWeng, Kun Ren, Philip Shao,and Daniel J Abadi. 2012. Calvin: fast distributed transactions for partitioned data-base systems. In Proceedings of the 2012 ACM SIGMOD International Conferenceon Management of Data. 1–12.

[37] Jiaping Wang and Hao Wang. 2019. Monoxide: Scale out Blockchains withAsynchronous Consensus Zones. In 16th {USENIX} Symposium on NetworkedSystems Design and Implementation ({NSDI} 19). 95–112.

[38] GavinWood et al. 2014. Ethereum: A secure decentralised generalised transactionledger. Ethereum project yellow paper 151 (2014), 1–32.

[39] Maofan Yin, Dahlia Malkhi, Michael K Reiter, Guy Golan Gueta, and Ittai Abra-ham. 2019. Hotstuff: Bft consensus with linearity and responsiveness. In Proceed-ings of the 2019 ACM Symposium on Principles of Distributed Computing. ACM,347–356.

[40] Mahdi Zamani, Mahnush Movahedi, and Mariana Raykova. 2018. Rapidchain:Scaling blockchain via full sharding. In Proceedings of the 2018 ACM SIGSACConference on Computer and Communications Security. ACM, 931–948.

9 PROOF OF SAFETY AND LIVENESSIn this section we will argue about the safety and liveness of Rivet.Informally, safety captures the idea that Rivet ensures a globalagreed order on a set of transactions and liveness captures that ideathat newer transactions are continuously included in the globalorder. Additionally, we will also prove that for every intra-shardtransaction, every replica of the corresponding shard executes theintra-shard transaction atop a identical starting state. Furthermore,for every cross-shard transaction, all the replicas involved in thecross-shard transaction from all the relevant shard, executes the

cross-shard transactions atop an identical initial state. Since, trans-action execution are deterministic, this implies that the final stateafter executing the cross-shard transactions are also identical.

Next we will formally define safety and liveness conditions andprove that Rivet ensures the defined safety and liveness.

Definition 9.1. (Safety) If an honest replica (could be eitherworkerreplica or reference replica) outputs a block 𝐵 at height 𝑖 , everyother honest replica of the same shard will also output block 𝐵 atheight 𝑖 .

Definition 9.2. (Liveness) Transactions (both cross-shard or intra-shard) sent to honest replicas in every shard are included in theblockchain within finite amount of time.

Safety of Rivet follows directly from the safety of the Byzan-tine fault tolerant consensus protocol used in the reference shard.This holds true even during periods of asynchrony because the par-tially synchronous consensus algorithm provides safety even underasynchrony. To elaborate, the consensus algorithm in the referenceshard provides global order for all transactions, the intra-shard onesas well as cross-shard ones. Each transaction is associated with aunique reference block that finalizes it in the reference chain. Trans-actions are hence ordered first by their heights in the referencechain, and then by their positions inside the reference block.

Besides an agreed upon total order, Rivet also ensures everyworker block commitment finalized in the reference chain repre-sents a valid statement. The reference shard ensures that at mostone worker block at any given height from a shard gets finalizedand it extends a previously finalized worker block in that shard. Wenext formally prove that this implies every honest worker shardreplica will have the same sequence of worker of blocks in its localblockchain.

Lemma 9.3. For any given shard 𝑋 , commitment of at most oneworker block at any given height from shard 𝑋 is included in thereference chain.

Proof. For the sake of contradiction, assume that state com-mitments com and com′ of two worker shard blocks 𝐵 and 𝐵′,respectively, both of at height 𝑖 was included in the reference chain.Without loss of generality, let com be is included before com′ andlet 𝑃 ′ be the reference block that includes com′. Then, accordingto Algorithm 2, no honest replica will vote for 𝑃 ′. But, since 𝑃 ′ is acommitted reference block, this implies that at least 𝑓 + 1 honestreplica voted for 𝑃 ′. Hence we get a contradiction. □

We next argue that when two honest replica in worker shardoutputs their local chain, chain on one replica will be a prefix of thethe chain output by the other replica. For two chain (blockchain)𝐴, 𝐵, we use 𝐴 ≼ 𝐵 to denote that the chain 𝐴 is a prefix of chain 𝐵.Hence, for Rivet,

Theorem 9.4. Let 𝑛, 𝑛′ be the two arbitrary honest replica of agiven shard 𝑋 , and let C, C′ be their respective local chain. Theneither C ≼ C′ or C′ ≼ C.

Proof. For the sake of contraction assume that C ̸≼ C′ andC′ ̸≼ C. Also, let 𝐵∗ ∈ C ∩ C′ be the last block where both the

13

chain agree. Also, let 𝐵, 𝐵′ be the latest committed block of C, C′ re-spectively. Observe that height(𝐵) > height(𝐵∗) and height(𝐵′) >height(𝐵∗). Hence, block 𝐵 does not extend 𝐵′ and vice versa.

From Lemma 9.3, we know that both block 𝐵 and 𝐵′ have distinctheight. Without loss of generality, let the height of 𝐵 be smallerthan height of 𝐵′. Then if commitment of block 𝐵 appears beforecommitment of 𝐵′, and if 𝑃 ′ is the reference block that includescommitment of 𝐵′; then since, 𝐵′ does not extend 𝐵, no honestreference replica will vote for validity of 𝑃 ′. However, since 𝑃 ′ is acommitted reference block, this implies that at least 𝑓 + 1 honestreplicas indeed voted for 𝑃 ′ resulting in a contradiction.

Alternatively, if a commitment of 𝐵 appears before commitmentof 𝐵′, then no honest replica will vote for a reference block thatincludes commitment of 𝐵′. Hence, we again get a contradictionby the same argument as above. □

We will next argue about liveness of Rivet during periods ofsynchrony. As in safety, the liveness of the reference shard followsdirectly from the liveness guarantee of the underlying ByzantineFault tolerant consensus protocol. Hence, we will focus on theliveness of worker shards. In particular, we will prove the livenessof worker in two steps. First, we will show that during periods ofsynchrony, if the leader, say 𝐿, of worker shard in view 𝑣 is honest,and reference shard also has a honest leader during the same timeinterval, then 𝐿 will be able to successfully commit a new workerblock on to the reference chain. Next, we will argue that Rivet suchperiods where both the leader of reference shard and worker shardare simultaneously honest, occur infinitely often.

Let 𝑛𝑟 be the number of replicas in the reference shard and 𝑓𝑟 bethe maximum number of Byzantine replicas in the reference shard.Similarly, let 𝑛𝑤 and 𝑓𝑤 be the number of replicas and maximumnumber of Byzantine replicas in a worker shard respectively. Recall𝑛𝑟 = 3𝑓𝑟 + 1 and 𝑛𝑤 = 2𝑓𝑤 + 1. Also, let 𝛼 be the ratio between rateof change of views of a worker shard 𝑋 and the reference shard.For example, 𝛼 = 1 implies that view change in both 𝑋 and thereference chain happens at the same speed. Similarly, 𝛼 = 2 impliesthat view change in worker shard 𝑋 happens twice faster than theview change of reference shard.

Claim 1. Let 𝑣, 𝑣 ′ ≥ 𝑣 + 𝑛𝑟 be any two distinct views in referenceshard. Then, for every worker shard 𝑋 , 𝑛𝑟 ≥ 𝑛𝑤 , and 𝛼 ≥ 1, thereexists a honest view 𝑣∗ between view 𝑣 and 𝑣 ′, i.e., view with a honestreference leader 𝐿𝑟 , such that during view 𝑣∗, an honest replica 𝐿𝑤 of𝑋 becomes the leader of 𝑋 . Here on, we refer to such view as HH view.

Proof. Let 𝑣 ′ = 𝑣 + 𝑛𝑟 . Then, during view 𝑣 and 𝑣 ′, exactly 𝛼𝑛𝑟nodes will become leader of𝑋 . Moreover, at least 𝛼 (𝑛𝑟 − 𝑓𝑟 ) workerleaders will co-exists with honest reference shard leaders. Also,among 𝛼𝑛𝑟 leaders at most 𝑓𝑤 ⌊ 𝛼𝑛𝑟𝑛𝑤

⌋ + 𝑓𝑤 them will by Byzantine.Thus 𝛼 (𝑛𝑟 − 𝑓𝑟 ) > 𝑓𝑤 ⌊ 𝛼𝑛𝑟𝑛𝑤

⌋ + 𝑓𝑤 will imply that at least onehonest worker leader will co-exist with honest reference leader.

Hence, solving the inequality, the condition we get:

𝛼 (𝑛𝑟 − 𝑓𝑟 ) > 𝑓𝑤 ⌊𝛼𝑛𝑟

𝑛𝑤⌋ + 𝑓𝑤

𝛼 (𝑛𝑟 − 𝑓𝑟 ) ≥ 𝑓𝑤𝛼𝑛𝑟 + 1𝑛𝑤

𝛼 (𝑛𝑟 − 𝑓𝑟 ) ≥12(𝛼𝑛𝑟 + 1)

𝛼 (2𝑓𝑟 + 1) ≥12(𝛼 (3𝑓𝑟 + 1) + 1)

𝛼 𝑓𝑟 + 𝛼 − 1 ≥ 0 (4)

For 𝛼 ≥ 1, equation 4 is is always true. □

An immediate corollary of the above claim is that in an infiniteexecution of the protocol, such honest reference leader and honestworker leader replica pairs will occur infinitely often. We will nextuse Claim 1 to prove liveness of Rivet.

Theorem 9.5. During period of synchrony, during a HH view, thecorresponding leader of worker shard 𝑋 , can successfully commit anew worker block in the reference chain.

Proof. During HH, let 𝑣𝑤 , 𝑣𝑟 be the view of worker and refer-ence shard respectively. Let 𝐿𝑤 and 𝐿𝑟 be the corresponding honestleader for view 𝑣𝑤 and respectively. Also, let 𝑡𝑟 be the time instant𝐿𝑟 enters the view 𝑣𝑟 . Then by time 𝑡𝑟 + Δ, 𝐿𝑟 will receive all mes-sages sent in view 𝑣𝑟 − 1. Then by time 𝑡𝑟 + 2Δ, 𝐿𝑤 (in fact allworker shard replicas) will receive the latest reference block. Since𝐿𝑤 is connected to 𝑓𝑤 + 1 replicas from each shard, by time 𝑡𝑟 + 4Δhave the data required to execute cross-shard transactions up to thelatest reference block. Then, if at time 𝑡𝑟 + 4Δ, if 𝐿𝑤 proposes a newblock 𝐵, by 𝑡𝑟 +6Δ, 𝐿𝑤 will have a certificate for 𝐵. If at time 𝑡𝑟 +6Δ,𝐿𝑤 forwards the block to 𝑓𝑟 +1 reference replicas, by time 𝑡𝑟 +8Δ, 𝐿𝑟will know about commitment of 𝐵. Hence, if 𝐿𝑟 proposes its blockafter time 𝑡𝑟 + 8Δ, the proposed block will include the commitmentof 𝐵. And hence by the liveness guarantee of underlying ByzantineFault Tolerant consensus protocol of reference chain, commitmentof 𝐵 will be finalized. This implies worker block 𝐵 be finalized. □

14

Algorithm 1 Reference shard block creation at replica (leader) 𝐿 in view 𝑣

1: ⊲ Inputs to the block proposal algorithm2: 𝐵ℓ (𝑖 ) : latest committed block of shard 𝑋𝑖 for each shard.3: 𝑃𝑠 (𝑖 ) : Reference block mentioned in 𝐵ℓ (𝑖 ) for each 𝑖4: 𝑃𝑙 : Latest reference block known to 𝐿5: 𝑃𝑙+1 : Block 𝐿 wants to propose6:7: ⊲ We will refer to the cross-shard transactions that appear after the last commit from shard 𝑋𝑖 as the pending cross-shard transaction of 𝑋𝑖

and denote them by 𝑄𝑖 .8: 𝑄𝑖 : cross-shard transactions that appear after 𝑃 (𝑖)𝑠 and involves 𝑋𝑖 .9: R(𝑖) : Keys that cross-shard transactions in 𝑄𝑖 intends to read from.10: W(𝑖) : Keys that cross-shard transactions in 𝑄𝑖 intends to write to.11:12: 𝑛𝑒𝑤ctx : New cross shard transactions available at 𝐿13: 𝑛𝑒𝑤com : New state commitment transactions available at 𝐿14:15: ⊲ Pick set of valid state commitments 𝑆𝑙+1 from the available state commitments16: 𝑆𝑙+1 ← ∅17: for each shard 𝑋𝑖 do18: 𝑣𝑎𝑙𝑖𝑑𝑋𝑖

= ∅ ⊲ Set of new valid state commitments from shard 𝑋𝑖 .19: for each state commitment com = ⟨𝐵 𝑗 ,H𝑗 ⟩ from 𝑋𝑖 in 𝑛𝑒𝑤com do20: if 𝐵 𝑗 extends 𝐵

(𝑖)ℓ

& 𝐵 𝑗 is certified & Q𝑠 (𝑖 ) ,𝑙 is empty then21: 𝑣𝑎𝑙𝑖𝑑𝑋𝑖

← 𝑣𝑎𝑙𝑖𝑑𝑋𝑖∪ {com𝑗 }

22: com∗𝑖← max{𝑣𝑎𝑙𝑖𝑑𝑋𝑖

} ⊲ Pick the commitment with longest hash chain, i.e., highest |H𝑗 |23: 𝑆𝑙+1 ← 𝑆𝑙+1 ∪ {com∗𝑖 }24: 𝑄𝑖 ← ∅;W(𝑖) ← ∅; R(𝑖) ← ∅ ⊲ Resetting cross-shard transactions and the associated read-write sets.25:26: ⊲ Pick set of new cross-shard transactions 𝐶𝑙+1 from the available cross-shard transactions27: 𝐶𝑙+1 ← ∅28: for each ctx ∈ 𝑛𝑒𝑤ctx do29: Let 𝑋ctx be the set of shards required for executing ctx30: for 𝑋 𝑗 ∈ 𝑋ctx do31: Let 𝑅 ( 𝑗)ctx ,𝑊

( 𝑗)ctx be the read and write set of ctx in 𝑋 𝑗 , respecitvely.

32: if 𝑅 ( 𝑗)ctx ∩W( 𝑗) = ∅ then33: 𝐶𝑙+1 ← 𝐶𝑙+1 ∪ {ctx}; W( 𝑗) ←W( 𝑗) ∪𝑊 ( 𝑗)ctx ; R

( 𝑗) ← R( 𝑗) ∪ 𝑅 ( 𝑗)ctx ⊲ Update pending cross-shard transactions34:35: 𝑃𝑙+1 ← ⟨𝑙 + 1, hash(𝑃𝑙 ), 𝑆𝑙+1,𝐶𝑙+1⟩

15

Algorithm 2 Validation of block 𝑃𝑟 = ⟨𝑟, hash(𝑃𝑟−1), 𝑆𝑟 ,𝐶𝑟 ⟩ at replica 𝑛 of reference shard1: ⊲ Input to the block validation algorithm. For simplicity, let 𝑃𝑟−1 is the latest reference block known to 𝑛.2: 𝐵ℓ (𝑖 ) : latest committed block of shard 𝑋𝑖 for each shard up to 𝑃𝑟−1 .3: 𝑃𝑠 (𝑖 ) : Reference block mentioned in 𝐵ℓ (𝑖 ) for each 𝑖4:5: 𝑄𝑖 : cross-shard transactions that appear after 𝑃𝑠 (𝑖 ) and involves 𝑋𝑖 .6: R(𝑖) : Keys that cross-shard transactions in 𝑄𝑖 intends to read from.7: W(𝑖) : Keys that cross-shard transactions in 𝑄𝑖 intends to write to.8:9: for every com𝑗 = ⟨𝐵 𝑗 ,H𝑗 ⟩ ∈ 𝑆𝑟 do10: if ¬(𝐵 𝑗 extends 𝐵ℓ (𝑖 ) & 𝐵 𝑗 is certified & Q𝑠 (𝑖 ) ,𝑙 is empty) then11: Output invalid 𝑃𝑟 ; return12:13: for every com𝑗 = ⟨𝐵 𝑗 ,H𝑗 ⟩ ∈ 𝑆𝑟 do14: Let 𝑋 𝑗 be the worker shard that commits com𝑗

15: 𝑄 𝑗 ← ∅; W(𝑖) ← ∅; R(𝑖) ← ∅ ⊲ Temporarily set values for 𝑄𝑖 ,W(𝑖) , and R(𝑖) to be empty sets16:17: for each ctx ∈ 𝐶𝑟 do18: Let 𝑋ctx be the set of shards required for executing ctx19: for 𝑋 𝑗 ∈ 𝑋ctx do20: Let 𝑅 ( 𝑗)ctx ,𝑊

( 𝑗)ctx be the read and write set of ctx in 𝑋 𝑗 , respectively.

21: if 𝑅 ( 𝑗)ctx ∩W( 𝑗) ≠ ∅ then22: Output invalid 𝑃𝑟 ; Reset 𝑄𝑖 ,W(𝑖) , and R(𝑖) to original state ∀𝑖 ∈ [𝑘]23: return24: else25: 𝑄 𝑗 ← 𝑄 𝑗 ∪ {ctx}; W( 𝑗) ←W( 𝑗) ∪𝑊 ( 𝑗)ctx ; R

( 𝑗) ← R( 𝑗) ∪ 𝑅 ( 𝑗)ctx

26:27: Output valid 𝑃𝑟 .

Algorithm 3 Block creation at replica (leader) 𝐿 of a worker shard 𝑋1: Inputs.2: 𝐵𝑖 : Latest committed block of 𝑋 with state state𝑖3: 𝑃𝑖 : Reference block at height 𝑟𝑖 that includes com𝑖 = ⟨state𝑖 , . . .⟩4: 𝐵 𝑗 : Latest worker block of 𝑋 (may be 𝐵𝑖 = 𝐵 𝑗 )5: 𝑃 𝑗 : Latest reference block at height 𝑟 𝑗 known to 𝐿6: Q𝑖, 𝑗 : Cross-shard transactions involving 𝑋 that appears between reference block 𝑃𝑖 and 𝑃 𝑗 (both inclusive).7: T𝑖, 𝑗 : Intra-shard transaction between worker block 𝐵𝑖 and 𝐵 𝑗 (both inclusive).8: 𝑇𝑗+1 : Newly available intra-shard transaction.9:10: if Q𝑖, 𝑗 is not empty even when (𝑖 = 𝑗) then11: parent block := 𝐵𝑖12: pick state𝑖 as the starting state13: execute Q𝑖, 𝑗 and T𝑖, 𝑗+1 = T𝑖, 𝑗 ∪𝑇𝑗+1 atop state𝑖 , i.e., state𝑗+1 ← Π(T𝑖, 𝑗+1,Π(state𝑖 ,Q𝑖, 𝑗 ))14: 𝐵 𝑗+1 ← ⟨hash(𝐵𝑖 ), state𝑗+1,T𝑖, 𝑗+1, 𝑟𝑖 , hash(𝑃𝑖 )⟩15: else16: parent block := 𝐵 𝑗

17: pick state𝑗 as the starting state18: execute 𝑇𝑗+1 atop state𝑗 , i.e., state𝑗+1 ← Π(state𝑗 ,𝑇𝑗+1).19: 𝐵 𝑗+1 ← ⟨hash(𝐵 𝑗 ), state𝑗+1,𝑇𝑗+1, 𝑟 𝑗 , hash(𝑃 𝑗 )⟩

16

Algorithm 4 Validation of block 𝐵 𝑗 = ⟨hash(𝐵 𝑗 ′), state𝑗 ,𝑇𝑗 , 𝑟 𝑗 , hash(𝑃 𝑗 )⟩ at replica 𝑛 of a worker shard 𝑋

1: Inputs.2: 𝐵 𝑗 : Newly proposed block at 𝑗 .3: 𝐵 𝑗 ′ : Parent of block 𝐵 𝑗 with state state𝑗 ′ .4: 𝑃 𝑗 : Reference block mentioned in 𝐵 𝑗 at height 𝑟 𝑗 .5: 𝑇𝑗 : Intra-shard transactions included in 𝐵 𝑗 .6: 𝐵𝑙 : Latest committed block of 𝑋 with state state𝑙 known to 𝑛.7: 𝑃𝑙 : Reference block at height 𝑟𝑖 that includes com𝑙 = ⟨state𝑙 , . . .⟩8: 𝑃𝑘 : Latest reference block at height 𝑟𝑘 known to 𝑛9: Q𝑗 ′, 𝑗 : Cross-shard transactions involving 𝑋 included in reference blocks after 𝐵 𝑗 ′ and up to 𝑃 𝑗 , if any.10: Q𝑗+1,𝑘 : Cross-shard transactions involving 𝑋 included in reference block after 𝑃 𝑗 and up to 𝑃𝑘 , if any.11: Q𝑙, 𝑗 : Cross-shard transactions involving 𝑋 included in reference block from 𝑃𝑙 up to 𝑃 𝑗 , if any.12:13: state𝑗 ←⊥14: if 𝐵 𝑗 extends 𝐵𝑙 & height(𝑃𝑘 ) ≥ height(𝑃 𝑗 ) & Q𝑗+1,𝑘 is empty then ⊲ extends means "is a descendant of"15: ⊲ Here height(𝑃) refers to the height of the block; also Q𝑎,𝑏 is empty if height(𝑃𝑎) > height(𝑃𝑏 )16: if 𝐵 𝑗 = 𝐵𝑙 then17: state′

𝑗← Π(Π(state𝑗 ′,Q𝑙, 𝑗 ),𝑇𝑗 )

18: else if height(𝐵 𝑗 ′) > height(𝐵𝑙 ) & 𝐵 𝑗 ′ is certified & Q𝑗 ′, 𝑗 is empty then19: state′

𝑗← Π(state𝑗 ′,𝑇𝑗 )

20:21: if state′

𝑗≠⊥ and state′

𝑗= state𝑗 then

22: sign; return23: do not sign; return

17

efﬁcient cross-shard transaction execution in sharded ... · utxo based model where each...

Documents