Fluid Sharding

Since it looks like we are soon finalizing the concepts for our cryptographic commitment primitives (epoch commitment chains), I believe we now have all the high level building blocks to start speccing a more concise fluid sharding framework.

Everything that I am going to propose builds on top of my earlier proposal of “untangling the tangle” by introducing shard markers as a coordinate for outputs and transactions in a sharding space (see Scaling IOTA Part 2 - Untangling the Tangle | by Hans Moog | Medium).

This coordinate system allows us to define a “location for the data and the execution” and can be used to associate different regions of the tangle with different groups of responsible nodes similar to a DHT (distributed hash table).

Let’s for now assume the sharding space is a cyclical structure e.g. a 32 bytes hash where the last value of the hash is right next to the first one:

circular_space

For simplicity reasons, we can imagine to unroll this cyclical structure into a 1-dimensional representation:

In contrast to my earlier proposal where things were completely fluid, we introduce an overlay topology which defines fixed regions of this shard space that will make up the shards (similar to the “epishards” earlier proposed by Bart). Let’s for now assume we slice the tangle into 8 equal parts and enumerate these shards from 0 to 7 (an arbitrary number - we can introduce protocols and mechanism to automatically adjust this topology based on demand later on and a 32 byte hash gives us potentially 2^256 - 1 shards):

Nodes can still freely decide how many and which shards they want to monitor but they need to monitor at least 3 consecutive shards (more on that later):

Each of these shards will produce their own cryptographic commitment chain comitting to their respective activity.

Since each commitment is associated to some “space coordinate” in the sharding space but also to some “time coordinate” in respect to “passed epochs since genesis” - we associate each commitment with a space/time coordinate:

So far, the commitment is a merkle root over all of the “sub-roots” of a given epoch (included messages, confirmed txs, ledger state and so on) and we can use this commitment to produce succinct proofs inside a shard:

Commitment(space/time) = MerkleRoot(stateRoots_{space/time})

We now want to extend this definition to also capture information of other shards. For this, we change the definition of a commitment in the following way:

Commitment(space/time) = MerkleRoot(
\>\>\>\>\>\>\>\>Commitment(space -1/time - 1),
\>\>\>\>\>\>\>\>MerkleRoot(stateRoots_{space/time}),
\>\>\>\>\>\>\>\>Commitment(space + 1/time - 1)
)

This results in a situation where each commitment doesn’t just commit to the activity of its own shard but also to the previous commitment of its neighboring shards (left and right) - forming something like a meta-tangle of commitments wrapping around the entire tangle like a mesh and indirectly connecting all existing shards:

Since each commitment is just a single hash, all nodes can download this entire meta-tangle giving them access to the state commitments of all other shards. This meta-tangle essentially forms the equivalent of ETH’s beacon chain.

Since the commitment tangle is circular, each space/time coordinate that is sufficiently old is reachable through both paths (left and right). Nodes can use this information to validate that the commitments they see from one side of the network match with the commitments they see from the other side.

This prevents malicious shards from presenting different commitments to different “sides” of the shard space so the only thing a malicious shard could do is to create a fraudulent commitment and this should ideally be addressed with some fisherman algorithm that allows actors to create fraud proofs for the commitments of other shards while they are propagating - resulting in a better 1 of N trust assumption (ideally prior to notarization).

As soon as a space/time coordinate is referenced by our own chain, it becomes “known to us” and we can “access” this space-time coordinate for proofs of inclusion which enables inter-shard transactions:

To send a transaction to another shard, we simply create a transaction in the originating shard that moves the funds to the given destination shard. This transaction is only processed in the originating shard but since the resulting state is no longer relevant for this shard, nodes will only include the state in their commitment but will no longer store the actual metadata of the Output (the output becomes unspendable in the originating shard - we essentially burn its funds).

As soon as the space/time coordinate that represents the containing commitment is known to the destination shard, the moved output can be presented to the destination shard together with a proof of inclusion against the corresponding space/time coordinate. This will allows us to mint the moved tokens in the destination shard.

Of course moving funds between distant shards will take a certain amount of time that is dependent on the amount of shards that exist + the interval at which we produce commitments.

To enable faster real-time payments between distant shards we introduce a new unlock condition for outputs:

The output can only be unlocked if:

  1. a given public key signed the transaction and the issuer can present a “proof of inclusion” of a certain state (its hash) in an earlier commitment of the same shard
  2. alternatively after some time it can be unlocked by an alternative public key without such a proof.

Furthermore we assume that there will be service providers that will want to provide the ability to move funds between distant shards in real-time for some small fee and that these service-providers will be incentivized enough to operate at least 1 node in each shard.

What users can do now is to approach these service providers and negotiate a FEE payed in IOTA that will be payed for instantly moving a certain AMOUNT of funds to another shard. They will accordingly send the funds to an Output in the destination shard that has the unlock condition, that the public key of the service provider can spend the output, if he can prove that he sent AMOUNT - FEE funds to a certain address in an earlier transaction.

The service provider can now wait for a transaction to be confirmed in the originating shard and since he knows that it will become part of the next epoch commitment, he can hand out the funds in the destination shard immediately. He will later be able to reimburse his expense by collecting the entire funds (including the fee) by claiming it from the state commitment.

If the service provider does not send the funds or does not provide the proof that he sent the funds before the timelock expires, the user can unlock the funds and recover his tokens.

This whole process is accordingly completely trustless and is essentially equivalent, to how optimistic rollups enable fast withdrawals.

I think that we can most probably enhance these concepts with things like data availability sampling and or maybe sampling the honesty of other shards via SPV but especially early on when the tangle is not too fragmented yet this should already be sufficient to enable a first version of fluid sharding.

Eventually this can also be combined with ZkProofs on L1, so that every message effectively validates the entire tangle + history. The recursive nature of zkproof systems like plonky matches exactly the structure of our meta-tangle.

so TL;DR:

  1. we entangle the commitments to form a beacon-tangle that is tracked by all nodes.
  2. we burn funds moved to another shard in the originating shard and mint them in the destination shard after proving the state was included in the corresponding space/time commitment.
14 Likes

Hi there hans,

let’s say there are multiple ways to shard the tangle, just like a cake there are many ways to cut the pieces. for instance:
1- shard by geographic proximity. this might be a decent choice for local payment networks. but it’s NOT secure because nearby shards would then have economic incentive to double spend far away shards and it would be hard to have shards monitor far away shards.
2- shard by data interaction, meaning dApps that interact with each other get grouped into the same shard. this is optimal for smart contract platforms like Ethereum. I’m not sure what use case will iota end up fulfilling to decide if this is the ideal way in our case.
3- shard by tps load. Load balancing tps between shards in a way that each shard get an equal amount of the volume.

In the beacon chain design, which every node will have to participate in, the beacon will have to keep track of the cross-shard events, especially the value transfers in order to maintain an equivalent amount of security and prevent shards from double spending each other.
Moreover, we need to eleminate the beacon chain, as well as minimize the number of shards and here’s why.

The more shards there are the more cross-shard transactions there will be, which will bloat the beacon chain.
lets say we have N number of shards { s1, s2… sN }
under the assumption that an account is equally likely to interact with any other account, the probability a transaction is going to be a cross shard event is N-1/N. that’s a reasonable assumption to make given that we can’t predict the frequency accounts will interact with each other between shards.
with a beacon chain design this has a dimishing returns with the more shards we create because say with have 100 TPS capacity per node, and 3 shards.
with 3 shards the probability a tx is going to be a cross-shard event is 2/3 = 0.66 and therefore nodes will process 33 local TPS and 66 cross-shard events.
all 3 shards will handle 99 tps plus the 66 mutual cross-shard events. 66 + 99 = 165. that’s 65 extra TPS. compared to if we had a single shard with 100 TPs

with 4 shards the probability a tx is going to be a cross-shard event is 3/4 = 0.75 and therefore nodes will process 25 local TPS and 75 cross-shard events. all 4 shards in total will process 175 TPS. that’s 75 extra TPS.
and so on with 5 or 6 shards… at this rate it’s clear the beacon chain has diminishing returns the more shards we have. i.e. in 3 shards we get +65% tps and with 4 shards +75% tps. that’s only +10% tps difference for an extra shard

1 Like

It’s hard for me to find in your comment where you’re taking into account any of the ideas Hans introduces above? A meta-tangle, a node tracking any 3 of 8 shards, no mention of a classic beacon chain.

Hans idea reads to me more like a *Byzantine Shard Reconciliation schema than a cross-shard beacon chain. But admittedly I’m only a basic civilian when it comes to this stuff.

*Term I just made up.

1 Like

you’re right, hans model is valid although too abstract. what i’m hinting at is that there will have to be one of three ways to shard the tangle, and each one will require keeping track of each shard’s balance as well as value transfers between shards in order to maintain the security. Han’s model can only work in a proof of stake setting to have an economic disincentive to prevent double spends between shards.

I don’t see Proof of Stake being required here?

PoS is required here. if you’re not going to create a shared security or provide an economic guarantee between shards a.k.a PoS. is it really a single tangle anymore? or just many tangles that don’t share security? would you be comfortable sending your precious iotas to a distant shard with validators you don’t know?
sharding is a pretty exhausted topic at this point.

1 Like

A few remarks/questions that come to mind:

  1. assigning validators/nodes to shards has to be randomized, otherwise it will become an easy task to become a majority in 3 neighboring shards and issue a double spend in the middle one, so we also require some global list of shardspaces and their (rotating) validator sets?

  2. imagine having 1000 shards and an attacker having 1% of network supply, won’t he be a majority in his shard now (which is already enough to exploit the “bridge” service providers you proposed)?

  3. How do we move tokens from one shard to another? Only nodes from the input-shard can consume the input, so the marker and therefor shard of the tx remains unchanged?

  4. What would be a reason to move funds to another shard in the first place?

  5. How do you consume 2 UTXOs from different shards in a TX? (potentially related to question 4)