Merkle-trees in milestones for securing the ledger state in "white flag"

hans.moog · March 5, 2020, 11:08pm

In the white flag approach, we apply ledger state changes by traversing the past cone of a milestone in a deterministic order and then applying the corresponding ledger state changes in that same order. Conflicts or invalid transactions are ignored but stay in the tangle.

Since the tangle can now contain conflicts, we can only apply the correct changes, if we start with the correct initial state.

Node operators usually use snapshot files to bootstrap their nodes. Since snapshots are simple text files containing an exported version of the ledger state, it would be very easy for an attacker to modify these files. Anybody who starts his node with such a modified snapshot file, would track a completely different version of the tangle.

We therefore need a way to “encode” the ledger state in the milestone, so that anybody who would use a tampered snapshot, could detect a divergence from the official state as soon as it happens.

One idea was, to include the root of a merkle tree in the milestone, that encodes the “applied” transactions and the corresponding ledger state, which can then be checked by the nodes before applying the changes locally.

I propose to use a sparse merkle tree that uses bitwise XOR operations to combine the applied tx hashes into a merkle tree like structure to calculate a DIFF_N (where N is the number of the milestone), that encodes the list of transaction, that were applied by this milestone.

This DIFF_N hash of the changes can now be used, to encode the whole state of the ledger STATE_N by combining it with the state of the previous milestone in the following way:

Merkle%20Proof%20(2)

This STATE_N can now be published in the milestone and used by the nodes to verify that the transactions they would apply based on their ledger state, are indeed the same ones as intended by the milestone (by doing the same calculations and then compare the locally computed STATE_N with the published one.

Using XOR when constructing the merkle tree instead of a normal hash operation, gives us several benefits:

It is much faster!
We can make the tree sparse, which means that if we have less than 2^x applied transactions, then we can simply skip calculating the hashes for the missing leaves.

This is due to two features of the XOR operation:
1. A \oplus A = empty (and therefore empty \oplus empty = empty)
2. A \oplus empty = A
which allows us to compress the calculations at the end (see example picture).

Note: It (using XOR) comes with the trade-off of loosing information about the order of the leaves but since we already have the order encoded in the tangle structure, this seems to be fine.

gal.rogozinski · March 6, 2020, 2:05pm

If we only want to make sure that full nodes calculated the same state as the coo node then this is good I think.

If we also want to also create proof of inclusions that can’t be easily faked this won’t do if I understand correctly. I think it will be nice if we can have light nodes that only track milestones and are able to query full nodes for proof. Maybe those light nodes can easily be ran on IOT devices?

hans.moog · March 6, 2020, 2:38pm

That’s actually true - I didn’t think about that Then we wouldn’t even need the combination with the previous state, but only the DIFF.

I guess we can also build a sparse tree with hashes using the same mechanism but hash(A + B) instead of A \oplus B

gal.rogozinski · March 8, 2020, 3:17pm

Did you mean hash(A || B) (just making sure there are no misunderstandings)?
Also, if we are not using XOR is the tree still sparse? If yes, then why? You want to do proofs on non-inclusion??

Anyhows, I am trying to understand the STATE idea. You want this to be published along DIFF calculations on the milestone? Or only the STATE?

I understood that you want the latter. If so, it can be done, but I just don’t understand the advantages of using STATE and not just DIFF. I see that you want to the merkle root to connect between the current milestone and the previous one, but what are the benefits? What attack are you trying to prevent?
Currently we made compass give out indexes in consecutive order, isn’t this enough?

Anyhows, here are my thoughts:

a. Maybe we should use bundle hashes instead of transaction hashes (assuming we are in the world that doesn’t have atomic txs yet). Enough to enforce the following I believe for this: Change bundle validity rules for enabling more robust node architecture · Issue #1746 · iotaledger/iri · GitHub

b. The requirements are that with the merkle roots a node should easily verify:

The “applied” transactions (or bundles) are the same one the coo applied
The ledger diff the node applies is the same diff the the coo applies

One might can claim that if you have (1) then (2) is given for free.
Even though this is a valid claim there might be a bug in how the node calculates the diffs in different implementations, even though it is a simple calculation…

So maybe we should also add the hash of the state diff. To be clear, a state of diff is a list of address,int pairs that signify the balance change per address (not the entire ledger state).

I guess (b) can be considered paranoid, so not sure how I feel about this idea (even though I proposed it)…

hans.moog · March 9, 2020, 11:46am

No, I meant a concatenation of the corresponding bytes.

You can still skip the last hashes and “jump” to the next highest level when calculating the root as soon as you .reach the end. As long as the behavior is deterministic, everybody can do the same calculations.

Only the STATE - chaining the states together like this allows us to build proofs of inclusion over several milestones (so you don’t need to store all milestones forever, but can instead compress them to another merkle proof again)

I agree - we use the bundle tail hash here

I would say they are effecitvely equivalent.

That should not be necessary, as the balance changes are “caused” by the transactions. It would not add any additional information. You can reconstruct the balance changes from the hahes in a deterministic way.

wolfgang.welz · May 6, 2020, 1:50pm

The ideas of this post have also been formalized as part of Protocol RFC-0012.

william.sanders · May 7, 2020, 4:50am

So how do we modify this post coordicide? The snapshots will also just be a downloadable file, and so the same attack exists, right?

hans.moog · May 7, 2020, 7:34am

No, because we dont use whiteflag

william.sanders · May 7, 2020, 7:50am

Im probably just being dumb here, but whats the difference? Even without white flag, you need the correct initial state to execute the protocol, right?

gal.rogozinski · May 17, 2020, 5:55pm

I think you’re right @william.sanders

I don’t know the answer… but I think ultimately we will need to provide some sort of proof to be published alongside a snapshot that can be easily verified.

gal.rogozinski · May 17, 2020, 7:04pm

I see this is not in the RFC
Reading this quote again, I see that I don’t really understand how this would work…

So @hans.moog can you please mansplain it to me?
What would the proof look like for an inclusion of a tx that is 2 milestones below the pruning depth for example?

william.sanders · May 18, 2020, 4:17am

Oh, I see my mistake, I was just being dumb. Without white flag, just proving your transaction was approved by a milestone is enough to prove that it was valid. However, now you have to prove that it wasnt actually ignored before the next milestone.

gal.rogozinski · May 18, 2020, 11:24am

I thought you asked about how to prove that a local snapshot someone is using is not altered