Dedicated Throughput for Validation Blocks

Let me propose a challenge! We currently have a way to deal with validation blocks, but we would like to improve on it. Let’s brainstorm!

:handshake: Validation Blocks. Validation Blocks are blocks regularly issued by committee members (a subset of the validator set) to achieve agreement about blocks and transactions issued during the epoch reducing communication overhead and minimizing confirmation time. This set has the size committeeTotalSeats (which is set to some value between 25 and 50 ). A committee member is expected to issue exactly 1 validation blocks every frequencyValidationBlock seconds (which is set to either 0.5 , or 1 ). All honest committee members reference their last own latest validation block, forming a chain of validation blocks from each validator.

:hourglass: Dedicated throughput. To account for this intense activity, validation blocks do not require to burn Mana to be issued nor to consume deficit in the DRR scheduler. In fact, DRR is good at adapting to changing rates from each issuer and maximising the total throughput based on who is active, but in case of validator blocks we have a fixed set of active issuers producing block at a preset rate, so we can simplify the approach and provide better properties thanks to these assumptions. Our intuition is that validation blocks can bypass the normal scheduler and get a preferencial route in the data flow.

:imp: Spam prevention. An additional concern is related to spam protection: since these block do not require to consume any Mana or deficit, a malicious committee member may produce a large number of validation blocks, effectively saturating the available network resources. Detecting misbehavior is easy (only 1 validation block every frequencyValidationBlock seconds must be produced), but it is important to discourage this to avoid orphaning large portion of the future cone.

:memo: Homework. I would like to invite the IOTA community and researchers to formulate a solution to the problem of dedicated throughput for validation blocks in order to optimize the current approach. Let me clarify again: we currently have a solution that is being implemented in IOTA Core, which I won’t reveal here because, IMO, thinking out of the box is usually much better!

6 Likes
  1. Introduce a Priority Queue: Create a priority queue specifically for validation blocks within the data flow. This queue will prioritize the processing of validation blocks, ensuring they are handled with higher priority over other types of blocks. This will allow for faster confirmation and reduce communication overhead.

  2. Implement Rate Limiting: Set a maximum limit on the number of validation blocks that can be issued within a specific time period. This will prevent malicious committee members from flooding the network with an excessive number of validation blocks. Any blocks exceeding the limit can be ignored or penalized in some way.

  3. Dynamic Frequency Adjustment: Instead of having a fixed frequency for producing validation blocks, implement a dynamic frequency adjustment mechanism. The frequency can be adjusted based on the network load and the number of active committee members. This way, the validation block issuance rate can be optimized to achieve maximum throughput while avoiding network congestion.

  4. Validation Block Reward Mechanism: Introduce a extra reward mechanism for producing valid validation blocks (for example double the mana). This would incentivize committee members to issue validation blocks, ensuring a consistent supply without compromising the network. The extra reward can be in the form of tokens or reputation points, encouraging committee members to actively participate while discouraging spam.

3 Likes

Can you explain why the naive approach of yeeting them off the committee as soon as they violate the minimum frequencyValidationBlock won’t work?

Split the queue in 2 parts, Q1 (validation blocks), Q2 (regular blocks).
Assuming 1s voting frequency, validators get assigned some virtual mana for each slot (say 20 mana just to be safe), each block burns 1 mana.
The scheduler prioritizes Q1, so it picks from Q1 until Q1 contains no more blocks with positive mana (alternatively, skip the virtual mana and alternate between Q1 and Q2).

This faces the same issue of “overspending”, where regular block-issuers are penalized with locked funds. So in a similar way we can punish validators through the performance factor and possibly through token slashing (provided some proof of spam).

1 Like

I would also choose @Buddhini’s solution with two schedulers, assuming that there is an easy way to detect such blocks on arrival. A different scheduler seems like the right thing since rules for these blocks deviate a lot.

I would take three values slotExpectedValidationBlocks, slotReceivedValidationBlocks and slotMaximumValidationBlocks, the latter being the maximum amount of validation blocks that can be sent during a slot (essentially duration divided by frequency). As soon as we receive the first validation block from a user in a slot, we set the expected blocks to 1 and start counting it up every frequencyValidationBlock seconds until the maximum is reached.

If the received blocks exceed the expected blocks by a certain threshold (note that slight network hiccups might skew our perspective slightly), we delay them.
If the expected blocks are way higher than the received ones, we stop counting up expected blocks until we receive another. That fixes attacks with bursting.

If somebody exceeds the maximum, we should have signed proof that he sent too many blocks and can slash him.

Also stop validation blocks with bad timestamps of course. That might be another variant to detect such abuse, but timestamps aren’t enforced to be 100% exact and such can be forged to a certain degree

Do validation blocks have a sequence number inside of a slot? In this case that would be really easy to just immediately slash for having a seq higher than allowed or two messages with the same seq.

Also just curious, how often do you plan to rotate the committee? I have seen everything from 10 seconds to 24 hours now…

2 Likes

Hi Jack, thanks for your proposals, I’ll give some feedback on each one, but in general my comment is that we need some more detail to turn these suggestions into concrete proposals that we can incorporate into our solution.

  1. I agree that it would be important to provide higher priority to validation blocks as they are more important to the correct functioning of the protocol than basic blocks, but if we completely prioritise them over basic blocks, this would allow a spamming validator to cause starvation of the basic blocks in the queue and no actual data would be able to be written to the Tangle.
  2. we definitely need to rate limit the validation blocks, and this would help deal with the above issue of spamming validators causing starvation of basic blocks, but “ignored or penalized in some way” is not really specific enough to turn into a proposal. How can we ignore a validation block and be sure that others also ignore this validation block? They may arrive at different times and in different order to each node or a validator may intentionally send different traffic to different parts of the network. Similarly, any penalty must be provable and agreed upon by everyone so the key is to make sure everyone can learn of the misbehaviour. We need to be more specific here with the proposal.
  3. Dynamic rate is an interesting suggestion, but what exactly would we respond to? When the congestion is high we require less validation blocks? What would this help with exactly? Also, what algorithm would we use to adjust this rate?
  4. This part of your proposal is right on target and is a key part of our existing solution! The reward a validator receives is key to incentivise the behaviour we want, so issuing blocks at the desired frequency is part of the “performance factor” calculation that determines validator reward, among other measurements of their performance such as how they attach those blocks to the tangle and how many other blocks they approve etc.
1 Like

Sure, I can try to explain. It’s not that “yeeting” the validators off the committee won’t work, but that could mean alot of things and the devil is in the detail. It depends on your “yeeting criteria” and when exactly the yeeting takes effect, but there are many ways I can think of that this would not work. For example, if my policy is to simply block all traffic from a validator if they go over their allowed rate, this could be problematic. If I don’t forward the excess blocks then how will others know that the validator has offended? What if those validation blocks are not accepted and included in a commitment? Then newly joining nodes will never know of their misbehavior. Does your proposal involve some sort of proof of misbehavior, and if so, who issues this?

So going back to the question again, I think blocking offending validators from the committee is a crucial part of any proposal, but it needs to be done in a provable way and agreed upon, so we need a concrete set of criteria and details of how that can be done.

1 Like

Thanks for your suggestion @Buddhini , I definitely agree that entirely separate bandwidth reserved for validation blocks and basic blocks should be the start point, and that’s how our current solution works. However, the proposal involving a Mana quota has a pitfall, and its one we discussed alot when we were introducing Mana to the scheduler and deciding between the mana burn priority queue and our DRR scheduler.
The problem is this:
If you have a fixed quota for your queue and you simply stop scheduling stuff once that quota runs out, then nodes could end up with inconsistent sets of blocks. Suppose I am a malicious validator with a quota of ten blocks and I send a different ten blocks to each of my neighbours. Each neighbour then selects these blocks as tips, but these new blocks will never be able to be solidified by the other neighbours. These approaches involving any fixed threshold are tricky to handle and can lead to high orphanage rates as we found in our experiments for Mana burn.

Completely agree with your thoughts regarding punishment of overspending, so the question is, how can we obtain a proof of spam, and what would be a reasonable penalty? Certainly we would reduce the performance factor for this, but do you have any further thoughts on slashing? This is definitely an area we want to look into which is not part of our current solution.

Also @Werner, thanks for your proposal, and appreciate the level of detail. Replying here because I am only allowed three consecutive replies…

If I understand correctly, what you are proposing for rate limiting essentially implements a first in first out (FIFO) queue with a fixed service rate. you basically slow down the arriving traffic and only allow it to pass through at the desired rate. So what you would end up with is a separate FIFO queue for each validator’s blocks, each with service rate frequencyValidationBlock! This is in essence what we have in our current solution, and as the person who proposed it, I can say that I like it :sweat_smile: Please correct me if I misunderstood the suggestion.

Stopping any block with a bad timestamp is already taken care of in other parts of the protocol, so that’s all good.

Validaiton blocks currently do not have a sequence number, but that is not to say that they can’t be given one. We have discussed enforcing a rule that each validator must approve their previous validation block with their current validation block, which would achieve a similar thing by making a chain. I wouldn’t like to enforce a strict rule related to skipping blocks in a sequence as this could be messy if some block in the sequence does not get accepted, then the accepted ledger would look as though a validator has misbehaved, so that rule seems too strict to me.

Committee rotation is every epoch, and that more in the region of 24 hours, but not precisely determined yet. Slots are currently set to 10 seconds, but we definitely don’t rotate committee every slot, that is far too frequent.

2 Likes

Hey @andrew.cullen ,
this seems to be the same problem as with regular block-issuers, who exceed their quota (according to manaburn) by sending different blocks to different nodes (and not having enough mana for all of them).
I guess since you said the new blocks would be orphaned, your solution involves keeping the blocks in the scheduler, maybe we can solve this differently for validation blocks and force solidification by scheduling the parents.
The attack then stops once the attacker has a negative mana balance in each scheduler.

Regarding punishment and proof of spam - you can simply send a block referencing all validation blocks of the attacker of a certain slot and if the amount exceeds the 10+tolerance, then punishment happens. You can attach proofs of inclusion of those blocks aswell.

I can imagine many different punishments (hrhr):

  • set performance factor to 0 for entire epoch
  • remove validator from committee for this epoch (starting from 2 slots after the proof has been approved)
  • (additionally) slash some of the locked funds (0%-1% depending on the degree of spam)
  • (additionally) lock the validator out of committee selection for 7 epochs
    or any combination of those
2 Likes

Remove the difference between “validation blocks” and normal blocks. Enforce a minimum issuance rate for validators and compensate them with a mana boost.

However, to do this requires a simplification of the way validators broadcast their opinions in the tangle. This requires three relatively straightforward concepts:

ONE

First you need some kind of counter that tracks the depth of every block. Something like the magnitude of the heaviest mana-weighted path from a given block to genesis.

This value is an integer. I’ll call it the “Mana-Weighted Depth” of a block, or MWD. MWD can be tracked and booked on arrival with some simple logic :

  1. A node N issues a new block B
  2. Look at the “mana-weighted depth” of B’s parents
  3. Pick the largest value
  4. If Node N is a validator, increase the value in step 2 by N’s validator weight.
  5. Otherwise add zero.
  6. Book this updated MWD integer with the new block

Now every block is booked with an integer expressing its mana-weighted depth.

TWO

Next you need a way to measure something like the weight approving a block, as witnessed by another block. Something like “relative approval weight”

For example: Block A references another Block B, somewhere in its past cone. To find the relative approval weight that block A “sees” approving Block B, you subtract the MWD of B from the MWD of A.

This integer is the relative approval weight of block B, from the point of view of Block A. It is also trivial to calculate.

THREE

The third and final piece is to change the way validators express their opinions in the tangle. This piece is somewhat counter intuitive:

  1. A validator node N issues a block B that has two parents Px and Py in conflicting realities, X and Y respectively
  2. Find Blocks X and Y at the root of realities X and Y
  3. Find the relative approval weight of P(x) with respect to Block X
  4. Find the relative approval weight of P(y) with respect to Block Y
  5. Pick the Parent with the higher relative approval weight, with respect to it’s reality
  6. Assign the validator N’s opinion to whichever reality has the greater relative approval weight.
  7. Book block B in the winning “reality”

This logic is a little weird but it basically says: validators don’t have to form or express an opinion explicitly in the tangle. Instead, we can just measure how much weight a validator “sees” approving a given conflict, and assume the validator approves whichever reality has more approval weight.

Validators do not need to do anything special to vote. They just issue blocks and attach them to the tangle like any other block. Validator opinions are derived from the data structure itself.

Since calculating and tracking MWD and Relative Approval Weight are lightweight operations, no tangle walks or complex logic is required. And since we have consensus on validator weights going into an epoch, all nodes will agree on the opinion of all validators. And since validators are essentially compelled to vote for the heaviest subtangle, you will still get consensus on a the conflict.

Now you can treat validator blocks just like any other blocks. Give validators a mana boost and require them to issue transactions at a minimum rate, to guarantee confirmation times.

1 Like

I do not see any issue with adding a sequence number, it also does not force a chain, just increase by 1 for every statement you make. This would allow a very easy detection for spam since we just need a too high or duplicate sequence number to reliability proove somebody sent too many validation blocks.

Skipping numbers or receiving in wrong order needn’t be an issue as well, as we use our expected blocks to prevent somebody from sending all messages in a single burst. The validator would just need to have measures in place to not accidently break the rules.

1 Like

Hey @andrew.cullen ,
i thought again about the issue of solidification (block-issuers/validators spamming different blocks to different nodes).

I believe a FIFO queue with a fixed service rate (your current implementation) is basically the same as a fixed quota for the slot, just with a different granularity (e.g. 20 bp10s vs 2 bps).

The way this issue is/will be handled/mitigated (correct me if i’m wrong), is to recognize (and not attach to) “suspicious” blocks to lower the risk of becoming orphaned.

I believe we could soften this specific issue of malicious validators a bit by removing validator blocks from the tip-selection of regular block-issuers. So now we can have at most N (= number of validators) orphans, which isn’t so bad considering it’s just votes (and not actual user transactions).
To avoid validator orphans, we could additionally force solidification of validator votes (just schedule their parents with them), so in worst case a malicious validator will additionally consume N tps (forced solidifications, given a validator references at most 1 block of each other validator).
When scheduling a validators vote due to a forced solidification, we can raise our suspicion of that validator and eventually not attach to it anymore.
(This seems okay to me, until we consider many malicious validators cooperating)

But i’ll stop on the solidification issue for now (as there are thousands of different scenarios to consider and in fact it appears to be a fundamental unsolvable problem to me…)

1 Like

Thanks for your reply and detailed approach! Your solution reminds me the good old times of “Tangle 1.0” analysis :slight_smile: Right now, we tend to make use of slots and epochs, which in general permit simpler and quicker computations. Additionally, it is kinda problematic to deal with synchronization for new nodes when one has to locally compute certain quantities.

But let me reply to your approach in more details:

  1. If I understand correctly, the MWD of a block B is the sum of the largest weighted chain of validation blocks in the past cone of block B as standard blocks do not contributed in the MWD calculation

  2. What is the advantage of picking the largest value instead of the smallest (or any other combination)

  3. As for the relative approval weight, even though B is in the past cone of A, MWD(A) may be computed according to a past cone which does not include B. Does this affect your algorithm?

  4. You can treat validator blocks just like any other blocks

Normal blocks burn Mana and it is something we do not want to apply (a validator may run out of Mana while helping the network to progress!).

  1. Give validators a mana boost and require them to issue transactions at a minimum rate, to guarantee confirmation times

How would you enforce a minimum rate for validator blocks?

2 Likes

The malicious validator can enforce a “cascade” of solification requests as it is not possible to know how many missing blocks there are.

This makes the attack stop, but it does not guarantee that all nodes end up with a consistent view on validation blocks from the malicious node. As Andrew mentioned, a hard cap on throughput per issuer/validator is likely to introduce consistency issues

As for the punishment, another possibility is to decrement the Mana amount depending on the amount of spam.

I remember we proposed to use sequence numbers in the context of the Adaptive PoW but this was causing problems, in particular to deal with collisions: what to do when two blocks with same sequence number arrive? To keep both or to ignore both?

BTW, I guess we can achieve the same behavior as with sequence numbers by just considering the block timestamp, as we know we cannot have more than one block in a given slot interval

we can limit forced solidification for votes to depth 2 (and fall back to basic behavior on failed attempts).
If validators do not attach to “suspicious” blocks - where forced solidification triggers suspicion - this should result in successful forced solidifications of honest validators? :thinking:
We could adjust the depth of forced solidifications based on the rate of validators in our queue (e.g. 2bps) and voting frequency.

So say there is 1s voting frequency and 2bps fixed service rate limit for validators, we can set max solidification depth to 2bps*1s = 2 blocks and add some tolerance.

Well, forced solidifications was supposed to tackle this issue.

Is it even possible to not have inconsistency issues without forcing solidifications (adding synchronicity)? Aren’t we getting into FLP territory here (probably mixing smth up here) ?

As far as i understand, there is a natural quota per slot anyway, based on the bandwidth and the exact algorithm of the scheduler, so i don’t quite understand how lowering that quota per validator causes that issue, the purpose was to reduce it :thinking:

Assume there are 10k nodes in a network, 50 validators. The malicious validator creates 100m different votes and sends 10k to each of the 10k nodes.
How can this not result in inconsistency issues?

The only way i can spot to somehow deal with this is to follow the 50 validators and their local tangles by solidifying their blocks.

A completely different solution which i highly like is having a rotating slot leader and remaining validators attesting to it.
Would solve many issues (SC sequencing, consistency issues, finality time, bandwidth, etc.)

1 Like

I remember we proposed to use sequence numbers in the context of the Adaptive PoW but this was causing problems, in particular to deal with collisions: what to do when two blocks with same sequence number arrive? To keep both or to ignore both?

This is indeed a tricky question. Too high sequence numbers can be safely dropped without any issue, but this does not work with collisions.

The attacking scenario would therefore be to massively spam the input queues with duplicate sequence numbers. At this point we have pretty clear evidence on the misbehavior and can file punishment requests containing the signed transactions (no reference needed).

At this point we could deem every transaction from the attacker in the slot as invalid, but some of these transactions are likely already processed by other validators (or we might have processed them ourselves) and might even be referenced.
So consensus would have to be found on which transactions are part of the ledger.

My suggestion to work around this would be to stop scheduling anything from that queue upon finding a duplicate sequence number and “dislike” everything he sent in this slot including its children so there is a high chance it will get orphaned (we’d eventually wipe the queue). In case of referenced transactions the committee would need to decide if they still include them.
If this results in more blocks than allowed inside the ledger, we treat these as if they came through the regular scheduler and deduct mana - potentially getting the attacker deep into the negatives.

By also requiring positive mana to participate in the validation scheduler (despite we normally don’t deduct it), this could result in a long ban from validation. However, this would require a higher deposit as issuers or you would also need to lock some mana as collateral that will be consumed in such cases.

BTW, I guess we can achieve the same behavior as with sequence numbers by just considering the block timestamp, as we know we cannot have more than one block in a given slot interval

Good catch! By dividing a slot into n intervals, we essentially have them set implicitly and don’t need another field.

1 Like