Berserk detection protocol for FPC and CA

It is known from both mathematics of FPC paper and simulations that berserk attacker can cause more havoc than cautious or semi-cautious. Probabilities of agreement failure increase when nodes can send two different opinions to different nodes. Moreover, it is known that in order to to be byzantine secure CA consensus requires opinion justification. However, such opinion justification allows for the detection of berserk nodes only if the malicious node is in the cycle of length 4.

To improve the security of both consensus mechanisms we propose the protocols that allow for the detection of berserk attackers in both FPC and CA. Details of berserk detection might be different for FPC and CA, however, the basic idea behind both is very similar.

{\Large{\text{Berserk detection in FPC}}}

The protocol allows every node to ask any other node for a list of votes received during the previous round of FPC voting. We call such a list v-list. Presumably, a request for v-list would be attached to the query request. Requesting v-list would not be mandatory, each node can ask for it with a certain probability as long as it has bandwidth capacity. To model the behavior of the system we assume that this probability is the same for all of the nodes and equals p (p is the probability that an arbitrary query request includes a request for v-list).

Assume that in the last round a node y received k votes submitted by nodes z_1,...,z_k. If a node x asks y for the v-list, y sends him votes submitted by z_1,...,z_k without signatures. This reduces the message size. If x detects any trace of suspicious behavior it will ask for confirming signatures with additional messages. Therefore message sends to x include only (possibly compressed) id of z_i and a single bit value of a vote (or possibly few more, see further improvements section for details of voting with history of votes). If detected proof of berserk behavior is gossip to the network and malicious actor is dropped by everybody.

{\Large{\text{Expected number of rounds before berserk detection}}}

Consider the following scenario: Among N honest nodes there is a single berserk node. In the last round, it was queried k times and it sends f \cdot k votes 0 and (1-f)k votes 1 to different nodes in the network. Let us call the first group G_0 and the other one G_1.

Probability that a node x requests v-list of a node from G_0 equals p fk/N and p (1-f)k/N for G_1. Assuming that both events are independent and keeping everything to the first order (which is reasonable when either N is large or p is small), probability that x receives two v-lists that allow for the detection berserk behavior equals

P(x\text{ receives opinion from } G_0) = 1-(1-p fk/N)^k \approx p fk^2/N,
P(x\text{ receives opinion from } G_1) = 1-(1-p (1-f)k/N)^k \approx p (1-f)k^2/N,
P(x {\text{ receives opinion from both: }} G_1 {\text{ and }} G_2) = \frac{p^2 (1-f)fk^4}{N^2}.

If N honest nodes follow this procedure then, the probability that some node detects berserk equals

P({\text{Finding berserk}}) = \frac{p^2 (1-f)fk^4}{N}.

The attacker is ‘‘maximally berserk’’ when f= (1-f) = 1/2. For this attacker and N = 1000, k=20 and p=0.1 detection probability equals 0.4. For p=0.01 it drops to 0.004. In the case of p=0.1 attacker should be found after 3 rounds, for p=0.01 after 250. Assuming that the full FPC voting takes \sim 30 rounds, for p=0.01 berserk should be detected after \sim 8 full FPC voting schemes.

{\Large{\text{Message overhead}}}

The id of a node takes 32 bytes. Assume that it can be compressed to x bytes. Note that since exchanging v-list is taking place together with FPC voting, the node can sign vote and v-list once. Therefore the message overhead for just v-list should take be on average

p k\cdot (x+1/3).

Assume p =0.1, k=20 and x = 6 then on average network requires additional 13 bytes more than 2000 expected to be used in FPC without berserk detection. Note that p=0.1 results in almost immediate berserk detection. Even in this case, the messages are going to be on average \sim 0.7 \% bigger.

{\Large{\text{Further improvements}}}

In the protocol presented above nodes compare votes from the single, previous round. However, we can increase the efficiency of the protocol by comparing not only the last vote but rather entire voting history. By history we mean a list of all votes cast by a given node on the conflict in different rounds. For example, if a node is in 30-th round on voting on particular conflict its history consist of 30 bits, each of them corresponds to the vote in one of 30 round. This modification would not be communication costly, the size of the exchanged messages would increase only by only a few bytes.

However, in order for such protocol to work, we would require each FPC queried node to send its entire voting history on a particular conflict. This ‘‘extended vote/vote history’’ would grow with each round of voting by one bit. Then as long as particular voting is taking place, all nodes would keep in the memory old v-lists. Then If they receive v-list from the same node in different rounds it compares votes on every available entry.

For example: when a node x receives the voting history from the same node y in the 20th and 30th rounds it would try fo find traces of berserk behavior in 20 rounds. With this method, the effectiveness of berserk detection increases greatly. Also, what is important sending even 32 additional votes with one bit per vote would not increase message size significantly. 4 additional bytes are insignificant with comparison to 64 required for signature and 50 for hash of the transaction and 32 for the public key (although two last can be compressed).

{\Large{\text{Version for CA}}}

A similar berserk detection protocol can be used for CA. One option would be to utilize an already discussed solution. Then the nodes would exchange information about the votes (votes history) on the layer outside the voting, for example on the gossip layer. However, such communication is very slow in comparison to CA. Therefore, the berserk detection most probably would be possible only when voting is over and potential damage is done. To overcome this problem we propose a modified solution that uses the opinion exchange layer.

Each node can prepare a vote history request. Such a request would have number \ell associated with it. The request is sent to one of the neighboring nodes which upon receiving such request decreases counter of number \ell by one and sends it to one of its nearest neighbors (randomly). The next neighbor also decreases \ell and passes the request along. Assume that a node x receives a request with number \ell = 0. Then it prepares a special message with all of the votes send it by its neighbors and justification of those votes (so in total m = 8 votes from current round and up to m\cdot m votes from the previous round). Note that its own voting history can be derived from those votes. The size of this message would be small as opinion can be expressed with a single byte and node identities can be compressed or even skipped if the list of neighbors is public. Moreover, signatures of neighbors should be skipped in the initial message as they can be requested when malicious behavior is detected. The only problematic part would be the signature of x. When ready the message is sent back using the same way it came from.

When \ell is comparable with graph diameter we assume that the final node which receives a request should is selected roughly uniformly random.

Let us model the probability when each node has m neighbors. Assume that a node sends a voting history request with probability p. Then the probability of finding single berserk who sends fk zeros and (1-f)k ones to its neighbors is analogical as in the case of FPC

P({\text{Finding berserk}}) = \frac{p^2 (1-f)fm^8}{N}.

Assume m=8, p=0.01, N=1000, f=1-f = 1/2 then this probability equals \approx 0.42. We want to note that the requirement of sending the entire history of vote can even increase this probability. Moreover, nodes on the way can keep the voting history in their buffers to have more vote histories to compare to.

{\Large{\text{Summary}}}

We have a viable strategy of detecting berserk attackers in both FPC and CA (however in the CA case we require that berserker is surrounded by honest node). To make it even more effective we propose a small change to the voting protocol. We suggest nodes to vote with entire history rather than only the most recent vote. As we showed this would not change the message size in a significant way.

How is this approximation made?

(1-x)^k \approx 1-kx

Ah, of course.

Why is the denominator not N^2?

We have N nodes following the same protocol. P(\text{Finding berserk}) doesn’t mean that “this particular” node finds berserk, it is the probability that there is at least one node that finds it. So N^2 from the denominator is canceled with N multiplied
in numerator (once again rough estimation).

In this case, replace the words “a certain” with “some”.

Furthermore, the value you computed is the “Expected value of the number of berserkers detected in a round”, and I am not sure if this number is important at all.

Your calculation is incorrect for two reasons. First, the probability of each node detecting a berserker is not independent, and so you cannot compute the whole from the parts. Second, even if the events were independent, then
F(\mbox{Finding berserker})=1-\big(1-\frac{p^2(1-f)fk^4}{ N^2}\big)^N
Which is a very different number.

Have you talked to one of our probability people about this?

Although now I look at this, I see that what I wrote above is approximately what you wrote :slight_smile: (although I am not sure if you can wave away the independence part)

It gives the information about expected number of rounds until detection of a single berserk (inverse of the probability).

Well… assuming that each node queries randomly and requests for the v-list is attached to the query, those events are independent. In what way a node x which querries randomly and does not share information about who does he query could influence a node y which does the same thing?

To be completely honest I already used violated “independence” in another formula: I approximated that picking votes from G_0 and G_1 is independent where actually it is not. Anyway, this was supposed to be is a rough approximation and I don’t think that in this case, it is unreasonable to assume that finding vote from G_1 does not significantly decrease the probability of finding vote from G_0.

Once again (1−x)^k\approx 1−kx.