Clocks and DLTs

william.sanders · April 28, 2020, 8:34am

Although we are all busy with coordicide, there is always time for a bit of theory. The ideas here were developed during many good chats over beer with other Iota researchers.

What is a clock

A tempo is a state machine whose state are enumerated by \mathbb{N} and state i can only transition to state i+1 (note, this is my own terminology and not a standard definition). Each state change is effectively “beat” or a “tick”. For example, consider a watch counting seconds since a certain point in time. Each time the watch ticks, the state changes.

Generally, a tempo can be very irregular. For example, the number of super bowl wins by the Green Bay Packers is a tempo. The tempo moved from 0 to 2 in the 60s, and then to 3 in the 90s and then to 4 in 2010. If the Packers do not improve their defence, the tempo will stay at 4 for a long time. More generally, imagine a tempo which records the number of flashes of a light. The light can flash 1000 in one hour, and then remain off for a century. The point is that tempos need not be regular.

We call a tempo a clock when it changes state regularly (again this definition is not from any scientific literature). Since this definition is not rigorous, we give a few examples. First, a watch counting seconds is most certainly a clock. Second, the flashing light tempo discussed earlier is not a clock. However, this tempo would be a clock if the flashes were a poisson point process, i.e. the light flashes randomly but at a particular rate with high probability.

Clocks thus are a tempo that corresponds to real time, or more generally a clock is a machine for measuring time. Defining a clock more precisely requires healthy doses of physics and philosophy. Although I will now move on, the reader may ruminate over this point with a pint.

Lamport Clocks

In Lamport’s landmark paper on distributed systems, he discusses a way of partially ordering messages in a distributed system. Each processor counts the number of messages they have seen. He refers to this number as a Lamport clock. Then, each message may include a Lamport Timestamp which is the time on the Lamport clock when the message was created.

The Lamport timestamps give a partial ordering. The order is partial because multiple messages can have the same timestamp. Lamport shows that these timestamps have a very important property: if message B depends on message A, then the Lamport timestamp of A is smaller than B. However, this only is true in an honest setting because the timestamps are unenforced.

Now a Lamport clocks is a tempo but not generally a clock in our terminology established above. Messages can be generated at any variable rate especially if one of the processors is byzantine. Thus the tempo can be completely irregular.

Clocks and DLTs

Every DLT needs to control the rate that information is added to the ledger, otherwise it can be spammed and overwhelm all the processors in the network. However, the word rate is meaningless without some sense of time. Thus, any DLT needs a clock! We will justify this statement further but first we give some examples of clocks featured in DLTS.

Proof of Work Clock: Any proof of work block chain is a clock. Indeed, the bitcoin white paper describes a blockchain as a timestamping algorithm. Any time a block is discovered, the clock “ticks”. Indeed, the difficulty is adjusted so that the discovery of blocks is always a poisson point process of constant rate.
Committee clock: Suppose we have a “committee” of n processors who each release a message every few seconds. Any time k< n committee member release a message, the clock “ticks”. Since k is less than n, the entire committee does not need to be honest. This type of clock is used in hash graph and also in the Aleph 0 project. Actually, our DRNG is also a clock by this reasoning.
Coordinator: Each time the coordinator releases the milestone, the “coordinator clock” ticks. Indeed, the milestones are used as a timestamp in many places in the current network. This is technically a committee clock with n=k=1.
Normal clocks: Some proof of stake block chains effectively use normal clocks for their rate control. Consider EOS: blocks are supposed to be issued at certain intervals. After each block is issued, the following blocks are essentially “votes” on whether or not the block was issued at the time.

Various DLTs use these clocks to control the rate messages (aka blocks or transactions) are created and distributed in the network. In the current protocol, the below max depth criterion uses milestones to prevent large bursts of transactions which cannot be validated quickly.

The rate messages are created need to be bounded, otherwise the network will be congested or nodes will not be able to process all the information. On the other hand, a minimum number of messages must be created otherwise the network is effectively down. Thus message creation must be correlated in someway with real time.

Let me rephrase using our earlier terminology: in any DLT (with an effective rate control system) each node’s Lamport clock tempo becomes a real clock (by our earlier definition). Moreover, DLT Lamport clocks are byzantine resistant.

Many DLTs go a step further and find find ways to essentially enforce Lamport timestamps. Consider block chains. In theory, the block number is effectively the number messages, or blocks, that a miner has seen up to that point. Miners are effectively incentivised/forced to put their block at the end of the longest chain. But this is equivalent to issuing a block with the correct Lamport timestamp.

Clocks and Coordicide

This might all be very interesting, but who cares? Well, let us examine how clocks are used in coordicide. First, we cap each nodes’ message processing rate by \nu. We use a combination of scheduling algorithms and back off notifications to ensure that large backlogs do not form causing large latencies or message drops even during an attack. Thus each node’s Lamport clock will tick away roughly at rate \nu with no problems. This is great news!

Coordicide also has timestamps measured in seconds which are enforced through FPC voting. These timestamps replace many of the roles that milestones provide in the current protocol, such as the below max depth criterion and snapshots. Also, seconds are used to measure difference in arrival times of conflicts, per the FCOB rule. Thus coordicide essentially uses two clocks: a normal clock and a Lamport clock.

I recommend however that we eventually only use Lamport clocks, and use Lamport timestamps instead of real time timestamps. Furthermore, I think delays between conflicts should also be measured in terms of the Lamport time. Note, Hans had a very similar idea last spring when we first discussed timestamps.

Besides natural elegance, here are some reasons why this would be a very good idea:

Currently, if no one issues any messages for 30ish minutes, the network will die because no one can issue a transaction with a valid timestamp that satisfies the below max depth rule. Now this would only happen during some global internet outage, and the community can come up with an ad hoc solution in this case. However, using Lamport time, the protocol would easily survive these outages because everyone’s clocks would simply pause during a network outage.
Currently DDOS attacks complicate certain aspects of the protocol. Indeed, we take these delays into account when setting levels of knowledge and judging timestamps conflicts. However, DDOS attacks would not affect a Lamport clock: a node will simply pause their clock till they come online again. Assuming the node receives all the mesages in the right order, each node would measure the correct arrival times and delays in Lamport time even durring a DDOS attack. As a result, the timestamp window could shrink to a much smaller value.

Let me be clear: we would only use Lamport time in a future version of the protocol and not the first version of coordicide. Indeed, we currently do not have time to sort out all the necessary research questions to implement this now. For instance we would need to know the following:

What is the network delay in “Lamport” time?
Are my claims here actually true? If a node is DDOSed, will it still measure the correct delays in Lamport time upon recovery?

mthcl · April 28, 2020, 10:28am

I recall those discussions; some points from my side:

Real time is something on which an approximate consensus (in the sense that we may assume that the clocks of most honest actors are reasonably precise) is already present. Why not use something that is given to us for free? (That maybe a healthy difference in our ways of thinking, though: I noticed that you generally try to make the system more “self-contained”, while I tend to rely more on “real-world mechanisms”.)
Those “rank” schemes are difficult to research in the presence of Byzantine actors. If I’m an attacker who wants to mess with these for some reason (by “mining” lots of messages in different places, releasing them all at once, sending some to one part of the network and other to another part etc. etc.), what is the amount of mess that I’m able to create, and with what probability am I successful? Moreover, even for honest actors, it is not clear where the NE is, i.e., am I incentivized to “interfere” with the ranks for any reason?
All this becomes even more messy in sharded systems: while we still know that the real time is more or less the same for almost everybody (modulo time zones), what would happen with ranks in the situation when the nodes only see a (small) part of the messages? There might be a possibility that the Lamport clocks would “diverge” in different part of the world…

Also:

(thinking about the plural in the above sentence) I’m still a bit more optimistic about the future…

This looks as quite a strong assumption to me.

william.sanders · April 28, 2020, 2:43pm

You raise some excellent points!

Since we are judging arrival times of messages, this approximate consensus is not as strong as we would really like. Just notice all the complications made by this big D little d business.

This is fundamentally different then rank. Assuming that our rate control algorithms work as well as we claim they do (and this is the key point!) we can be certain with high probability that all our Lamport clocks are well synched. In fact, they should be synched better than our normal clocks (this should be verified though before we implement this). So assuming that the Lamport clocks behave correctly, voting on “real time” time stamps should be the exact same as voting on Lamport timestamps: all users must report the correct time (approximately) or else theyre message will be downvoted by everyone.

Well you got me here Sharding would complicate this. Then again, rate control also is complicated in sharding too.

For short outages, this is actually quite reasonable. If you are my neighbor, using the schedule, I make a queue of messages I need to gossip to you. If you dont send me confirmation that you received a message, I try again until you get it or I give up. If I always broadcast my queue in order, you will always receive messages from an honest neighbor in the correct order too. Now attackers do have methods of increasing delays (by not gossiping messages and such) but this is true regardless of what clock you use, and should be guarded against by the flood gossiping.

I do admit though that I make some grandiose claims which would need verification.

mthcl · April 28, 2020, 3:03pm

Yeah, but my point still stands: even though it’s not very strong, it’s far from being useless.

luigi.vigneri · April 28, 2020, 3:10pm

That’s not correct. This is true in case I have a direct link between two nodes (e.g., connected by a cable). In the Internet, nodes may take a different path depending on what routers decide. Hence, it’s not unusual that transmission order won’t be preserved.

Hopefully, some research on congestion control exists in case of network slicing as well. However, the system model is often different, but I expect to find some relevant paper anyway.

luigi.vigneri · April 28, 2020, 3:14pm

Very nice article. Two comments:

have you thought of different clocks, for instance vector clocks? What would be their impact? I think the Lamport clock is the most trivial version.
while logical clocks are definitely useful in many contexts, in the rate control scenario we also need a notion of actual time (in seconds), because the bandwidth and the processing capabilities are measured with this unit (e.g., bit per second, operations per second).

william.sanders · April 29, 2020, 5:38am

Thank you!

I have not given serious thought to vector clocks. However, I think the hashgraft style DLTs seem to borrow ideas from them.
The rate control algorithms indeed have to use something other than a logical clock: the logical clock only becomes a clock if the rate control algorithms force it to be correlated with time somehow. Since the other options are basically PoW clocks or committee clocks, it makes sense that we use real clocks.