# Milestone selection algorithm

Introduction

In the current setting, the Coordinator periodically issues milestones. When a transaction is directly or indirectly referenced by a milestone, it is considered confirmed.

A milestone, as well as normal transactions, references two existing transactions. The choice of which transactions to approve is a non-trivial task as we would like to maximize the number of confirmed transactions. Currently, the trunk and the branch are chosen through a biased random walk. This algorithm, however, is slow because it requires the calculation of cumulative weights. In this report, we will propose an algorithm having the following two properties:

• Fast execution: as the milestone generation is periodic, the algorithm must terminate in short time (i.e., no more than a few seconds);
• Best effort confirmation rate: the size of the union of the past cones of the two referenced transactions should be large.

However, the second property cannot guarantee that the solution found is an optimal solution to the problem as the milestone selection problem is NP-hard.

What is the reason why we are interested in such a problem? First of all, we aim to improve the performance of the current network: when we get rid of the cumulative weight calculation for the random walk, we need an alternative algorithm for milestone selection. Furthermore, the problem is relevant from a theoretical point of view, and may be somehow applied to a Coo-less network for choosing the heaviest branch of the Tangle.

Problem description

The milestone selection problem can be formalized as follows:

Given a directed acyclic graph (DAG) \mathcal{D}, find the two nodes x, y\in\mathcal{D} that maximize their union set of descendants (i.e., approvees).

The problem is difficult to solve, and it is most likely NP-hard (the problem can indeed be reduced to set partitioning).
In fact, the number of descendants of node x depends on the number of descendants of its children nodes x_1 and x_2 minus the number of common descendants of x_1 and x_2. To calculate the common descendants, one cannot just know the number of descendants - but also what the descendants are.

Multi-start random greedy approach

In this article, we propose a greedy approach to solve the milestone selection problem. The idea is inspired by the observation of a Tangle built through the uniform tip selection algorithm. In this scenario, when no attackers are present in the network (we will relax this assumption later), we observe two key features:

1. We expect all branches to have more or less the same number of approvees, since the uniform tip selection would merge small side branches;
2. If a larger branch emerges for some reasons, a larger number of tips is expected to belong to it (see Figure 1).

Figure 1. Representation of the Tangle using uniform random tip selection. The yellow area represent the past cone of the dark green tip.

According to the above properties, a randomly chosen tip will belong to the heaviest branch whp (see the dark green tip of Figure 1). The choice of the second tip follows the same reasoning, hence it is chosen randomly.

A powerful attacker can decide to inflate the number of tips to have a higher probability to be selected. The attacker can, e.g., concentrate the tips around the genesis, introducing the blowball effect (Figure 2). We propose an iterative greedy approach in order to deal with similar attacks: at every step, the algorithm counts the number of descendants of the randomly chosen tips, and replaces randomly the tip with the lowest number of approvees. In case of attack, the algorithm is able to â€˜â€˜escapeâ€™â€™ easily from the blowball as the computation of descendants for those tips will terminate fast.

Figure 2. Tangle with a blowball attached around the genesis.

Here, we describe the algorithm more formally:

1. Choose a random tip x, and compute the list of its descendants X (to compute the descendants of a node, it is possible to use algorithms such as BFS or DFS which have time complexity of O(|V|+|E|), and, since in the Tangle |E| \leq 2 \cdot |V|, we get O(|V|)).
2. Choose a new random tip y, and compute the list of its descendants Y. For optimization purposes, one can compute Y\setminus X.
3. If |X| < |Y\setminus X|, then swap X and Y.
4. If |X\cup Y| \geq \Gamma, then terminate the algorithm (see next section for further information about the stopping parameter \Gamma>0). Otherwise, reiterate from point 2.

This algorithm may get stuck in some local maximum. In order to mitigate this problem, we propose to iterate the algorithm above multiple times (multi-start).

Termination policies

A fundamental requirement for iterative algorithms is to properly set the stopping criterion. In point 4., we introduce a parameter \Gamma inside the condition of termination of the algorithm. This condition relates on the fact that the coordinator knows how many transactions have been issued since the last milestone. Hence, it can make a guess on a reasonable value of \Gamma, i.e., the minimum number of confirmed transactions per second (CTPS) it wants to achieve.

However, depending on particular configurations of the Tangle, it can happen that the above condition is never satisfied. In this case, we can additionally set a maximum time of execution of the algorithm, or rather a maximum number of iterations.

Next steps

The future step are listed below:

• Define a rigorous termination policy to maximize the tradeoff speed/CTPS.
• Perform simulations. Currently, we have a Python simulator which creates a new Tangle (i.e., from the genesis), and applies the multi-start random greedy algorithm. Preliminary simulations show that the algorithm terminates in a few seconds running until 1000 transactions per second.

In Step 1, it should be mentioned that x is in the future cone of the previous milestone.

It would be interesting to see if this algorithm could be modified to set distributed milestones somehow.

Is this a necessary condition? Or could we allow that certain transactions are referenced only by some milestones?

In practice, my intuition says that he future cone of the milestone is more likely to be the heaviest branch, so the condition would almost always be satisfied.

I think having the milesotnes be ordered is fairly reasonable to ensure than none of them are ever orphaned. But it seems easy enough (and for your second point probably good anyways) to pick the first x in the future cone of the previous milestone.

What I am thinking, since anyhows we have to do a walk to count transactions, why not use Local Modifiers walk to choose the tips for the milestone

This is for the following reasons:

1. It is less gameable than (almost) URTS on a subset
2. In order to implement the secret transaction idea, we have to do a walk anyhowsâ€¦
3. Using LM will increase the likelihood of us using a tip straight from the get go that satisfies \Gamma

I guess the walk that starts from a secret checkpoint can have lower randomness and a walk that starts from a milestone can have higher randomness (just intuition)

Also I need to remind you that we need Below Max Depth (Perhaps it is obvious but I think it should be written down here)

The Below Max Depth Rule does need to be mentioned indeed.

Also, based on other work weve done, you dont need to do random walks to count transactions. Also LM has the same orphanage problems as CW based random walks, and and they do not really prevent lazyness.