In order to join the Tangle, a new transaction must choose two previous ones to approve. In general, it will choose two previously unapproved transactions, or tips. The method for choosing these two tips is called the tip selection algorithm.
This page describes provides both a basic introduction and a deep dive to how tip selection works.
Nodes are not obligated to follow the recommended tip selection algorithm. However it is designed so that if most nodes do follow it, the rest are incentivized to follow suit.
Tip selection is done by performing a weighted random walk from the genesis towards the tips. The walk stops when it reaches a tip. The walk is performed twice, and so two tips are chosen.
The walk is biased towards transactions with more cumulative weight, or more transactions referencing them. This creates an incentive to approve new transactions rather than old ones.
The following document describes what happens behind the scenes when a user requests the getTransactionsToApprove API. This API is invoked when a user wants to send a transaction to the Tangle, to do so he needs to request two suitable transactions to approve. The algorithm walks a subgraph of the Tangle and, according to some criteria, returns two tip transactions, that will be pointed to by the
branchTransactionfields of the newly issued transaction, for further details on transaction structure please refer to The Anatomy of a Transaction.
The process of the tip selection algorithm can be separated into four stages, that are detailed in the following sections:
- Rating Calculation
- Random Walk
The algorithm's goal is to return two non-conflicting tips as a successful result of the API call.
The algorithm has to walk a portion of the Tangle (subgraph) twice, stepping according to a rating calculation, selecting an
entryPointfor each walk.
Calculating weights for transactions is expensive in terms of both RAM and CPU. To avoid recomputing the weights due to new inbound transactions, we work with a fixed latest portion of the Tangle for the entire duration of the API call. We take the latest solid milestone issued by the coordinator, and we recur a user-specified
depthnumber of milestones in the past. The Tangle starting from this old milestone is called subgraph. The maximum allowed
depthnumber of milestones to recur back in the past is still subject to debate: it should be sufficiently small to allow the algorithm to terminate in a reasonable amount of time, while still providing good mixing to the obtained subgraph. Currently, a
maxDepthnumber has been heuristically defined and has a configuration default of 15. If the user calls the API with a
depthbigger than the
maxDepthvalue the call will return immediately with a
"Invalid depth input"message and 500 error code.
Mixing: going far enough in the past so that it is almost like starting from the Genesis transaction. This property is essential since we do not want to leave too many unconfirmed tips behind, and merge as many Tangle branches as possible when issuing a new milestone.
The random walks for both the
branchreturned tips will start from the same
latestSolidMilestone - depthmilestone transaction. If the user specifies a
referenceargument to the API call, the
branchwalk will start from the supplied transaction instead, assuming it is included in the considered subgraph. In other words, the
referencetransaction cannot fall beyond the
latestSolidMilestone - depthmilestone. In case the user specifies a
referencetransaction older than
depth, the API call fails after the rating calculation with a
"reference transaction is too old"error message.
In this phase the algorithm computes the rating of every transaction in the subgraph. These ratings will be subsequently transformed into weights during the random walk phase, these will be used to bias the walker's steps. The calculation will be performed only once, and used for both the
In order to be able to have different rating algorithms at our disposal, we created a generic interface to which every rating calculator should adhere.
Map<TxId -> Integer> calculate(TxId entryPoint)
Every rating calculator, being invoked with an
entryPointtransaction, should return a mapping of transaction IDs with their corresponding rating value expressed as integers.
The Cumulative Weight is a specific rating algorithm that gives each transaction a rating equivalent to the number of transactions which reference it. These transactions are called future set. To perform the calculation, the subgraph is first sorted in topological order using DFS.
Topological sorting: obtaining a structure for our subgraph such that if transaction A approves transaction B, then B comes before A, leaving the tips of our subgraph as last transactions elements in the structure.
For every transaction included in our sorted subgraph, we create a future set, containing direct and indirect approvers. The rating of each transaction is the size of its future set + 1 (the transaction's own weight).
entryPoint = latestSolidMilestone - depth entryPointTrunk = entryPoint entryPointBranch = reference or entryPoint ratings = CumulativeWeightCalculator.calculate(entryPointTrunk) class CumulativeWeightCalculator(RatingCalculator): def calculate(startTx): rating = dict() subgraph = Tangle(startTx) topologicalSubgraph = sortTopologically(subgraph) for tx in topologicalSubgraph: rating[tx] = len(futureSet(tx)) + 1 return rating
In order to preserve space while storing transaction's identifiers, we only store a portion of the transaction's hash bytes, truncating it to
PREFIX_LENGTHlength. Currently, this value has been hardcoded to 44 bytes, corresponding to 220 trits. While preserving a significant amount of space during the execution of the algorithm, this approach poses challenges by increasing the collision probability of the keys used to identify transactions.
Future Set Size
In order to cap the memory consumption of the algorithm, we allow to store up to
MAX_FUTURE_SET_SIZEnumber of approvers for the transaction we are considering, under the assumption that a higher rating score won't contribute significantly to bias the walker. This value has been heuristically hardcoded to 5000. Please note that this optimization, while capping memory usage during runtime, makes the walk to behave more randomly closer the beginning of the considered subgraph since the future sets of those transactions are more likely to have been capped to
MAX_FUTURE_SET_SIZE. The desired behavior is instead the contrary: we would like the beginning of the walk to be strongly biased towards the main branch while being more random closer to the tips, spreading the chance for any of them to get selected.
Once the transactions' ratings are computed, we can start our two walks to return the
In order to be able to have different walking algorithms at our disposal, we created a generic interface to which every walker implementation should adhere.
TxId walk(TxId entryPoint, Map<TxId -> Integer> ratings, WalkValidator validator)
Every walker, being invoked with an
ratingmapping structure, and a
validatorobject used to validate walker's steps, should return the selected tip transaction.
The WalkerAlpha is a random walker algorithm that traverses a subgraph towards the tips, whose entropy can be tuned by adjusting an
alphafactor. Namely, an
alpha = 0factor makes the walk purely random, transactions' rating does not bias the walk in any way. Conversely, with an
alpha = ∞, the highest-rated transaction will be always chosen as the next step. To summarize, the higher the
alphafactor is, the more likely it is for a high-rated transaction to be chosen as a next step in the walk. Finding an ideal value for the
alphafactor is still a topic of research, for more information please refer to this blog post.
Starting from the supplied
entryPoint, the algorithm walks towards the tips by selecting a single approver at each step, the choice of the path to take is biased by the supplied rating structure. At each step, the consistency of the ledger state is checked against the validator object. Each step should traverse a bundle until its tail, otherwise consistency could not be validated. Upon reaching an inconsistent state, the walker invalidates the step it just took, it backtracks to the previous tail and runs the approver selection again.
The probability to walk towards a specific approver gets calculated with the following formula, where
weightof a specific transaction.
Ratings are normalized and transformed into
weightswith the help of the
alphafactor. Finally, a
randomvalue between 0 and the sum of all the
weightsgets generated, and it gets subtracted by the approvers' weights iteratively until reaching the value of 0. The approver that turned the
randomvalue into 0 gets selected as the next step in the walk.
class WalkerAlpha(Walker): def walk(entryPoint, ratings, validator): step = entryPoint prevStep = None while step: approvers = getApprovers(step) prevStep = step step = nextStep(ratings, approvers, validator) # When there are no more steps it means we reached a tip return prevStep def nextStep(ratings, approvers, validator): approversWithRating = approvers.filter(a => ratings.contains(a)) # There is no valid approver, we are on a tip if len(approversWithRating) == 0: return None approversRatings = approverswithRating.map(a => ratings.get(a)) weights = ratingsToWeights(approversRatings) approver = weightedChoice(approversWithRating, weights) if approver is not None: tail = validator.findTail(approver) # If the selected approver is invalid we try again without him if validator.isInvalid(tail): approvers = approvers.remove(approver) return nextStep(ratings, approvers, validator) return tail return None def weightedChoice(approvers, weights): randomNumber = random(0, sum(weights)) for approver in approvers: randomNumber = randomNumber - weights.get(approver) if randomNumber <= 0: return approver def ratingsToWeights(ratings): highestRating = max(ratings) normalizedRatings = ratings.map(r => r - highestRating) weights = normalizedRatings.map(r => math.exp(r * alpha)) return weights
For the sake of simplicity, the above pseudo-code does not dig into the details of the transaction's validation, we are going to take a look at these here. A transaction is considered invalid if any of the following occurs:
- It is not solid and we cannot reconstruct its state, since a portion of the Tangle this transaction references is unknown.
- It references a transaction too far in the past, namely beyond
latestSolidMilestone - maxDepth. Please refer to section 1.1 for further explanation.
- The ledger state computed from it is not consistent, such as trying to spend missing funds or double-spending.
- The validator maintains a list of transactions that have been checked for validity. Every time a new transaction gets validated, it is also checked against these.
class WalkValidator: previousTransactions =  def isInvalid(transaction): previousTransactions.append(transaction) if notSolid(transaction): return True if belowMaxDepth(transaction): return True if inconsistent(transaction): return True if incosistentWithPreviousTransactions(transaction): return True return False
Please note that the same validator object gets passed for both walks, resulting in two tips that are consistent with each other.
Transactions in IOTA are issued in bundles: a chain of numbered transactions which consistency to the Tangle has to be considered atomically. When we walk towards an approver, we may also walk into the middle of a bundle, so we have to make sure we traverse it until its tail transaction, which finalizes the state change for the whole bundle, so we can validate it. Transactions participating in a bundle share the same
bundleHashand have an index, being the closing transaction, the tail, the one with index 0.
The two tips returned by the walks are checked for consistency between each other, we don't want to return to the user two transactions which states cannot be conciliated, since the user's transaction will turn out invalid and therefore never selected for approval.
- Approvers: the set of transactions that have a direct edge pointing towards the considered transaction.
- Future set: set of all direct and indirect approvers.
- Tip: a transaction that doesn't have any valid approvers.
- Solid: a transaction is considered to be solid if none of the Tangle history approved by this transaction is missing.
- State: the sum of all the changes introduced by all transactions that have been approved by a specific transaction recurring all the way back until the origin.
- Inconsistent: a transaction is considered inconsistent if it would lead to an invalid state such as spending missing funds or double-spending.
- Invalid: a transaction is considered invalid if it is either non-solid, inconsistent or it references a transaction which is too old.
- Milestone: a transaction periodically issued by the Coordinator.
- Confirmed: transactions directly or indirectly approved by a milestone are considered confirmed.
- TPS: Transactions Per Second seen by a node.
- CTPS: Confirmed Transactions per second.
- Rating: represents the likelihood for a transaction to be selected as the next walking hop by the Walker algorithm.
- Subgraph: the portion of the Tangle's history contained between two arbitrarily-chosen milestones.
- Depth: the number of milestones we recur back in the past to build the subgraph.
- Walker: the algorithm that, following pre-defined rules and biased by the transactions' rating, traverses a subgraph towards the tips.
- Entrypoint: the transaction from which the walker algorithm starts his walk.
- Reference: a user-defined transaction that will be used as an entrypoint for one of the two passes of the walker. this ensures that the selected tips will reference this transaction.
- Tail: transaction with
currentIndex == 0.
- Head: transaction with
currentIndex == lastIndex.
- Bundle: chain of numbered transactions that need to be considered atomically for a state change; a well-formed bundle starts with a head transaction followed by a chain of approving transactions (connected by
Trunk), and terminates with a tail transaction.
- Blowball: a specific Tangle formation that consists of a central transaction, usually a milestone, surrounded by a large number of tips, most of which invalid.