Contributions welcome!

Polkadot Protocol

This specification is Work-In-Progress and any content, structure, design and/or hyper/anchor-link is subject to change.

Formally, Polkadot is a replicated sharded state machine designed to resolve the scalability and interoperability among blockchains. In Polkadot vocabulary, shards are called parachains and Polkadot relay chain is part of the protocol ensuring global consensus among all the parachains. The Polkadot relay chain protocol, henceforward called Polkadot protocol, can itself be considered as a replicated state machine on its own. As such, the protocol can be specified by identifying the state machine and the replication strategy.

From a more technical point of view, the Polkadot protocol has been divided into two parts, the Polkadot Runtime and the Polkadot Host. The Runtime comprises the state transition logic for the Polkadot protocol and is designed and be upgradable via the consensus engine without requiring hard forks of the blockchain. The Polkadot Host provides the necessary functionality for the Runtime to execute its state transition logic, such as an execution environment, I/O, consensus and network interoperability between parachains. The Polkadot Host is planned to be stable and mostly static for the lifetime duration of the Polkadot protocol, the goal being that most changes to the protocol are primarily conducted by applying Runtime updates and not having to coordinate with network participants on manual software updates.

Polkadot Host

With the current document, we aim to specify the Polkadot Host part of the Polkadot protocol as a replicated state machine. After defining the different types of hosts in Chapter 1, we proceed to specify the representation of a valid state of the Protocol in Chapter 2. We also identify the protocol states, by explaining the Polkadot state transition and discussing the detail based on which the Polkadot Host interacts with the state transition function, i.e. Runtime in the same chapter. Following, we specify the input messages triggering the state transition and the system behavior. In Chapter 4, we specify the communication protocols and network messages required for the Polkadot Host to communicate with other nodes in the network, such as exchanging blocks and consensus messages. In Chapter 5 and Chapter 6, we specify the consensus protocol, which is responsible for keeping all the replica in the same state. Finally, the initial state of the machine is identified and discussed in Section A.3. A Polkadot Host implementation which conforms with this part of the specification should successfully be able to sync its states with the Polkadot network.

1. Overview

The Polkadot Protocol differentiates between different classes of Polkadot Hosts. Each class differs in their trust roots and how active or passively they interact with the network.

1.1. Light Client

The light client is a mostly passive participant in the protocol. Light clients are designed to work in resource constrained environments like browsers, mobile devices or even on-chain. Its main objective is to follow the chain, make queries to the full node on specific information on recent state of the blockchain, and to add extrinsics (transactions). It does not maintain the full state, rather queries the full node on the latest finalized state and verifies the authenticity of the responses trustlessly. Details of specifications focused for Light Clients can be found in Chapter 7.

1.2. Full Node

While the full node is still a mostly passive participant of the protocol, they follow the chain by receiving and verifying every block in the chain. It maintains full state of the blockchain by executing the extrinsics in blocks. Their role in consesus mechanism is limited to following the chain and not producing the blocks.

  • Functional Requirements:

    1. The node must populate the state storage with the official genesis state, elaborated further in Section A.3.

    2. The node should maintain a set of around 50 active peers at any time. New peers can be found using the discovery protocols (Section 4.4)

    3. The node should open and maintain the various required streams (Section 4.7) with each of its active peers.

    4. Furthermore, the node should send block requests (Section 4.8.2) to these peers to receive all blocks in the chain and execute each of them.

    5. The node should exchange neighbor packets (Section 4.8.6.1).

1.3. Authoring Node

The authoring node covers all the features of the full node but instead of just passivly following the protocol, it is an active participant, producing blocks and voting in Grandpa.

  • Functional Requirements:

    1. Verify that the Host’s session key is included in the current Epoch’s authority set (Section 3.3.1).

    2. Run the BABE lottery (Chapter 5) and wait for the next assigned slot in order to produce a block.

    3. Gossip any produced blocks to all connected peers (Section 4.8.1).

    4. Run the catch-up protocol (Section 6.6.1) to make sure that the node is participating in the current round and not a past round.

    5. Run the GRANDPA rounds protocol (Chapter 6).

1.4. Relaying Node

The relaying node covers all the features of the authoring node, but also participants in the availability and validity process to process new parachain blocks as described in Chapter 8.

2. States and Transitions

2.1. Introduction

A Discrete State Machine (DSM) is a state transition system that admits a starting state and whose set of states and set of transitions are countable. Formally, it is a tuple of

\$(\Sigma, S, s_0, \delta)\$

where

  • \$\Sigma\$ is the countable set of all possible inputs.

  • \$S\$ is a countable set of all possible states.

  • \$s_0 in S\$ is the initial state.

  • \$\delta\$ is the state-transition function, known as Runtime in the Polkadot vocabulary, such that

\$\delta : S \times \Sigma \rightarrow S\$
Definition 2. Path Graph

A path graph or a path of \$n\$ nodes formally referred to as \$P_n\$, is a tree with two nodes of vertex degree 1 and the other n-2 nodes of vertex degree 2. Therefore, \$P_n\$ can be represented by sequences of \$(v_1, \ldots, v_n)\$ where \$e_i = (v_i, v_{i + 1})\$ for \$1 <= i <= n - 1\$ is the edge which connect \$v_i\$ and \$v_{i + 1}\$.

Definition 3. Blockchain

A blockchain \$C\$ is a directed path graph. Each node of the graph is called Block and indicated by \$B\$. The unique sink of \$C\$ is called Genesis Block, and the source is called the \$"Head"\$ of \$C\$. For any vertex \$(B_1, B_2)\$ where \$B_1 -> B_2\$ we say \$B_2\$ is the parent of \$B_1\$, which is the child of \$B_2\$, respectively. We indicate that by:

\$B_2 := P(B_1)\$

The parent refers to the child by its hash value (Definition 10), making the path graph tamper proof since any modifications to the child would result in its hash value being changed.

The term "blockchain" can also be used as a way to refer to the network or system that interacts or maintains the directed path graph.

2.1.1. Block Tree

In the course of formation of a (distributed) blockchain, it is possible that the chain forks into multiple subchains in various block positions. We refer to this structure as a block tree:

Definition 4. Block

The block tree of a blockchain, denoted by \$BT\$ is the union of all different versions of the blockchain observed by the Polkadot Host such that every block is a node in the graph and \$B_1\$ is connected to \$B_2\$ if \$B_1\$ is a parent of \$B_2\$.

When a block in the block tree gets finalized, there is an opportunity to prune the block tree to free up resources into branches of blocks that do not contain all of the finalized blocks or those that can never be finalized in the blockchain (Chapter 6).

Definition 5. Pruned Block Tree

By Pruned Block Tree, denoted by \$"PBT"\$, we refer to a subtree of the block tree obtained by eliminating all branches which do not contain the most recent finalized blocks (Definition 90). By pruning, we refer to the procedure of \$BT larr "PBT"\$. When there is no risk of ambiguity and is safe to prune BT, we use \$"BT"\$ to refer to \$"PBT"\$.

Definition 6 gives the means to highlight various branches of the block tree.

Definition 6. Subchain

Let \$G\$ be the root of the block tree and \$B\$ be one of its nodes. By \$"Chain"(B)\$, we refer to the path graph from \$G\$ to \$B\$ in \$"BT"\$. Conversely, for a chain \$C = "Chain"(B)\$, we define the head of \$C\$ to be \$B\$, formally noted as \$B := bar C\$. We define \$|C|\$, the length of \$C\$ as a path graph.

If \$B'\$ is another node on \$"Chain"(B)\$, then by \$"SubChain"(B', B)\$ we refer to the subgraph of \$"Chain"(B)\$ path graph which contains \$B\$ and ends at \$B'\$ and by \$|"SubChain"(B', B)|\$ we refer to its length.

Accordingly, \$bbb C_(B')(BT)\$ is the set of all subchains of \$BT\$ rooted at \$B'\$. The set of all chains of \$BT\$, \$bbb C_G(BT))\$ is denoted by \$bbb C(BT)\$ or simply \$bbb C\$, for the sake of brevity.

Definition 7. Longest Chain

We define the following complete order over \$bbb C\$ as follows. For chains \$C_1, C_2 in bbb C\$ we have that \$C_1 > C_2\$ if either \$|C_1| > |C_2|\$ or \$|C_1| = |C_2|\$.

If \$|C_1| =| C_2|\$ we say \$C_1 > C_2\$ if and only if the block arrival time (Definition 67) of \$bar C_1\$ is less than the block arrival time of \$bar C_2\$, from the subjective perspective of the Host. We define the \$"Longest-Chain"(BT)\$ to be the maximum chain given by this order.

Definition 8. Longest Path

\$"Longest-Path"(BT)\$ returns the path graph of \$BT\$ which is the longest among all paths in \$BT\$ and has the earliest block arrival time (Definition 67). \$"Deepest-Leaf"(BT)\$ returns the head of \$"Longest-Path"(BT)\$ chain.

Because every block in the blockchain contains a reference to its parent, it is easy to see that the block tree is de facto a tree. A block tree naturally imposes partial order relationships on the blocks as follows:

We say \$B\$ is descendant of \$B'\$, formally noted as \$B > B'\$, if \$(|B| > |B'|) in C\$. Respectively, we say that \$B'\$ is an ancestor of \$B\$, formally noted as \$B < B'\$, if \$(|B| < |B'|) in C\$.

2.2. State Replication

Polkadot nodes replicate each other’s state by syncing the history of the extrinsics. This, however, is only practical if a large set of transactions are batched and synced at the time. The structure in which the transactions are journaled and propagated is known as a block of extrinsics (Section 2.2.1). Like any other replicated state machines, state inconsistency can occure between Polkadot replicas. Section 2.4.5 gives an overview of how a Polkadot Host node manages multiple variants of the state.

2.2.1. Block Format

A Polkadot block consists a block header (Definition 10) and a block body (Definition 13). The block body in turn is made up out of extrinsics , which represent the generalization of the concept of transactions. Extrinsics can contain any set of external data the underlying chain wishes to validate and track.

%3 cluster__block Block block__seq pos size type id 0 ... BlockHeader header ... ... BlockBody body block_header__seq BlockHeader block__seq:header_type->block_header__seq block_body__seq BlockBody block__seq:body_type->block_body__seq
Definition 10. Block Header

The header of block B, \$H_h(B)\$, is a 5-tuple containing the following elements:

  • parent_hash: formally indicated as \$H_p\$, is the 32-byte Blake2b hash (Section A.1.1.1) of the SCALE encoded parent block header (Definition 12).

  • number: formally indicated as \$H_i\$, is an integer, which represents the index of the current block in the chain. It is equal to the number of the ancestor blocks. The genesis state has number 0.

  • state_root: formally indicated as \$H_r\$, is the root of the Merkle trie, whose leaves implement the storage for the system.

  • extrinsics_root: is the field which is reserved for the Runtime to validate the integrity of the extrinsics composing the block body. For example, it can hold the root hash of the Merkle trie which stores an ordered list of the extrinsics being validated in this block. The extrinsics_root is set by the runtime and its value is opaque to the Polkadot Host. This element is formally referred to as \$H_e\$.

  • digest: this field is used to store any chain-specific auxiliary data, which could help the light clients interact with the block without the need of accessing the full storage as well as consensus-related data including the block signature. This field is indicated as \$H_d\$ (Definition 11).

%3 cluster__block_header BlockHeader block_header__seq pos size type id 0 32 parent_hash 32 ... Scale::CompactInt number ... 32 state_root ... 32 extrinsic_root ... ... Scale::CompactInt num_digests ... ... Digest digests repeat num_digests.value times digest__seq Digest block_header__seq:digests_type->digest__seq
Binary Format 1. Block Header
Definition 11. Header Digest

The header digest of block \$B\$ formally referred to by \$H_d (B)\$ is an array of digest items \$H_d^i\$’s, known as digest items of varying data type (Definition 188) such that:

\$H_d(B) := H_d^1, ..., H_d^n\$

where each digest item can hold one of the following type identifiers:

\$H_d^i = { (4, to, (t, "id", m)), (5, to, (t, "id", m)), (6, to, (t, "id", m)), (8, to, (t)) :}\$
where
  • \$"id"\$ is a 4-byte ASCII encoded consensus engine identifier

  • \$"m"\$ is a scale encoded byte array containing the message payload

\$t = 4\$

Consensus Message, contains scale-encoded message \$m\$ from the Runtime to the consensus engine. The receiving engine is determined by the \$"id"\$ identifier:

\$"id" = tt "BABE"\$

a message to BABE engine (Definition 58)

\$"id" = tt "FRNK"\$

a message to GRANDPA engine (Definition 86)

\$"id" = tt "BEEF"\$

a message to BEEFY engine (Definition 87)

\$t = 5\$

Seal, is produced by the consensus engine and proves the authorship of the block producer. The engine used for this is provided through \$"id"\$ (at the moment BABE), while \$m\$ contains the scale-encoded signature (Definition 70) of the block producer. In particular, the Seal digest item must be the last item in the digest array and must be stripped off by the Polkadot Host before the block is submitted to any Runtime function including for validation. The Seal must be added back to the digest afterward.

\$t = 6\$

Pre-Runtime digest, contains messages from the consensus engines to the runtime. Currently only used by BABE to pass the scale encoded BABE Header (Definition 69) in \$m\$ with \$"id" = tt "BABE"\$

\$t = 8\$

Runtime Environment Updated digest, indicates that changes regarding the Runtime code or heap pages (Section 2.6.3.1) occurred. No additional data is provided.

%3 cluster__digest Digest cluster__pre_runtime Digest::PreRuntime cluster__post_runtime Digest::PostRuntime cluster__seal Digest::Seal cluster__empty Digest::Empty digest__seq pos size type id 0 1 u1→TypeId type 1 ... switch (type) value digest__seq:type_type->digest__seq:value_type digest__seq_value_switch case type :type_id_pre_runtime PreRuntime :type_id_post_runtime PostRuntime :type_id_seal Seal :type_id_runtime_updated Empty digest__seq:value_type->digest__seq_value_switch pre_runtime__seq pos size type id 0 4 str(ASCII) engine 4 ... Scale::Bytes payload digest__seq_value_switch:case0->pre_runtime__seq post_runtime__seq pos size type id 0 4 str(ASCII) engine 4 ... Scale::Bytes payload digest__seq_value_switch:case1->post_runtime__seq seal__seq pos size type id 0 4 str(ASCII) engine 4 ... Scale::Bytes payload digest__seq_value_switch:case2->seal__seq empty__seq pos size type id digest__seq_value_switch:case3->empty__seq
Binary Format 2. Block Header Digest
Definition 12. Header Hash

The block header hash of block \$B\$, \$H_h(B)\$, is the hash of the header of block \$B\$ encoded by simple codec:

\$H_h(B) := "Blake2b"("Enc"_(SC)("Head"(B)))\$
Definition 13. Block Body

The block body consists of an sequence of extrinsics, each encoded as a byte array. The content of an extrinsic is completely opaque to the Polkadot Host. As such, from the point of the Polkadot Host, and is simply a SCALE encoded array of byte arrays. The body of Block \$B\$ represented as \$"Body"(B)\$ is defined to be:

\$"Body"(B) := "Enc"_(SC)(E_1,...,E_n)\$

Where each \$E_i in bbb B\$ is a SCALE encoded extrinsic.

%3 cluster__block_body BlockBody cluster__transaction BlockBody::Transaction block_body__seq pos size type id 0 ... Scale::CompactInt num_transactions ... ... Transaction transactions repeat num_transactions.value times transaction__seq pos size type id 0 ... Scale::CompactInt len_data ... len_data.value data block_body__seq:transactions_type->transaction__seq

2.3. Extrinsics

The block body consists of an array of extrinsics. In a broad sense, extrinsics are data from outside of the state which can trigger state transitions. This section describes extrinsics and their inclusion into blocks.

2.3.1. Preliminaries

The extrinsics are divided into two main categories defined as follows:

Transaction extrinsics are extrinsics which are signed using either of the key types (Section A.1.4) and broadcasted between the nodes. Inherent extrinsics are unsigned extrinsics which are generated by Polkadot Host and only included in the blocks produced by the node itself. They are broadcasted as part of the produced blocks rather than being gossiped as individual extrinsics.

The Polkadot Host does not specify or limit the internals of each extrinsics and those are defined and dealt with by the Runtime (Definition 1). From the Polkadot Host point of view, each extrinsics is simply a SCALE-encoded blob (Section A.2.2).

2.3.2. Transactions

Transaction are submitted and exchanged through Transactions network messages (Section 4.8.5). Upon receiving a Transactions message, the Polkadot Host decodes the SCALE-encoded blob and splits it into individually SCALE-encoded transactions.

Alternative transaction can be submitted to the host by offchain worker through the Host API (Section B.6.2).

Any new transaction should be submitted to the Runtime (Section C.7.1). This will allow the Polkadot Host to check the validity of the received transaction against the current stat and if it should be gossiped to other peers. If it considers the submitted transaction as valid, the Polkadot Host should store it for inclusion in future blocks. The whole process of handling new transactions is described in more detail by Algorithm 1.

Additionally valid transactions that are supposed to be gossiped are propagated to connected peers of the Polkadot Host. While doing so the Polkadot Host should keep track of peers already aware of each transaction. This includes peers which have already gossiped the transaction to the node as well as those to whom the transaction has already been sent. This behavior is mandated to avoid resending duplicates and unnecessarily overloading the network. To that aim, the Polkadot Host should keep a transaction pool and a transaction queue defined as follows:

Definition 14. Transaction Queue

The Transaction Queue of a block producer node, formally referred to as \$TQ\$ is a data structure which stores the transactions ready to be included in a block sorted according to their priorities (Section 4.8.5). The Transaction Pool, formally referred to as \$TP\$, is a hash table in which the Polkadot Host keeps the list of all valid transactions not in the transaction queue.

Furthermore Algorithm 1 updates the transaction pool and the transaction queue according to the received message:

Algorithm 1 Validate-Transactions-and-Store

1:LDecSC(MT)L \leftarrow Dec_{SC}(M_T)

2:for all {TLTTQTTP}\{T \in L \mid T \notin TQ \mid T \notin TP\} do

3:BdB_d \leftarrow Head(Longest-Chain(BTBT))

4:NHn(Bd)N \leftarrow H_n(B_d)

5:RR \leftarrow Call-Runtime-Entry(TaggedTransactionQueue_validate_transaction,N,T\texttt{TaggedTransactionQueue\_validate\_transaction}, N, T)

6:if Valid(RR) then

7:if Requires(RR)T(TQ  Bii<d) \subset \bigcup_{\forall T \in (TQ~\cup~B_i \mid \exists i < d)} Provided-Tags(TT) then

8:Insert-At(TQ,T,TQ, T, Requires(RR),, Priority(RR))

9:else

10:Add-To(TP,TTP,T)

11:end if

12:Maintain-Transaction-Pool()

13:if ShouldPropagate(RR) then

14:Propagate(TT)

15:end if

16:end if

17:end for

where
  • \$M_T\$ is the transaction message (offchain transactions?)

  • \$"Dec"_(SC)\$ decodes the SCALE encoded message.

  • \$"Longest-Chain"\$ is defined in Definition 7.

  • \$tt "TaggedTransactionQueue_validate_transaction"\$ is a Runtime entrypoint specified in Section C.7.1 and \$Requires(R)\$, \$Priority(R)\$ and \$Propagate(R)\$ refer to the corresponding fields in the tuple returned by the entrypoint when it deems that \$T\$ is valid.

  • \$"Provided-Tags"(T)\$ is the list of tags that transaction \$T\$ provides. The Polkadot Host needs to keep track of tags that transaction \$T\$ provides as well as requires after validating it.

  • \$"Insert-At"(TQ,T,"Requires"(R),"Priority"(R))\$ places \$T\$ into \$TQ\$ approperietly such that the transactions providing the tags which \$T\$ requires or have higher priority than \$T\$ are ahead of \$T\$.

  • \$"Maintain-Transaction-Pool"\$ is described in Algorithm 2.

  • \$"ShouldPropagate"\$ indictes whether the transaction should be propagated based on the Propagate field in the ValidTransaction type as defined in Definition 225, which is returned by \$tt "TaggedTransactionQueue_validate_transaction"\$.

  • \$"Propagate"(T)\$ sends \$T\$ to all connected peers of the Polkadot Host who are not already aware of \$T\$.

Algorithm 2 Maintain-Transaction-Pool

1:Scan the pool for ready transactions

2:Move them to the transaction queue

3:Drop invalid transactions

This has not been defined yet.

2.3.3. Inherents

Inherents are unsigned extrinsics inserted into a block by the block author and as a result are not stored in the transaction pool or gossiped across the network. Instead they are generated by the Polkadot Host by passing the required inherent data, as listed in Table 1, to the Runtime method \$tt "BlockBuilder_inherent_extrinsics"\$ (Section C.6.3). Then the returned extrinsics should be included in the current block as explained in Algorithm 12.

Table 1. Inherent Data
Identifier Value Type Description

timstap0

Unsigned 64-bit integer

Unix epoch time (Definition 181)

Definition 15. Inherent Data

Inherent-Data is a hashtable (Definition 192), an array of key-value pairs consisting of the inherent 8-byte identifier and its value, representing the totality of inherent extrinsics included in each block. The entries of this hash table which are listed in Table 1 are collected or generated by the Polkadot Host and then handed to the Runtime for inclusion (Algorithm 12).

2.4. State Storage Trie

For storing the state of the system, Polkadot Host implements a hash table storage where the keys are used to access each data entry. There is no assumption either on the size of the key nor on the size of the data stored under them, besides the fact that they are byte arrays with specific upper limits on their length. The limit is imposed by the encoding algorithms to store the key and the value in the storage trie (Section A.2.2.1).

2.4.1. Accessing System Storage

The Polkadot Host implements various functions to facilitate access to the system storage for the Runtime (Section 2.6.1). Here we formalize the access to the storage when it is being directly accessed by the Polkadot Host (in contrast to Polkadot runtime).

Definition 16. Stored Value

The \$sf "StoredValue"\$ function retrieves the value stored under a specific key in the state storage and is formally defined as:

\$sf "StoredValue" ": "cc K -> cc V\$ \$k -> {(v,"if " (k,v), "exists in state storage"),(phi,,"otherwise"):}\$

where \$cc K sub bbb B\$ and \$cc V sub bbb B\$ are respectively the set of all keys and values stored in the state storage. \$cc V\$ can be an empty value.

2.4.2. General Structure

In order to ensure the integrity of the state of the system, the stored data needs to be re-arranged and hashed in a radix tree, which hereafter we refer to as the State Trie or just Trie. This rearrangment is necessary to be able to compute the Merkle hash of the whole or part of the state storage, consistently and efficiently at any given time.

The trie is used to compute the merkle root (Section 2.4.4) of the state, \$H_r\$ (Definition 10), whose purpose is to authenticate the validity of the state database. Thus, the Polkadot Host follows a rigorous encoding algorithm to compute the values stored in the trie nodes to ensure that the computed Merkle hash, \$H_r\$, matches across the Polkadot Host implementations.

The trie is a radix-16 tree (Definition 17). Each key value identifies a unique node in the tree. However, a node in a tree might or might not be associated with a key in the storage.

Definition 17. Radix-r Tree

A Radix-r tree is a variant of a trie in which:

  • Every node has at most \$r\$ children where \$r = 2^x\$ for some \$x\$;

  • Each node that is the only child of a parent, which does not represent a valid key is merged with its parent.

As a result, in a radix tree, any path whose interior vertices all have only one child and does not represent a valid key in the data set, is compressed into a single edge. This improves space efficiency when the key space is sparse.

When traversing the trie to a specific node, its key can be reconstructed by concatenating the subsequences of the keys which are stored either explicitly in the nodes on the path or implicitly in their position as a child of their parent.

To identify the node corresponding to a key value, \$k\$, first we need to encode \$k\$ in a way consistent with the trie structure. Because each node in the trie has at most 16 children, we represent the key as a sequence of 4-bit nibbles:

Definition 18. Key Encode

For the purpose of labeling the branches of the trie, the key \$k\$ is encoded to \$k_("enc")\$ using \$sf "KeyEncode"\$ functions:

\$k_("enc") := (k_("enc"_1), ..., k_("enc"_(2n))) := sf "KeyEncode"(k)\$

such that:

\$sf "KeyEncode": bbb B -> "Nibbles"^4\$ \$k |-> (k_("enc"_1),...,k_("enc"_(2n)))\$ \$(b_1,...,b_n) |-> (b_1^(1),b_1^2,b_2^1,b_2^2,...,b_n^1,b_n^2 )\$

where \$"Nibble"^4\$ is the set of all nibbles of 4-bit arrays and \$b_i^1\$ and \$b_i^2\$ are 4-bit nibbles, which are the big endian representations of \$b_i\$:

\$k_("enc"_i) := (b_i^1,b_i^2) := (b_i -: 16,b_i mod 16)\$

where \$mod\$ is the remainder and \$-:\$ is the integer division operators.

By looking at \$k_("enc")\$ as a sequence of nibbles, one can walk the radix tree to reach the node identifying the storage value of \$k\$.

2.4.3. Trie Structure

In this subsection, we specify the structure of the nodes in the trie as well as the trie structure:

Definition 19. Set of Nodes

We refer to the set of the nodes of Polkadot state trie by \$cc N\$. By \$N in cc N\$ to refer to an individual node in the trie.

Definition 20. State Trie

The state trie is a radix-16 tree (Definition 17). Each node in the trie is identified with a unique key \(k_N\) such that:

  • \$k_N\$ is the shared prefix of the key of all the descendants of \$N\$ in the trie.

and, at least one of the following statements holds:

  • \$(k_N, v)\$ corresponds to an existing entry in the State Storage.

  • \$N\$ has more than one child.

Conversely, if \$(k, v)\$ is an entry in the state trie then there is a node \$N in cc N\$ such that \$k_N = k\$.

Definition 21. Branch

A branch node \$N_b in cc N_b\$ is a node which has one child or more. A branch node can have at most 16 children. A leaf node \$N_l in cc N_l\$ is a childless node. Accordingly:

\$cc N_b := {N_b in cc N | N_b " is a branch node"}\$ \$cc N_l := {N_l in cc N | N_l " is a leaf node"}\$

For each node, part of \$k_N\$ is built while the trie is traversed from the root to \$N\$ and another part of \$k_N\$ is stored in \$N\$ (Definition 22).

Definition 22. Aggregated Prefix Key

For any \$N in cc N\$, its key \$k_N\$ is divided into an aggregated prefix key, \$"pk"_N^("Agr")\$, aggregated by Algorithm 3 and a partial key, \$"pk"_N\$ of length \$0 <= l_("pk"_N)\$ in nibbles such that:

\$"pk"_N := (k_("enc"_i),...,k_("enc"_(i+l_("pk"_N))))\$

where \$"pk"_N^("Agr")\$ is a prefix subsequence of \$k_N\$; \$i\$ is the length of \$"pk"_N^("Agr")\$ in nibbles and so we have:

\$sf "KeyEncode"(k_N) = "pk"_N^("Agr") || "pk"_N = (k_("enc"_1), ..., k_("enc"_(i-1)),k_("enc"_i),k_("enc"_(i+l_("pk"_N))))\$

Part of \$"pk"_N^("Agr")\$ is explicitly stored in \$N\$’s ancestors. Additionally, for each ancestor, a single nibble is implicitly derived while traversing from the ancestor to its child included in the traversal path using the \$"Index"_N\$ function (Definition 23).

Definition 23. Index

For \$N in cc N_b\$ and \$N_c\$ child of \$N\$, we define \$sf "Index"_N\$ function as:

\$sf "Index"_N: {N_C in cc N | N_c " is a child of " N} -> "Nibbles"_1^4\$ \$N_c -> i\$

such that

\$k_(N_c) = k_N || i || "pk"_(N_c)\$

Algorithm 3 Aggregate-Key

Require: PN(P_N \coloneqq (TrieRoot=N1,,Nj=N) = N_1, \dots, N_j = N)

1:pkNAgrϕpk^{Agr}_N \leftarrow \phi

2:i1i \leftarrow 1

3:for all NiPNN_i \in P_N do

4:pkNAgrpkNAgrpkNiIndexNi(Ni+1)pk^{Agr}_N \leftarrow pk^{Agr}_N || pk_{N_i} || \textrm{Index}_{N_i}(N_{i + 1})

5:end for

6:pkNAgrpkNAgrpkNipk^{Agr}_N \leftarrow pk^{Agr}_N || pk_{N_i}

7:return pkNAgrpk^{Agr}_N

Algorithm 3. Aggregate-Key

Assuming that \$P_N\$ is the path (Definition 2) from the trie root to node \$N\$, Algorithm 3 rigorously demonstrates how to build \$"pk"_N^("Agr")\$ while traversing \$P_N\$.

Definition 24. Node Value

A node \$N in cc N\$ stores the node value, \$v_N\$, which consists of the following concatenated data:

\$"Node Header"||"Partial Key"||"Node Subvalue"\$

Formally noted as:

\$v_N := "Head"_N||"Enc"_"HE"(pk_N)||sv_N\$
where
Definition 25. Node Header

The node header, consisting of \$>= 1\$ bytes, \$N_1...N_n\$, specifies the node variant and the partial key length (Definition 22). Both pieces of information can be represented in bits within a single byte, \$N_1\$, where the amount of bits of the variant, \$v\$, and the bits of the partial key length, \$p_l\$ varies.

\$v = { (01, "Leaf", p_l = 2^6), (10, "Branch Node with " k_N !in cc K, p_l = 2^6), (11, "Branch Node with " k_N in cc K, p_l = 2^6), (001, "Leaf containing a hashed subvalue", p_l = 2^5), (0001, "Branch containing a hashed subvalue", p_l = 2^4), (0000 0000, "Empty", p_l = 0), (0000 0001, "Reserved for compact encoding",) :}\$

If the value of \$p_l\$ is equal to the maximum possible value the bits can hold, such as 63 (\$2^6-1\$) in case of the \$01\$ variant, then the value of the next 8 bits (\$N_2\$) are added the the length. This process is repeated for every \$N_n\$ where \$N_n = 2^8-1\$. Any value smaller than the maximum possible value of \$N_n\$ implies that the next value of \$N_(n+1)\$ should not be added to the length. The hashed subvalue for variants \$001\$ and \$0001\$ is described in Definition 28.

Formally, the length of the partial key, \$"pk"_N^l\$, is defined as:

\$"pk"_N^l = p_l + N_n + N_(n+x) + ... + N_(n+x+y)\$

as long as \$p_l = m\$, \$N_(n+x) = 2^8-1\$ and \$N_(n+x+y) < 2^8-1\$, where \$m\$ is the maximum possible value that \$p_l\$ can hold.

2.4.4. Merkle Proof

To prove the consistency of the state storage across the network and its modifications both efficiently and effectively, the trie implements a Merkle tree structure. The hash value corresponding to each node needs to be computed rigorously to make the inter-implementation data integrity possible.

The Merkle value of each node should depend on the Merkle value of all its children as well as on its corresponding data in the state storage. This recursive dependency is encompassed into the subvalue part of the node value which recursively depends on the Merkle value of its children. Additionally, as Section 2.5.1 clarifies, the Merkle proof of each child trie must be updated first before the final Polkadot state root can be calculated.

We use the auxiliary function introduced in Definition 26 to encode and decode information stored in a branch node.

Definition 26. Children Bitmap

Suppose \$N_b, N_c in cc N\$ and \$N_c\$ is a child of \$N_b\$. We define bit \$b_i : = 1\$ if and only if \$N\$ has a child with partial key \$i\$, therefore we define ChildrenBitmap functions as follows:

\$"ChildrenBitmap:"\$ \$cc N_b -> bbb B_2\$ \$N -> (b_(15), ...,b_8,b_7,...,b_0)_2\$

where

\$b_i := {(1, EE N_c in cc N: k_(N_c) = k_(N_b)||i||pk_(N_c)),(0, "otherwise"):}\$
Definition 27. Subvalue

For a given node \$N\$, the subvalue of \$N\$, formally referred to as \$sv_N\$, is determined as follows:

\$sv_N := {("StoredValue"_("SC")),("Enc"_("SC")("ChildrenBitmap"(N)||"StoredValue"_("SC")||"Enc"_("SC")(H(N_(C_1))),...,"Enc"_("SC")(H(N_(C_n))))):}\$

where the first variant is a leaf node and the second variant is a branch node.

\$"StoredValue"_("SC") := {("Enc"_("SC")("StoredValue"(k_N)),"if StoredValue"(k_N) = v),(phi,"if StoredValue"(k_N) = phi):}\$

\$N_(C_1) ... N_(C_n)\$ with \$n <= 16\$ are the children nodes of the branch node \$N\$.

The trie deviates from a traditional Merkle tree in that the node value (Definition 24), \$v_N\$, is presented instead of its hash if it occupies less space than its hash.

Definition 28. Hashed Subvalue

To increase performance, a merkle proof can be generated by inserting the hash of a value into the trie rather than the value itself (which can be quite large). If merkle proof computation with node hashing is explicitly executed via the Host API (Section B.2.8.2), then any value larger than 32 bytes is hashed, resulting in that hash being used as the subvalue (Definition 27) under the corresponding key. The node header must specify the variant \$001\$ and \$0001\$ respectively for leaves containing a hash as their subvalue and for branches containing a hash as their subvalue (Definition 25).

Definition 29. Merkle Value

For a given node \$N\$, the Merkle value of \$N\$, denoted by \$H(N)\$ is defined as follows:

\$H: bbb B -> U_(i -> 0)^(32) bbb B_32\$ \$H(N): {(v_N,||v_N|| < 32 " and " N != R),("Blake2b"(v_n),||v_N|| >= 32 " or " N = R):}\$

Where \$v_N\$ is the node value of \$N\$ (Definition 24) and \$R\$ is the root of the trie. The Merkle hash of the trie is defined to be \(H(R)\).

2.4.5. Managing Multiple Variants of State

Unless a node is committed to only update its state according to the finalized block (Definition 90), it is inevitable for the node to store multiple variants of the state (one for each block). This is, for example, necessary for nodes participating in the block production and finalization.

While the state trie structure (Section 2.4.3) facilitates and optimizes storing and switching between multiple variants of the state storage, the Polkadot Host does not specify how a node is required to accomplish this task. Instead, the Polkadot Host is required to implement \$"Set-State-At"\$ (Definition 30):

Definition 30. Set State At Block

The function:

\$"Set-State-At"(B)\$

in which \$B\$ is a block in the block tree (Definition 4), sets the content of state storage equal to the resulting state of executing all extrinsics contained in the branch of the block tree from genesis till block B including those recorded in Block \$B\$.

For the definition of the state storage see Section 2.4.

2.5. Child Storage

As clarified in Section 2.4, the Polkadot state storage implements a hash table for inserting and reading key-value entries. The child storage works the same way but is stored in a separate and isolated environment. Entries in the child storage are not directly accessible via querying the main state storage.

The Polkadot Host supports as many child storages as required by Runtime and identifies each separate child storage by its unique identifying key. Child storages are usually used in situations where Runtime deals with multiple instances of a certain type of objects such as Parachains or Smart Contracts. In such cases, the execution of the Runtime entrypoint might result in generating repeated keys across multiple instances of certain objects. Even with repeated keys, all such instances of key-value pairs must be able to be stored within the Polkadot state.

In these situations, the child storage can be used to provide the isolation necessary to prevent any undesired interference between the state of separated instances. The Polkadot Host makes no assumptions about how child storages are used, but provides the functionality for it via the Host API (Section B.3).

2.5.1. Child Tries

The child trie specification is the same as the one described in Section 2.4.3. Child tries have their own isolated environment. Nonetheless, the main Polkadot state trie depends on them by storing a node (\$K_N, V_N\$) which corresponds to an individual child trie. Here, \$K_N\$ is the child storage key associated to the child trie, and \$V_N\$ is the Merkle value of its corresponding child trie computed according to the procedure described in Section 2.4.4.

The Polkadot Host API (Section B.3) allows the Runtime to provide the key \$K_N\$ in order to identify the child trie, followed by a second key in order to identify the value within that child trie. Every time a child trie is modified, the Merkle proof \$V_N\$ of the child trie stored in the Polkadot state must be updated first. After that, the final Merkle proof of the Polkadot state can be computed. This mechanism provides a proof of the full Polkadot state including all its child states.

2.6. Runtime Interactions

Like any transaction-based transition system, Polkadot’s state is changed by executing an ordered set of instructions. These instructions are known as extrinsics. In Polkadot, the execution logic of the state transition function is encapsulated in a Runtime (Definition 1). For easy upgradability this Runtime is presented as a Wasm blob. Nonetheless, the Polkadot Host needs to be in constant interaction with the Runtime (Section 2.6.1).

In Section 2.3, we specify the procedure of the process where the extrinsics are submitted, pre-processed and validated by Runtime and queued to be applied to the current state.

To make state replication feasible, Polkadot journals and batches series of its extrinsics together into a structure known as a block, before propagating them to other nodes, similar to most other prominent distributed ledger systems. The specification of the Polkadot block as well as the process of verifying its validity are both explained in Section 2.2.

2.6.1. Interacting with the Runtime

The Runtime (Definition 1) is the code implementing the logic of the chain. This code is decoupled from the Polkadot Host to make the the logic of the chain easily upgradable without the need to upgrade the Polkadot Host itself. The general procedure to interact with the Runtime is described by Algorithm 4.

Algorithm 4 Interact-With-Runtime

Require: F,Hb(B),(A1,,An)F, H_b(B),(A_1,\ldots,A_n)

1:SB\mathcal{S}_B \leftarrow Set-State-At(Hb(B)H_b(B))

2:AEncSC((A1,,An))A \leftarrow Enc_{SC}((A_1, \ldots, A_n))

3:Call-Runtime-Entrypoint(RB,REB,F,A,AlenR_B, \mathcal{RE}_B, F, A, A_{len})

where
  • \$F\$ is the runtime entrypoint call.

  • \$H_b(B)\$ is the block hash indicating the state at the end of \$B\$.

  • \$A_1,...,A_n\$ are arguments to be passed to the runtime entrypoint.

In this section, we describe the details upon which the Polkadot Host is interacting with the Runtime. In particular, \$"Set-State-At"\$ and \$"Call-Runtime-Entrypoint"\$ procedures called by Algorithm 4 are explained in Definition 32 and Definition 30 respectively. \$R_B\$ is the Runtime code loaded from \$S_B\$, as described in Definition 31, and \$RE_B\$ is the Polkadot Host API, as described in Definition 201.

2.6.2. Loading the Runtime Code

The Polkadot Host expects to receive the code for the Runtime of the chain as a compiled WebAssembly (Wasm) Blob. The current runtime is stored in the state database under the key represented as a byte array:

\$b := "3A,63,6F,64,65"\$

which is the ASCII byte representation of the string :code (Section A.3). As a result of storing the Runtime as part of the state, the Runtime code itself becomes state sensitive and calls to Runtime can change the Runtime code itself. Therefore the Polkadot Host needs to always make sure to provide the Runtime corresponding to the state in which the entrypoint has been called. Accordingly, we define \$R_B\$ (Definition 31).

The initial Runtime code of the chain is provided as part of the genesis state (Section A.3) and subsequent calls to the Runtime have the ability to, in turn, upgrade the Runtime by replacing this Wasm blob with the help of the storage API (Section B.2). Therefore, the executor must always load the latest Runtime from storage - or preferably detect Runtime upgrades (Definition 11) - either based on the parent block when importing blocks or the best/highest block when creating new blocks.

Definition 31. Runtime Code at State

By \$R_B\$, we refer to the Runtime code stored in the state storage at the end of the execution of block \$B\$.

2.6.3. Code Executor

The Polkadot Host executes the calls of Runtime entrypoints inside a Wasm Virtual Machine (VM), which in turn provides the Runtime with access to the Polkadot Host API. This part of the Polkadot Host is referred to as the Executor.

Definition 32 introduces the notation for calling the runtime entrypoint which is used whenever an algorithm of the Polkadot Host needs to access the runtime.

It is acceptable behavior that the Runtime panics during execution of a function in order to indicate an error. The Polkadot Host must be able to catch that panic and recover from it.

In this section, we specify the general setup for an Executor that calls into the Runtime. In Appendix C we specify the parameters and return values for each Runtime entrypoint separately.

By

\$"Call-Runtime-Entrypoint"(R,RE,"Runtime-Entrypoint",A,A_len)\$

we refer to the task using the executor to invoke the while passing an \$A_1, ..., A_n\$ argument to it and using the encoding described in Section 2.6.3.2.

2.6.3.1. Memory Management

The Polkadot Host is responsible for managing the WASM heap memory starting at the exported symbol as a part of implementing the allocator Host API (Section B.9) and the same allocator should be used for any other heap allocation to be used by the Polkadot Runtime.

The size of the provided WASM memory should be based on the value of the storage key (an unsigned 64-bit integer), where each page has the size of 64KB. This memory should be made available to the Polkadot Runtime for import under the symbol name memory.

2.6.3.2. Sending Data to a Runtime Entrypoint

In general, all data exchanged between the Polkadot Host and the Runtime is encoded using SCALE codec described in Section A.2.2. Therefore all runtime entrypoints have the following identical Wasm function signatures:

(func $runtime_entrypoint (param $data i32) (param $len i32) (result i64))

In each invocation of a Runtime entrypoints, the argument(s) which are supposed to be sent to the entrypoint, need to be SCALE encoded into a byte array \$B\$ (Section A.2.2) and copied into a section of Wasm shared memory managed by the shared allocator described in Section 2.6.3.1.

When the Wasm method, corresponding to the entrypoint, is invoked, two integers are passed as arguments. The first argument is set to the memory address of the byte array \$B\$ in Wasm memory. The second argument sets the length of the encoded data stored in \$B\$.

2.6.3.3. Receiving Data from a Runtime Entrypoint

The value which is returned from the invocation is an integer, representing two consecutive integers in which the least significant one indicates the pointer to the offset of the result returned by the entrypoint encoded in SCALE codec in the memory buffer. The most significant one provides the size of the blob.

2.6.3.4. Runtime Version Custom Section

For newer Runtimes, the Runtime version (Section C.4.1) can be read directly from the Wasm custom section with the name runtime_version. The content is a SCALE encoded structure as described in Section C.4.1.

Retrieving the Runtime version this way is preferred over calling the Core_version entrypoint since it involves significantly less overhead.

3. Synchronization

Many applications that interact with the Polkadot network to some extent must be able to retrieve certain information about the network. Depending on the utility, this includes validators that interact with Polkadot’s consensus and need access to the full state, either from the past or just the most up-to-date state, or light clients that are only interest in the minimum information required in order to verify some claims about the state of the network, such as the balance of a specific account. To allow implemenations to quickly retrieve the required information, different types of synchronization protocols are available, respectivel Full, Fast and Warp sync suited for different needs.

The associated network messages are specified in Section 4.8.

3.1. Warp Sync

Warp sync (Section 4.8.4) only downloads the block headers where authority set changes occurred, so called fragments (Definition 41), and by verifying the GRANDPA justifications (Definition 45). This protocols allows nodes to arrive at the desired state much faster than fast sync.

3.2. Fast Sync

Fast sync works by downloading the block header history and validating the auhtority set changes (Section 3.3.1) in order to arrive at a specific (usually the most recent) header. After the desired header has been reached and verified, the state can be downloaded and imported (Section 4.8.3). Once this process has been completed, the node can proceed with a full sync.

3.3. Full Sync

The full sync protocols is the "default" protocol that’s suited for many types of implementations, such as archive nodes (nodes that store everything), validators that participate in Polkadots consensus and light clients that only verify claims about the state of the network. Full sync works by listening to announced blocks (Section 4.8.1) and requesting the blocks from the announcing peers, or just the block headers in case of light clients.

The full sync protocol usually downloads the entire chain, but no such requirements must be met. If an implemenation only wants the latest, finalized state, it can combine it with protocols such as fast sync (Section 3.2) and/or warp sync (Section 3.1) to make synchronization as fast as possible.

3.3.1. Consensus Authority Set

Because Polkadot is a proof-of-stake protocol, each of its consensus engines has its own set of nodes represented by known public keys, which have the authority to influence the protocol in pre-defined ways explained in this Section. To verify the validity of each block, the Polkadot node must track the current list of authorities (Definition 33) for that block.

Definition 33. Authority List

The authority list of block \$B\$ for consensus engine \$C\$ noted as \$"Auth"_C(B)\$ is an array that contains the following pair of types for each of its authorities \$A in "Auth"_C(B)\$:

\$(pk_A,w_A)\$

\$pk_A\$ is the session public key (Definition 180) of authority \$A\$. And \$w_A\$ is an unsigned 64-bit integer indicating the authority weight. The value of \$"Auth"_C(B)\$ is part of the Polkadot state. The value for \$"Auth"_C(B_0)\$ is set in the genesis state (Section A.3) and can be retrieved using a runtime entrypoint corresponding to consensus engine \$C\$.

The authorities and their corresponding weights can be retrieved from the Runtime (Section C.10.1).

3.3.2. Runtime-to-Consensus Engine Message

The authority list (Definition 33) is part of the Polkadot state and the Runtime has the authority to update this list in the course of any state transitions. The Runtime informs the corresponding consensus engine about the changes in the authority set by adding the appropriate consensus message in the form of a digest item (Definition 11) to the block header of block \$B\$ which caused the transition in the authority set.

The Polkadot Host must inspect the digest header of each block and delegate consensus messages to their consensus engines. The BABE and GRANDPA consensus engine must react based on the type of consensus messages it receives. The active GRANDPA authorities can only vote for blocks that occurred after the finalized block in which they were selected. Any votes for blocks before the came into effect would get rejected.

3.4. Importing and Validating Block

Block validation is the process by which a node asserts that a block is fit to be added to the blockchain. This means that the block is consistent with the current state of the system and transitions to a new valid state.

New blocks can be received by the Polkadot Host via other peers (Section 4.8.2) or from the Host’s own consensus engine (Chapter 5). Both the Runtime and the Polkadot Host then need to work together to assure block validity. A block is deemed valid if the block author had authorship rights for the slot in which the block was produce as well as if the transactions in the block constitute a valid transition of states. The former criterion is validated by the Polkadot Host according to the block production consensus protocol. The latter can be verified by the Polkadot Host invoking entry into the Runtime as (Section C.4.2) as a part of the validation process. Any state changes created by this function on successful execution are persisted.

The Polkadot Host implements Algorithm 5 to assure the validity of the block.

Algorithm 5 Import-and-Validate-Block

Require: B,Just(B)B, \text{Just}(B)

1:Set-Storage-State-At(P(B)P(B))

2:if Just(B)\text{Just}(B) \neq \emptyset then

3:Verify-Block-Justification(B,Just(B)B, \text{Just}(B))

4:if B is Finalized and P(B) is not FinalizedB~\textbf{is}~\text{Finalized}~\textbf{and}~P(B)~\textbf{is not}~\text{Finalized} then

5:Mark-as-Final(P(B)P(B))

6:end if

7:end if

8:if Hp(B)PBTH_p(B) \notin PBT then

9:return

10:end if

11:Verify-Authorship-Right(Head(B)\text{Head}(B))

12:BB \leftarrow Remove-Seal(BB)

13:RR \leftarrow Call-Runtime-Entry(Core_execute_block,B\texttt{Core\_execute\_block}, B)

14:BB \leftarrow Add-Seal(BB)

15:if R=R = True then

16:Persist-State()

17:end if

where
  • \$"Remove-Seal"\$ removes the Seal digest from the block (Definition 11) before submitting it to the Runtime.

  • \$"Add-Seal"\$ adds the Seal digest back to the block (Definition 11) for later propagation.

  • \$"Persist-State"\$ implies the persistence of any state changes created by \$tt "Core_execute_block"\$ (Section C.4.2) on successful execution.

  • \$"PBT"\$ is the pruned block tree (Definition 4).

  • \$"Verify-Authorship-Right"\$ is part of the block production consensus protocol and is described in Algorithm 10.

  • Finalized block and finality are defined in Chapter 6.

4. Networking

This chapter in its current form is incomplete and considered work in progress. Authors appreciate receiving request for clarification or any reports regarding deviation from the current Polkadot network protocol. This can be done through filing an issue in Polkadot Specification repository.

4.1. Introduction

The Polkadot network is decentralized and does not rely on any central authority or entity for achieving its fullest potential of provided functionality. The networking protocol is based on a family of open protocols, including protocol implemented libp2p e.g. the distributed Kademlia hash table which is used for peer discovery.

This chapter walks through the behavior of the networking implementation of the Polkadot Host and defines the network messages. The implementation details of the libp2p protocols used are specified in external sources as described in Section 4.2

4.2. External Documentation

Complete specification of the Polkadot networking protocol relies on the following external protocols:

  • libp2p - libp2p is a modular peer-to-peer networking stack composed of many modules and different parts. includes the multiplexing protocols and .

  • libp2p addressing - The Polkadot Host uses the libp2p addressing system to identify and connect to peers.

  • Kademlia - Kademlia is a distributed hash table for decentralized peer-to-peer networks. The Polkadot Host uses Kademlia for peer discovery.

  • Noise - The Noise protocol is a framework for building cryptographic protocols. The Polkadot Host uses Noise to establish the encryption layer to remote peers.

  • yamux - yamux is a multiplexing protocol developed by HashiCorp. It is the de-facto standard for the Polkadot Host. Section 4.7 describes the subprotocol in more detail.

  • Protocol Buffers - Protocol Buffers is a language-neutral, platform-neutral mechanism for serializing structured data and is developed by Google. The Polkadot Host uses Protocol Buffers to serialize specific messages, as clarified in Section 4.8.

4.3. Node Identities

Each Polkadot Host node maintains an ED25519 key pair which is used to identify the node. The public key is shared with the rest of the network allowing the nodes to establish secure communication channels.

Each node must have its own unique ED25519 key pair. If two or more nodes use the same key, the network will interpret those nodes as a single node, which will result in unspecified behavior. Furthermore, the node’s PeerId as defined in Definition 34 is derived from its public key. PeerId is used to identify each node when they are discovered in the course of the discovery mechanism described in Section 4.4.

Definition 34. PeerId

The Polkadot node’s PeerId, formally referred to as \$P_(id)\$, is derived from the ED25519 public key and is structured based on the libp2p specification, but does not fully conform to the specification. Specifically, it does not support CID and the only supported key type is ED25519.

The byte representation of the PeerId is always of the following bytes in this exact order:

\$b_0 = 0\$ \$b_1 = 36\$ \$b_2 = 8\$ \$b_3 = 1\$ \$b_4 = 18\$ \$b_5 = 32\$ \$b_(6..37) = ...\$

where

  • \$b_0\$ is the multihash prefix of value \$0\$ (implying no hashing is used).

  • \$b_1\$ the length of the PeerId (remaining bytes).

  • \$b_2\$ and \$b_3\$ are a protobuf encoded field-value pair indicating the used key type (field \$1\$ of value \$1\$ implies ED25519).

  • \$b_4\$, \$b_5\$ and \$b_(6..37)\$ are a protobuf encoded field-value pair where \$b_5\$ indicates the length of the public key followed by the the raw ED25519 public key itself, which varies for each Polkadot Host and is always 32 bytes (field \$2\$ contains the public key, which has a field value length prefix).

4.4. Discovery mechanism

The Polkadot Host uses various mechanisms to find peers within the network, to establish and maintain a list of peers and to share that list with other peers from the network as follows:

  • Bootstrap nodes are hard-coded node identities and addresses provided by the genesis state (Section A.3).

  • mDNS is a protocol that performs a broadcast to the local network. Nodes that might be listening can respond to the broadcast. The libp2p mDNS specification defines this process in more detail. This protocol is an optional implementation detail for Polkadot Host implementers and is not required to participate in the Polkadot network.

  • Kademlia requests invoking Kademlia requests, where nodes respond with their list of available peers. Kademlia requests are performed on a specific substream as described in Section 4.7.

4.5. Connection establishment

Polkadot nodes connect to peers by establishing a TCP connection. Once established, the node initiates a handshake with the remote peers on the encryption layer. An additional layer on top of the encryption layer, known as the multiplexing layer, allows a connection to be split into substreams, as described by the yamux specification, either by the local or remote node.

The Polkadot node supports two types of substream protocols. Section 4.7 describes the usage of each type in more detail:

  • Request-Response substreams: After the protocol is negotiated by the multiplexing layer, the initiator sends a single message containing a request. The responder then sends a response, after which the substream is then immediately closed. The requests and responses are prefixed with their LEB128 encoded length.

  • Notification substreams. After the protocol is negotiated, the initiator sends a single handshake message. The responder can then either accept the substream by sending its own handshake or reject it by closing the substream. After the substream has been accepted, the initiator can send an unbound number of individual messages. The responder keeps its sending side of the substream open, despite not sending anything anymore, and can later close it in order to signal to the initiator that it no longer wishes to communicate.

    Handshakes and messages are prefixed with their LEB128 encoded lengths. A handshake can be empty, in which case the length prefix would be 0.

Connections are established by using the following protocols:

  • /noise - a protocol that is announced when a connection to a peer is established.

  • /multistream/1.0.0 - a protocol that is announced when negotiating an encryption protocol or a substream.

  • /yamux/1.0.0 - a protocol used during yamux negotiation. See Section 4.7 for more information.

The Polkadot Host can establish a connection with any peer of which it knows the address. The Polkadot Host supports multiple networking protocols:

  • TCP/IP with addresses in the form of /ip4/1.2.3.4/tcp/30333 to establish a TCP connection and negotiate encryption and a multiplexing layer.

  • WebSocket with addresses in the form of /ip4/1.2.3.4/tcp/30333/ws to establish a TCP connection and negotiate the WebSocket protocol within the connection. Additionally, encryption and multiplexing layer is negotiated within the WebSocket connection.

  • DNS addresses in form of /dns/example.com/tcp/30333 and /dns/example.com/tcp/30333/ws.

The addressing system is described in the libp2p addressing specification. After a base-layer protocol is established, the Polkadot Host will apply the Noise protocol to establish the encryption layer as described in Section 4.6.

4.6. Encryption Layer

Polkadot protocol uses the libp2p Noise framework to build an encryption protocol. The Noise protocol is a framework for building encryption protocols. libp2p utilizes that protocol for establishing encrypted communication channels. Refer to the libp2p Secure Channel Handshake specification for a detailed description.

Polkadot nodes use the XX handshake pattern to establish a connection between peers. The three following steps are required to complete the handshake process:

  1. The initiator generates a keypair and sends the public key to the responder. The Noise specification and the libp2p PeerId specification describe keypairs in more detail.

  2. The responder generates its own key pair and sends its public key back to the initiator. After that, the responder derives a shared secret and uses it to encrypt all further communication. The responder now sends its static Noise public key (which may change anytime and does not need to be persisted on disk), its libp2p public key and a signature of the static Noise public key signed with the libp2p public key.

  3. The initiator derives a shared secret and uses it to encrypt all further communication. It also sends its static Noise public key, libp2p public key and signature to the responder.

After these three steps, both the initiator and responder derive a new shared secret using the static and session-defined Noise keys, which are used to encrypt all further communication.

4.7. Protocols and Substreams

After the node establishes a connection with a peer, the use of multiplexing allows the Polkadot Host to open substreams. libp2p uses the yamux protocol to manage substreams and to allow the negotiation of application-specific protocols, where each protocol serves a specific utility.

The Polkadot Host uses multiple substreams whose usage depends on a specific purpose. Each substream is either a Request-Response substream or a Notification substream, as described in Section 4.5.

The prefixes on those substreams are known as protocol identifiers and are used to segregate communications to specific networks. This prevents any interference with other networks. dot is used exclusively for Polkadot. Kusama, for example, uses the protocol identifier ksmcc3.
  • /ipfs/ping/1.0.0 - Open a standardized substream libp2p to a peer and initialize a ping to verify if a connection is still alive. If the peer does not respond, the connection is dropped. This is a Request-Response substream.

    Further specification and reference implementation are available in the libp2p documentation.

  • /ipfs/id/1.0.0 - Open a standardized libp2p substream to a peer to ask for information about that peer. This is a Request-Response substream, but the initiator does not send any message to the responder and only waits for the response.

    Further specification and reference implementation are available in the libp2p documentation.

  • /dot/kad - Open a standardized substream for Kademlia FIND_NODE requests. This is a Request-Response substream, as defined by the libp2p standard.

    Further specification and reference implementation are available on Wikipedia respectively the golang Github repository.

  • /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/light/2 - a request and response protocol that allows a light client to request information about the state. This is a Request-Response substream.

    The messages are specified in Section 7.4.

    For backwards compatibility reasons, /dot/light/2 is also a valid substream for those messages.
  • /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/block-announces/1 - a substream/notification protocol which sends blocks to connected peers. This is a Notification substream.

    The messages are specified in Section 4.8.1.

    For backwards compatibility reasons, /dot/block-announces/1 is also a valid substream for those messages.
  • /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/sync/2 - a request and response protocol that allows the Polkadot Host to request information about blocks. This is a Request-Response substream.

    The messages are specified in Section 4.8.2.

    For backwards compatibility reasons, /dot/sync/2 is also a valid substream for those messages.
  • /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/sync/warp - a request and response protocol that allows the Polkadot Host to perform a warp sync request. This is a Request-Response substream.

    The messages are specified in Section 4.8.4.

    For backwards compatibility reasons, /dot/sync/warp is also a valid substream for those messages.
  • /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/transactions/1 - a substream/notification protocol which sends transactions to connected peers. This is a Notification substream.

    The messages are specified in Section 4.8.5.

    For backwards compatibility reasons, /dot/transactions/1 is also a valid substream for those messages.
  • /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/grandpa/1 - a substream/notification protocol that sends GRANDPA votes to connected peers. This is a Notification substream.

    The messages are specified in Section 4.8.6.

    For backwards compatibility reasons, /paritytech/grandpa/1 is also a valid substream for those messages.
  • /91b171bb158e2d3848fa23a9f1c25182fb8e20313b2c1eb49219da7a70ce90c3/beefy/1 - a substream/notification protocol which sends signed BEEFY statements, as described in Section 6.7, to connected peers. This is a Notification substream.

    The messages are specified in Section 4.8.7.

    For backwards compatibility reasons, /paritytech/beefy/1 is also a valid substream for those messages.

4.8. Network Messages

The Polkadot Host must actively communicate with the network in order to participate in the validation process or act as a full node.

The Polkadot network originally only used SCALE encoding for all message formats. Meanwhile, Protobuf has been adopted for certain messages. The encoding of each listed message is always SCALE encoded unless Protobuf is explicitly mentioned. Encoding and message formats are subject to change.

4.8.1. Announcing blocks

When the node creates or receives a new block, it must be announced to the network. Other nodes within the network will track this announcement and can request information about this block. The mechanism for tracking announcements and requesting the required data is implementation-specific.

Block announcements, requests and responses are sent over the substream as described in Definition 35.

The BlockAnnounceHandshake initializes a substream to a remote peer. Once established, all BlockAnounce messages (Definition 36) created by the node are sent to the /dot/block-announces/1 substream.

The BlockAnnounceHandshake is a structure of the following format:

\$BA_h = "Enc"_("SC")(R, N_B, h_B, h_G)\$

where:

\$R = {(1,"The node is a full node"),(2,"The node is a light client"),(4,"The node is a validator"):}\$ \$N_B = "Best block number according to the node"\$ \$h_B = "Best block hash according to the node"\$ \$h_G = "Genesis block hash according to the node"\$
Definition 36. Block Announce

The BlockAnnounce message is sent to the specified substream and indicates to remote peers that the node has either created or received a new block.

The message is a structure of the following format:

\$BA = "Enc"_("SC")("Head"(B),b)\$

where:

\$"Head"(B) = "Header of the announced block"\$ \$b = {(0,"Is not part of the best chain"),(1,"Is the best block according to the node"):}\$

4.8.2. Requesting Blocks

Block requests can be used to retrieve a range of blocks from peers. Those messages are sent over the /dot/sync/2 substream.

Definition 37. Block Request

The BlockRequest message is a Protobuf serialized structure of the following format:

Type Id Description Value

uint32

1

Bits of block data to request

\$B_f\$

oneof

Start from this block

\$B_s\$

Direction

5

Sequence direction, interpreted as Id 0 (ascending) if missing.

uint32

6

Maximum amount (optional)

\$B_m\$

where
  • \$B_f\$ indicates all the fields that should be included in the request. its big-endian encoded bitmask that applies to all desired fields with bitwise OR operations. For example, the \$B_f\$ value to request Header and Justification is 0001 0001 (17).

    Field Value

    Header

    0000 0001

    Body

    0000 0010

    Justification

    0001 0000

  • \$B_s\$ is a Protobuf structure indicating a varying data type (enum) of the following values:

    Type Id Description

    bytes

    2

    The block hash

    bytes

    3

    The block number

  • Direction is a Protobuf structure indicating the sequence direction of the requested blocks. The structure is a varying data type (enum) of the following format:

    Id Description

    0

    Enumerate in ascending order (from child to parent)

    1

    Enumerate in descending order (from parent to canonical child)

  • \$B_m\$ is the number of blocks to be returned. An implementation defined maximum is used when unspecified.

Definition 38. Block Response

The BlockResponse message is received after sending a BlockRequest message to a peer. The message is a Protobuf serialized structure of the following format:

Type Id Description

Repeated BlockData

1

Block data for the requested sequence

where BlockData is a Protobuf structure containing the requested blocks. Do note that the optional values are either present or absent depending on the requested fields (bitmask value). The structure has the following format:

Type Id Description Value

bytes

1

Block header hash

Definition 12

bytes

2

Block header (optional)

Definition 10

repeated bytes

3

Block body (optional)

Definition 13

bytes

4

Block receipt (optional)

bytes

5

Block message queue (optional)

bytes

6

Justification (optional)

Definition 78

bool

7

Indicates whether the justification is empty (i.e. should be ignored)

4.8.3. Requesting States

The Polkadot Host can request the state in form of a key/value list at a specified block.

When receiving state entries from the state response messages (Definition 40), the node can verify the entries with the entry proof (id 1 in KeyValueStorage) against the merkle root in the block header (of the block specified in Definition 39). Once the state response message claims that all entries have been sent (id 3 in KeyValueStorage), the node can use all collected entry proofs and validate it against the merkle root to confirm that claim.

See the the synchronization chapter for more information (Chapter 3).

Definition 39. State Request

A state request is sent to a peer to request the state at a specified block. The message is a single 32-byte Blake2 hash which indicates the block from which the sync should start.

Depending on what substream is used, he remote peer either sends back a state response (Definition 40) on the /dot/sync/2 substream or a warp sync proof (Definition 41) on the /dot/sync/warp.

Definition 40. State Response

The state response is sent to the peer that initialized the state request (Definition 39) and contains a list of key/value entries with an associated proof. This response is sent continuously until all key/value pairs have been submitted.

Type Id Description

repeated KeyValueStateEntry

1

State entries

bytes

2

State proof

where KeyValueStateEntry is of the following format:

Type Id Description

bytes

1

Root of the entry, empty if top level

repeated StateEntry

2

Collection of key/values

bool

3

Equal 'true' if there are no more keys to return.

and StateEntry:

Type Id Description

bytes

1

The key of the entry

bytes

2

The value of the entry

4.8.4. Warp Sync

The warp sync protocols allows nodes to retrieve blocks from remote peers where authority set changes occurred. This can be used to speed up synchronization to the latest state.

See the the synchronization chapter for more information (Chapter 3).

Definition 41. Warp Sync Proof

The warp sync proof message, \$P\$, is sent to the peer that initialized the state request (Definition 39) on the /dot/sync/warp substream and contains accumulated proof of multiple authority set changes (Section 3.3.2). It’s a datastructure of the following format:

\$P = (f_x...f_y, c)\$

\$f_x...f_y\$ is an array consisting of warp sync fragments of the following format:

\$f_x = (B_h, J^(r,"stage")(B))\$

where \$B_h\$ is the last block header containing a digest item (Definition 11) signaling an authority set change from which the next authority set change can be fetched from. \$J^(r,"stage")(B)\$ is the GRANDPA justification (Definition 78) and \$c\$ is a boolean that indicates whether the warp sync has been completed.

4.8.5. Transactions

Transactions (Section 2.3) are sent directly to peers with which the Polkadot Host has an open transaction substream (Definition 42). Polkadot Host implementers should implement a mechanism that only sends a transaction once to each peer and avoids sending duplicates. Sending duplicate transactions might result in undefined consequences such as being blocked for bad behavior by peers.

The mechanism for managing transactions is further described in Section Section 2.3.

Definition 42. Transaction Message

The transactions message is the structure of how the transactions are sent over the network. It is represented by \$M_T\$ and is defined as follows:

\$M_T := "Enc"_("SC")(C_1,...,C_n)\$

in which:

\$C_i := "Enc"_("SC")(E_i)\$

Where each \$E_i\$ is a byte array and represents a separate extrinsic. The Polkadot Host is agnostic about the content of an extrinsic and treats it as a blob of data.

Transactions are sent over the /dot/transactions/1 substream.

4.8.6. GRANDPA Messages

The exchange of GRANDPA messages is conducted on the substream. The process for the creation and distributing these messages is described in Chapter 6. The underlying messages are specified in this section.

Definition 43. Grandpa Gossip Message

A GRANDPA gossip message, \$M\$, is a varying datatype (Definition 188) which identifies the message type that is cast by a voter followed by the message itself.

\$M = {(0,"Vote message", V_m),(1,"Commit message", C_m),(2,"Neighbor message", N_m),(3,"Catch-up request message",R_m),(4,"Catch-up message",U_m):}\$
where
Definition 44. GRANDPA Vote Messages

A GRANDPA vote message by voter \$v\$, \$M_v^(r,"stage")\$, is gossip to the network by voter \$v\$ with the following structure:

\$M_v^(r,"stage")(B) := "Enc"_("SC")(r,"id"_(bbb V),"SigMsg")\$ \$"SigMsg" := ("msg","Sig"_(v_i)^(r,"stage"),v_("id"))\$ \$"msg" := "Enc"_("SC")("stage",V_v^(r,"stage")(B))\$
where
  • \$r\$ is an unsigned 64-bit integer indicating the Grandpa round number (Definition 76).

  • \$"id"_(bbb V)\$ is an unsigned 64-bit integer indicating the authority Set Id (Definition 33).

  • \$"Sig"_(v_i)^(r,"stage")\$ is a 512-bit byte array containing the signature of the authority (Definition 77).

  • \$v_(id)\$ is a 256-bit byte array containing the ed25519 public key of the authority.

  • \$"stage"\$ is a 8-bit integer of value 0 if it’s a pre-vote sub-round, 1 if it’s a pre-commit sub-round or 2 if it’s a primary proposal message.

  • \$V_v^(r,"stage")(B)\$ is the GRANDPA vote for block \$B\$ (Definition 76).

This message is the sub-component of the GRANDPA gossip message (Definition 43) of type Id 0.

The GRANDPA compact justification format is an optimized data structure to store a collection of pre-commits and their signatures to be submitted as part of a commit message. Instead of storing an array of justifications, it uses the following format:

\$J_(v_(0,...n))^(r,"comp") := ({V_(v_0)^(r,pc),... V_(v_n)^(r,pc)},{("Sig"_(v_0)^(r,pc),v_("id"_0)), ... ("Sig"_(v_n)^(r,pc),v_("id"_n))})\$
where
  • \$V_(v_i)^(r,pc)\$ is a 256-bit byte array containing the pre-commit vote of authority \$v_i\$ (Definition 76).

  • \$"Sig"_(v_i)^(r,pc)\$ is a 512-bit byte array containing the pre-commit signature of authority \$v_i\$ (Definition 77).

  • \$v_("id"_n)\$ is a 256-bit byte array containing the public key of authority \$v_i\$.

Definition 46. GRANDPA Commit Message

A GRANDPA commit message for block \$B\$ in round \$r\$, \$M_v^(r,"Fin")(B)\$, is a message broadcasted by voter \$v\$ to the network indicating that voter \$v\$ has finalized block \$B\$ in round \$r\$. It has the following structure:

\$M_v^(r,"Fin")(B) := "Enc"_("SC")(r,"id"_(bbb V),V_v^r(B),J_(v_(0,...n))^(r,"comp"))\$
where
  • \$r\$ is an unsigned 64-bit integer indicating the round number (Definition 76).

  • \$id_(bbb V)\$ is the authority set Id (Definition 33).

  • \$V_v^r(B)\$ is a 256-bit array containing the GRANDPA vote for block \$B\$ (Definition 75).

  • \$J_(v_(0,...n))^(r,"comp")\$ is the compacted GRANDPA justification containing observed pre-commit of authorities \$v_0\$ to \$v_n\$ (Definition 45).

This message is the sub-component of the GRANDPA gossip message (Definition 43) of type Id 1.

4.8.6.1. GRANDPA Neighbor Messages

Neighbor messages are sent to all connected peers but they are not repropagated on reception. A message should be send whenever the messages values change and at least every 5 minutes. The sender should take the recipients state into account and avoid sending messages to peers that are using a different voter sets or are in a different round. Messages received from a future voter set or round can be dropped and ignored.

A GRANDPA Neighbor Message is defined as:

\$M^("neigh") := "Enc"_("SC")(v,r,"id"_(bbb V),H_i(B_("last")))\$
where
  • \$v\$ is an unsigned 8-bit integer indicating the version of the neighbor message, currently 1.

  • \$r\$ is an unsigned 64-bit integer indicating the round number (Definition 76).

  • \$"id"_(bbb V)\$ is an unsigned 64-bit integer indicating the authority Id (Definition 33).

  • \$H_i(B_("last"))\$ is an unsigned 32-bit integer indicating the block number of the last finalized block \$B_("last")\$.

This message is the sub-component of the GRANDPA gossip message (Definition 43) of type Id 2.

4.8.6.2. GRANDPA Catch-up Messages

Whenever a Polkadot node detects that it is lagging behind the finality procedure, it needs to initiate a catch-up procedure. GRANDPA Neighbor messages (Definition 47) reveal the round number for the last finalized GRANDPA round which the node’s peers have observed. This provides the means to identify a discrepancy in the latest finalized round number observed among the peers. If such a discrepancy is observed, the node needs to initiate the catch-up procedure explained in Section 6.6.1).

In particular, this procedure involves sending a catch-up request and processing catch-up response messages.

A GRANDPA catch-up request message for round \$r\$, \$M_(i,v)^("Cat"-q)("id"_(bbb V),r)\$, is a message sent from node \$i\$ to its voting peer node \$v\$ requesting the latest status of a GRANDPA round \$r' >r\$ of the authority set \$bbb V_("id")\$ along with the justification of the status and has the following structure:

\$M_(i,v)^(r,"Cat"-q) := "Enc"_("SC")(r,"id"_(bbb V))\$

This message is the sub-component of the GRANDPA Gossip message (Definition 43) of type Id 3.

A GRANDPA catch-up response message for round \$r\$, \$M_(v,i)^("Cat"-s)("id"_(bbb V),r)\$, is a message sent by a node \$v\$ to node \$i\$ in response of a catch-up request \$M_(v,i)^("Cat"-q)("id"_(bbb V),r')\$ in which \$r >= r'\$ is the latest GRANDPA round which v has prove of its finalization and has the following structure:

\$M_(v,i)^("Cat"-s) := "Enc"_("SC")("id"_(bbb V), r, J_(0,...n)^(r,"pv")(B), J_(0,...m)^(r,"pc")(B),H_h(B'),H_i(B'))\$

Where \$B\$ is the highest block which \$v\$ believes to be finalized in round \$r\$ (Definition 76). \$B'\$ is the highest ancestor of all blocks voted on in the arrays of justifications \$J_(0,...n)^(r,"pv")(B)\$ and \$J_(0,...m)^(r,"pc")(B)\$ (Definition 78) with the exception of the equivocatory votes.

This message is the sub-component of the GRANDPA Gossip message (Definition 43) of type Id 4.

4.8.7. GRANDPA BEEFY

The BEEFY protocol is currently in early development and subject to change.

This section defines the messages required for the GRANDPA BEEFY protocol (Section 6.7). Those messages are sent over the /paritytech/beefy/1 substream.

Definition 50. Commitment

A commitment, \$C\$, contains the information extracted from the finalized block at height \$H_i(B_("last"))\$ as specified in the message body and a datastructure of the following format:

\$C = (R_h,H_i(B_("last")),"id"_(bbb V))\$
where
  • \$R_h\$ is the MMR root of all the block header hashes leading up to the latest, finalized block.

  • \$H_i(B_("last"))\$ is the block number this commitment is for. Namely the latest, finalized block.

  • \$"id"_(bbb V)\$ is the current authority set Id (Definition 73).

Definition 51. Vote Message

A vote message, \$M_v\$, is direct vote created by the Polkadot Host on every BEEFY round and is gossiped to its peers. The message is a datastructure of the following format:

\$M_v = "Enc"_("SC")(C,A_("id")^("bfy"),A_("sig"))\$
where
  • \$C\$ is the BEEFY commitment (Definition 50).

  • \$A_("id")^("bfy")\$ is the ECDSA public key of the Polkadot Host.

  • \$A_("sig")\$ is the signature created with \$A_("id")^("bfy")\$ by signing the statement \$R_h\$ in \$C\$.

Definition 52. Signed Commitment

A signed commitment, \$M_("sc")\$, is a datastructure of the following format:

\$M_("SC") = "Enc"_("SC")(C,S_n)\$ \$S_n = (A_0^("sig"),... A_n^("sig"))\$
where
  • \$C\$ is the BEEFY commitment (Definition 50).

  • \$S_n\$ is an array where its exact size matches the number of validators in the current authority set as specified by \$"id"_(bbb V)\$ (Definition 73) in \$C\$. Individual items are of the type Option (Definition 190) which can contain a signature of a validator which signed the same statement (\$R_h\$ in \$C\$) and is active in the current authority set. It’s critical that the signatures are sorted based on their corresponding public key entry in the authority set.

    For example, the signature of the validator at index 3 in the authority set must be placed at index 3 in \$S_n\$. If not signature is available for that validator, then the Option variant is None inserted (Definition 190). This sorting allows clients to map public keys to their corresponding signatures.

A signed commitment witness, \$M_("SC")^w\$, is a light version of the signed BEEFY commitment (Definition 52). Instead of containing the entire list of signatures, it only claims which validator signed the statement.

The message is a datastructure of the following format:

\$M_("SC")^w = "Enc"_("SC")(C,V_(0,... n), R_("sig"))\$
where
  • \$C\$ is the BEEFY commitment (Definition 50).

  • \$V_(0,... n)\$ is an array where its exact size matches the number of validators in the current authority set as specified by \$"id"_(bbb V)\$ in \$C\$. Individual items are booleans which indicate whether the validator has signed the statement (true) or not (false). It’s critical that the boolean indicators are sorted based on their corresponding public key entry in the authority set.

    For example, the boolean indicator of the validator at index 3 in the authority set must be placed at index 3 in \$V_n\$. This sorting allows clients to map public keys to their corresponding boolean indicators.

  • \$R_("sig")\$ is the MMR root of the signatures in the original signed BEEFY commitment (Definition 52).

5. Block Production

5.1. Introduction

The Polkadot Host uses BABE protocol for block production. It is designed based on Ouroboros praos . BABE execution happens in sequential non-overlapping phases known as an epoch. Each epoch on its turn is divided into a predefined number of slots. All slots in each epoch are sequentially indexed starting from 0. At the beginning of each epoch, the BABE node needs to run Algorithm 6 to find out in which slots it should produce a block and gossip to the other block producers. In turn, the block producer node should keep a copy of the block tree and grow it as it receives valid blocks from other block producers. A block producer prunes the tree in parallel by eliminating branches that do not include the most recent finalized blocks (Definition 5).

5.1.1. Block Producer

A block producer, noted by \$cc P_j\$, is a node running the Polkadot Host which is authorized to keep a transaction queue and which it gets a turn in producing blocks.

5.1.2. Block Authoring Session Key Pair

Block authoring session key pair \$(sk_j^s,pk_j^s)\$ is an SR25519 key pair which the block producer \$cc P_j\$ signs by their account key (Definition 177) and is used to sign the produced block as well as to compute its lottery values in Algorithm 6.

Definition 54. Epoch and Slot

A block production epoch, formally referred to as \$cc E\$, is a period with a pre-known starting time and fixed-length during which the set of block producers stays constant. Epochs are indexed sequentially, and we refer to the \$n^(th)\$ epoch since genesis by \$cc E_n\$. Each epoch is divided into equal-length periods known as block production slots, sequentially indexed in each epoch. The index of each slot is called a slot number. The equal length duration of each slot is called the slot duration and indicated by \$cc T\$. Each slot is awarded to a subset of block producers during which they are allowed to generate a block.

Substrate refers to an epoch as "session" in some places, however, epoch should be the preferred and official name for these periods.

We refer to the number of slots in epoch \$cc E_n\$ by \$sc_n\$. \$sc_n\$ is set to the duration field in the returned data from the call of the Runtime entry BabeApi_configuration (Section C.11.1) at genesis. For a given block \$B\$, we use the notation \(s_B\) to refer to the slot during which \$B\$ has been produced. Conversely, for slot \$s\$, \$cc B_c\$ is the set of Blocks generated at slot \$s\$.

Definition 56 provides an iterator over the blocks produced during a specific epoch.

Definition 56. Epoch Subchain

By \$"SubChain"(cc E_n\$) for epoch \$cc E_n\$, we refer to the path graph of \$BT\$ containing all the blocks generated during the slots of epoch \$cc E_n\$. When there is more than one block generated at a slot, we choose the one which is also on \$"Longest-Chain"(BT)\$.

Definition 57. Equivocation

A block producer equivocates if they produce more than one block at the same slot. The proof of equivocation are the given distinct headers that were signed by the validator and which include the slot number.

The Polkadot Host must detect equivocations committed by other validators and submit those to the Runtime as described in Section C.11.6.

Definition 58. BABE Consensus Message

\$"CM"_b\$, the consensus message for BABE, is of the following format:

\$"CM"_b = {(1,("Auth"_C, r)),(2,A_i),(3,D):}\$
where
1

implies next epoch data: The Runtime issues this message on every first block of an epoch. The supplied authority set (Definition 33), \$"Auth"_C\$, and randomness (Definition 71), \$r\$, are used in the next epoch \$cc E_n + 1\$.

2

implies on disabled: A 32-bit integer, \$A_i\$, indicating the individual authority in the current authority list that should be immediately disabled until the next authority set changes. This message initial intension was to cause an immediate suspension of all authority functionality with the specified authority.

3

implies next epoch descriptor: These messages are only issued on configuration change and in the first block of an epoch. The supplied configuration data are intended to be used from the next epoch onwards.

  • \$D\$ is a varying datatype of the following format:

    \$D = {(1, (c,2_("nd"))):}\$

    where \$c\$ is the probability that a slot will not be empty (Definition 59). It is encoded as a tuple of two unsigned 64-bit integers \$(c_("nominator"),c_("denominator"))\$ which are used to compute the rational \$c = c_("nominator")/c_("denominator")\$.

  • \$2_("nd")\$ describes what secondary slot (Definition 61), if any, is to be used. It is encoded as one-byte varying datatype:

    \$s_"2nd" = { (0,->,"no secondary slot"), (1,->,"plain secondary slot"), (2,->,"secondary slot with VRF output") :}\$

5.2. Block Production Lottery

The babe constant (Definition 59) is initialized at genesis to the value returned by calling BabeApi_configuration (Section C.11.1). For efficiency reasons, it is generally updated by the Runtime through the next config data consensus message in the digest (Definition 11) of the first block of an epoch for the next epoch.

A block producer aiming to produce a block during \$cc E_n\$ should run <algo-block-production-lottery>> to identify the slots it is awarded. These are the slots during which the block producer is allowed to build a block. The \$sk\$ is the block producer lottery secret key and \$n\$ is the index of the epoch for whose slots the block producer is running the lottery.

In order to ensure consistent block production, BABE uses secondary slots in case no authority won the (primary) block production lottery. Unlike the lottery, secondary slot assignees are know upfront publically (Definition 61). The Runtime provides information on how or if secondary slots are executed (Section C.11.1), explained further in Definition 61.

Definition 59. BABE Constant

The BABE constant is the probability that a slot will not be empty and used in the winning threshold calculation (Definition 60). It’s expressed as a rational, \$(x, y)\$, where \$x\$ is the numerator and \$y\$ is the denominator.

Definition 60. Winning Threshold

The Winning threshold denoted by \$T_(cc E_n)\$ is the threshold that is used alongside the result of Algorithm 6 to decide if a block producer is the winner of a specific slot. \$T_(cc E_n)\$ is calculated as follows:

\$A_w =sum_(n=1)^(|"Auth"_C(B)|)(w_A in "Auth"_C(B)_n)\$ \$T_(cc E_n) := 1 - (1 - c)^(w_a/A_w)\$

where \$A_w\$ is the total sum of all authority weights in the authority set (Definition 33) for epoch \$cc E_n\$, \$w_a\$ is the weight of the block author and \$c in (0, 1)\$ is the BABE constant (Definition 59).

The numbers should be treated as 64-bit rational numbers.

5.2.1. Primary Block Production Lottery

A block producer aiming to produce a block during \$cc E_n\$ should run the \$"Block-Production-Lottery"\$ algorithm to identify the slots it is awarded. These are the slots during which the block producer is allowed to build a block. The session secret key, \$sk\$, is the block producer lottery secret key and \$n\$ is the index of the epoch for whose slots the block producer is running the lottery.

Algorithm 6 Block-Production-Lottery

Require: sk

1:rr \leftarrow Epoch-Randomness(nn)

2:for i:=1 to scni := 1 ~\textbf{to}~ sc_n do

3:(π,d)(\pi, d) \leftarrow VRF(r,i,skr, i, sk)

4:A[i](d,π)A[i] \leftarrow (d, \pi)

5:end for

6:return A

where \$"Epoch-Randomness"\$ is defined in (Definition 71), \$sc_n\$ is defined in Definition 55 , \$"VRF"\$ creates the BABE VRF transcript (Definition 62) and \$e_i\$ is the epoch index, retrieved from the Runtime (Section C.11.1). \$s_k\$ and \$p_k\$ is the secret key respectively the public key of the authority. For any slot \$s\$ in epoch \$n\$ where \$o < T_(cc E_n)\$ (Definition 60), the block producer is required to produce a block.

the secondary slots (Definition 61) are running along side the primary block production lottery and mainly serve as a fallback to in case no authority was selected in the primary lottery.
Definition 61. Secondary Slots

Secondary slots work along side primary slot to ensure consistent block production, as described in Section 5.2. The secondary assignee of a block is determined by calculating a specific value, \$i_d\$, which indicates the index in the authority set (Definition 33). The corresponding authority in that set has the right to author a secondary block. This calculation is done for every slot in the epoch, \$s in sc_n\$ (Definition 55).

\$p larr h("Enc"_"SC"(r,s))\$ \$i_d larr p mod A_l\$
where
  • \$r\$ is the Epoch randomness (Definition 71).

  • \$s\$ is the slot number (Definition 54).

  • \$"Enc"_"SC"(...)\$ encodes its inner value to the corresponding SCALE value.

  • \$h(...)\$ creates a 256-bit Blake2 hash from its inner value.

  • \$A_l\$ is the lengths of the authority list (Definition 33).

If \$i_d\$ points to the authority, that authority must claim the secondary slot by creating a BABE VRF transcript (Definition 62). The resulting values \$o\$ and \$p\$ are then used in the Pre-Digest item (Definition 69). In case of secondary slots with plain outputs, respectively the Pre-Digest being of value 2, the transcript respectively the VRF is skipped.

The BABE block production lottery requires a specific transcript structure (Definition 175). That structure is used by both primary slots (Algorithm 6) and secondary slots (Definition 61).

\$t_1 larr "Transcript"("'BABE'")\$ \$t_2 larr "append"(t_1, "'slot number'", s)\$ \$t_3 larr "append"(t_2, "'current epoch'", e_i)\$ \$t_4 larr "append"(t_3, "'chain randomness'", r)\$ \$t_5 larr "append"(t_4, "'vrf-nm-pk'", p_k)\$ \$t_6 larr "meta-ad"(t_5, "'VRFHash'", "False")\$ \$t_7 larr "meta-ad"(t_6, 64_"le", "True")\$ \$h larr "prf"(t_7, "False")\$ \$o = s_k * h\$ \$p larr "dleq_prove"(t_7, h)\$

The operators are defined in Definition 176, \$"dleq_prove"\$ in Definition 172. The computed outputs, \$o\$ and \$p\$, are included in the block Pre-Digest (Definition 69).

5.3. Slot Number Calculation

It is imperative for the security of the network that each block producer correctly determines the current slot numbers at a given time by regularly estimating the local clock offset in relation to the network (Definition 64).

The calculation described in this section is still to be implemented and deployed: For now, each block producer is required to synchronize its local clock using NTP instead. The current slot \$s\$ is then calculated by \$s = t_"unix"/cc T\$ where \$cc T\$ is defined in Definition 54 and \$t_"unix"\$ is defined in Definition 181. That also entails that slot numbers are currently not reset at the beginning of each epoch.

Polkadot does this synchronization without relying on any external clock source (e.g. through the or the ). To stay in synchronization, each producer is therefore required to periodically estimate its local clock offset in relation to the rest of the network.

This estimation depends on the two fixed parameters \$k\$ (Definition 65) and \$s_(cq)\$ (Definition 66). These are chosen based on the results of a formal security analysis, currently assuming a \$1 s\$ clock drift per day and targeting a probability lower than \$0.5%\$ for an adversary to break BABE in 3 years with resistance against a network delay up to \$1 / 3\$ of the slot time and a Babe constant (Definition 59) of \$c = 0.38\$.

All validators are then required to run Algorithm 8 at the beginning of each sync period (Definition 68) to update their synchronization using all block arrival times of the previous period. The algorithm should only be run once all the blocks in this period have been finalized, even if only probabilistically (Definition 65). The target slot to which to synchronize should be the first slot in the new sync period.

Definition 63. Slot Offset

Let \$s_i\$ and \$s_j\$ be two slots belonging to epochs \$cc E_k\$ and \$cc E_l\$. By Slot-Offset\$(s_i,s_j)\$ we refer to the function whose value is equal to the number of slots between \$s_i\$ and \$s_j\$ (counting \$s_j\$) on the time continuum. As such, we have Slot-Offset\$(s_i, s_i) = 0\$.

It is imperative for the security of the network that each block producer correctly determines the current slot numbers at a given time by regularly estimating the local clock offset in relation to the network (Definition 64).

The relative time synchronization is a tuple of a slot number and a local clock timestamp \$(s_"sync",t_"sync")\$ describing the last point at which the slot numbers have been synchronized with the local clock.

Algorithm 7 Slot-Time

Require: ss

1:return tsync+t_\text{sync} + Slot-Offset(ssync,ss_{sync}, s)×T \times \mathcal{T}

Algorithm 7. Slot-Time

where \$s\$ is the slot number.

Algorithm 8 Median-Algorithm

Require: E,ssync\mathfrak{E}, s_{sync}

1:Ts{}T_s \leftarrow \{ \}

2:for B in EjB ~\textbf{in}~ \mathfrak{E}_j do

3:testBTB+t_{est}^{B} \leftarrow T_B + Slot-Offset(sB,ssyncs_B, s_{sync})×T \times \mathcal{T}

4:TsTstestBT_s \leftarrow T_s \cup t_{est}^{B}

5:end for

6:return Median(TsT_s)

Algorithm 8. Median-Algorithm
where
  • \(\mathfrak{E}\) is the sync period used for the estimate.

  • \$s_"sync"\$ is the slot time to estimate.

  • \$"Slot-Offset"\$ is defined in Algorithm 7.

  • \(\mathcal{T}\) is the slot duration defined in Definition 54.

Definition 65. Pruned Best Chain

The pruned best chain \$C^(r^k)\$ is the longest selected chain (Definition 7) with the last \$k\$ Blocks pruned. We chose \$k= 140\$. The last (probabilistic) finalized block describes the last block in this pruned best chain.

Definition 66. Chain Quality

The chain quality \$s_(cq)\$ represents the number of slots that are used to estimate the local clock offset. Currently, it is set to \$s_(cq) = 3000\$.

The prerequisite for such a calculation is that each producer stores the arrival time of each block (Definition 67) measured by a clock that is otherwise not adjusted by any external protocol.

Definition 67. Block Arrival Time

The block arrival time of block \$B\$ for node \$j\$ formally represented by \$T_B^j\$ is the local time of node \$j\$ when node \$j\$ has received block \$B\$ for the first time. If the node \$j\$ itself is the producer of \$B\$, \$T_B^j\$ is set equal to the time that the block is produced. The index \$j\$ in \$T_B^j\$ notation may be dropped and B’s arrival time is referred to by \$T_B\$ when there is no ambiguity about the underlying node.

Definition 68. Sync Period

A is an interval at which each validator (re-)evaluates its local clock offsets. The first sync period \$fr E_1\$ starts just after the genesis block is released. Consequently, each sync period \$fr E_i\$ starts after \$fr E_(i - 1)\$. The length of the sync period (Definition 66) is equal to \$s_(qc)\$and expressed in the number of slots.

Figure 1. An exemplary result of Median Algorithm in first sync epoch with \$s_"cq" = 9\$ and \$k = 1\$.

5.4. Production Algorithm

Throughout each epoch, each block producer should run Algorithm 9 to produce blocks during the slots it has been awarded during that epoch. The produced block needs to carry the Pre-Digest (Definition 69) as well as the block signature (Definition 70) as Pre-Runtime and Seal digest items.

Definition 69. Pre-Digest

The Pre-Digest, or BABE header, \$P\$, is a varying datatype of the following format:

\$P = {(1,->,(a_"id",s,o,p)),(2,->,(a_"id",s)),(3,->,(a_"id",s,o,p)):}\$
where
  • 1 indicates a primary slot with VRF outputs, 2 a secondary slot with plain outputs and 3 a secondary slot with VRF outputs (Section 5.2). Plain outputs are no longer actively used and only exist for backwards compatibility reasons, respectively to sync old blocks.

  • \$a_"id"\$ is the unsigned 32-bit integer indicating the index of the authority in the authority set (Section 3.3.1) who authored the block.

  • \$s\$ is the slot number (Definition 54).

  • \$o\$ is VRF output (Algorithm 6 respectively Definition 61).

  • \$p\$ is VRF proof (Algorithm 6 respectively Definition 61).

The Pre-Digest must be included as a digest item of Pre-Runtime type in the header digest (Definition 11) \$H_d(B)\$.

Algorithm 9 Invoke-Block-Authoring

Require: sk,pk,n,BTsk, pk, n, BT

1:AA \leftarrow Block-production-lottery(sk,nsk, n)

2:for s1 to scns \leftarrow 1 ~\textbf{to}~ sc_n do

3:Wait-Until(Slot-Time(ss))

4:(d,π)A[s](d, \pi) \leftarrow A[s]

5:if d<τd < \tau then

6:CBestC_{Best} \leftarrow Longest-Chain(BTBT)

7:BsB_s \leftarrow Build-Block(CBestC_{Best})

8:Add-Digest-Item(Bs,Pre-Runtime,Eid(BABE),HBABE(Bs)B_s,\text{Pre-Runtime}, E_{id}(\text{BABE}), H_\text{BABE}(B_s))

9:Add-Digest-Item(Bs,Seal,SBB_s, \text{Seal}, S_B)

10:Broadcast-Block(BsB_s)

11:end if

12:end for

where \$"BT"\$ is the current block tree, \$"Block-Production-Lottery"\$ is defined in Algorithm 6 and \$"Add-Digest-Item"\$ appends a digest item to the end of the header digest \$H_d(B)\$ (Definition 11).

Definition 70. Block Signature

The Block Signature \$S_B\$ is a signature of the block header hash (Definition 12) and defined as

\$"Sig"_("SR25519","sk"_j^s)(H_h(B))\$

\$m\$ should be included in \$H_d(B)\$ as the Seal digest item (Definition 11) of value:

\$(t, "id"("BABE"), m)\$

in which, \$t = 5\$ is the seal digest identifier and \$"id"("BABE")\$ is the BABE consensus engine unique identifier (Definition 11). The Seal digest item is referred to as the BABE Seal.

5.5. Epoch Randomness

At the beginning of each epoch, \$cc E_n\$ the host will receive the randomness seed \$cc R_(cc E_(n+1))\$ (Definition 71) necessary to participate in the block production lottery in the next epoch \$cc E_(n+1)\$ from the Runtime, through the consensus message (Definition 58) in the digest of the first block.

Definition 71. Randomness Seed

For epoch \$cc E\$, there is a 32-byte \$cc R_(cc E)\$ computed based on the previous epochs VRF outputs. For \$cc E_0\$ and \$cc E_1\$, the randomness seed is provided in the genesis state (Section C.11.1). For any further epochs, the randomness is retrieved from the consensus message (Definition 58).

5.6. Verifying Authorship Right

When a Polkadot node receives a produced block, it needs to verify if the block producer was entitled to produce the block in the given slot by running Algorithm 10. Algorithm 11 runs as part of the verification process, when a node is importing a block.

Algorithm 10 Verify-Authorship-Right

Require: Heads(B)\text{Head}_{s(B)}

1:ss \leftarrow Slot-Number-At-Given-Time(TBT_B)

2:Ec\mathcal{E}_c \leftarrow Current-Epoch()

3:(D1,,DHd(B))Hd(B)(D_1, \ldots, D_{|H_d(B)|}) \leftarrow H_d(B)

4:DsDHd(B)D_s \leftarrow D_{|H_d(B)|}

5:Hd(B)(D1,,DHd(B)1)H_d(B) \leftarrow \left(D_1, \ldots, D_{|H_d(B)| - 1}\right) // remove the seal from the digest

6:(id,SigB)DecSC(Ds)(id, \text{Sig}_B)\leftarrow \text{Dec}_{SC}(D_s)

7:if idid \neq Seal-Id then

8:error ‘‘Seal missing''

9:end if

10:AuthorIDAuthorityDirectoryEc[HBABE(B).SingerIndex]\text{AuthorID} \leftarrow \text{AuthorityDirectory}^{\mathcal{E}_c}[H_{BABE}(B).\text{SingerIndex}]

11:Verify-Signature(AuthorID,Hh(B),SigB\text{AuthorID}, H_h(B),\text{Sig}_B)

12:if BBT:Hh(B)Hh(B)\exists B' \in BT : H_h(B) \neq H_h (B) and sB=sBs_B = s_B' and SignerIndexB=SignerIndexB\text{SignerIndex}_B = \text{SignerIndex}_{B'} then

13:error ‘‘Block producer is equivocating''

14:end if

15:Verify-Slot-Winner((dB,πB),sB,AuthorID(d_B, \pi_B), s_B, \text{AuthorID})

where
  • \$"Head"_s(B)\$ is the header of the block that’s being verified.

  • \$T_B\$ is \$B\$’s arrival time (Definition 67).

  • \$H_d(B)\$ is the digest sub-component (Definition 11) of \$"Head"(B)\$ (Definition 10).

  • The Seal \$D_s\$ is the last element in the digest array \$H_d(B)\$ as described in Definition 11.

  • \$"Seal-Id"\$ is the type index showing that a digest item (Definition 11) of varying type (Definition 189) is of type Seal.

  • \$"AuthorityDirectory"^(cc E_c)\$ is the set of Authority ID for block producers of epoch \$cc E_c\$.

    1. \$"AuthorId"\$ is the public session key of the block producer.

  • \$"BT"\$ is the pruned block tree (Definition 5).

  • \$"Verify-Slot-Winner"\$ is defined in Algorithm 11.

Algorithm 11 Verify-Slot-Winner

Require: BB

1:Ec\mathcal{E}_c \leftarrow Current-Epoch

2:ρ\rho \leftarrow Epoch-Randomness(cc)

3:Verify-VRF(ρ,HBABE(B).(dB,πB),HBABE(B).s,c\rho, H_{BABE}(B).(d_B, \pi_B), H_{BABE}(B).s, c)

4:if dBτd_B \geqslant \tau then

5:error ‘‘Block producer is not a winner of the slot''

6:end if

Algorithm 11. Verify-Slot-Winner
where
  1. \$"Epoch-Randomness"\$ is defined in Definition 71.

  2. \$H_"BABE"(B)\$ is the BABE header defined in Definition 69.

  3. \$(o,p)\$ is the block lottery result for block \$B\$ (Algorithm 6), respectively the VRF output (Definition 62).

  4. \$"Verify-VRF"\$ is described in Section A.1.3.

  5. \$T_(cc E_n)\$ is the winning threshold as defined in Definition 60.

5.7. Block Building Process

The block building process is triggered by Algorithm 9 of the consensus engine which in turn runs Algorithm 12.

Algorithm 12 Build-Block

1:PBP_B \leftarrow Head(CBestC_{Best})

2:Head(B)(HpHh(PB),HiHi(PB)+1,Hrϕ,Heϕ,Hdϕ)\text{Head}(B) \leftarrow \left(H_p \leftarrow H_h(P_B), H_i \leftarrow H_i(P_B) + 1, H_r \leftarrow \phi, H_e \leftarrow \phi, H_d \leftarrow \phi \right)

3:Call-Runtime-Entry(Core_initialize_block,Head(B)\texttt{Core\_initialize\_block}, \text{Head}(B))

4:I-D \leftarrow Call-Runtime-Entry(BlockBuilder_inherent_extrinsics,\texttt{BlockBuilder\_inherent\_extrinsics}, Inherent-Data)

5:for E inE~\textbf{in} I-D do

6:Call-Runtime-Entry(BlockBuilder_apply_extrinsics,E\texttt{BlockBuilder\_apply\_extrinsics}, E)

7:end for

8:while not End-Of-Slot(ss) do

9:EE \leftarrow Next-Ready-Extrinsic()

10:RR \leftarrow Call-Runtime-Entry(BlockBuilder_apply_extrinsics,E\texttt{BlockBuilder\_apply\_extrinsics}, E)

11:if Block-Is-Full(RR) then

12:break

13:end if

14:if Should-Drop(RR) then

15:Drop(EE)

16:end if

17:Head(B)\text{Head}(B) \leftarrow Call-Runtime-Entry(BlockBuilder_finalize_block,B\texttt{BlockBuilder\_finalize\_block}, B)

18:BB \leftarrow Add-Seal(BB)

19:end while

Algorithm 12. Build-Block
where
  • \$C_"Best"\$ is the chain head at which the block should be constructed ("parent").

  • \$s\$ is the slot number.

  • \$"Head"(B)\$ is defined in Definition 10.

  • \$"Call-Runtime-Entry"\$ is defined in Definition 32.

  • \$"Inherent-Data"\$ is defined in Definition 15.

  • \$"End-Of-Slot"\$ indicates the end of the BABE slot as defined Algorithm 8 respectively Definition 54.

  • \$"Next-Ready-Extrinsic"\$ indicates picking an extrinsic from the extrinsics queue (Definition 14).

  • \$"Block-Is-Full"\$ indicates that the maximum block size is being used.

  • \$"Should-Drop"\$ determines based on the result \$R\$ whether the extrinsic should be dropped or remain in the extrinsics queue and scheduled for the next block. The ApplyExtrinsicResult (Definition 217) describes this behavior in more detail.

  • \$"Drop"\$ indicates removing the extrinsic from the extrinsic queue (Definition 14).

  • \$"Add-Seal"\$ adds the seal to the block (<<>>) before sending it to peers. The seal is removed again before submitting it to the Runtime.

6. Finality

6.1. Introduction

The Polkadot Host uses GRANDPA Finality protocol to finalize blocks. Finality is obtained by consecutive rounds of voting by the validator nodes. Validators execute GRANDPA finality process in parallel to Block Production as an independent service. In this section, we describe the different functions that GRANDPA service performs to successfully participate in the block-finalization process.

Definition 72. GRANDPA Voter

A GRANDPA Voter, \$v\$, represented by a key pair \$(K_v^("pr"),v_("id"))\$ where \$k_v^("pr")\$ represents an ed25519 private key, is a node running a GRANDPA protocol and broadcasting votes to finalize blocks in a Polkadot Host-based chain. The set of all GRANDPA voters for a given block B is indicated by \$bbb V_B\$. In that regard, we have [To do: change function name, only call at genesis, adjust V_B over the sections]

\$bbb V = tt "grandpa_authorities"(B)\$

where \$tt "grandpa_authorities"\$ is a function entrypoint of the Runtime described in Section C.10.1. We refer to \$bbb V_B\$ as \$bbb V\$ when there is no chance of ambiguity.

Analogously we say that a Polkadot node is a non-voter node for block \$B\$, if it does not own any of the key pairs in \$bbb V_B\$.

Definition 73. Authority Set Id

The authority set Id (\$"id"_(bbb V)\$) is an incremental counter which tracks the amount of authority list changes that occurred (Definition 86). Starting with the value of zero at genesis, the Polkadot Host increments this value by one every time a Scheduled Change or a Forced Change occurs. The authority set Id is an unsigned 64-bit integer.

Definition 74. GRANDPA State

The GRANDPA state, \$"GS"\$, is defined as:

\$"GS" := {bbb V, "id"_(bbb V),r}\$
where
  • \$bbb V\$: is the set of voters.

  • \$"id"_(bbb V)\$: is the authority set ID (Definition 73).

  • \$r\$: is the voting round number.

Definition 75. GRANDPA Vote

A GRANDPA vote or simply a vote for block \$B\$ is an ordered pair defined as

\$V(B) := (H_h(B),H_i(B))\$

where \$H_h(B)\$ and \$H_i (B)\$ are the block hash (Definition 12) and the block number (Definition 10).

Definition 76. Voting Rounds

Voters engage in a maximum of two sub-rounds of voting for each round \$r\$. The first sub-round is called pre-vote and the second sub-round is called pre-commit.

By \$V_v^(r,"pv")\$ and \$V_v^(r,"pc")\$ we refer to the vote cast by voter \$v\$ in round \$r\$ (for block \$B\$) during the pre-vote and the pre-commit sub-round respectively.

Voting is done by means of broadcasting voting messages (Section 4.8.6) to the network. Validators inform their peers about the block finalized in round \$r\$ by broadcasting a commit message (Algorithm 14).

Definition 77. Vote Signature

\$"Sign"_(v_i)^(r,"stage")\$ refers to the signature of a voter for a specific message in a round and is formally defined as:

\$"Sign"_(v_i)^(r,"stage") := "Sig"_("ed25519")("msg",r,"id"_(bbb V))\$
where
  • \$"msg"\$: is an byte array containing the message to be signed (Definition 75).

  • \$r\$: is an unsigned 64-bit integer is the round number.

  • \$"id"_(bbb V)\$: is an unsigned 64-bit integer indicating the authority set Id (Definition 33).

Definition 78. Justification

The justification for block \$B\$ in round \$r\$, \$J^(r,"stage")(B)\$, is a vector of pairs of the type:

\$(V(B'),"Sign"_(v_i)^(r,"stage")(B'),v_("id"))\$

in which either

\$B' >= B\$

or \$V_(v_i)^(r,"pc")(B')\$ is an equivocatory vote.

In all cases, \$"Sign"_(v_i)^(r,"stage")(B')\$ is the signature (Definition 77) of voter \$v_("id") in bbb V_B\$ broadcasted during either the pre-vote (stage = pv) or the pre-commit (stage = pc) sub-round of round r. A valid justification must only contain up-to-one valid vote from each voter and must not contain more than two equivocatory votes from each voter.

We say \$J^(r,"pc")(B)\$ justifies the finalization of \$B' >= B\$ for a non-voter node \$n\$ if the number of valid signatures in \$J^(r,"pc")(B)\$ for \$B'\$ is greater than \$2/3|bbb V_B|\$.

Note that \$J^(r,"pc")(B)\$ can only be used by a non-voter node to finalize a block. In contrast, a voter node can only be assured of the finality (Definition 90) of block \$B\$ by actively participating in the voting process. That is by invoking Algorithm 14.

The GRANDPA protocol dictates how an honest voter should vote in each sub-round, which is described by Algorithm 14. After defining what constitutes a vote in GRANDPA, we define how GRANDPA counts votes.

Definition 80. Equivocation

Voter \$v\$ equivocates if they broadcast two or more valid votes to blocks during one voting sub-round. In such a situation, we say that \$v\$ is an equivocator and any vote \$V_v^(r,"stage")(B)\$ cast by \$v\$ in that sub-round is an equivocatory vote, and

\$cc E^(r,"stage")\$

represents the set of all equivocators voters in sub-round stage of round \$r\$. When we want to refer to the number of equivocators whose equivocation has been observed by voter \$v\$ we refer to it by:

\$cc E_("obs"(v))^(r,"stage")\$

The Polkadot Host must detect equivocations committed by other validators and submit those to the Runtime as described in Section C.10.2.

A vote \$V_v^(r,"stage") = V(B)\$ is invalid if

  • \$H (B)\$ does not correspond to a valid block.

  • \$B\$ is not an (eventual) descendant of a previously finalized block.

  • \$M_v^(r,"stage")\$ does not bear a valid signature.

  • \$"id"_(bbb V)\$ does no match the current \$bbb V\$.

  • \$V_v^(r,"stage")\$ is an equivocatory vote.

For validator \$v\$, the set of observed direct votes for Block \$B\$ in round \$r\$, formally denoted by \$"VD"_("obs"(v))^(r,"stage")(B)\$ is equal to the union of:

  • set of valid votes \$V_(v_i)^(r,"stage")\$ cast in round \$r\$ and received by \$v\$ such that \$V_(v_i)^(r,"stage") = V(B)\$.

We refer to the set of total votes observed by voter \$v\$ in sub-round stage of round \$r\$ by \$V_("obs"(v))^(r,"stage")\$.

The set of all observed votes by \$v\$ in the sub-round stage of round \$r\$ for block \$B\$, \$V_("obs"(v))^(r,"stage")\$ is equal to all of the observed direct votes cast for block \$B\$ and all of the \$B\$’s descendants defined formally as:

\$V_("obs"(v))^(r,"stage")(B) := uuu_(v_i in bbb V, B < B') "VD"_("obs"(v))^(r,"stage")(B')\$

The total number of observed votes for Block \$B\$ in round \$r\$ is defined to be the size of that set plus the total number of equivocator voters:

\$#V_("obs"(v))^(r,"stage")(B) := |V_("obs"(v))^(r,"stage")(B)|+|cc E_("obs"(v))^(r,"stage")|\$

Note that for genesis state we always have \$#V_("obs"(v))^(r,"pv")(B) = |bbb V|\$.

Let \$V_("unobs"(v))^(r,"stage")\$ be the set of voters whose vote in the given stage has not been received. We define the total number of potential votes for Block \$B\$ in round \$r\$ to be:

\$#V_("obs"(v),"pot")^(r,"stage")(B) := |V_("obs"(v))^(r,"stage")(B)|+|V_("unobs"(v))^(r,"stage")|+"Min"(1/3|bbb V|,|bbb V|-|V_("obs"(v))^(r,"stage")(B)|-|V_("unobs"(v))^(r,"stage")|)\$

The current pre-voted block \$B_v^(r,"pv")\$ also know as GRANDPA GHOST is the block chosen by Algorithm 17:

\$B_v^(r,"pv") := "GRANDPA-GHOST"(r)\$

Finally, we define when a voter \$v\$ sees a round as completable, that is when they are confident that \$B_v^(r,"pv")\$ is an upper bound for what is going to be finalized in this round.

Definition 85. Completable Round

We say that round \$r\$ is completable if \$|V_("obs"(v))^(r,"pc")|+ cc E_("obs"(v))^(r,"pc") > 2/3 bbb V\$ and for all \$B' > B_v^(r,"pv")\$:

\$|V_("obs"(v))^(r,"pc")|- cc E_("obs"(v))^(r,"pc") - |V_("obs"(v))^(r,"pc")(B')|> 2/3|bbb V|\$

Note that in practice we only need to check the inequality for those \$B' > B_v^(r,"pv")\$ where \$|V_("obs"(v))^(r,"pc")(B')| > 0\$.

\$"CM"_g\$, the consensus message for GRANDPA, is of the following format:

\$"CM"_g = {(1,("Auth"_C,N_("delay"))),(2,(m,"Auth"_C,N_("delay"))),(3,"A"_i),(4,N_("delay")),(5,N_("delay")):}\$
where
\$N_"delay"\$

is an unsigned 32-bit integer indicating how deep in the chain the announcing block must be before the change is applied.

1

implies scheduled change: Schedule an authority set change after the given delay of \$N_("delay") := |"SubChain"(B,B')|\$ where \$B'\$ is the block where the change is applied. The earliest digest of this type in a single block will be respected, unless a force change is present, in which case the force change takes precedence.

2

implies forced change: Schedule a forced authority set change after the given delay of \$N_("delay") := |"SubChain"(B,m + B')|\$ where \$B'\$ is the block where the change is applied. The earliest digest of this type in a block will be respected.

Forced changes are explained further in Section 6.5.

3

implies on disabled: An index to the individual authority in the current authority list (Definition 33) that should be immediately disabled until the next authority set changes. When an authority gets disabled, the node should stop performing any authority functionality from that authority, including authoring blocks and casting GRANDPA votes for finalization. Similarly, other nodes should ignore all messages from the indicated authority which pertain to their authority role.

4

implies pause: A signal to pause the current authority set after the given delay of \$N_("delay") := |"SubChain"(B,B')|\$ where \$B'\$ is a block where the change is applied. Once applied, the authorities should stop voting.

5

implies resume: A signal to resume the current authority set after the given delay of \$N_("delay") := |"SubChain"(B,B')|\$ where \$B'\$ is the block where the change is applied. Once applied, the authorities should resume voting.

The BEEFY protocol is still under construction. The following part will be updated in the future and certain information will be clarified.

\$"CM"_y\$, the consensus message for BEEFY (Section 6.7), is of the following format:

\$"CM"_y = {(1,(V_B,V_i)),(2,A_i),(3,R):}\$
where
1

implies that the remote authorities have changed. \$V_B\$ is the array of the new BEEFY authorities’s public keys and \$V_i\$ is the identifier of the remote validator set.

2

implies on disabled: an index to the individual authorty in \$V_B\$ that should be immediately disabled until the next authority change.

3

implies MMR root: a 32-byte array containing the MMR root.

6.2. Initiating the GRANDPA State

In order to participate coherently in the voting process, a validator must initiate its state and sync it with other active validators. In particular, considering that voting is happening in different distinct rounds where each round of voting is assigned a unique sequential round number \$r_v\$, it needs to determine and set its round counter \$r\$ equal to the voting round \$r_n\$ currently undergoing in the network. The mandated initialization procedure for the GRANDPA protocol for a joining validator is described in detail in Algorithm 13.

The process of joining a new voter set is different from the one of rejoining the current voter set after a network disconnect. The details of this distinction are described further in this section.

6.2.1. Voter Set Changes

A GRANDPA voter node which is initiating GRANDPA protocol as part of joining a new authority set is required to execute Algorithm 13. The algorithm mandates the initialization procedure for GRANDPA protocol.

The GRANDPA round number reset to 0 for every authority set change.

Voter set changes are signaled by Runtime via a consensus engine message (Section 3.3.2). When Authorities process such messages they must not vote on any block with a higher number than the block at which the change is supposed to happen. The new authority set should reinitiate GRANDPA protocol by executing Algorithm 13.

Algorithm 13 Initiate-Grandpa

Input: rlast,Blastr_{last}, B_{last}

1:Last-Finalized-Block Blast\leftarrow B_{last}

2:Best-Final-Candidate(0)Blast(0) \leftarrow B_{last}

3:GRANDPA-GHOST(0)Blast(0) \leftarrow B_{last}

4:Last-Completed-Round0 \leftarrow 0

5:rn1r_n \leftarrow 1

6:Play-Grandpa-round(rnr_n)

Algorithm 13. Initiate-Grandpa

where \$B_("last")\$ is the last block which has been finalized on the chain (Definition 90). \$r_("last")\$ is equal to the latest round the voter has observed that other voters are voting on. The voter obtains this information through various gossiped messages including those mentioned in Definition 90. \$r_("last")\$ is set to 0 if the GRANDPA node is initiating the GRANDPA voting process as a part of a new authority set. This is because the GRANDPA round number resets to 0 for every authority set change.

6.3. Rejoining the Same Voter Set

When a voter node rejoins the network after a disconnect from the voter set and with the condition that there has been no change to the voter set at the time of the disconnect, the node must continue performing the GRANDPA protocol at the same state as before getting disconnected from the network, ignoring any possible progress in GRANDPA finalization. Following reconnection, the node eventually gets updated to the current GRANDPA round and synchronizes its state with the rest of the voting set through the process called Catchup (Section 6.6.1).

6.4. Voting Process in Round \$r\$

For each round \$r\$, an honest voter \$v\$ must participate in the voting process by following Algorithm 14.

Algorithm 14 Play-Grandpa-Round

Require: (rr)

1:tr,vt_{r, v} \leftarrow Current local time

2:primary\textrm{primary} \leftarrow Derive-Primary(rr)

3:if v=primaryv = \textrm{primary} then

4:Broadcast(Mvr1,Fin(M_{v}^{r - 1, \textrm{Fin}}(Best-Final-Candidate(r1r - 1))))

5:if Best-Final-Candidate(r1r - 1) \geqslant Last-Finalized-Block then

6:Broadcast(Mvr1,Prim(M_{v}^{r - 1, \textrm{Prim}}(Best-Final-Candidate(r1r - 1))))

7:end if

8:end if

9:Receive-Messages(until Time tr,v+2×T\geqslant t_{r_,v} + 2 \times T or rr is completable)

10:LL \leftarrow Best-Final-Candidate(r1r - 1)

11:NN \leftarrow Best-PreVote-Candidate(rr)

12:Broadcast(Mvr,pv(N)M_v^{r, \textrm{pv}} (N))

13:Receive-Messages(until Bvr,pvLB^{r,\textrm{pv}}_v \geqslant L and (( Time tr,v+4×T\geqslant t_{r_,v} + 4 \times T or rr is completable )))

14:Broadcast(Mvr,pc(Bvr,pv)M_v^{r, \textrm{pc}}(B_v^{r, \textrm{pv}}))

15:repeat

16:Receive-Messages()

17:Attempt-To-Finalize-At-Round(rr)

18:until rr is completable and Finalizable(rr) and Last-Finalized-Block \geqslant Best-Final-Candidate(r1r - 1)

19:Play-Grandpa-round(r+1r + 1)

20:repeat

21:Receive-Messages()

22:Attempt-To-Finalize-At-Round(rr)

23:until Last-Finalized-Block \geqslant Best-Final-Candidate(rr)

24:if Last-Completed-Round <r < r then

25:Last-Completed-Round r\leftarrow r

26:end if

Algorithm 14. Play-Grandpa-Round
where
  • \$T\$ is sampled from a log-normal distribution whose mean and standard deviation are equal to the average network delay for a message to be sent and received from one validator to another.

  • \$"Derive-Primary"\$ is described in Algorithm 15.

  • The condition of completablitiy is defined in Definition 85.

  • \$"Best-Final-Candidate"\$ function is explained in Algorithm 16.

  • \$"Attempt-To-Finalize-At-Round"(r)\$ is described in Algorithm 19.

  • \$"Finalizable"\$ is defined in Algorithm 20.

Algorithm 15 Derive-Primary

Input: rr

1:return rmodVr \bmod |\mathbb{V}|

Algorithm 15. Derive-Primary

where \$r\$ is the GRANDPA round whose primary is to be determined.

Algorithm 16 Best-Final-Candidate

Input: rr

1:Bvr,pvB_v^{r, pv} \leftarrow GRANDPA-GHOST(rr)

2:if r=0r = 0 then

3:return Bvr,pvB_v^{r, pv}

4:else

5:C{BBBvr,pv#Vobv(v),potr,pc(B)>23V}\mathcal{C} \leftarrow \{ B' | B' \leqslant B_v^{r,pv} | \#V^{r, pc}_{\operatorname{obv}(v), pot}(B') > \frac{2}{3} |\mathbb{V}| \}

6:if C=ϕ\mathcal{C} = \phi then

7:return Bvr,pvB_v^{r, pv}

8:else

9:return EC:Hn(E)=max(Hn(B)BC)E \in \mathcal{C} : H_n (E) = \operatorname{max}\left(H_n (B') | B' \in \mathcal{C}\right)

10:end if

11:end if

Algorithm 16. Best-Final-Candidate

where \$#V_("obv"(v),pot)^(r,pc)\$ is defined in Definition 83.

Algorithm 17 GRANDPA-GHOST

Input: rr

1:if r=0r = 0 then

2:GBlastG \leftarrow B_{last}

3:else

4:LL \leftarrow Best-Final-Candidate(r1r - 1)

5:G={B>L#Vobs(v)r,pv(B)23V}\mathcal{G} = \{ \forall B > L | \#V_{\operatorname{obs}(v)}^{r, pv}(B) \geqslant \frac{2}{3} |\mathbb{V}| \}

6:if G=ϕ\mathcal{G} = \phi then

7:GLG \leftarrow L

8:else

9:GGHn(G)=max(Hn(B)BG)G \in \mathcal{G} | H_n(G) = \operatorname{max}\left( H_n (B) | \forall B \in \mathcal{G} \right)

10:end if

11:end if

12:return GG

Algorithm 17. GRANDPA-GHOST
where
  • \$B_("last")\$ is the last block which has been finalized on the chain (Definition 90).

  • \$#V_("obs"(v))^(r,pv)(B)\$ is defined in Definition 82.

Algorithm 18 Best-PreVote-Candidate

Input: rr

1:Bvr,pvB^{r, pv}_v \leftarrow GRANDPA-GHOST(rr)

2:if Received(Mvprimaryr,prim(B))M_{v_{primary}}^{r, prim}(B)) and Bvr,pvB>LB^{r, pv}_v \geqslant B > L) then

3:NBN \leftarrow B

4:else

5:NBvr,pvN \leftarrow B^{r, pv}_v

6:end if

Algorithm 19 Attempt-To-Finalize-At-Round

Require: (rr)

1:LL \leftarrow Last-Finalized-Block

2:EE \leftarrow Best-Final-Candidate(rr)

3:if ELE \geqslant L and Vobs(v)r,pc(E)>2/3V{V^{r, \textrm{pc}}_{\textrm{obs}(v)}}(E) > 2 / 3 |\mathbb{V}| then

4:Last-Finalized-BlockE\leftarrow E

5:if Mvr,Fin(E)M_v^{r, \textrm{Fin}} (E) \notin Received-Messages then

6:Broadcast(Mvr,Fin(E)M_v^{r, \textrm{Fin}}(E))

7:return

8:end if

9:end if

Algorithm 20 Finalizable

Require: (rr)

1:if rr is not Completable then

2:return False

3:end if

4:GG \leftarrow GRANDPA-GHOST(Jr,pv(B)J^{r, pv}(B))

5:if G=ϕG = \phi then

6:return False

7:end if

8:ErE_r \leftarrow Best-Final-Candidate(rr)

9:if ErϕE_r \neq \phi and Best-Final-Candidate(r1r - 1) ErG\leqslant E_r \leqslant G then

10:return True

11:else

12:return False

13:end if

Algorithm 20. Finalizable

where the condition for completability is defined in Definition 85.

Note that we might not always succeed in finalizing our best final candidate due to the possibility of equivocation. We might even not finalize anything in a round (although Algorithm 14 prevents us from moving to the round \$r+1\$ before finalizing the best final candidate of round \$r-1\$) The example in Definition 88 serves to demonstrate a situation where the best final candidate of a round cannot be finalized during its own round:

Definition 88. Unfinalized Candidate

Let us assume that we have 100 voters and there are two blocks in the chain (\$B_1 < B_2\$). At round 1, we get 67 pre-votes for \$B_2\$ and at least one pre-vote for \$B_1\$ which means that \$"GRANDPA-GHOST"(1) = B_2\$.

Subsequently, potentially honest voters who could claim not seeing all the pre-votes for \$B_2\$ but receiving the pre-votes for \$B_1\$ would pre-commit to \$B_1\$. In this way, we receive 66 pre-commits for \$B_1\$ and 1 pre-commit for \$B_2\$. Henceforth, we finalize \$B_1\$ since we have a threshold commit (67 votes) for \$B_1\$.

At this point, though, we have \$tt "Best-Final-Candidate"(r) = B_2\$ as \$#V_("obs"(v),"pot")^(r,"stage")(B_2) = 67\$ and \$2 > 1\$.

However, at this point, the round is already completable as we know that we have \$tt "GRANDPA-GHOST"(1) = B_2\$ as an upper limit on what we can finalize and nothing greater than \$B_2\$ can be finalized at \$r = 1\$. Therefore, the condition of Algorithm 14 is satisfied and we must proceed to round 2.

Nonetheless, we must continue to attempt to finalize round 1 in the background as the condition of Algorithm 19 has not been fulfilled.

This prevents us from proceeding to round 3 until either:

  • We finalize \$B_2\$ in round 2, or

  • We receive an extra pre-commit vote for \$B_1\$ in round 1. This will make it impossible to finalize \$B_2\$ in round 1, no matter to whom the remaining pre-commits are going to be cast for (even with considering the possibility of 1/3 of voter equivocating) and therefore we have \$tt "Best-Final-Candidate"(r) = B_1\$.

Both scenarios unblock Algorithm 14, \$tt "Last-Finalized-Block" >= tt "Best-Final-Candidate"(r - 1)\$ albeit in different ways: the former with increasing the \$tt "Last-Finalized-Block"\$ and the latter with decreasing \$tt "Best-Final-Candidate"(r - 1)\$.

6.5. Forced Authority Set Changes

In a case of emergency where the Polkadot network is unable to finalize blocks, such as in an event of mass validator outage, the Polkadot governance mechanism must enact a forced change, which the Host must handle in a specific manner. Given that in such a case finality cannot be relied on, the Host must detect the forced change (Definition 86) in a (valid) block and apply it to all forks.

The \$m in CM_g\$, which is specified by the governance mechanism, defines the starting block at which \$N_("delay")\$ is applied. This provides some degree of probabilistic consensus to the network with the assumption that the forced change was received by most participants and that finality can be continued.

Block #100
Block #100
Block #101'A
Block #101'A
Block #101'B
Block #101'B
Applied
Applied
(Scheduled Change)
Authority Set #A, delay 5
(Scheduled Change)...
Applied
Applied
(Scheduled Change)
Authority Set #B, delay 5
(Scheduled Change)...
Fork #A
Fork #A
Fork #B
Fork #B
...
...
(Scheduled Change)
Authority Set #X
(Scheduled Change)...
...
...
...
...
Block #106'B
Block #106'B
Block #106'A
Block #106'A
...
...
...
...
...
...
Text is not SVG - cannot display
Figure 2. Applying a scheduled change
Block #100
Block #100
Block #101'A
Block #101'A
Block #101'B
Block #101'B
Ignored
Ignored
(Scheduled Change)
Authority Set #A, delay 5
(Scheduled Change)...
Applied
Applied
Applied cross-fork
Applied cross-fork
(Forced Change)
Authority Set #B
starting at mdelay 5
(Forced Change)...
Fork #A
Fork #A
Fork #B
Fork #B
...
...
(Scheduled Change)
Authority Set #X
(Scheduled Change)...
...
...
...
...
Block #(m+5)'B
Block #(m+5)'B
Block #(m+5)'A
Block #(m+5)'A
...
...
...
...
...
...
Text is not SVG - cannot display
Figure 3. Applying a forced change

6.6. Block Finalization

Definition 89. Justified Block Header

The Justified Block Header is provided by the consensus engine and presented to the Polkadot Host, for the block to be appended to the blockchain. It contains the following parts:

  • block_header the complete block header (Definition 10) and denoted by \$"Head"(B)\$.

  • justification: as defined by the consensus specification indicated by \$"Just"(B)\$ as defined in Definition 78.

  • authority Ids: This is the list of the Ids of authorities, which have voted for the block to be stored and is formally referred to as \$A(B)\$. An authority Id is 256-bit.

Definition 90. Finalized

A Polkadot relay chain node \$n\$ should consider block \$B\$ as finalized if any of the following criteria hold for \$B' >= B\$:

  • \$V_("obs"(n))^(r,"pc")(B') > 2/3|bbb V_(B')|\$.

  • It receives a \$M_v^(r,"Fin")(B')\$ message in which \$J^r(B)\$ justifies the finalization (Definition 78).

  • It receives a block data message for \$B'\$ with \$"Just"(B')\$ (Definition 89) which justifies the finalization.

for:

  • Any round \$r\$ if the node \$n\$ is not a GRANDPA voter.

  • Only for round \$r\$ for which the node \$n\$ has invoked Algorithm 14 and round \$r+1\$ if \$n\$ is a GRANDPA voter and has already caught up to its peers according to the process described in Section Section 6.6.1.

Note that all Polkadot relay chain nodes are supposed to process GRANDPA commit messages regardless of their GRANDPA voter status.

6.6.1. Catching up

When a Polkadot node (re)joins the network, it requests the history of state transitions in the form of blocks, which it is missing.

Nonetheless, the process is different for a GRANDPA voter node. When a voter node joins the network, it needs to gather the justification (Definition 78) of the rounds it has missed. Through this process, they can safely join the voting process of the current round, on which the voting is taking place.

6.6.1.1. Sending the catch-up requests

When a Polkadot voter node has the same authority list as a peer voter node who is reporting a higher number for the finalized round field, it should send a catch-up request message (Definition 48) to the reporting peer. This will allow the node to to catch up to the more advanced finalized round, provided that the following criteria hold:

  • The peer node is a GRANDPA voter, and:

  • The last known finalized round for the Polkadot node is at least 2 rounds behind the finalized round for the peer.

6.6.1.2. Processing the catch-up requests

Only GRANDPA voter nodes are required to respond to the catch-up requests. Additionally, it is only GRANDPA voters who are supposed to send catch-up requests. As such GRANDPA voters could safely ignore the catch-up requests from non-voter nodes. When a GRANDPA voter node receives a catch-up request message, it needs to execute Algorithm 21. Note: a voter node should not respond to catch-up requests for rounds that are actively being voted on, those are the rounds for which Algorithm 14 is not concluded.

Algorithm 21 Process-Catchup-Request

Input: Mi,vCat-q(idV,r)M_{i, v}^\text{Cat-q}(\text{id}_\mathbb{V}, r)

1:if Mi,vCat-q(idV,r).idVidVM_{i, v}^\text{Cat-q}(\text{id}_\mathbb{V}, r).\text{id}_\mathbb{V} \neq \text{id}_\mathbb{V} then

2:error ‘‘Catching up on different set''

3:end if

4:if iPi \notin \mathbb{P} then

5:error ‘‘Requesting catching up from a non-peer''

6:end if

7:if r>r > Last-Completed-Round then

8:error ‘‘Catching up on a round in the future''

9:end if

10:Send(i,Mv,iCat-s(idV,r)i, M_{v, i}^\text{Cat-s}(\text{id}_\mathbb{V}, r))

where
  • \$M_(i,v)^("Cat"-q)("id"_(bbb V),r)\$ is the catch-up message received from peer \$i\$ (Definition 48).

  • \$"id"_(bbb V)\$ is the voter set id with which the serving node is operating

  • \$r\$ is the round number for which the catch-up is requested for.

  • \$bbb P\$ is the set of immediate peers of node \$v\$.

  • \$tt "Last-Completed-Round"\$ is initiated in Algorithm 13 and gets updated by Algorithm 14.

  • \$M_(v,i)^("Cat"-s)("id"_(bbb V),r)\$ is the catch-up response (Definition 49).

6.6.1.3. Processing catch-up responses

A Catch-up response message contains critical information for the requester node to update their view on the active rounds which are being voted on by GRANDPA voters. As such, the requester node should verify the content of the catch-up response message and subsequently updates its view of the state of the finality of the Relay chain according to Algorithm 22.

Algorithm 22 Process-Catchup-Response

Input: Mv,iCat-s(idV,r)M_{v,i}^\text{Cat-s}(\text{id}_{\mathbb{V}}, r)

1:Mv,iCat-s(idV,r).idV,r,Jr,pv(B),Jr,pc(B),Hh(B),Hi(B)DecSC(Mv,iCats(idV,r)M_{v,i}^\text{Cat-s}(\text{id}_{\mathbb{V}}, r).\text{id}_{\mathbb{V}}, r, J^{r, pv}(B), J^{r, pc}(B), H_h(B'), H_i(B') \leftarrow \text{Dec}_{SC}(M_{v, i}^{Cat-s}(\text{id}_{\mathbb{V}}, r)

2:if Mv,iCat-s(idV,r).idVidVM_{v, i}^\text{Cat-s}(\text{id}_{\mathbb{V}}, r).\text{id}_{\mathbb{V}} \neq \text{id}_{\mathbb{V}} then

3:error ‘‘Catching up on different set''

4:end if

5:if rr \leqslant Leading-Round then

6:error ‘‘Catching up in to the past''

7:end if

8:if Jr,pv(B)J^{r, pv}(B) is not valid then

9:error ‘‘Invalid pre-vote justification''

10:end if

11:if Jr,pc(B)J^{r, pc}(B) is not valid then

12:error ‘‘Invalid pre-commit justification''

13:end if

14:GG \leftarrow GRANDPA-GHOST(Jr,pv(B)J^{r, pv}(B))

15:if G=ϕG = \phi then

16:error ‘‘GHOST-less Catch-up''

17:end if

18:if rr is not completable then

19:error ‘‘Catch-up round is not completable''

20:end if

21:if Jr,pc(B)J^{r, pc}(B) justifies BB' finalization then

22:error ‘‘Unjustified Catch-up target finalization''

23:end if

24:Last-Completed-Round r\leftarrow r

25:if iVi \in \mathbb{V} then

26:Play-Grandpa-round(r+1r + 1)

27:end if

where \$M_(v,i)^("Cat"-s)("id"_(bbb V),r)\$ is the catch-up response received from node \$v\$ (Definition 49).

6.7. Bridge design (BEEFY)

The BEEFY protocol is currently in early development and subject to change. The specification has not been completed yet.

The BEEFY (Bridge Effiency Enabling Finality Yielder) is a secondary protocol to GRANDPA to support efficient bridging between the Polkadot network (relay chain) and remote, segregated blockchains, such as Ethereum, which were not built with the Polkadot interchain operability in mind. The protocol allows participants of the remote network to verify finality proofs created by the Polkadot relay chain validators. In other words: clients in the Ethereum network should able to verify that the Polkadot network is at a specific state.

Storing all the information necessary to verify the state of the remote chain, such as the block headers, is too expensive. BEEFY stores the information in a space-efficient way and clients can request additional information over the protocol.

6.7.1. Preliminaries

Definition 91. Merkle Mountain Ranges

Merkle Mountain Ranges, MMR, are used as an efficient way to send block headers and signatures to light clients.

MMRs have not been defined yet.
Definition 92. Statement

The statement is the same piece of information which every relay chain validator is voting on. Namely, the MMR root of all the block header hashes leading up to the latest, finalized block.

Definition 93. Witness Data

Witness data contains the statement (Definition 92), an array indicating which validator of the Polkadot network voted for the statement (but not the signatures themselves) and a MMR root of the signatures. The indicators of which validator voted for the statement are just claims and provide no proofs. The network message is defined in Definition 53 and the relayer saves it on the chain of the remote network.

Definition 94. Light Client

A light client is an abstract entity in a remote network such as Ethereum. It can be a node or a smart contract with the intent of requesting finality proofs from the Polkadot network. A light client reads the witness data (Definition 93 from the chain, then requests the signatures directly from the relayer in order to verify those.

The light client is expected to know who the validators are and has access to their public keys.

Definition 95. Relayer

A relayer (or "prover") is an abstract entity which takes finality proofs from the Polkadot network and makes those available to the light clients. Inherently, the relayer tries to convince the light clients that the finality proofs have been voted for by the Polkadot relay chain validators. The relayer operates offchain and can for example be a node or a collection of nodes.

6.7.2. Voting on Statements

The Polkadot Host signs a statement (Definition 92) and gossips it as part of a vote (Definition 51) to its peers on every new, finalized block. The Polkadot Host uses ECDSA for signing the statement, since Ethereum has better compatibility for it compared to SR25519 or ED25519.

6.7.3. Committing Witnesses

The relayer (Definition 95) participates in the Polkadot network by collecting the gossiped votes (Definition 51). Those votes are converted into the witness data structure (Definition 93). The relayer saves the data on the chain of the remote network. The occurrence of saving witnesses on remote networks is undefined.

6.7.4. Requesting Signed Commitments

A light client (Definition 94) fetches the witness data (Definition 93) from the chain. Once the light client knows which validators apparently voted for the specified statement, it needs to request the signatures from the relayer to verify whether the claims are actually true. This is achieved by requesting signed commitments (Definition 52).

How those signed commitments are requested by the light client and delivered by the relayer varies among networks or implementations. On Ethereum, for example, the light client can request the signed commitments in form of a transaction, which results in a response in form of a transaction.

7. Light Clients

7.1. Requirements for Light Clients

We list requirements of a Light Client categorized along the the three dimensions of Functionality, Efficiency, and Security.

  • Functional Requirements:

    1. Update state (Section 2.4) to reflect the latest view of the blockchain via synchronization with full nodes.

    2. (Optional) Verify validity of runtime transitions (Section 2.6).

    3. Make queries for data at the latest block height or across a range of blocks.

    4. Append extrinsics (Section 2.3) to the blockchain via full nodes.

  • Efficiency Requirements:

    1. Efficient bootstrapping and syncing: initializations and update functions of the state have tractable computation and communication complexity and grows at most linearly with the chain size. Generally, the complexity is proportional to the GRANDPA validator set change.

    2. Querying operations happen by requesting athe key-value pair from a full node.

    3. Further, verifying the validity of responses by the full node is logarithmic in the size of the state.

  • Security Requirements:

    1. Secure bootstrapping and Synchronizing: Probability that an adversarial full node convincing a light client of a forged blockchain state is negligible.

    2. Secure querying: Probability that an adversary convinces light client to accept a forged account state os negligible.

    3. Assure that the submitted extrinsics are appended in a successor block or inform the user incase of failure.

  • Polkadot Specific Requirements:

    1. The client MUST be able to connect to a relay chain using chain state.

    2. The client MUST be able to retrieve checkpoint state from a trusted source to speed up initialization.

    3. The client MUST be able to subscribe/unsubscribe to/from any polkadot-spec-conformant relay chain (Polkadot, Westend, Kusama)

    4. The client MUST be able to subscribe/unsubscribe to/from parachains that do not use custom protocols or cryptography methods other than those that Polkadot, Westend and Kusama use.

    5. The client MUST support the following RPC methods: rpc_methods, chainHead_unstable_follow, chainHead_unstable_unfollow, chainHead_unstable_unpin, chainHead_unstable_storage, chainHead_unstable_call chainHead_unstable_stopCall. transaction_unstable_submitAndWatch, and transaction_unstable_unwatch

    6. The client MUST support the @substrate/connect connection extension protocol: ToApplicationError, ToApplicationChainReady, ToApplicationRpc, ToExtensionAddChain, ToExtensionAddWellKnownChain, ToExtensionRpc, ToExtensionRemoveChain.

7.2. Warp Sync for Light Clients

Warp sync (Section 4.8.4) only downloads the block headers where authority set changes occurred, so-called fragments (Definition 41), and by verifying the GRANDPA justifications (Definition 78). This protocol allows nodes to arrive at the desired state much faster than fast sync. Warp sync is primarily designed for Light Clients. Although, warp sync could be used by full nodes, the sync process mahy lack information to cater to complete functionality set of full nodes.

For light clients, it is too expensive to download the state (approx. 550MB) to respond to queries. Rather, the queries are submitted to the Full node and only the response of the full node is validated using the hash of the state root. Requests for warp sync are performed using the /dot/sync/warp Request-Response substream, the corresponding network messages are detailed in Section 4.7.

Light clients base their trust in provided snapshots and the ability to slash grandpa votes for equivocation for the period they are syncing via warp sync. Full nodes and above in contrast verify each block indvidually.

In theory, the warp sync process takes the Genesis Block as input and outputs the hash of the state trie root at the latest finalized block. This root hash acts as proof to further validate the responses to queries by the full node. The warp sync works by starting from a trusted specified block (for e.g. from a snapshot) and verifying the block headers only at the authority set changes.

Eventually, the light client verifies the finality of the block returned by a full node to ensure that the block is indeed the latest finalized block. This entails two things: . check the authenticity of GRANDPA Justifications messages from Genesis to the last finalized block, . Check the timestamp of the last finalized block to ensure that no other blocks might have been finalized at a later timestamp.

Long-Range Attack Vulnerabilities: Warp syncing is particularly vulnerable to what is called long-range attacks. The authorities allowed to finalize blocks can generate multiple proofs of finality for multiple different blocks of the same height, hence, they can finalize more than one chain at a time. It is possible for two-thirds of the validators that were active at a certain past block N to collude and decide to finalize a different block N', even when N has been finalized for the first time several weeks or months in the past. When a client then warp syncs, it can be tricked to consider this alternative block N' as the finalized one. However, in practice, to mitigate Long-Range Attacks, the starting point of the warp syncing is not too far in the past. How far exactly depends on the logic of the runtime of the chain. For example, in Polkadot, the starting block for the sync should be at max 28 days old, to be within the purview of the slashing period for misbehaving nodes. Hence, even though in theory warp sync can start from Genesis Block, it is not advised to implement the same in practice.

We outline the warp sync process, abstracting out details of verifying the finality and how the full node to sync with is selected.

Algorithm 23 Warp-Sync-Light-Clients

Input: BlockHeader startblock, the initial block to start the sync. May not be the Genesis Block.

Output: CommitmentRootHash rootroot, State Tries Root hash of the latest finalized Block.

1:fullnode \leftarrow SelectFullNode

2:latestBlockHeader, grandpaJustifications \leftarrow SyncWithNode(fullnode)

3:isVerified \leftarrow verifyAuthoritySetChange(grandpaJustifications) \land verifyFinality(latestBlockHeader)

4:if isVerified then

5:returnreturn SOMESOME getCommitmentRootHash(latestBlockHeader)

6:end if

7:throwthrow ERRORERROR

Abstraction of Warp Sync and verification of latest block’s finality. SelectFullNode: determines the full node that the light client syncs with. SyncSithNode: Returns the header of latest finalized block and a list of Grandpa Justifications by the full node. verifyAuthoritySetChange: verification algorithm which checks the authenticity of the header only at the end of an era where the authority set changes iteratively until reaching the latest era. verifyFinalty: verifies the finalty of the latest block using the Grandpa Justifications messages.

The warp syncing process is closely coupled with the state querying procedure used the light client. We outline the process of querying the state by a light client and validating the response.

Algorithm 24 Querying-State-Light-Clients

Input: Query q, BlockHeight h, CommitmentRootHash rootroot

Output: Maybe Result resres

1:(resres, π\pi) QueryFullNode(q,h)\leftarrow QueryFullNode (q, h)

2:if validityCheckroot(res,π)validityCheck_{root}(res, \pi) then

3:returnreturn SOMESOME resres

4:end if

5:throwthrow ERRORERROR

Querying State Algorithm. \(QueryFullNode\): returns the response to the query requested from the Full Node for the query \(q\) at block height \(h\). \(validityCheck_{root}\): predicate that checks the validity of response \(res\) and associated merkle proof \(\pi\) by matching it against the Commit Root Hash \(root\) obtained as a result of warp sync.

7.3. Runtime Environment for Light Clients

Technically, though a runtime execution environment is not necessary to build a light client, most clients require interacting with the Runtime and the state of the blockchain for integrity checks at the minimum. One can imagine an application scenarios like an on-chain light client which only listens to the latest state without ever adding extrinsics. Current implementations of Light Nodes (for e.g. Smoldot) uses the wasmtime as its runtime environment to drastically simplify the code. The performance of wasmtime is satisfying enough to not require a native runtime. The details of runtime API that the environment needs to suppport can be found in Appendix C (Appendix C).

7.4. Light Client Messages

Light clients are applications that fetch the required data that they need from a Polkadot node with an associated proof to validate the data. This makes it possible to interact with the Polkadot network without requiring to run a full node or having to trust the remote peers. The light client messages make this functionality possible.

All light client messages are protobuf encoded and are sent over the /dot/light/2 substream.

7.4.1. Request

A message with all possible request messages. All message are sent as part of this message.

Type Id Description

oneof (request)

The request type

Where the request can be one of the following fields:

Type Id Description

RemoteCallRequest

1

A remote call request (Definition 96)

RemoteReadRequest

2

A remote read request (Definition 98)

RemoteReadChildRequest

4

A remote read child request (Definition 100)

7.4.2. Response

A message with all possible response messages. All message are sent as part of this message.

Type Id Description

oneof (response)

The response type

Where the response can be one of the following fields:

Type Id Description

RemoteCallResponse

1

A remote call response (Definition 97)

RemoteReadResponse

2

A remote read response (Definition 99)

7.4.3. Remote Call Messages

Execute a call to a contract at the given block.

Definition 96. Remote Call Request

Remote call request.

Type Id Description

bytes

2

Block at which to perform call

string

3

Method name

bytes

4

Call data

Definition 97. Remote Call Response

Remote call response.

Type Id Description

bytes

2

An Option type (Definition 190) containing the call proof or None if proof generation failed.

7.4.4. Remote Read Messages

Read a storage value at the given block.

Definition 98. Remote Read Request

Remote read request.

Type Id Description

bytes

2

Block at which to perform call

repeated bytes

3

Storage keys

Definition 99. Remote Read Response

Remote read response.

Type Id Description

bytes

2

An Option type (Definition 190) containing the read proof or None if proof generation failed.

7.4.5. Remote Read Child Messages

Read a child storage value at the given block.

Remote read child request.

Type Id Description

bytes

2

Block at which to perform call

bytes

3

Child storage key, this is relative to the child type storage location

bytes

6

Storage keys

The response is the same as for the Remote Read Request message, respectively Definition 99.

7.5. Storage for Light Clients

The light client requires a persistent storage for saving the state of the blockchain. In addition it requires efficient Serialization/ De-serialization methods to transform SCALE (Section A.2.2) encoded network traffic for storing and reading from the persistent storage.

8. Availability & Validity

Polkadot serves as a replicated shared-state machine designed to resolve scalability issues and interoperability among blockchains. The validators of Polkadot execute transactions and participate in the consensus of Polkadots primary chain, the so called relay chain. Parachains are independent networks that maintain their own state and are connected to the relay chain. Those parachains can take advantage of the relay chain consensus mechanism, including sending and receiving messages to and from other parachains. Parachain nodes that send parachain blocks, known as candidates, to the validators in order to be included in relay chain are referred to as collators.

The Polkadot relay chain validators are responsible for guaranteeing the validity of both relay chain and parachain blocks. Additionally, the validators are required to keep enough parachain blocks that should be included in the relay chain available in their local storage in order to make those retrievable by peers, who lack the information, to reliably confirm the issued validity statements about parachain blocks. The Availability & Validity (AnV) protocol consists of multiple steps for successfully upholding those responsibilities.

Parachain blocks themselves are produced by collators (Section 8.1), whereas the relay chain validators only verify their validity (and later, their availability). It is possible that the collators of a parachain produces multiple parachain block candidates for a child of a specific block. Subsequently, they send the block candidates to the the relay chain validators who are assigned to the specific parachain. The assignment is determined by the Runtime (Section 8.2). Those validators are then required to check the validity of submitted candidates (Section 8.3), then issue and collect statements (Section 8.2.1) about the validity of candidates to other validators. This process is known as candidate backing. Once a candidate meets a specified criteria for inclusion, the selected relay chain block author then choses any of the backed candidate for each parachain and includes those into the relay chain block (Section 8.2.2).

Every relay chain validator must fetch the proposed candidates and issue votes on whether they have the candidate saved in their local storage, so called availability votes (Section 8.4.1), then also collect the votes sent by other validators and include them in the relay chain state (Section 8.2.2). This process ensures that only relay chain blocks get finalized where each candidate is available on enough nodes of validators.

Parachain candidates contained in non-finalized relay chain blocks must then be retrieved by a secondary set of relay chain validators, unrelated from the candidate backing process, who are randomly assigned to determine the validity of specific parachains based on a VRF lottery and are then required to vote on the validity of those candidates. This process is known as approval voting (Section 8.5). If a validator does not have the candidate data, it must recover the candidate data (Section 8.4.2).

8.1. Collations

Collations are proposed candidates Definition 131 to the Polkadot relay chain validators. The Polkodat network protocol is agnostic on what candidate productionis mechanism each parachain uses and does not specify or mandate any of such production methods (e.g. BABE-GRANDPA, Aura, etc). Furthermore, the relay chain validator host implementation itself does not directly interpret or process the internal transactions of the candidate, but rather rely on the parachain Runtime to validate the candidate (Section 8.3). Collators, which are parachain nodes which produce candidate proposals and send them to the relay chain validator, must prepare pieces of data (Definition 101) in order to correctly comply with the requirements of the parachain protocol.

Definition 101. Collation

A collation is a datastructure which contains the proposed parachain candidate, including an optional validation parachain Runtime update and upward messages. The collation datastructure, C, is a datastructure of the following format:

\$C = (M,H,R,h,P,p,w)\$ \$M = (u_n,…u_m)\$ \$H = (z_n,…z_m)\$
where
  • \$M\$ is an array of upward messages (Definition 137), \$u\$, interpreted by the relay chain itself.

  • \$H\$ is an array of outbound horizontal messages (Definition 139), \$z\$, interpreted by other parachains.

  • \$R\$ is an Option type (Definition 190) which can contain a parachain Runtime update. The new Runtime code is an array of bytes.

  • \$h\$ is the head data (Definition 133) produced as a result of execution of the parachain specific logic.

  • \$P\$ is the PoV block (Definition 132).

  • \$p\$ is an unsigned 32-bit integer indicating the number of processed downward messages (Definition 138).

  • \$w\$ is an unsigned 32-bit integer indicating the mark up to which all inbound HRMP messages have been processed by the parachain.

8.2. Candidate Backing

The Polkadot validator receives an arbitrary number of parachain candidates with associated proofs from untrusted collators. The assigned validators of each parachain (Definition 136) must verify and select a specific quantity of the proposed candidates and issue those as backable candidates to its peers. A candidate is considered backable when at least 2/3 of all assigned validators have issued a Valid statement about that candidate, as described in Section 8.2.1. Validators can retrieve information about assignments via the Runtime APIs Section C.9.2 respectively Section C.9.3.

8.2.1. Statements

The assigned validator checks the validity of the proposed parachains blocks (Section 8.3) and issues Valid statements (Definition 102) to its peers if the verification succeeded. Broadcasting failed verification as Valid statements is a slashable offense. The validator must only issue one Seconded statement, based on an arbitrary metric, which implies an explicit vote for a candidate to be included in the relay chain.

This protocol attempts to produce as many backable candidates as possible, but does not attempt to determine a final candidate for inclusion. Once a parachain candidate has been seconded by at least one other validator and enough Valid statements have been issued about that candidate to meet the 2/3 quorum, the candidate is ready to be included in the relay chain (Section 8.2.2).

The validator issues validity statements votes in form of a validator protocol message (Definition 114).

Definition 102. Statement

A statement, \$S\$, is a datastructure of the following format:

\$S = (d,A_i,A_s)\$ \$d = {(1,->,C_r),(2,->,C_h):}\$
where
  • \$d\$ is a varying datatype where 1 indicates that the validator “seconds” a candidate, meaning that the candidate should be included in the relay chain, followed by the committed candidate receipt (Definition 105), \$C_r\$. 2 indicates that the validator has deemed the candidate valid, followed by the candidate hash.

  • \$C_h\$ is the candidate hash.

  • \$A_i\$ is the validator index in the authority set that signed this statement.

  • \$A_s\$ is the signature of the validator.

8.2.2. Inclusion

The Polkadot validator includes the backed candidates as parachain inherent data (Definition 103) into a block as described Section 2.3.3. The relay chain block author decides on whatever metric which candidate should be selected for inclusion, as long as that candidate is valid and meets the validity quorum of 2/3+ as described in Section 8.2.1. The candidate approval process (Section 8.5) ensures that only relay chain blocks are finalized where each candidate for each availability core meets the requirement of 2/3+ availability votes.

Definition 103. Parachain Inherent Data

The parachain inherent data contains backed candidates and is included when authoring a relay chain block. The datastructure, \$I\$, is of the following format:

\$I = (A,T,D,P_h)\$ \$T = (C_0,…C_n)\$ \$D = (*d_n,…d_m)\$ \$C = (R,V,i)\$ \$V = (a_n,…a_m)\$ \$a = {(1,->,s),(2,->,s):}\$ \$A = (L_n,…L_m)\$ \$L = (b,v_i,s)\$
where
  • \$A\$ is an array of signed bitfields by validators claiming the candidate is available (or not). The array must be sorted by validator index corresponding to the authority set (Definition 33).

  • \$T\$ is an array of backed candidates for inclusing in the current block.

  • \$D\$ is an array of disputes.

  • \$P_h\$ is the parachain parent head data (Definition 133).

  • \$d\$ is a dispute statement (Section 8.7.2.1).

  • \$R\$ is a committed candidate receipt (Definition 105).

  • \$V\$ is an array of validity votes themselves, expressed as signatures.

  • \$i\$ is a bitfield of indices of the validators within the validator group (Definition 136).

  • \$a\$ is either an implicit or explicit attestation of the validity of a parachain candidate, where 1 implies an implicit vote (in correspondence of a Seconded statement) and 2 implies an explicit attestation (in correspondence of a Valid statement). Both variants are followed by the signature of the validator.

  • \$s\$ is the signature of the validator.

  • \$b\$ the availability bitfield (Section 8.4.1).

  • \$v_i\$ is the validator index of the authority set (Definition 33).

Definition 104. Candidate Receipt

A candidate receipt, \$R\$, contains information about the candidate and a proof of the results of its execution. It’s a datastructure of the following format:

\$R = (D,C_h)\$

where \$D\$ is the candidate descriptor (Definition 106) and \$C_h\$ is the hash of candidate commitments (Definition 107).

The committed candidate receipt, \$R\$, contains information about the candidate and the the result of its execution that is included in the relay chain. This type is similar to the candidate receipt (Definition 104), but actually contains the execution results rather than just a hash of it. It’s a datastructure of the following format:

\$R = (D,C)\$

where \$D\$ is the candidate descriptor (Definition 106) and \$C\$ is the candidate commitments (Definition 107).

Definition 106. Candidate Descriptor

The candidate descriptor, \$D\$, is a unique descriptor of a candidate receipt. It’s a datastructure of the following format:

\$D = (p,H,C_i,V,B,r,s,p_h,R_h)\$
where
  • \$p\$ is the parachain Id (Definition 134).

  • \$H\$ is the hash of the relay chain block the candidate is executed in the context of.

  • \$C_i\$ is the collators public key.

  • \$V\$ is the hash of the persisted validation data (Definition 227).

  • \$B\$ is the hash of the PoV block.

  • \$r\$ is the root of the block’s erasure encoding Merkle tree.

  • \$s\$ the collator signature of the concatenated components \$p\$, \$H\$, \$R_h\$ and \$B\$.

  • \$p_h\$ is the hash of the parachain head data (Definition 133) of this candidate.

  • \$R_h\$ is the hash of the parachain Runtime.

Definition 107. Candidate Commitments

The candidate commitments, \$C\$, is the result of the execution and validation of a parachain (or parathread) candidate whose produced values must be committed to the relay chain. Those values are retrieved from the validation result (Definition 109). A candidate commitment is a datastructure of the following format:

\$C =(M_u,M_h,R,h,p,w)\$
where
  • \$M_u\$ is an array of upward messages sent by the parachain. Each individual message, m, is an array of bytes.

  • \$M_h\$ is an array of individual outbound horizontal messages (Definition 139) sent by the parachain.

  • \$R\$ is an Option value (Definition 190) that can contain a new parachain Runtime in case of an update.

  • \$h\$ is the parachain head data (Definition 133).

  • \$p\$ is a unsigned 32-bit integer indicating the number of downward messages that were processed by the parachain. It is expected that the parachain processes the messages from first to last.

  • \$w\$ is a unsigned 32-bit integer indicating the watermark which specifies the relay chain block number up to which all inbound horizontal messages have been processed.

8.3. Candidate Validation

Received candidates submitted by collators and must have its validity verified by the assigned Polkadot validators. For each candidate to be valid, the validator must successfully verify the following conditions in the following order:

  1. The candidate does not exceed any parameters in the persisted validation data (Definition 227).

  2. The signature of the collator is valid.

  3. Validate the candidate by executing the parachain Runtime (Section 8.3.1).

If all steps are valid, the Polkadot validator must create the necessary candidate commitments (Definition 107) and submit the appropriate statement for each candidate (Section 8.2.1).

8.3.1. Parachain Runtime

Parachain Runtimes are stored in the relay chain state, and can either be fetched by the parachain Id or the Runtime hash via the relay chain Runtime API as described in Section C.9.7 and Section C.9.8 respectively. The retrieved parachain Runtime might need to be decompressed based on the magic identifier as described in Section 8.3.2.

In order to validate a parachain block, the Polkadot validator must prepare the validation parameters (Definition 108), then use its local Wasm execution environment (Section 2.6.3) to execute the validate_block parachain Runtime API by passing on the validation parameters as an argument. The parachain Runtime function returns the validation result (Definition 109).

Definition 108. Validation Parameters

The validation parameters structure, \$P\$, is required to validate a candidate against a parachain Runtime. It’s a datastructure of the following format:

\$P = (h,b,B_i,S_r)\$
where
  • \$h\$ is the parachain head data (Definition 133).

  • \$b\$ is the block body (Definition 132).

  • \$B_i\$ is the latest relay chain block number.

  • \$S_r\$ is the relay chain block storage root (Section 2.4.4).

Definition 109. Validation Result

The validation result is returned by the validate_block parachain Runtime API after attempting to validate a parachain block. Those results are then used in candidate commitments (Definition 107), which then will be inserted into the relay chain via the parachain inherent data (Definition 103). The validation result, \$V\$, is a datastructure of the following format:

\$V = (h,R,M_u,M_h,p_,w)\$ \$M_u = (m_0,…m_n)\$ \$M_h = (t_0,…t_n)\$
where
  • \$h\$ is the parachain head data (Definition 133).

  • \$R\$ is an Option value (Definition 190) that can contain a new parachain Runtime in case of an update.

  • \$M_u\$ is an array of upward messages sent by the parachain. Each individual message, m, is an array of bytes.

  • \$M_h\$ is an array of individual outbound horizontal messages (Definition 139) sent by the parachain.

  • \$p\$ is a unsigned 32-bit integer indicating the number of downward messages that were processed by the parachain. It is expected that the parachain processes the messages from first to last.

  • \$w\$ is a unsigned 32-bit integer indicating the watermark which specifies the relay chain block number up to which all inbound horizontal messages have been processed.

8.3.2. Runtime Compression

Runtime compression is not documented yet.

8.4. Availability

8.4.1. Availability Votes

The Polkadot validator must issue a bitfield (Definition 141) which indicates votes for the availability of candidates. Issued bitfields can be used by the validator and other peers to determine which backed candidates meet the 2/3+ availability quorum.

Candidates are inserted into the relay chain in form of parachain inherent data (Section 8.2.2) by a block author. A validator can retrieve that data by calling the appropriate Runtime API entry (Section C.9.3), then create a bitfield indicating for which candidate the validator has availability data stored and broadcast it to the network (Definition 118). When sending the bitfield distrubtion message, the validator must ensure \$B_h\$ is set approriately, therefore clarifying to which state the bitfield is referring to, given that candidates can vary based on the chain fork.

Missing availability data of candidates must be recovered by the validator as described in Section 8.4.2. If previously issued bitfields are no longer accurate, i.e. the availability data has been recovered or the candidate of an availability core has changed, the validator must create a new bitfield and broadcast it to the network. Candidates must be kept available by validators for a specific amount of time. If a candidate does not receive any backing, validators should keep it available for about one hour, in case the state of backing does change. Backed and even approved candidates (Section 8.5) must be kept by validators for about 25 hours, since disputes (Section 8.6) can occur and the candidate needs to be checked again.

The validator issues availability votes in form of a validator protocol message (Definition 115).

8.4.2. Candidate Recovery

The availability distribution of the Polkadot validator must be able to recover parachain candidates that the validator is assigned to, in order to determine whether the candidate should be backed (Section 8.2) respectively whether the candidate should be approved (Section 8.5). Additionally, peers can send availability requests as defined in Definition 122 and Definition 124 to the validator, which the validator should be able to respond to.

Candidates are recovered by sending requests for specific indices of erasure encoded chunks (Section A.4.1). A validator should request chunks by picking peers randomly and must recover at least \$f+1\$ chunks, where \$n=3f+k\$ and \$k in {1,2,3}\$. \$n\$ is the number of validators as specified in the session info, which can be fetched by the Runtime API as described in Section C.9.11.

8.5. Approval Voting

The approval voting process ensures that only valid parachain blocks are finalized on the relay chain. After backable parachain candidates were submitted to the relay chain (Section 8.2.2), which can be retrieved via the Runtime API (Section C.9.3), validators need to determine their assignments for each parachain and issue approvals for valid candidates, respectively disputes for invalid candidates. Since it cannot be expected that each validator verifies every single parachain candidate, this mechanism ensures that enough honest validators are selected to verify parachain candidates in order prevent the finalization of invalid blocks. If an honest validator detects an invalid block which was approved by one or more validators, the honest validator must issue a disputes which wil cause escalations, resulting in consequences for all malicious parties, i.e. slashing. This mechanism is described more in Section 8.5.1.

8.5.1. Assignment Criteria

Validators determine their assignment based on a VRF mechanism, similar to the BABE consensus mechanism. First, validators generate an availability core VRF assignment (Definition 111), which indicates which availability core a validator is assigned to. Then a delayed availability core VRF assignment is generated which indicates at what point a validator should start the approval process. The delays are based on “tranches” (Section 8.5.2).

An assigned validator never broadcasts their assignment until relevant. Once the assigned validator is ready to check a candidate, the validator broadcasts their assignment by issuing an approval distribution message (Definition 119), where \$M\$ is of variant 0. Other assigned validators that receive that network message must keep track of if, expecting an approval vote following shortly after. Assigned validators can retrieve the candidate by using the availability recovery (Section 8.4.2) and then validate the candidate (Section 8.3).

The validator issues approval votes in form of a validator protocol message (Definition 114) respectively disputes (Section 8.6).

8.5.2. Tranches

Validators use a subjective, tick-based system to determine when the approval process should start. A validator starts the tick-based system when a new availability core candidates have been proposed, which can be retrieved via the Runtime API (Section C.9.3) , and increments the tick every 500 milliseconds. Each tick/increment is referred to as a “tranche”, represented as an integer, starting at 0.

As described in Section 8.5.1, the validator first executes the VRF mechanism to determine which parachains (availability cores) the validator is assigned to, then an additional VRF mechanism for each assigned parachain to determine the delayed assignment. The delayed assignment indicates the tranche at which the validator should start the approval process. A tranche of value 0 implies that the assignment should be started immediately, while later assignees of later tranches wait until it’s their term to issue assignments, determined by their subjective, tick-based system.

Validators are required to track broadcasted assignments by other validators assigned to the same parachain, including verifying the VRF output. Once a valid assignment from a peer was received, the validator must wait for the following approval vote within a certain period as described in Section C.9.11 by orienting itself on its local, tick-based system. If the waiting time after a broadcasted assignment exceeds the specified period, the validator interprets this behavior as a “no-show”, indicating that more validators should commit on their tranche until enough approval votes have been collected.

If enough approval votes have been collected as described in Section C.9.11, then assignees of later tranches do not have to start the approval process. Therefore, this tranche system serves as a mechanism to ensure that enough candidate approvals from a random set of validators are created without requiring all assigned validators to check the candidate.

Definition 110. Relay VRF Story

The relay VRF story is an array of random bytes derived from the VRF submitted within the block by the block author. The relay VRF story, T, is used as input to determine approval voting criteria and generated the following way:

\$T = "Transcript"(b_r,b_s,e_i,A)\$
where
  • \$"Transcript"\$ constructs a VRF transcript (Definition 175).

  • \$b_r\$ is the BABE randomness of the current epoch (Definition 71).

  • \$b_s\$ is the current BABE slot (Definition 54).

  • \$e_i\$ is the current BABE epoch index (Definition 54).

  • \$A\$ is the public key of the authority.

An availability core VRF assignment is computed by a relay chain validator to determine which availability core (Definition 135) a validator is assigned to and should vote for approvals. Computing this assignement relies on the VRF mechanism, transcripts and STROBE operations described further in Section A.1.3.

The Runtime dictates how many assignments should be conducted by a validator, as specified in the session index which can be retrieved via the Runtime API (Section C.9.11). The amount of assignments is referred to as “samples”. For each iteration of the number of samples, the validator calculates an individual assignment, \$T\$, where the little-endian encoded sample number, \$s\$, is incremented by one. At the beginning of the iteration, \$S\$ starts at value 0.

The validator executes the following steps to retrieve a (possibly valid) core index:

\$t_1 larr "Transcript"("'A&V MOD'")\$ \$t_2 larr "append"(t_1, "'RC-VRF'", R_s)\$ \$t_3 larr "append"(t_2, "'sample'", s)\$ \$t_4 larr "append"(t_3, "'vrf-nm-pk'", p_k)\$ \$t_5 larr "meta-ad"(t_4, "'VRFHash'", "False")\$ \$t_6 larr "meta-ad"(t_5, 64_("le"), "True")\$ \$i larr "prf"(t_6, "False")\$ \$o = s_k * i\$

where \$s_k\$ is the secret key, \$p_k\$ is the public key and \$64_("le")\$ is the integer 64 encoded as little endian. \$R_s\$ is the relay VRF story as defined in Definition 110. Following:

\$t_1 larr "Transcript"("'VRFResult'")\$ \$t_2 larr "append"(t_1, "''", "'A&V CORE'")\$ \$t_3 larr "append"(t_2, "'vrf-in'", i)\$ \$t_4 larr "append"(t_3, "'vrf-out'", o)\$ \$t_5 larr "meta-ad"(t_4, "''", "False")\$ \$t_6 larr "meta-ad"(t_5, 4_"le", "True")\$ \$r larr "prf"(t_6, "False")\$ \$c_i = r mod a_c\$

where \$4_("le")\$ is the integer 4 encoded as little endian, \$r\$ is the 4-byte challenge interpreted as a little endian encoded interger and \$a_c\$ is the number of availability cores used during the active session, as defined in the session info retrieved by the Runtime API (Section C.9.11). The resulting integer, \$c_i\$, indicates the parachain Id (Definition 134). If the parachain Id doesn’t exist, as can be retrieved by the Runtime API (Section C.9.3), the validator discards that value and continues with the next iteration. If the Id does exist, the validators continues with the following steps:

\$t_1 larr "Transcript"("'A&V ASSIGNED'")\$ \$t_2 larr "append"(t_1, "'core'", c_i)\$ \$p larr "dleq_prove"(t_2, i)\$

where \$"dleq_prove"\$ is described in Definition 172. The resulting values of \$o\$, \$p\$ and \$s\$ are used to construct an assignment certificate (Definition 113) of kind 0.

The delayed availability core VRF assignments determined at what point a validator should start the approval process as described in Section 8.5.2. Computing this assignement relies on the VRF mechanism, transcripts and STROBE operations described further in Section A.1.3.

The validator executes the following steps:

\$t_1 larr "Transcript"("'A&V DELAY'")\$ \$t_2 larr "append"(t_1,"'RC-VRF'",R_s)\$ \$t_3 larr "append"(t_2, "'core'",c_i)\$ \$t_4 larr "append"(t_3, "'vrf-nm-pk'", p_k)\$ \$t_5 larr "meta-ad"(t_4, "'VRFHash'", "False")\$ \$t_6 larr "meta-ad"(t_5, 64_("le"), "True")\$ \$i larr "prf"(t_6, "False")\$ \$o = s_k * i\$ \$p larr "dleq_prove"(t_6, i)\$

The resulting value \$p\$ is the VRF proof (Definition 171). \$"dleq_prove"\$ is described in Definition 172.

The tranche, \$d\$, is determined as:

\$t_1 larr "Transcript"("'VRFResult'")\$ \$t_2 larr "append"(t_1, "''", "'A&V TRANCHE'")\$ \$t_3 larr "append"(t_2, "'vrf-in'", i)\$ \$t_4 larr "append"(t_3, "'vrf-out'", o)\$ \$t_5 larr "meta-ad"(t_4, "''", "False")\$ \$t_6 larr "meta-ad"(t_5, 4_("le"), "True")\$ \$c larr "prf"(t_6, "False")\$ \$d = d mod (d_c+d_z) - d_z\$
where
  • \$d_c\$ is the number of delayed tranches by total as specified by the session info, retrieved via the Runtime API (Section C.9.11).

  • \$d_z\$ is the zeroth delay tranche width as specified by the session info, retrieved via the Runtime API (Section C.9.11)..

The resulting tranche, \$n\$, cannot be less than \$0\$. If the tranche is less than \$0\$, then \$d=0\$. The resulting values \$o\$, \$p\$ and \$c_i\$ are used to construct an assignment certificate (<Definition 113) of kind 1.

Definition 113. Assignment Certificate

The Assignment Certificate proves to the network that a Polkadot validator is assigned to an availability core and is therefore qualified for the approval of candidates, as clarified in Definition 111. This certificate contains the computed VRF output and is a datastructure of the following format:

\$(k, o, p)\$ \$k = {(0,->,s),(1,->,c_i):}\$

where \$k\$ indicates the kind of the certificate, respectively the value 0 proves the availability core assignment (Definition 111), followed by the sample number \$s\$, and the value 1 proves the delayed availability core assignment (Definition 112), followed by the core index \$c_i\$ (Section C.9.3). \$o\$ is the VRF output and \$p\$ is the VRF proof.

8.6. Disputes

Disputes are not documented yet.

8.7. Network Messages

The availability and validity process requires certain network messages to be exchanged between validators and collators.

8.7.1. Notification Messges

The notification messages are exchanged between validators, including messages sent by collators to validators. The protocol messages are exchanged based on a streaming notification substream (Section 4.5). The messages are SCALE encoded (Section A.2.2).

The validator protocol message is a varying datatype used by validators to broadcast relevant information about certain steps in the A&V process. Specifically, this includes the backing process (Section 8.2) and the approval process (Section 8.5). The validator protocol message, \$M\$, is a varying datatype of the following format:

\$M = {(1,->,M_f),(3,->,M_s),(4,->,M_a):}\$
where

The collation protocol message, M, is a varying datatype of the following format:

\$M = {(0,->,M_c):}\$

where \$M_c\$ is the collator message (Definition 116).

Definition 116. Collator Message

The collator message is sent as part of the collator protocol message (Definition 115). The collator message, \$M\$, is a varying datatype of the following format:

\$M = {(0,->,(C_i,P_i,C_s)),(1,->,H),(4,->,(B_h,S)):}\$
where
  • \$M\$ is a varying datatype where 0 indicates the intent to advertise a collation and 1 indicates the advertisement of a collation to a validator. 4 indicates that a collation sent to a validator was seconded.

  • \$C_i\$ is the public key of the collator.

  • \$P_i\$ is the parachain Id (Definition 134).

  • \$C_s\$ is the signature of the collator using the PeerId of the collators node.

  • \$H\$ is the hash of the parachain block (Definition 132).

  • \$S\$ is a full statement (Definition 102).

The statement distribution message is sent as part of the validator protocol message (Definition 115) indicates the validity vote of a validator for a given candidate, described further in Section 8.2.1. The statement distribution message, \$M\$, is of varying type of the following format:

\$M = {(0,->,(B_h,S)),(1,->,S_m):}\$ \$S_m = (B_h,C_h,A_i,A_s)\$
where
  • \$M\$ is a varying datatype where 0 indicates a signed statement and 1 contains metadata about a seconded statement with a larger payload, such as a runtime upgrade. The candidate itself can be fetched via the request/response message (Definition 128).

  • \$B_h\$ is the hash of the relay chain parent, indicating the state this message is for.

  • \$S\$ is a full statement (Definition 102).

  • \$A_i\$ is the validator index in the authority set (Definition 33) that signed this message.

  • \$A_s\$ is the signature of the validator.

The bitfield distribution message is sent as part of the validator protocol message (Definition 114) and indicates the availability vote of a validator for a given candidate, described further in Section 8.4.1. This message is sent in form of a validator protocol message (Definition 114). The bitfield distribution message, \$M\$, is a datastructure of the following format:

\$M = {(0,->,(B_h,P)):}\$ \$P = (d,A_i,A_s)\$
where
  • \$B_h\$ is the hash of the relay chain parent, indicating the state this message is for.

  • \$d\$ is the bitfield array (Definition 141).

  • \$A_i\$ is the validator index in the authority set (Definition 33) that signed this message.

  • \$A_s\$ is the signature of the validator.

The approval distribution message is sent as part of the validator protocol message (Definition 114) and indicates the approval vote of a validator for a given candidate, described further in Section 8.5.1. The approval distribution message, \$M\$, is a varying datatype of the following format:

\$M = {(0,->,((C_,I_)_0…(C,I)_n)),(1,->,(V_0,…V_n)):}\$ \$C = (B_h,A_i,c_a)\$ \$c_a = (c_k,P_o,P_p)\$ \$c_k = {(0→s),(1→i):}\$ \$V = (B_h,I,A_i,A_s)\$
where
  • \$M\$ is a varying datatype where 0 indicates assignments for candidates in recent, unfinalized blocks and 1 indicates approvals for candidates in some recent, unfinalized block.

  • \$C\$ is an assignment criterion which refers to the candidate under which the assignment is relevant by the block hash.

  • \$I\$ is an unsigned 32-bit integer indicating the index of the candidate, corresponding the the order of the availability cores (Section C.9.3).

  • \$B_h\$ is the relay chain block hash where the candidate appears.

  • \$A_i\$ is the authority set Id (Definition 33) of the validator that created this message.

  • \$A_s\$ is the signature of the validator issuing this message.

  • \$c_a\$ is the certification of the assignment.

  • \$c_k\$ is a varying datatype where 0 indicates an assignment based on the VRF that authorized the relay chain block where the candidate was included, followed by a sample number, \$s\$. 1 indicates an assignment story based on the VRF that authorized the relay chain block where the candidate was included combined with the index of a particular core. This is described further in Section 8.5.

  • \$P_o\$ is a VRF output and \$P_p\$ its corresponding proof.

8.7.2. Request & Response

The request & response network messages are sent and received between peers in the Polkadot network, including collators and non-validator nodes. Those messages are conducted on the request-response substreams (Section 4.5). The network messages are SCALE encoded as described in Section ?.

Definition 120. PoV Fetching Request

The PoV fetching request is sent by clients who want to retrieve a PoV block from a node. The request is a datastructure of the following format:

\$C_h\$

where \$C_h\$ is the 256-bit hash of the PoV block. The response message is defined in Definition 121.

Definition 121. PoV Fetching Response

The PoV fetching response is sent by nodes to the clients who issued a PoV fetching request (Definition 120). The response, \$R\$, is a varying datatype of the following format:

\$R = {(0,->,B),(1,->,phi):}\$

where 0 is followed by the PoV block and 1 indicates that the PoV block was not found.

Definition 122. Chunk Fetching Request

The chunk fetching request is sent by clients who want to retrieve chunks of a parachain candidate. The request is a datastructure of the following format:

\$(C_h,i)\$

where \$C_h\$ is the 256-bit hash of the parachain candidate and \$i\$ is a 32-bit unsigned integer indicating the index of the chunk to fetch. The response message is defined in Definition 123.

Definition 123. Chunk Fetching Response

The chunk fetching response is sent by nodes to the clients who issued a chunk fetching request (Definition 122). The response, \$R\$, is a varying datatype of the following format:

\$R = {(0,->,C_r),(1,->,phi):}\$ \$C_r = (c,c_p)\$

where 0 is followed by the chunk response, \$C_r\$ and 1 indicates that the requested chunk was not found. \$C_r\$ contains the erasure-encoded chunk of data belonging to the candidate block, \$c\$, and \$c_p\$ is that chunks proof in the Merkle tree. Both \$c\$ and \$c_p\$ are byte arrays of type \$(b_n…b_m)\$.

Definition 124. Available Data Request

The available data request is sent by clients who want to retrieve the PoV block of a parachain candidate. The request is a datastructure of the following format:

\$C_h\$

where \$C_h\$ is the 256-bit candidate hash to get the available data for. The response message is defined in Definition 125.

Definition 125. Available Data Response

The available data response is sent by nodes to the clients who issued a available data request (Definition 124). The response, \$R\$, is a varying datatype of the following format:

\$R = {(0,->,A),(1,->,phi):}\$ \$A = (P_{ov},D_{pv})\$

where 0 is followed by the available data, \$A\$, and 1 indicates the the requested candidate hash was not found. \$P_{ov}\$ is the PoV block (Definition 132) and \$D_{pv}\$ is the persisted validation data (Definition 227).

The collation fetching request is sent by clients who want to retrieve the advertised collation at the specified relay chain block. The request is a datastructure of the following format:

\$(B_h,P_{id})\$

where \$B_h\$ is the hash of the relay chain block and \$P_{id}\$ is the parachain Id (Definition 134). The response message is defined in Definition 127.

The collation fetching response is sent by nodes to the clients who issued a collation fetching request (Definition 126). The response, \$R\$, is a varying datatype of the following format:

\$R = {(0,->,(C_r,B)):}\$

where \$0\$ is followed by the candidate receipt (Definition 104), \$C_r\$, as and the PoV block (Definition 132), \$B\$. This type does not notify the client about a statement that was not found.

The statement fetching request is sent by clients who want to retrieve statements about a given candidate. The request is a datastructure of the following format:

\$(B_h,C_h)\$

where \$B_h\$ is the hash of the relay chain parent and \$C_h\$ is the candidate hash that was used to create a committed candidate receipt (Definition 105). The response message is defined in Definition 129.

The statement fetching response is sent by nodes to the clients who issued a collation fetching request (Definition 128). The response, \$R\$, is a varying datatype of the following format:

\$R = {(0,->,C_r):}\$

where \$C_r\$ is the committed candidate receipt (Definition 105). No response is returned if no statement is found.

8.7.2.1. Dispute Request

The dispute request is sent by clients who want to issue a dispute about a candidate. The request, \$D_r\$, is a datastructure of the following format:

\$D_r = (C_r,S_i,I_v,V_v)\$ \$I_v = (A_i,A_s,k_i)\$ \$V_v = (A_i,A_s,k_v)\$ \$k_i = {(0,->,phi):}\$ \$k_v = {(0,->,phi),(1,->,C_h),(2,->,C_h),(3,->,phi):}\$
where
  • \$C_r\$ is the candidate that is being disputed. The structure is a candidate receipt (Definition 104).

  • \$S_i\$ is an unsigned 32-bit integer indicating the session index the candidate appears in.

  • \$I_v\$ is the invalid vote that makes up the request.

  • \$V_v\$ is the valid vote that makes this dispute request valid.

  • \$A_i\$ is an unsigned 32-bit integer indicating the validator index in the authority set (Definition 33).

  • \$A_s\$ is the signature of the validator.

  • \$k_i\$ is a varying datatype and implies the dispute statement. 0 indicates an explicit statement.

  • \$k_v\$ is a varying datatype and implies the dispute statement.

    • \$0\$ indicates an explicit statement.

    • \$1\$ indicates a seconded statement on a candidate, \$C_h\$, from the backing phase. \$C_h\$ is the hash of the candidate.

    • \$2\$ indicates a valid statement on a candidate, \$C_h\$, from the backing phase. \$C_h\$ is the hash of the candidate.

    • \$3\$ indicates an approval vote from the approval checking phase.

The response message is defined in Section 8.7.2.2.

8.7.2.2. Dispute Response

The dispute response is sent by nodes to the clients who who issued a dispute request (Section 8.7.2.1). The response, \$R\$, is a varying type of the following format:

\$R = {(0,->,phi):}\$

where \$0\$ indicates that the dispute was successfully processed.

8.8. Definitions

Definition 130. Collator

A collator is a parachain node that sends parachain blocks, known as candidates (Definition 131), to the relay chain validators. The relay chain validators are not concerned how the collator works or how it creates candidates.

Definition 131. Candidate

A candidate is a submitted parachain block (Definition 132) to the relay chain validators. A parachain block stops being referred to as a candidate as soon it has been finalized.

Definition 132. Parachain Block

A parachain block or a Proof-of-Validity block (PoV block) contains the necessary data for the parachain specific state transition logic. Relay chain validators are not concerned with the inner structure of the block and treat it as a byte array.

Definition 133. Head Data

The head data is contains information about a parachain block (Definition 132). The head data is returned by executing the parachain Runtime and relay chain validators are not concerned with its inner structure and treat it as a byte arrays.

Definition 134. Parachain Id

The Parachain Id is a unique, unsigned 32-bit integer which serves as an identifier of a parachain, assigned by the Runtime.

Definition 135. Availability Core

Availability cores are slots used to process parachains. The Runtime assigns each parachain to a availability core and validators can fetch information about the cores, such as parachain block candidates, by calling the appropriate Runtime API (Section C.9.3). Validators are not concerned with the internal workings from the Runtimes perspective.

Definition 136. Validator Groups

Validator groups indicate which validators are responsible for creating backable candidates for parachains (Section 8.2), and are assigned by the Runtime (Section C.9.2). Validators are not concerned with the internal workings from the Runtimes perspective. Collators can use this information for submitting blocks.

Definition 137. Upward Message

An upward message is an opaque byte array sent from a parachain to a relay chain.

Definition 138. Downward Message

A downward message is an opaque byte array received by the parachain from the relay chain.

Definition 139. Outbound HRMP Message

An outbound HRMP message (Horizontal Relay-routed Message Passing) is sent from the perspective of a sender of a parachain to an other parachain by passing it through the relay chain. It’s a datastructure of the following format:

\$(I,M)\$

where \$I\$ is the recipient Id (Definition 134) and \$M\$ is an upward message (Definition 137).

Definition 140. Inbound HRMP Message

An inbound HRMP message (Horizontal Relay-routed Message Passing) is seen from the perspective of a recipient parachain sent from an other parachain by passing it through the relay chain. It’s a datastructure of the following format:

\$(N,M)\$

where \$N\$ is the unsigned 32-bit integer indicating the relay chain block number at which the message was passed down to the recipient parachain and \$M\$ is a downward message (Definition 138).

Definition 141. Bitfield Array

A bitfield array contains single-bit values which indidate whether a candidate is available. The number of items is equal of to the number of availability cores (Definition 135) and each bit represents a vote on the corresponding core in the given order. Respectively, if the single bit equals 1, then the Polkadot validator claims that the availability core is occupied, there exists a committed candidate receipt (Definition 105) and that the validator has a stored chunk of the parachain block (Definition 132).

Polkadot Runtime

Description of various useful Runtime internals

9. Extrinsics

9.1. Introduction

An extrinsic is a SCALE encoded array consisting of a version number, signature, and varying data types indicating the resulting Runtime function to be called, including the parameters required for that function to be executed.

9.2. Preliminaries

Definition 142. Extrinsic

An extrinsic , \(tx\), is a tuple consisting of the extrinsic version, \(T_v\) (Definition 143), and the body of the extrinsic, \(T_b\).

\[tx := (T_v, T_b)\]

The value of \(T_b\) varies for each version. The current version 4 is described in Section 9.3.1.

Definition 143. Extrinsic Version

\(T_v\) is a 8-bit bitfield and defines the extrinsic version. The required format of an extrinsic body, \(T_b\), is dictated by the Runtime. Older or unsupported version are rejected.

The most significant bit of \(T_v\) indicates whether the transaction is signed (\(1\)) or unsigned (\(0\)). The remaining 7-bits represent the version number. As an example, for extrinsic format version 4, an signed extrinsic represents \(T_v\) as 132 while a unsigned extrinsic represents it as 4.

9.3. Extrinsics Body

9.3.1. Version 4

Version 4 of the Polkadot extrinsic format is defined as follows:

\[T_b := (A_i, Sig, E, M_i, F_i(m))\]
where
Definition 144. Extrinsic Address

Account Id, \(A_i\), is the 32-byte address of the sender of the extrinsic as described in the external SS58 address format.

Definition 145. Extrinsic Signature

The signature, \(Sig\), is a varying data type indicating the used signature type, followed by the signature created by the extrinsic author. The following types are supported:

\[Sig := \begin{cases} 0, & \text{Ed25519, followed by: } (b_0, ...,b_{63}) \\ 1, & \text{Sr25519, followed by: } (b_0, ...,b_{63}) \\ 2, & \text{Ecdsa, followed by: } (b_0, ...,b_{64}) \end{cases}\]

Signature types vary in sizes, but each individual type is always fixed-size and therefore does not contain a length prefix. Ed25519 and Sr25519 signatures are 512-bit while Ecdsa is 520-bit, where the last 8 bits are the recovery ID.

The signature is created by signing payload \(P\).

\[\begin{aligned} P &:= \begin{cases} Raw, & \text{if } |Raw| \leq 256 \\ Blake2(Raw), & \text{if } |Raw| > 256 \\ \end{cases} \\ Raw &:= (M_i, F_i(m), E, R_v, F_v, H_h(G), H_h(B)) \\ \end{aligned}\]
where
  • \(M_i\): the module indicator (Definition 147).

  • \(F_i(m)\): the function indicator of the module (Definition 148).

  • \(E\): the extra data (Definition 146).

  • \(R_v\): a UINT32 containing the specification version of 14.

  • \(F_v\): a UINT32 containing the format version of 2.

  • \(H_h(G)\): a 32-byte array containing the genesis hash.

  • \(H_h(B)\): a 32-byte array containing the hash of the block which starts the mortality period, as described in Definition 149.

Definition 146. Extra Data

Extra data, \(E\), is a tuple containing additional meta data about the extrinsic and the system it is meant to be executed in.

\[E := (T_{mor}, N, P_t)\]
where
  • \(T_{mor}\): contains the SCALE encoded mortality of the extrinsic (Definition 149).

  • \(N\): a compact integer containing the nonce of the sender. The nonce must be incremented by one for each extrinsic created, otherwise the Polkadot network will reject the extrinsic.

  • \(P_t\): a compact integer containing the transactor pay including tip.

Definition 147. Module Indicator

\(M_i\) is an indicator for the Runtime to which Polkadot module, \(m\), the extrinsic should be forwarded to.

\(M_i\) is a varying data type pointing to every module exposed to the network.

\[M_i := \begin{cases} 0, & \text{System} \\ 1, & \text{Utility} \\ ... & \\ 7, & \text{Balances} \\ ... & \end{cases}\]
Definition 148. Function Indicator

\(F_i(m)\) is a tuple which contains an indicator, \(m_i\), for the Runtime to which function within the Polkadot module, \(m\), the extrinsic should be forwarded to. This indicator is followed by the concatenated and SCALE encoded parameters of the corresponding function, \(params\).

\[F_i(m) := (m_i, params)\]

The value of \(m_i\) varies for each Polkadot module, since every module offers different functions. As an example, the Balances module has the following functions:

\[Balances_i := \begin{cases} 0, & \text{transfer} \\ 1, & \text{set_balance} \\ 2, & \text{force_transfer} \\ 3, & \text{transfer_keep_alive} \\ ... & \end{cases}\]

9.3.2. Mortality

Definition 149. Extrinsic Mortality

Extrinsic mortality is a mechanism which ensures that an extrinsic is only valid within a certain period of the ongoing Polkadot lifetime. Extrinsics can also be immortal, as clarified in Section 9.3.2.2.

The mortality mechanism works with two related values:

  • \(M_{per}\): the period of validity in terms of block numbers from the block hash specified as \(H_h(B)\) in the payload (Definition 145). The requirement is \(M_{per} \geq 4\) and \(M_{per}\) must be the power of two, such as 32, 64, 128, etc.

  • \(M_{pha}\): the phase in the period that this extrinsic’s lifetime begins. This value is calculated with a formula and validators can use this value in order to determine which block hash is included in the payload. The requirement is \(M_{pha} < M_{per}\).

In order to tie a transaction’s lifetime to a certain block (\(H_i(B)\)) after it was issued, without wasting precious space for block hashes, block numbers are divided into regular periods and the lifetime is instead expressed as a "phase" (\(M_{pha}\)) from these regular boundaries:

\[M_{pha} = H_i(B)\ mod\ M_{per}\]

\(M_{per}\) and \(M_{pha}\) are then included in the extrinsic, as clarified in Definition 146, in the SCALE encoded form of \(T_{mor}\) (Section 9.3.2.2). Polkadot validators can use \(M_{pha}\) to figure out the block hash included in the payload, which will therefore result in a valid signature if the extrinsic is within the specified period or an invalid signature if the extrinsic "died".

9.3.2.1. Example

The extrinsic author choses \(M_{per} = 256\) at block 10'000, resulting with \(M_{pha} = 16\). The extrinsic is then valid for blocks ranging from 10'000 to 10'256.

9.3.2.2. Encoding

\(T_{mor}\) refers to the SCALE encoded form of type \(M_{per}\) and \(M_{pha}\). \(T_{mor}\) is the size of two bytes if the extrinsic is considered mortal, or simply one bytes with the value equal to zero if the extrinsic is considered immortal.

\[T_{mor} := Enc_{SC}(M_{per}, M_{pha})\]

The SCALE encoded representation of mortality \(T_{mor}\) deviates from most other types, as it’s specialized to be the smallest possible value, as described in Algorithm 25 and Algorithm 26.

If the extrinsic is immortal, specify a single byte with the value equal to zero.

Algorithm 25 Encode Mortality

Require: Mper,MphaM_{per}, M_{pha}

1:return 0ifextrinsic is immortal0 \enspace \textbf{if} \enspace \textit{extrinsic is immortal}

2:init factor=factor = Limit(Mper>>12,1,ϕM_{per} >> 12, 1, \phi)

3:init left=left = Limit(TZ(MperM_{per})1,1,15 - 1, 1, 15)

4:init right=Mphafactor<<4right = \frac{M_{pha}}{factor} << 4

5:return leftrightleft|right

Algorithm 25. Encode Mortality

Algorithm 26 Decode Mortality

Require: TmorT_{mor}

1:return ImmortalifTmorb0=0\textit{Immortal} \enspace \textbf{if} \enspace T^{b0}_{mor} = 0

2:init enc=Tmorb0+(Tmorb1<<8)enc = T^{b0}_{mor} + (T^{b1}_{mor} << 8)

3:init Mper=2<<(enc mod (1<<4))M_{per} = 2 << (enc\ mod\ (1 << 4))

4:init factor=factor = Limit(Mper>>12,1,ϕM_{per} >> 12, 1, \phi)

5:init Mpha=(enc>>4)factorM_{pha} = (enc >> 4) * factor

6:return (Mper,Mpha)(M_{per}, M_{pha})

Algorithm 26. Decode Mortality
where
  • \(T^{b0}_{mor}\): the first byte of \(T_{mor}\).

  • \(T^{b1}_{mor}\): the second byte of \(T_{mor}\).

  • Limit(\(num\), \(min\), \(max\)): Ensures that \(num\) is between \(min\) and \(max\). If \(min\) or \(max\) is defined as \(\phi\), then there is no requirement for the specified minimum/maximum.

  • TZ(\(num\)): returns the number of trailing zeros in the binary representation of \(num\). For example, the binary representation of 40 is 0010 1000, which has three trailing zeros.

  • \(>>\): performs a binary right shift operation.

  • \(<<\): performs a binary left shift operation.

  • \(|\) : performs a bitwise OR operation.

10. Weights

10.1. Motivation

The Polkadot network, like any other permissionless system, needs to implement a mechanism to measure and to limit the usage in order to establish an economic incentive structure, to prevent the network overload, and to mitigate DoS vulnerabilities. In particular, Polkadot enforces a limited time-window for block producers to create a block, including limitations on block size, which can make the selection and execution of certain extrinsics too expensive and decelerate the network.

In contrast to some other systems such as Ethereum which implement fine measurement for each executed low-level operation by smart contracts, known as gas metering, Polkadot takes a more relaxed approach by implementing a measuring system where the cost of the transactions (referred to as ’extrinsics’) are determined before execution and are known as the weight system.

The Polkadot weight system introduces a mechanism for block producers to measure the cost of running the extrinsics and determine how "heavy" it is in terms of execution time. Within this mechanism, block producers can select a set of extrinsics and saturate the block to its fullest potential without exceeding any limitations (as described in Section 10.2.1). Moreover, the weight system can be used to calculate a fee for executing each extrinsics according to its weight (as described in Section 10.6.1).

Additionally, Polkadot introduces a specified block ratio (as defined in Section 10.2.1), ensuring that only a certain portion of the total block size gets used for regular extrinsics. The remaining space is reserved for critical, operational extrinsics required for the functionality by Polkadot itself.

To begin, we introduce in Section 10.2 the assumption upon which the Polkadot transaction weight system is designed. In Section 10.2.1, we discuss the limitation Polkadot needs to enforce on the block size. In Section 10.3, we describe in detail the procedure upon which the weight of any transaction should be calculated. In Section 10.5, we present how we apply this procedure to compute the weight of particular runtime functions.

10.2. Assumptions

In this section, we define the concept of weight and we discuss the considerations that need to be accounted for when assigning weight to transactions. These considerations are essential in order for the weight system to deliver its fundamental mission, i.e. the fair distribution of network resources and preventing a network overload. In this regard, weights serve as an indicator on whether a block is considered full and how much space is left for remaining, pending extrinsics. Extrinsics which require too many resources are discarded. More formally, the weight system should:

  • prevent the block from being filled with too many extrinsics

  • avoid extrinsics where its execution takes too long, by assigning a transaction fee to each extrinsic proportional to their resource consumption.

These concepts are formalized in Definition 150 and Definition 153:

Definition 150. Block Length

For a block \(B\) with \(Head(B)\) and \(Body(B)\) the block length of \(B\), \(Len(B)\), is defined as the amount of raw bytes of \(B\).

Definition 151. Target Time per Block

Ṯargeted time per block denoted by \(T(B)\) implies the amount of seconds that a new block should be produced by a validator. The transaction weights must consider \(T(B)\) in order to set restrictions on time intensive transactions in order to saturate the block to its fullest potential until \(T(B)\) is reached.

Definition 152. Block Target Time

Available block ration reserved for normal, noted by \(R(B)\), is defined as the maximum weight of none-operational transactions in the Body of \(B\) divided by \(Len(B)\).

Definition 153. Block Limits

P̱olkadot block limits as defined here should be respected by each block producer for the produced block \(B\) to be deemed valid:

  • \(Len(B) \le 5 \times 1'024 \times 1'024 = 5'242'880\) Bytes

  • \(T(B) = 6\ seconds\)

  • \(R(B) \le 0.75\)

Definition 154. Weight Function

The P̱olkadot transaction weight function denoted by \(\mathcal{W}\) as follows:

\[\begin{aligned} \mathcal{W} &: \mathcal{E} \rightarrow \mathbb{N} \\ \mathcal{W} &: E \mapsto w \end{aligned}\]

where \(w\) is a non-negative integer representing the weight of the extrinsic \(E\). We define the weight of all inherent extrinsics as defined in the Section 2.3.3 to be equal to 0. We extend the definition of \(\mathcal{W}\) function to compute the weight of the block as sum of weight of all extrinsics it includes:

\[\begin{aligned} \mathcal{W} &: \mathcal{B}\rightarrow \mathbb{N} \\ \mathcal{W} &: B \mapsto \sum_{E\in B}(W(E)) \end{aligned}\]

In the remainder of this section, we discuss the requirements to which the weight function needs to comply to.

  • Computations of function \(\mathcal{W}(E)\) must be determined before execution of that \(E\).

  • Due to the limited time window, computations of \(\mathcal{W}\) must be done quickly and consume few resources themselves.

  • \(\mathcal{W}\) must be self contained and must not require I/O on the chain state. \(\mathcal{W}(E)\) must depend solely on the Runtime function representing \(E\) and its parameters.

Heuristically, "heaviness" corresponds to the execution time of an extrinsic. In that way, the \(\mathcal{W}\) value for various extrinsics should be proportional to their execution time. For example, if Extrinsic A takes three times longer to execute than Extrinsic B, then Extrinsic A should roughly weighs 3 times of Extrinsic B. Or:

\[\mathcal{W}(A) \approx 3 \times \mathcal{W}(B)\]

Nonetheless, \(\mathcal{W}(E)\) can be manipulated depending on the priority of \(E\) the chain is supposed to endorse.

10.2.1. Limitations

In this section we discuss how applying the limitation defined in Definition 153 can be translated to limitation \(\mathcal{W}\). In order to be able to translate those into concrete numbers, we need to identify an arbitrary maximum weight to which we scale all other computations. For that we first define the block weight and then assume a maximum on it block length in Definition 155:

Definition 155. Block Weight

We define the block weight of block \(B\), formally denoted as \(\mathcal{W}(B)\), to be:

\[\mathcal{W}(B) = \sum^{|\mathcal{E}|}_{n = 0} (W(E_n))\]

We require that:

\[\mathcal{W}(B) < 2'000'000'000'000\]

The weights must fulfill the requirements as noted by the fundamentals and limitations, and can be assigned as the author sees fit. As a simple example, consider a maximum block weight of 1’000’000’000, an available ratio of 75% and a targeted transaction throughput of 500 transactions, we could assign the (average) weight for each transaction at about 1’500’000. Block producers have economic incentive to include as many extrinsics as possible (without exceeding limitations) into a block before reaching the targeted block time. Weights give indicators to block producers on which extrinsics to include in order to reach the blocks fullest potential.

10.3. Calculation of the weight function

In order to calculate weight of block \(B\), \(\mathcal{W}(B)\), one needs to evaluate the weight of each transaction included in the block. Each transaction causes the execution certain Runtime functions. As such, to calculate the weight of a transaction, those functions must be analyzed in order to determine parts of the code which can significantly contribute to the execution time and consume resources such as loops, I/O operations, and data manipulation. Subsequently the performance and execution time of each part will be evaluated based on variety of input parameters. Based on those observations, weights are assigned Runtime functions or parameters which contribute to long execution times. These sub component of the code are discussed in Section 10.4.1.

The general algorithm to calculate \(\mathcal{W}(E)\) is described in the Section 10.4.

10.4. Benchmarking

Calculating the extrinsic weight solely based on theoretical complexity of the underlying implementation proves to be too complicated and unreliable at the same time. Certain decisions in the source code architecture, internal communication within the Runtime or other design choices could add enough overhead to make the asymptotic complexity practically meaningless.

On the other hand, benchmarking an extrinsics in a black-box fashion could (using random parameters) most centainly results in missing corner cases and worst case scenarios. Instead, we benchmark all available Runtime functions which are invoked in the course of execution of extrinsics with a large collection of carefully selected input parameters and use the result of the benchmarking process to evaluate \(\mathcal{W}(E)\).

In order to select useful parameters, the Runtime functions have to be analyzed to fully understand which behaviors or conditions can result in expensive execution times, which is described closer in Section 10.4.1. Not every possible benchmarking outcome can be invoked by varying input parameters of the Runtime function. In some circumstances, preliminary work is required before a specific benchmark can be reliably measured, such as creating certain preexisting entries in the storage or other changes to the environment.

The Practical Examples (Section 10.5) covers the analysis process and the implementation of preliminary work in more detail.

10.4.1. Primitive Types

The Runtime reuses components, known as "primitives", to interact with the state storage. The execution cost of those primitives can be measured and a weight should be applied for each occurrence within the Runtime code.

For storage, Polkadot uses three different types of storage types across its modules, depending on the context:

  • Value: Operations on a single value. The final key-value pair is stored under the key:

        hash(module_prefix) + hash(storage_prefix)
  • Map: Operations on multiple values, datasets, where each entry has its corresponding, unique key. The final key-value pair is stored under the key:

        hash(module_prefix) + hash(storage_prefix) + hash(encode(key))
  • Double map: Just like Map, but uses two keys instead of one. This type is also known as "child storage", where the first key is the "parent key" and the second key is the "child key". This is useful in order to scope storage entries (child keys) under a certain context (parent key), which is arbitrary. Therefore, one can have separated storage entries based on the context. The final key-value pair is stored under the key:

        hash(module_prefix) + hash(storage_prefix)
          + hash(encode(key1)) + hash(encode(key2))

It depends on the functionality of the Runtime module (or its sub-processes, rather) which storage type to use. In some cases, only a single value is required. In others, multiple values need to be fetched or inserted from/into the database.

Those lower level types get abstracted over in each individual Runtime module using the decl_storage! macro. Therefore, each module specifies its own types that are used as input and output values. The abstractions do give indicators on what operations must be closely observed and where potential performance penalties and attack vectors are possible.

10.4.1.1. Considerations

The storage layout is mostly the same for every primitive type, primarily differentiated by using special prefixes for the storage key. Big differences arise on how the primitive types are used in the Runtime function, on whether single values or entire datasets are being worked on. Single value operations are generally quite cheap and its execution time does not vary depending on the data that’s being processed. However, excessive overhead can appear when I/O operations are executed repeatedly, such as in loops. Especially, when the amount of loop iterations can be influenced by the caller of the function or by certain conditions in the state storage.

Maps, in contrast, have additional overhead when inserting or retrieving datasets, which vary in sizes. Additionally, the Runtime function has to process each item inside that list.

Indicators for performance penalties:

  • Fixed iterations and datasets - Fixed iterations and datasets can increase the overall cost of the Runtime functions, but the execution time does not vary depending on the input parameters or storage entries. A base Weight is appropriate in this case.

  • Adjustable iterations and datasets - If the amount of iterations or datasets depend on the input parameters of the caller or specific entries in storage, then a certain weight should be applied for each (additional) iteration or item. The Runtime defines the maximum value for such cases. If it doesn’t, it unconditionally has to and the Runtime module must be adjusted. When selecting parameters for benchmarking, the benchmarks should range from the minimum value to the maximum value, as described in Definition 156.

  • Input parameters - Input parameters that users pass on to the Runtime function can result in expensive operations. Depending on the data type, it can be appropriate to add additional weights based on certain properties, such as data size, assuming the data type allows varying sizes. The Runtime must define limits on those properties. If it doesn’t, it unconditionally has to and the Runtime module must be adjusted. When selecting parameters for benchmarking, the benchmarks should range from the minimum values to the maximum value, as described in paragraph Definition 156.

Definition 156. Maximum Value

What the maximum value should be really depends on the functionality that the Runtime function is trying to provide. If the choice for that value is not obvious, then it’s advised to run benchmarks on a big range of values and pick a conservative value below the targeted time per block limit as described in section Section 10.2.1.

10.4.2. Parameters

The inputs parameters highly vary depending on the Runtime function and must therefore be carefully selected. The benchmarks should use input parameters which will most likely be used in regular cases, as intended by the authors, but must also consider worst case scenarios and inputs which might decelerate or heavily impact performance of the function. The input parameters should be randomized in order to cause various effects in behaviors on certain values, such as memory relocations and other outcomes that can impact performance.

It’s not possible to benchmark every single value. However, one should select a range of inputs to benchmark, spanning from the minimum value to the maximum value which will most likely exceed the expected usage of that function. This is described in more detail in Section 10.4.1.1. The benchmarks should run individual executions/iterations within that range, where the chosen parameters should give insight on the execution time. Selecting imprecise parameters or too extreme ranges might indicate an inaccurate result of the function as it will be used in production. Therefore, when a range of input parameters gets benchmarked, the result of each individual parameter should be recorded and optionally visualized, then the necessary adjustment can be made. Generally, the worst case scenario should be assigned as the weight value for the corresponding runtime function.

Additionally, given the distinction theoretical and practical usage, the author reserves the right to make adjustments to the input parameters and assigned weights according to the observed behavior of the actual, real-world network.

10.4.2.1. Weight Refunds

When assigning the final weight, the worst case scenario of each runtime function should be used. The runtime can then additional "refund" the amount of weights which were overestimated once the runtime function is actually executed.

The Polkadot runtime only returns weights if the difference between the assigned weight and the actual weight calculated during execution is greater than 20%.

10.4.3. Storage I/O cost

It is advised to benchmark the raw I/O operations of the database and assign "base weights" for each I/O operation type, such as insertion, deletion, querying, etc. When a runtime function is executed, the runtime can then add those base weights of each used operation in order to calculate the final weight.

10.4.4. Environment

The benchmarks should be executed on clean systems without interference of other processes or software. Additionally, the benchmarks should be executed on multiple machines with different system resources, such as CPU performance, CPU cores, RAM and storage speed.

10.5. Practical examples

This section walks through Runtime functions available in the Polkadot Runtime to demonstrate the analysis process as described in Section 10.4.1.

In order for certain benchmarks to produce conditions where resource heavy computation or excessive I/O can be observed, the benchmarks might require some preliminary work on the environment, since those conditions cannot be created with simply selected parameters. The analysis process shows indicators on how the preliminary work should be implemented.

10.5.1. Practical Example #1: request_judgement

In Polkadot, accounts can save information about themselves on-chain, known as the "Identity Info". This includes information such as display name, legal name, email address and so on. Polkadot offers a set of trusted registrars, entities elected by a Polkadot public referendum, which can verify the specified contact addresses of the identities, such as Email, and vouch on whether the identity actually owns those accounts. This can be achieved, for example, by sending a challenge to the specified address and requesting a signature as a response. The verification is done off-chain, while the final judgement is saved onchain, directly in the corresponding Identity Info. It’s also note worthy that Identity Info can contain additional fields, set manually by the corresponding account holder.

Information such as legal name must be verified by ID card or passport submission.

The function request_judgement from the identity pallet allows users to request judgement from a specific registrar.

(func $request_judgement (param $req_index int) (param $max_fee int))
  • req_index: the index which is assigned to the registrar.

  • max_fee: the maximum fee the requester is willing to pay. The judgement fee varies for each registrar.

Studying this function reveals multiple design choices that can impact performance, as it will be revealed by this analysis.

10.5.1.1. Analysis

First, it fetches a list of current registrars from storage and then searches that list for the specified registrar index.

let registrars = <Registrars<T>>::get();
let registrar = registrars.get(reg_index as usize).and_then(Option::as_ref)
  .ok_or(Error::<T>::EmptyIndex)?;

Then, it searches for the Identity Info from storage, based on the sender of the transaction.

let mut id = <IdentityOf<T>>::get(&sender).ok_or(Error::<T>::NoIdentity)?;

The Identity Info contains all fields that have a data in them, set by the corresponding owner of the identity, in an ordered form. It then proceeds to search for the specific field type that will be inserted or updated, such as email address. If the entry can be found, the corresponding value is to the value passed on as the function parameters (assuming the registrar is not "stickied", which implies it cannot be changed). If the entry cannot be found, the value is inserted into the index where a matching element can be inserted while maintaining sorted order. This results in memory reallocation, which increases resource consumption.

match id.judgements.binary_search_by_key(&reg_index, |x| x.0) {
  Ok(i) => if id.judgements[i].1.is_sticky() {
    Err(Error::<T>::StickyJudgement)