Erasure coding metrics

This document presents a reliability and a redundancy metric that can help tune the erasure coding mechanism to ensure reliability, efficiency and minimize reconstruction failures in the system.

Parameters

Name Type Description
honestPercent uint Fraction of chunks assumed to be honest - either sent by honest parents, or (unlikely) sent in a truly random fashion by malicious parents
totChunks uint Total number of chunks generated (by erasure coding) from the underlying message
reqChunks uint Number of chunks required to reconstruct the underlying message
rcvChunks uint Number of honest chunks received from parents in aggregate

Reliability metric

The reliability metric captures the (inverse of) the failure rate in the erasure coding system defined by the above parameters.

Combinatorics

Let nCr(n, r) denote the combinatorics operator. We need to calculate the probability of reconstructing the final message given rcvChunks truly random chunks in a reqChunks-out-of-totChunks erasure coded system. We ignore the fact that a honest peer wouldn't send duplicate chunks and model this as the probability of getting at least reqChunks unique chunks in a random sequence of rcvChunks size.

Define

  • h(n): In a sequence of rcvChunks, the number of sequences constructible with exactly n unique chunk types given n total chunk types
  • s(n): In a sequence of rcvChunks, the number of sequences constructible with exactly n unique chunk types given totChunks chunk types
  • r: Reliability metric which we wish to compute

We compute the reliability metric iteratively as follows

  • To compute h(1), we can construct only one unique sequence given one unique chunk type.
h(1) = 1
  • To compute h, we construct all sequences possible with at most n chunk types and remove sequences with less than n unique chunk types, computed by iteratively choosing k out of n possible chunks and subtracting sequences with those k chunks. This leaves sequences with exactly n chunk types.
h(n) = (n ** rcvChunks)
        - (nCr(n, n - 1) * h(n - 1))
        - (nCr(n, n - 2) * h(n - 2))
        ...
        - (nCr(n, 1) * h(1))
  • To compute s, we choose n chunk types out of totChunks and construct sequences with them.
s(n) = nCr(totChunks, n) * h(n)
  • We finally compute r by computing the fraction of sequences that have at most reqChunks - 1 chunk types and subtracting it from 1.
r = 1 - (s(1) + s(2) + ... + s(reqChunks - 1))/(totChunks ** rcvChunks)

Python reference code

import math
def nCr(n, r):
  return math.factorial(n)/math.factorial(r)/math.factorial(n-r)

h = [0]
for i in range(1, reqChunks):
  h += [i**rcvChunks]
  for j in range(1, i):
    h[i] -= nCr(i,j)*h[j]
  h[0] += nCr(totChunks,i)*h[i]

reliability = 1 - h[0]/totChunks**rcvChunks

print(reliability)

Redundancy metric

The redundancy metric captures the bandwidth multiple that nodes spend in aggregate in the erasure coding system defined by the above parameters.

This is simply given by (rcvChunks / reqChunks) / honestPercent assuming that in the erasure coding system, the total required size of chunks is pretty close to the original message size.