In the following, we will walk through the process of creating an
encoder and a decoder for a single buffer of data.

Kodo implements a number of different erasure correcting codes. In
this example, we have chosen to use a particular version of a RLNC
(Random Linear Network Code).

The Full RLNC is one of the most common RLNC variants, and provides
several of the advantages that RLNCs have over traditional erasure
correcting codes. However, for the time being we will just use it as
a standard erasure correcting code, namely to encode and decode some
data.

In the following, we will go through three of the key parameters to
choose when configuring an erasure correcting code:

The number of symbols

The symbol_size

The finite field, or more specifically the field size used.

In general if a block of data is to be encoded, it’s split into a number of
symbols of a specific size. If you multiply the number of
symbols with the symbol size, you get the total amount of data in bytes that
will be either encoded or decoded per generation.

Note

Sizes in Kodo are always measured in bytes. So if you see a
variable or function name that includes the word “size”,
bytes is the unit used.

Note

In network applications, a single symbol typically
corresponds to a single packet (for example, an UDP
datagram).

Denotes the number of symbols in a block/generation. Increasing this
number will have the following effects:

The computational complexity will increase, and can therefore slow
down the encoding/decoding.

For some variants of RLNC, the per-packet
overhead will increase due to added coding coefficients.

The per-symbol decoding delay will become larger. The reason for
this is that when we increase the number of symbols that are encoded
the decoder has to receive more symbols before decoding.

The protocol complexity can be decreased. If the number of symbols
is increased so that all the data which is to be encoded can fit in
a single generation, the protocol will only have to handle a single
generation. If multiple generations are needed, the receivers will
have to tell from which generations the server should send data, and
hence increasing the complexity of the protocol.

The need for a high field size decreases (which is an advantage
since, in short, a higher field size leads to higher complexity).
The reason for this is that when the decoder is only missing a few
symbols, the chance for it to receive a useful encoded symbol
decreases. This reduction depends on the field size (higher is
better). You pay this price at each generation, but if a generation
contains many symbols this issue becomes smaller. Furthermore with
many symbols, the generations will be bigger, and hence fewer
generations are needed.

Denotes the size of each symbol in bytes. The choice of symbol size
typically depends on the application. For network applications we may
choose the symbol size according to the network MTU (Maximum Transfer
Unit) so that datagrams do not get fragmented as they traverse the
network. In those cases symbols sizes are typically around 1300-1400
bytes. On the other hand for storage applications the symbol size is
typically much larger, e.g., in the order of several megabytes.

// Copyright Steinwurf ApS 2016.// Distributed under the "STEINWURF EVALUATION LICENSE 1.0".// See accompanying file LICENSE.rst or// http://www.steinwurf.com/licensing#include<cstdint>#include<algorithm>#include<iostream>#include<vector>#include<kodocpp/kodocpp.hpp>intmain(){//! [0]// Set the number of symbols (i.e. the generation size in RLNC// terminology) and the size of a symbol in bytesuint32_tmax_symbols=16;uint32_tmax_symbol_size=1400;//! [1]// In the following we will make an encoder/decoder factory.// The factories are used to build actual encoders/decoderskodocpp::encoder_factoryencoder_factory(kodocpp::codec::full_vector,kodocpp::field::binary8,max_symbols,max_symbol_size);kodocpp::encoderencoder=encoder_factory.build();kodocpp::decoder_factorydecoder_factory(kodocpp::codec::full_vector,kodocpp::field::binary8,max_symbols,max_symbol_size);kodocpp::decoderdecoder=decoder_factory.build();//! [2]std::vector<uint8_t>payload(encoder.payload_size());std::vector<uint8_t>data_in(encoder.block_size());// Just for fun - fill the data with random datastd::generate(data_in.begin(),data_in.end(),rand);//! [3]// Assign the data buffer to the encoder so that we may start// to produce encoded symbols from itencoder.set_const_symbols(data_in.data(),encoder.block_size());//! [4]// Create a buffer which will contain the decoded data, and we assign// that buffer to the decoderstd::vector<uint8_t>data_out(decoder.block_size());decoder.set_mutable_symbols(data_out.data(),decoder.block_size());//! [5]uint32_tencoded_count=0;while(!decoder.is_complete()){// Encode a packet into the payload bufferuint32_tbytes_used=encoder.write_payload(payload.data());std::cout<<"Bytes used = "<<bytes_used<<std::endl;++encoded_count;// Pass that packet to the decoderdecoder.read_payload(payload.data());}std::cout<<"Encoded count = "<<encoded_count<<std::endl;//! [6]// Check if we properly decoded the dataif(data_in==data_out){std::cout<<"Data decoded correctly"<<std::endl;}//! [7]return0;}

Initially we define the two parameters number of symbols and the
symbol_size.

1
2
3
4

// Set the number of symbols (i.e. the generation size in RLNC// terminology) and the size of a symbol in bytesuint32_tmax_symbols=16;uint32_tmax_symbol_size=1400;

In the given example, the codec and the finite field types are specified
by first two parameters of encoder and decoder factories. Since
fast finite field computations are not only useful in erasure
correcting codes, this part of the functionality has been split into a
second library called Fifi. The
Fifi library defines a number of different finite fields such as
binary, binary4, binary8. To switch to a different field you
can simple replace field::binary8 with one of the other field types,
e.g. field::binary.

Once the key parameters have been selected, we are ready to create an
encoder and a decoder to perform the actual coding.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

// In the following we will make an encoder/decoder factory.// The factories are used to build actual encoders/decoderskodocpp::encoder_factoryencoder_factory(kodocpp::codec::full_vector,kodocpp::field::binary8,max_symbols,max_symbol_size);kodocpp::encoderencoder=encoder_factory.build();kodocpp::decoder_factorydecoder_factory(kodocpp::codec::full_vector,kodocpp::field::binary8,max_symbols,max_symbol_size);kodocpp::decoderdecoder=decoder_factory.build();

We use the encoder_factory and decoder_factory objects to configure
and build encoders and decoders. Invoking the build() function will return
a new encoder or decoder object. These objects actually contain C++ smart
pointers under the hood, so you can pass them around to other fuctions and
the underlying objects will be deleted when there are no more references
to them.

Before the encoding and decoding of data can begin, we define two buffers.

1
2
3
4
5

std::vector<uint8_t>payload(encoder.payload_size());std::vector<uint8_t>data_in(encoder.block_size());// Just for fun - fill the data with random datastd::generate(data_in.begin(),data_in.end(),rand);

The first buffer is the payload buffer. Once we start coding, this
buffer will contain a single encoded symbol which we can “transmit” to
the decoder. Besides the encoded symbol data, the payload buffer will
also contain a header that describes how the symbol was
encoded. The format and size of this data depends on the chosen
erasure correcting code. Fortunately we don’t have to worry about
that as long as we provide a large enough buffer. The needed size of
the buffer is returned by the payload_size function.

The second buffer, data_in, contains the data we wish to
encode. As mentioned earlier the size of this buffer is the number of
symbols multiplied by the symbol size. For convenience we can use the
block_size function to get this value. In this case we are not
encoding real data, so we just fill the data_in buffer with some
randomly generated data.

Once the buffers have been created we can call the set_const_symbols
function on the encoder to specify which buffer it should encode.

1
2
3

// Assign the data buffer to the encoder so that we may start// to produce encoded symbols from itencoder.set_const_symbols(data_in.data(),encoder.block_size());

We also create a third buffer, data_out, where the decoder will store
the decoded data. The size of this buffer is the same as the size of
data_in. Again, we call the block_size function to get the proper size.
We do not have to initialize this buffer, we just pass it as a parameter
of the set_mutable_symbols function, and the decoder will take control
of the buffer.

1
2
3
4

// Create a buffer which will contain the decoded data, and we assign// that buffer to the decoderstd::vector<uint8_t>data_out(decoder.block_size());decoder.set_mutable_symbols(data_out.data(),decoder.block_size());

Finally we have everything ready to start the coding. This is done in
a loop until the decoding is completed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

uint32_tencoded_count=0;while(!decoder.is_complete()){// Encode a packet into the payload bufferuint32_tbytes_used=encoder.write_payload(payload.data());std::cout<<"Bytes used = "<<bytes_used<<std::endl;++encoded_count;// Pass that packet to the decoderdecoder.read_payload(payload.data());}std::cout<<"Encoded count = "<<encoded_count<<std::endl;

We use a variable encoded_count to keep track of the number of
symbols we have encoded. When we finish, this number should match the
symbols as all data is passed safely to the decoder, we shall
later see examples where this is not necessarily the case. The loop
stops when the decoders is_complete function returns true. This
happens when all symbols have been decoded. The encoder encodes into
the payload buffer and returns the number of bytes used during the
encoding, and hence the number of bytes we would have to transmit
over the network. The payload is passed to the decoder which decodes
the encoded data and thereby increases its rank.

When the decoding process is completed, the data will be available in
the data_out buffer, so we check that its contents match the original
data be in data_in.

1
2
3
4
5

// Check if we properly decoded the dataif(data_in==data_out){std::cout<<"Data decoded correctly"<<std::endl;}