Endianness

Endianness refers to the ordering of components in a representation of an entity. For our purposes, we use it to refer to the ordering of bytes in multi-byte integers.

Overview

Let's look at how memory allocation and storage of multi-byte integers work. For illustration, we consider 32-bit integers. When you wish to store a 32-bit integer, the program allocates 4 bytes worth of memory.

Now we have a choice on how we want to store the integer - the 4 bytes in the integer can be mapped to the 4 bytes of memory in any order giving us 24 possible endian options. As long as we use the same ordering while storing and retrieving any integer, the program can function without error.

The most popular endian systems in use today are the big endian and little endian systems.

Big endian

Big endian refers to ordering by most significant byte to least. A 32-bit integer 0x12345678 would be stored as

+++++++++++++++++++++++++++++
| 0x12 | 0x34 | 0x56 | 0x78 |
+++++++++++++++++++++++++++++

Example systems: TCP, IBM z/Architechture

Little endian

Little endian refers to ordering by least significant byte to most. A 32-bit integer 0x12345678 would be stored as

+++++++++++++++++++++++++++++
| 0x78 | 0x56 | 0x34 | 0x12 |
+++++++++++++++++++++++++++++

Example systems: Intel/AMD x86-64, RISC-V

Why should I care?

Endianness is crucial while serializing and deserializing data transmitted over the network between systems with different orderings. Consider the following serialization and deserialization steps happening in a big and little endian system respectively:

// Serialization in big endian system
char buf[4]; uint32_t i = 0xff;
memcpy(buf, &i, 4);

// Transmitted through the network as | 0x00 | 0x00 | 0x00 | 0xff |

// Deserialization in little endian system
char buf[4] = { 0x00, 0x00, 0x00, 0xff }; uint32_t i;
memcpy(&i, buf, 4);

// i now contains | 0x00 | 0x00 | 0x00 | 0xff | in memory
// which is 0xff000000 in the little endian system
// Very different from 0xff!!!

A mismatch in endianness causes the data read to be widly different. Best practice is to fix an endianness for the transmitted data (called network ordering) and handle conversions in the clients based on their endianness (called host ordering).