Solidity: ABI encoding explained

Apr 11, 2024 by Arnaud Stoz | 1992 views

https://cylab.be/blog/334/solidity-abi-encoding-explained

If you have already been curious about how Ethereum smart contract works under the hood or even participated to a CTF where you had to exploit some weakness in smart contract, you probably stumble upon the solidity abi encoding page. Even if this is the reference paper, it can look a bit difficult to understand and it’s not easily readable even though it’s not really difficult. Let’s review how the encoding is working with the help of few example.

Disclaimer

This post does not cover every type and try to avoid as much as possible mathematical expression. If you want a more complete description, please refer to the official documentation

Data type

In solidity there are two different data type when talking about encoding :

Dynamic: Intuitively a dynamic type is every type for which its length can vary. In this category we have the following type
- bytes
- string
- Any unsized array (T[] where T can be any type )
- A fixed size array if the array type is dynamic (T[k] where T is a dynamic type)
- Any tuple if at least one of its element is a dynamic type
Static: any other type not mentioned above is considered as a static type
- uint256
- bool
- ….

Static type encoding

Static type encoding are relatively intuitive, we will only describe basic type here, if you want a more complete definition please refer to the official documentation.

uint: hexadecimal representation of the integer with a 32 bytes length. enc(72) = 0x0000000000000000000000000000000000000000000000000000000000000048
bool (considered as a uint8): enc(false) = 0x0000000000000000000000000000000000000000000000000000000000000001
bytes of fixed size the sequence of bytes with trailing zero-bytes to reach a length of 32 bytes. enc(0x123456) = 0x1234560000000000000000000000000000000000000000000000000000000000

Dynamic type encoding

The encoding for the dynamic type works differently as by definition, the size of the type is not known in advance.

bytes

For bytes of unfixed sized, the encoding is simply the encoding of the length (assumed to be a uint256) followed by the hexadecimal representation of the byte with trailing zero such that the length is a multiple of 32.

let bytes X = 0x12ab34
enc(X) = enc(3)enc(bytes3(0x12ab34))
enc(3) = 0x0000000000000000000000000000000000000000000000000000000000000003
enc(bytes3(0x12ab34)) = 0x12ab340000000000000000000000000000000000000000000000000000000000

Putting all together:

enc(X) = 0x000000000000000000000000000000000000000000000000000000000000000312ab340000000000000000000000000000000000000000000000000000000000

Unsized array

Encoding of unsized array follow the same pattern as for bytes, first 32 bytes is the encoding of the length and then follow the encoding of each elements.

let uint256[] X = [12, 13, 14]
enc(X) = enc(3)enc(12)enc(13)enc(14)
enc(3) = 0x0000000000000000000000000000000000000000000000000000000000000003
enc(12) = 0x000000000000000000000000000000000000000000000000000000000000000c
enc(13) = 0x000000000000000000000000000000000000000000000000000000000000000d
enc(14) = 0x000000000000000000000000000000000000000000000000000000000000000e

Putting all together

enc(X) = 0x0000000000000000000000000000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000000000000000c000000000000000000000000000000000000000000000000000000000000000d000000000000000000000000000000000000000000000000000000000000000e

Tuple

The tuple is the more complex structure to encode. Each element of the tuple are encoded in order with respect to the following pattern.

if this is a dynamic type, we first encoded an offset to specify where the encoded data can be found and then put the data (encoded following the rules define above) at that offset
if this is a static type we encode the type directly.

Note that the offset can be anything as long as it does not overlap with other data

Lets do a quick example to clarify this: imagine the following tuple need to be encoded (bytes, bool, uint[]) with the following value (0x1234, true, [1,2,3]). So we have two dynamic types (bytes and uint[]) and one static type (bool).

The encoding will look like the following.

Based on this picture we can calculate the encoding of this structure:

offset of bytes = 0x0000000000000000000000000000000000000000000000000000000000000060
enc(bool) = 0x0000000000000000000000000000000000000000000000000000000000000001
offset of start uint[] = 0x00000000000000000000000000000000000000000000000000000000000000a0
enc(0x1234) = enc(2)enc(bytes2(0x1234)) = 0x00000000000000000000000000000000000000000000000000000000000000031234000000000000000000000000000000000000000000000000000000000000
enc([1,2,3]) = enc(3)enc(1)enc(2)enc(3) = 0x0x0000000000000000000000000000000000000000000000000000000000000003000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000000000000000020000000000000000000000000000000000000000000000000000000000000003

Function call

A function called is done by first specifying the function selector and the append the encoding of the argument following the rules we explained above. Note that argument are always considered to be a tuple.

The function selector is simply the first four bytes of the keccak-256 of the signature of the function. The signature is simply the function name with a tuple of parameter type (without space).

Imagine the following function, function bar(bytes3[2] memory, bool y) public pure {}. Using an online tool like this one we can calculate the keccak256 of this function signature keccak256(bar(bytes2[2],bool) = 2e91aa30111cff884664745eaee00a00e8ea3ed23683a23e1530af9dc231f652.

We then take only the first 4 bytes to get the function selector. FunctionSelector = 0x2e91aa30

Let’s now imagines we are calling this function with arguments ([0x1234, 0xabcd], false). As this is two static types, encoding is straightforward

enc([[0x1234, 0xabcd]) = 0x0000000000000000000000000000000000000000000000000000000000001234000000000000000000000000000000000000000000000000000000000000abcd
enc(false) = 0x0000000000000000000000000000000000000000000000000000000000000000

So the complete calldata look like:

functionSelector enc(args) =
0x2e91aa300000000000000000000000000000000000000000000000000000000000001234000000000000000000000000000000000000000000000000000000000000abcd0000000000000000000000000000000000000000000000000000000000000000

Please note that we applied the rule of encoding a tuple for the arguments. So if any of the arguments are of dynamic type you have to follow the procedure for encoding tuple containing dynamic type.

Conclusion

Now you have a better understanding on how encoding work in solidity and how smart contract function call works. You can now detect poor logic smart contract and make your own smart contract safer !

This blog post is licensed under CC BY-SA 4.0