Skip to content

Assembly Indices

r0qs edited this page Mar 28, 2025 · 1 revision

Subassembly indices are assigned based on a depth-first traversal (DFS) of the root object within the Yul object structure. This enumeration reflects the path of each subassembly in the .data section of the assembly's JSON output.

Indices for Indirect Subassemblies

When encoding the path to a subassembly, the reference to a subobject that is not a direct child of the current object uses a negative 64-bit index 2^64 - 1 - x for the xth child. For example, the hex value 0xffffffffffffffff corresponds to -1 in the space of unsigned 64-bit integers uint64_t. This scheme allows both direct children and deeper references to be represented using a single integer field.

  • Direct Subassemblies:

    • Each object has a set of direct subassemblies, indexed sequentially starting from 0.
  • Indirect Subassemblies (non-direct children):

    • If an object is not a direct subassembly of another, it receives a negative index following this formula:
      y = 2^{64} - 1 - x
      
      where:
      • x is the number of indirect subassemblies.
      • y is the computed negative index.

For example, consider the following Yul contract:

object "A" {
  code {
    sstore(0, dataoffset("B.C.D"))
    sstore(1, datasize("B.E"))
  }
  object "B" {
    code {
        mstore(0, datasize("C"))
    }
    object "C" {
      code {}
      object "D" {
        code { invalid() }
      }
    }
    object "E" {
      code { revert(0,0) }
    }
  }
}

And respective (simplified) assembly JSON output:

{
  ".code": [
    {
      "name": "PUSH [$]",
      "value": "000000000000000000000000000000000000000000000000ffffffffffffffff"
    },
    {
      "name": "PUSH",
      "value": "0"
    },
    {
      "name": "SSTORE",
    },
    {
      "name": "PUSH #[$]",
      "value": "000000000000000000000000000000000000000000000000fffffffffffffffe"
    },
    {
      "name": "PUSH",
      "value": "1"
    },
    {
      "name": "SSTORE",
    },
    {
      "name": "STOP",
    }
  ],
  ".data": {
    "0": {
      ".code": [
        {
          "name": "PUSH #[$]",
          "value": "0000000000000000000000000000000000000000000000000000000000000000"
        },
        {
          "name": "PUSH",
          "value": "0"
        },
        {
          "name": "MSTORE",
        },
        {
          "name": "STOP",
        }
      ],
      ".data": {
        "0": {
          ".code": [
            {
              "name": "STOP",
            }
          ],
          ".data": {
            "0": {
              ".code": [
                {
                  "name": "INVALID",
                }
              ]
            }
          }
        },
        "1": {
          ".code": [
            {
              "name": "PUSH",
              "value": "0"
            },
            {
              "name": "DUP1",
            },
            {
              "name": "REVERT",
            }
          ]
        }
      }
    }
  },
  "sourceList": []
}

All paths relatives to A as the root, are:

  • B: [0] through A
  • C: [0 0] through B.C
  • D: [0 0 0] through B.C.D
  • E: [0 1] through B.E

In terms of object hierarchy, we have the following indices:

  • [C] -> non-negative index 0 forC with path [0] as first direct child of B
  • [B C D] -> index -1 for B.C.D with path [0,0,0]
  • [B E] -> index -2 for B.E with path [0,1]

Interpretation of PUSH Instructions

Assembly instructions such as PUSH #[$] refer to subassemblies using indices that are derived from the DFS-based enumeration described earlier. For example:

{
  "name": "PUSH #[$]",
  "value": "000000000000000000000000000000000000000000000000fffffffffffffffe"
}

has value:

0xfffffffffffffffe == (2^64 - 2)
Clone this wiki locally