Skip to content

Types

Quokka exports the type information recorded by IDA (structures, unions, enumerations, arrays and pointers) and exposes it through a hierarchy of Python objects. These objects are useful for understanding data layout, reconstructing high-level semantics, and cross-referencing types with the data or code that uses them.

Type hierarchy

CoreType                         ← abstract base for every type
├── BaseType          (IntEnum)  ← primitive (char, int, float, …)
├── ComplexType                  ← abstract base for named types
│   ├── EnumType                 ← C enum
│   ├── StructureType  (dict)    ← C struct
│   │   └── UnionType            ← C union (subclass of StructureType)
│   ├── ArrayType                ← C array (T[N])
│   └── PointerType              ← C pointer (T*)
├── EnumTypeMember               ← one value inside an EnumType
└── StructureTypeMember          ← one field inside a StructureType/UnionType

Every type object exposes a set of is_* boolean properties so you can check its kind without isinstance calls:

Property True for
is_base_type BaseType
is_enum EnumType
is_struct StructureType
is_union UnionType
is_array ArrayType
is_pointer PointerType
is_composite Any ComplexType (enum, struct, union, array, pointer)
is_member StructureTypeMember or EnumTypeMember

Accessing types from a Program

import quokka

prog = quokka.Program("binary.quokka", "binary")

# Iterate over every exported type
for t in prog.types:
    print(type(t).__name__, getattr(t, "name", t))

# Only structures
for struct in prog.structures:
    print(struct.name, struct.size, "bytes")

# Only enumerations
for enum in prog.enums:
    print(enum.name)

BaseType — primitives

BaseType is an IntEnum that represents the primitive C types IDA knows about:

Name C equivalent Size (bytes)
BYTE char 1
WORD short 2
DOUBLE_WORD int 4
QUAD_WORD int64_t 8
OCTO_WORD int128_t 16
FLOAT float 4
DOUBLE double 8
VOID void 0
UNKNOWN 0
from quokka.data_type import BaseType

bt = BaseType.DOUBLE_WORD
print(bt.size)   # 4
print(bt.c_str)  # <T:int>
print(bt.is_base_type)  # True

StructureType — C structs

StructureType behaves like a dict keyed by positional index (integer, starting at 0). Each value is a StructureTypeMember.

Key attributes

Attribute Type Description
name str Structure name as defined in IDA
size int Total size in bytes (0 if variable-length)
c_str str C declaration of the structure
comments list[str] Analyst comments
members list[StructureTypeMember] Members in declaration order
for struct in prog.structures:
    print(f"struct {struct.name} ({struct.size} bytes)")
    for member in struct.members:
        byte_offset = member.offset // 8
        byte_size   = member.size   // 8
        print(f"  +0x{byte_offset:02x}  {member.name}  ({byte_size} bytes)  type={member.type}")

Offsets and sizes are in bits

StructureTypeMember.offset and StructureTypeMember.size are expressed in bits, not bytes. Divide by 8 to get byte values. This representation is necessary to support bit-fields.

StructureTypeMember

Attribute Type Description
name str Field name
offset int Bit offset within the parent structure
size int Size in bits (0 for variable-length fields)
type TypeT Resolved type of the field
parent StructureType Back-reference to the containing structure
comments list[str] Analyst comments
data_refs_to list[AddressT] Addresses that reference this field
struct = next(prog.structures)
first_member = struct.members[0]
print(first_member.name)           # e.g. "next"
print(first_member.offset // 8)   # byte offset
print(first_member.type)           # e.g. <TPtr: next->...>

UnionType — C unions

UnionType is a subclass of StructureType. It works identically except that all members conceptually share offset 0 (all variants overlay the same memory). The dict is still keyed by positional index to avoid collisions.

for t in prog.types:
    if t.is_union:
        print(f"union {t.name} ({t.size} bytes)")
        for member in t.members:
            print(f"  variant {member.name}: {member.type}")

EnumType — C enumerations

Key attributes

Attribute Type Description
name str Enum name
size int Storage size in bytes (derived from base_type)
base_type BaseType Underlying integer type
members Iterable[EnumTypeMember] All enum values
c_str str C declaration
comments list[str] Analyst comments

EnumType is iterable and subscriptable by positional index. Members can also be accessed as attributes using their name:

for enum in prog.enums:
    print(f"enum {enum.name} (base: {enum.base_type})")

    # Iterate
    for member in enum:
        print(f"  {member.name} = {member.value}")

    # Positional access
    first = enum[0]

    # Attribute access by name
    val = enum.SOME_VALUE  # equivalent to enum["SOME_VALUE"]

EnumTypeMember

Attribute Type Description
name str Constant name
value int Integer value
size int Same as the parent enum's size
base_type BaseType Underlying integer type
parent EnumType Back-reference to the containing enum
comments list[str] Analyst comments
data_refs_to list[Data] Data objects that reference this constant

ArrayType — C arrays

Attribute Type Description
name str Type name
size int Total size in bytes
element_type TypeT Type of each element
array_size int Number of elements
c_str str C declaration
for t in prog.types:
    if t.is_array:
        print(f"{t.name}: {t.element_type}[{t.array_size}]  ({t.size} bytes)")

PointerType — C pointers

Attribute Type Description
name str Type name
size int Pointer size in bytes (4 or 8 depending on architecture)
pointed_type TypeT The type being pointed to
c_str str C declaration
for t in prog.types:
    if t.is_pointer:
        print(f"{t.name}{t.pointed_type}  (size={t.size})")

Adding new types

You can define new types from Python and persist them back to the disassembler database. New types are created via Program.add_type() and are marked with is_new=True so the backend knows to register them.

# Struct
prog.add_type("struct point { int x; int y; };")

# Enum
prog.add_type("enum color { RED=0, GREEN=1, BLUE=2 };")

# Typedef
prog.add_type("typedef unsigned int uint32;")

# Union
prog.add_type("union data { int i; float f; };")

Persisting new types

New types are appended to the protobuf types array, so prog.write() and prog.commit() automatically include them. When applied back to IDA, the backend reconstructs each new type from its c_str field using parse_decls().

# Save to .quokka only
prog.write()

# Or apply to IDA and re-export
prog.commit(database_file="binary.i64", overwrite=True)

Note

Duplicate type names are rejected -- add_type() raises QuokkaError if a type with the same name already exists in the program.

Cross-references from types

Complex types and their members carry cross-reference lists that tell you which data objects use them:

for struct in prog.structures:
    refs = struct.data_refs_to
    if refs:
        print(f"{struct.name} is referenced by {len(refs)} data object(s)")

    for member in struct.members:
        for addr in member.data_refs_to:
            print(f"  field {member.name} accessed from 0x{addr:x}")

The following cross-reference accessors are available on ComplexType:

Property Description
data_refs_to All data cross-references to this type
data_read_refs_to Read cross-references only
data_write_refs_to Write cross-references only

EnumTypeMember and StructureTypeMember also expose data_refs_to.

Checking types with is_* properties

Because a TypeT can be any of the concrete type classes it is often cleaner to use the boolean properties rather than isinstance:

def describe(t) -> str:
    if t.is_base_type:
        return f"primitive {t.c_str}"
    elif t.is_struct:
        return f"struct {t.name} ({len(t.members)} fields)"
    elif t.is_union:
        return f"union {t.name} ({len(t.members)} variants)"
    elif t.is_enum:
        return f"enum {t.name} ({sum(1 for _ in t.members)} values)"
    elif t.is_array:
        return f"array {t.element_type}[{t.array_size}]"
    elif t.is_pointer:
        return f"pointer → {t.pointed_type}"
    return "unknown"

for t in prog.types:
    print(describe(t))