Playing with Hooks

Setting hooks is the main interface with an execution and an exploration to perform user-defined actions. TritonDSE enables hooking the following events:

address reached
instruction executed (all of them)
memory address read or written
register read or written
function reached (from its name)
end of an execution
thread context switch
new input creation (before it gets appended in the pool of seeds)

The library introduces a CallbackManager object which enables registering callbacks. A SymbolicExecutor does contain this object.

A SymbolicExplorator object it also contains a callback_manager instance. In this case, callbacks will be transmitted to all subsequent SymbolicExecutor instances.

Let’s reuse the following base snippet:

[41]:

from triton import Instruction
from tritondse import SymbolicExecutor, Config, Seed, Program, ProcessState, SeedFormat, CompositeData

p = Program("crackme_xor")
config = Config(pipe_stdout=False, seed_format=SeedFormat.COMPOSITE)
seed = Seed(CompositeData(argv=[b"./crackme_xor", b"AAAAAAAAAAAA"]))

executor = SymbolicExecutor(config, seed)
executor.load(p)

WARNING:root:symbol __gmon_start__ imported but unsupported

I. Instruction hooking

Instruction hooking enables hooking the execution of every instructions executed regardless of theirs address etc.

The signature for an instruction hook is the following:

Callable[['SymbolicExecutor', ProcessState, Instruction], None]

We can use it to print every instructions executed:

[42]:

def trace_inst(se: SymbolicExecutor, pstate: ProcessState, inst: Instruction):
    print(f"[tid:{inst.getThreadId()}] 0x{inst.getAddress():x}: {inst.getDisassembly()}")

executor.callback_manager.register_post_instruction_callback(trace_inst)

[43]:

executor.run()

[tid:0] 0x400460: xor ebp, ebp
[tid:0] 0x400462: mov r9, rdx
[tid:0] 0x400465: pop rsi
[tid:0] 0x400466: mov rdx, rsp
[tid:0] 0x400469: and rsp, 0xfffffffffffffff0
[tid:0] 0x40046d: push rax
[tid:0] 0x40046e: push rsp
[tid:0] 0x40046f: mov r8, 0x400680
[tid:0] 0x400476: mov rcx, 0x400610
[tid:0] 0x40047d: mov rdi, 0x4005b3
[tid:0] 0x400484: call 0x400440
[tid:0] 0x400440: jmp qword ptr [rip + 0x200bda]
[tid:0] 0x4005b3: push rbp
[tid:0] 0x4005b4: mov rbp, rsp
[tid:0] 0x4005b7: sub rsp, 0x20
[tid:0] 0x4005bb: mov dword ptr [rbp - 0x14], edi
[tid:0] 0x4005be: mov qword ptr [rbp - 0x20], rsi
[tid:0] 0x4005c2: cmp dword ptr [rbp - 0x14], 2
[tid:0] 0x4005c6: je 0x4005cf
[tid:0] 0x4005cf: mov rax, qword ptr [rbp - 0x20]
[tid:0] 0x4005d3: add rax, 8
[tid:0] 0x4005d7: mov rax, qword ptr [rax]
[tid:0] 0x4005da: mov rdi, rax
[tid:0] 0x4005dd: call 0x400556
[tid:0] 0x400556: push rbp
[tid:0] 0x400557: mov rbp, rsp
[tid:0] 0x40055a: mov qword ptr [rbp - 0x18], rdi
[tid:0] 0x40055e: mov dword ptr [rbp - 4], 0
[tid:0] 0x400565: jmp 0x4005a6
[tid:0] 0x4005a6: cmp dword ptr [rbp - 4], 4
[tid:0] 0x4005aa: jle 0x400567
[tid:0] 0x400567: mov eax, dword ptr [rbp - 4]
[tid:0] 0x40056a: movsxd rdx, eax
[tid:0] 0x40056d: mov rax, qword ptr [rbp - 0x18]
[tid:0] 0x400571: add rax, rdx
[tid:0] 0x400574: movzx eax, byte ptr [rax]
[tid:0] 0x400577: movsx eax, al
[tid:0] 0x40057a: sub eax, 1
[tid:0] 0x40057d: xor eax, 0x55
[tid:0] 0x400580: mov ecx, eax
[tid:0] 0x400582: mov rdx, qword ptr [rip + 0x200ab7]
[tid:0] 0x400589: mov eax, dword ptr [rbp - 4]
[tid:0] 0x40058c: cdqe
[tid:0] 0x40058e: add rax, rdx
[tid:0] 0x400591: movzx eax, byte ptr [rax]
[tid:0] 0x400594: movsx eax, al
[tid:0] 0x400597: cmp ecx, eax
[tid:0] 0x400599: je 0x4005a2
[tid:0] 0x40059b: mov eax, 1
[tid:0] 0x4005a0: jmp 0x4005b1
[tid:0] 0x4005b1: pop rbp
[tid:0] 0x4005b2: ret
[tid:0] 0x4005e2: mov dword ptr [rbp - 4], eax
[tid:0] 0x4005e5: cmp dword ptr [rbp - 4], 0
[tid:0] 0x4005e9: jne 0x4005f7
[tid:0] 0x4005f7: mov edi, 0x40069e
[tid:0] 0x4005fc: call 0x400430
[tid:0] 0x400430: jmp qword ptr [rip + 0x200be2]
[tid:0] 0x400601: mov eax, 0
[tid:0] 0x400606: leave
[tid:0] 0x400607: ret

The pre and post defines whether the callback is called before the instruction is executed or after.

In this case we could also have used register_pre_instruction_callback but the Instruction object would not be decoded yet, so it prevents getting its disassembly.

II. Address/Function hooking

We can hook any address and perform an associated action.

We can also hook any function as long as the symbol is set. They both have the same signature:

Callable[['SymbolicExecutor', ProcessState, Addr], None]

For the purpose of the challenge let’s hook the compare instruction and patch de ZF flag to force looping. Let’s also hook the puts function to print the string given to each call.

[44]:

def hook_cmp(se: SymbolicExecutor, pstate: ProcessState, addr: int):
    print(f"{pstate.cpu.al} - {pstate.cpu.cl}")
    pstate.cpu.zf = 1
    #exec.abort()

def hook_puts(se: SymbolicExecutor, pstate: ProcessState, routine: str, addr: int):
    s = pstate.memory.read_string(pstate.get_argument_value(0))
    print(f"puts: {s}")

[45]:

executor = SymbolicExecutor(config, seed)
executor.load(p)

# Remove trace printing callback
executor.callback_manager.reset()
executor.callback_manager.register_post_addr_callback(0x0400597, hook_cmp)
executor.callback_manager.register_post_imported_routine_callback("puts", hook_puts)

WARNING:root:symbol __gmon_start__ imported but unsupported

[46]:

executor.run()

49 - 21
62 - 21
61 - 21
38 - 21
49 - 21
puts: Win

We did not really won where as we forced the ZF flag, but we retrieved encoded values on wich the comparison is made.

III. Solving queries

We can modify our hooks to directly solve by SMT what shall be the appropriate value of CL in order to match the comparison.

[47]:

from tritondse.types import SolverStatus

def hook_cmp2(se: SymbolicExecutor, pstate: ProcessState, addr: int):
    # CL contains the input of the user (hashed)

    # retrieve the symbolic value of both characters
    sym_al = pstate.read_symbolic_register(pstate.registers.al)
    sym_cl = pstate.read_symbolic_register(pstate.registers.cl)

    # Solve the constraint such that one match the other
    status, model = pstate.solve(sym_al.getAst() == sym_cl.getAst())

    # If formula is SAT retrieve input values
    if status == SolverStatus.SAT:
        # Retrieve value of the input variable involved in the cl value here (shall be only one here)
        var_values = pstate.get_expression_variable_values_model(sym_cl, model)
        for var, value in var_values.items():
            print(f"{var}: {chr(value)}")
    else:
        print(status.name)

    pstate.cpu.zf = 1

[48]:

executor = SymbolicExecutor(config, seed)
executor.load(p)

executor.callback_manager.reset()
executor.callback_manager.register_post_addr_callback(0x0400597, hook_cmp2)

WARNING:root:symbol __gmon_start__ imported but unsupported

[49]:

executor.run()

argv[1][0]:8: e
argv[1][1]:8: l
argv[1][2]:8: i
argv[1][3]:8: t
argv[1][4]:8: e

IV. Hooking exploration events

We can similarly put callbacks on a SymbolicExplorator. In this case, the callback manager will be shared among all the SymbolicExecutor instances. Let’s hook every iteration to print some statistics:

[53]:

from tritondse import SymbolicExplorator, Config, Seed, Program, ProcessState, SeedFormat, CompositeData

def pre_exec_hook(se: SymbolicExecutor, state: ProcessState):
    print(f"input: {se.seed.hash} {se.seed.content.argv} ", end="")

def post_exec_hook(se: SymbolicExecutor, state: ProcessState):
    print(f"status:{se.seed.status.name}   [exitcode:{se.exitcode}]")

dse = SymbolicExplorator(Config(seed_format=SeedFormat.COMPOSITE), p)
dse.add_input_seed(Seed(CompositeData(argv=[b"./crackme_xor", b"AAAAAAAAAAAA"])))

dse.callback_manager.register_pre_execution_callback(pre_exec_hook)
dse.callback_manager.register_post_execution_callback(post_exec_hook)

dse.explore()

WARNING:root:symbol __gmon_start__ imported but unsupported

input: 0063e1d416400b0a0401dc471be64a8f [b'./crackme_xor', b'AAAAAAAAAAAA'] status:OK_DONE   [exitcode:0]
input: 415d0d4405119b88530788282aa06d7d [b'./crackme_xor', b'eAAAAAAAAAAA'] status:OK_DONE   [exitcode:0]

[53]:

<ExplorationStatus.IDLE: 2>