Managing Seeds
This tutorial explains how to deal with input files manipulated and generated during the execution. Let’s reuse the following base snippet. The only difference here is that we are going to change the coverage strategy to PATH_COVERAGE
.
[80]:
from tritondse import SymbolicExecutor, Config, Seed, Program, ProcessState, SymbolicExplorator, CoverageStrategy, SeedFormat, CompositeData
config = Config(pipe_stdout=False, coverage_strategy=CoverageStrategy.PATH, seed_format=SeedFormat.COMPOSITE)
def post_exec_hook(se: SymbolicExecutor, state: ProcessState):
print(f"seed:{se.seed.hash} ({repr(se.seed.content.argv)}) [exitcode:{se.exitcode}]")
dse = SymbolicExplorator(config, Program("crackme_xor"))
dse.add_input_seed(Seed(CompositeData(argv=[b"./crackme", b"AAAAAAAAAAAAAAA"])))
dse.callback_manager.register_post_execution_callback(post_exec_hook)
dse.explore()
WARNING:root:symbol __gmon_start__ imported but unsupported
seed:e2f673d0fd7980a2bdad7910f0f6da7a ([b'./crackme', b'AAAAAAAAAAAAAAA']) [exitcode:0]
seed:b204f9c8720b4ee299a215ef4c9f168f ([b'./crackme', b'eAAAAAAAAAAAAAA']) [exitcode:0]
seed:cab6e4b729327d1e088c9d459e0340eb ([b'./crackme', b'elAAAAAAAAAAAAA']) [exitcode:0]
seed:c8f3df9e460142aed1158aa354d7179d ([b'./crackme', b'eliAAAAAAAAAAAA']) [exitcode:0]
seed:2cb80846ef5684501c73e1e19f595230 ([b'./crackme', b'elitAAAAAAAAAAA']) [exitcode:0]
seed:dc1d802d1c2796a1a21d96827ce1cae7 ([b'./crackme', b'eliteAAAAAAAAAA']) [exitcode:0]
[80]:
<ExplorationStatus.IDLE: 2>
Just by exploring all possible paths we managed to solve the challenge. Let’s play with the corpus.
Initial Corpus
There are two ways to provide an initial corpus:
providing in existing workspace directory and putting manually files in the worklist directory
via the API by adding the seed with
add_input_seed
, it will automatically be added in seeds to process.
Managing generated corpus
SymbolicExecutor: This class is solely meant to execute a single seed, not to produce new ones (that is the purpose of the explorator). However if a one wants to generate a new input to process in a callback, it is possible with the method enqueue_seed
. That method will just fill a list that will later be picked-up by the SymbolicExplorator
instance.
Here is a dummy hook function that manually negates a branching condition to generate a new input file:
[81]:
def hook_cmp(se: SymbolicExecutor, pstate: ProcessState, addr: int):
zf = pstate.cpu.zf # concrete value
sym_zf = pstate.read_symbolic_register(pstate.registers.zf)
# Revert the current value of zf to "negate" condition
status, model = pstate.solve(sym_zf.getAst() != zf)
if status == SolverStatus.SAT:
new_seed = se.mk_new_seed_from_model(model)
se.enqueue_seed(new_seed)
By default the status of the seed generated is FRESH
. However, one can directly assign it the CRASH
status if it is a faulty one.
SymbolicExplorator: All seeds generated during the exploration are moved in the Workspace
instance from the worklist folder to the appropriate one corpus, hangs and crashs. One can iterate them using the workspace instance.
[82]:
print("seed to process:", len(list(dse.workspace.iter_worklist())))
print("\nCorpus:")
for seed in dse.workspace.iter_corpus():
print(seed.filename)
seed to process: 0
Corpus:
2cb80846ef5684501c73e1e19f595230.00000064.tritondse.cov
dc1d802d1c2796a1a21d96827ce1cae7.00000064.tritondse.cov
b204f9c8720b4ee299a215ef4c9f168f.00000064.tritondse.cov
e2f673d0fd7980a2bdad7910f0f6da7a.00000064.tritondse.cov
c8f3df9e460142aed1158aa354d7179d.00000064.tritondse.cov
cab6e4b729327d1e088c9d459e0340eb.00000064.tritondse.cov
Setting seed status
During the execution one, can assign a status to the seed currently being executed in any callback. (Make sure not to override a status previously set by another callback). At the end of an execution if no status has been assigned, the seed is automatically assigned OK_DONE
.
## Seed Formats
In TritonDSE, a seed contains the user input that will be processed by the target program.
TritonDSE supports two seed formats: RAW
and COMPOSITE
.
RAW
seed is a contiguous buffer of bytes that will be injected in memory. By default it is done when the program reads fromstdin
.COMPOSITE
seeds allow providing more complex input to the program such asargv
or the content of files. Note thatstdin
is treated as a file. Their content is provided as aCompositeData
instance which is defined intritondse/seeds.py
as follows:
#Excerpt from /tritondse/seeds.py
class CompositeData:
argv: List[bytes]
"list of argv values"
files: Dict[str, bytes]
"dictionnary of files and the associated content (stdin is one of them)"
variables: Dict[str, bytes]
"user defined variables, that the use must take care to inject at right location"
The following example shows a complex input type where user-input are read on argv
, stdin
a file called filename.txt
and a variable var1
. Symbolic variables like var1
will not be injected automatically, the user has to do it himself at the appropriate location.
from tritondse import Seed, CompositeData
composite_seed = Seed(CompositeData(argv=[b"this", b"will", b"injected", b"into", b"argv"],\
files={"stdin": b"This will be injected when the program reads from stdin",\
"filename.txt": b"This will be injected when the program reads from filename.txt"\
},\
variables={"var1": b"The user is responsible for injecting this manually (more on this later)"}\
))
It is not necessary to provide all the fields in CompositeData
when creating a new seed. The following seeds are valid.
[83]:
composite_seed_1 = Seed(CompositeData(argv=[b"this", b"will", b"injected", b"into", b"argv"]))
composite_seed_2 = Seed(CompositeData(files={"stdin": b"This will be injected when the program reads from stdin"}))
When creating a SymbolicExecutor
or SymbolicExplorator
the seed format can be specified in the Config
. The default seed format is RAW
. Seed formats cannot be mixed, all the seeds of a given SymbolicExecutor
or SymbolicExplorator
must have the same format.
[84]:
from tritondse import SymbolicExplorator, Config, SeedFormat, Program
dse_using_raw_seeds = SymbolicExplorator(Config(), Program("crackme_xor"))
dse_using_composite_seeds = SymbolicExplorator(Config(seed_format=SeedFormat.COMPOSITE), Program("crackme_xor"))
How to inject arbitrary variables
The variables
field of CompositeData
is never injected by the framework. It is the user’s responsability to inject it in the appropriate places with inject_symbolic_variable_memory
or inject_symbolic_variable_register
.
Let’s see an example. Say the target application contains the following function that we want to explore:
void parse_buffer(char* buffer);
The following script will register a callback at the start of parse_buffer
and inject input into buffer
.
from tritondse import SymbolicExplorator, Config, SeedFormat, Program, CompositeData
def example_hook(se: SymbolicExecutor, pstate: ProcessState, addr: int):
# In the callback, retrieve the pointer to the buffer we want to inject
arg0 = pstate.get_argument_value(0)
# Inject the data from the seed into the buffer
# var_prefix should be the key of the variable in dictionary se.seed.content.variables
se.inject_symbolic_variable_memory(arg0, "buffer", se.seed.content.variables["buffer"])
p = Program("parser_program")
# Create a new symbolic explorator that uses COMPOSITE seeds
dse = SymbolicExplorator(Config(seed_format=SeedFormat.COMPOSITE), p)
# Add an initial seed
composite_data = CompositeData(variables={"buffer" : b"A"*128})
dse.add_input_seed(Seed(composite_data))
# Register a callback on the function whose parameter we want to inject
dse.callback_manager.register_pre_addr_callback(p.find_function_addr("parse_buffer"), example_hook)
dse.explore()