Abstract

The CodeHawk-Binary analysis is a powerful analysis tool for binaries of various architectures. Its core analysis is based on the mathematical theory of abstract interpretation [3, 4]. It includes its own disassemblers, currently supporting x86, arm32 (including Thumb-2), mips, and Power32 (in progress). It takes as input a (possibly stripped) binary, disassembles it, and constructs functions, control flow graphs, and a callgraph. It then creates an over-approximating semantic translation of the functions into CHIF, the CodeHawk Internal Form, for analysis, and translates the resulting (over-approximating) invariants back into the context of the assembly code. This invariantgeneration process is performed in a series of rounds, incrementally adding variables discovered in previous rounds. The final set of invariants is saved (in highly compressed form) and forms the basis for all subsequent analyses and liftings. The CodeHawk-Binary Analyzer comes with a comprehensive command-line interpreter to produce a wide variety of analysis results, including annotated assembly code, annotated control-flow graphs and callgraphs, and potential vulnerability reports. It does not, however, have a graphical user interface. For general reverse engineering tasks a graphical user interface, such as provided by IDA Pro, Binary Ninja, Ghidra, or angr, is often a preferred way of interaction for exploration and navigation. None of these tools, however, provide the deep analysis results that CodeHawk generates. Part of the reason for a more shallow analysis in these tools is, of course, that deeper analysis often takes too much time creating response times that are not acceptable for an interactive tool. With CodeHawk-Binary Analyzer plugins for these tools we hope to offer users the best of both worlds. CodeHawk analysis can be performed off-line. All analysis results are saved in full, such that they can be used for many different purposes, via a comprehensive python API that is called directly from the python plugin code. During an interactive session with the preferred tool, these analysis results can be accessed and deployed as desired, controlled by interactively invoking plugins, thus injecting analysis results directly into the tool’s database and thereby augmenting the tool’s own analysis results. Because the plugins only need to extract data, they do not perform any expensive analysis, response times are comparable to other actions typically performed in these tools. In this report we describe and illustrate an initial selection of python plugins implemented for IDA-Pro.

Download Publication