In the DARPA AMP program, we developed, under the MRAM project, a prototype end-to-end binary patching platform and associated workflow that significantly reduces the costs of patching stripped binaries, the CodeHawk Binary Patcher (CBP). CBP's cost reduction is achieved by lifting the binary into a form that can be edited by a typical software engineer (as opposed to the rare and highly skilled reverse engineer). Concomitantly, our platform drastically decreases the likelihood of errors by formally validating translation steps of the process and by providing the developer with assurance artifacts for review. Furthermore, the platform enacts minimally invasive changes to the binary, with the ability to reason about individual expressions and instructions, maintaining the testing and certifications applied to the original binary. We have implemented a prototype binary patching and assurance graphical user-interface workflow as a plugin to the binary analysis platform Binary Ninja. The MRAM project was employed and evaluated by independent operators from the AMP TA3 performer, with the experience being described as "smooth," "thorough," and "intuitive."
Transition-Ready, Independently-Evaluated Binary Patching Workflow: We have developed an effective and efficient binary patching workflow in the context of our BinaryNinja UI plugin. The workflows enable a developer to stay completely within the UI to patch a stripped binary, leading the developer through patching the C lifting, automatically applying the patch, reviewing the assurance artifacts, and additional patch iterations (if required). This complete end-to-end workflow, including matching source and automatically introducing source symbols and types, was independently demonstrated to be revolutionary during AMP's Phase II end evaluation: we achieved the fastest time to patch, coupled with the smallest patch changeset produced. An independent team of 2 developers was able to create a correct micro-patch on a stripped binary in 20 minutes (without live guidance), the fastest time to patch of any team participating in the evaluation. The team also independently reviewed the assurance result and concluded the patch was correct. The resulting patch had a smaller changeset versus a patch created by reverse engineers on the evaluation team (10 bytes vs 57 bytes changed). This demonstrated our prototype is ready to be used by transition partners for rapid and assured binary micro-patching.
Binary Patching Without Reverse-Engineering: The CodeHawk Binary Patching system enables an analyst with only source-code development experience to patch binaries without running nor recompiling the original binary. CBP performs a validatable lifting of the binary to editable C-source code with symbols and types incorporated from the Binary Ninja type recovery system, including BSI source matching. The source code lifted by CBP looks similar to the original source code in many cases and employs natural expressions on the source types. It is straightforward for an analyst to understand the lifted code to determine how to modify the lifting to enact the patch. Then, once the analyst has enacted the patch, CBP automatically applies the changes on the binary, as defined by the well-defined semantic changes on the lifting. No recompilation is required. CBP reasons about the associations of the lifted source to the binary to automatically generate binary instructions to access values and variables efficiently, keeping changed code lean and fast. The analyst can next review the assurance artifacts. All these steps are performed in our user-friendly Binary Ninja plugin. Finally, the analyst can test the modified functionality.
Intuitive Patch Assurance Workflow for Typical Developer: Enacting a binary patch is straightforward in CBP, just modify source code. The next step is for the analyst to decide if the patched binary is correct, i.e., assuring the modifications to the binary. The CBP patch assurance artifacts and workflow do not require a PhD in program analysis to comprehend and correctly employ. After much experimentation and feedback from independent analysts on AMP, we have developed an effective analyst workflow where CBP helps to assure that (1) the patch produced on the binary correctly represents the changes enacted on the lifting and (2) the patched lifting correctly patches the vulnerability it is meant to fix. For (1), we provide assurance artifacts in our UI that enable the analyst to rapidly understand what the patch has changed in the binary (functions, control flow, etc.) via relational analyses and then, once the change has been localized, to directly compare the modified lifting (produced by the developer) with the lifting of the \emph{patched} binary (produced by our validated translation from binary to source). We have demonstrated this capability in the AMP Phase II end evaluation, with the evaluating analysts stating that `the creation of custom patches/verification was intuitive.'' For (2), we employ CodeHawk's C Analyzer on the patched lifting to demonstrate whether the vulnerability has been fixed; this applies to the large class of memory-corruption vulnerabilities.
Automatic Source Symbol and Type Incorporation: A vital synergy exists between BSI and CodeHawk. BSI is STR's TA1 performer in AMP. BSI will find source matches for functions in the binary. For high-confidence matches, the symbols and types of the source code will be automatically incorporated into CodeHawk's lifting and analysis. This capability was demonstrated during AMP's Phase II End evaluation.
Minimally-Invasive Binary Patching: When CBP performs the modifications to the original binary specified by the changes on the lifting, CBP reasons about the required changes at the expression level by comparing the abstract syntax tree (AST) of the original lifting to the modified lifting. Thus, CBP knows what has changed at the expression level and how to enact those changes on the binary. The changes it enacts are minimally invasive, meaning as few bytes are modified as required by the semantics of the changes on the lifting. Minimal changes enable precise relational analysis and have the best chance to preserve the vast majority of certification and testing that went into the original binary.
Translation Validation: An essential assurance component is the validation of the translations employed in the binary patching workflow. The analyst must be able to trust the results of a tool, and thus, a tool must demonstrate it is trustworthy. CodeHawk is currently able to create checkable proofs for its analysis results. For the AMP program, we achieved foundational steps towards a patching validation infrastructure (PVI) whose goal is to validate the translation steps of CodeHawk. When complete, PVI will produce checkable proofs demonstrating the correctness of translations between 1) binary to lifted source and 2) lifted source to binary. This is a massive leap in assurance over unverified tools that translate in a best effort approach, even when their input seems well-defined, without validation, how can one be sure a tool is bug-free?
Fully-Automated Discovery and Patching of Vulnerabilities: CodeHawk includes vulnerability discovery and patch templates that automatically find and fix important classes of vulnerabilities. Currently, CodeHawk supports string overflow vulnerabilities that occur through insecure use of C library functions.
Transition: The CodeHawk Binary Patcher is being developed further on ARPA-H's DIGIHEALS program via a subcontract with STR.
Artifacts and Links
- Github: CodeHawk Source Repos
- YouTube: CodeHawk Binary Patcher: BlackHat Tool Demo
- GitHub: Source code for MicroPatch Bench: Suite of micropatching benchmarks
Papers
- A Unified Algebraic Framework Of Program Analyses. LangSec, 2021
- IDA Pro Plugins for CodeHawk-Binary. Aarno Labs Technical Report, 2023
- Multifocal Relational Analysis for Assured Micropatching: Final Report. Aarno Labs Technical Report, 2024
- Assured Micropatching of Race Conditions in Legacy Real-time Embedded Systems. Real-Time Autonomous Systems Security, 2024
Blog Posts
Funding Source
DARPA: Assured Micropatching (AMP)
Program Dates
Start: July, 2020
End: July, 2024