Arya - Autonomous Security Agents
The goal of the Arya project is to develop
high-assurance, autonomous, and decentralized
software agents for neutralizing botnet infections
over large networks. We hope to scale to operations
involving tens of thousands of agents in deeply
nested networks with possibly long periods of agent
disconnection from the agent command-and-control.
For the project we have developed a novel blockchain
implementation based on a proof-of-burn consensus
mechanism. Agents coordinate in a decentralized
fashion via the blockchain. The blockchain enables
coordinated agent actions, such as synchronized and
distributed botnet neutralization, resource
coordination, and capabilities sharing. The
blockchain provides strong assurance guarantees such
as boundedness, replay resilience, authentication,
and partition resilience.
The project is developing a formal operation
specification and logic that includes a high-level
schema for specifying properties of an operation
such as rules of engagement, targeting, pre-campaign
intelligence, and the world model of devices and
exploits. For complex decisions in the agent
network, we are developing, with the help of MIT
CSAIL, Correct-by-Construction Reinforcement
Learning that assures a learned policy adheres to a
specification and can suggest changes to the
specification required for operation success.
Finally, we are modeling the agent in a formal proof
system to prove safety and effectiveness properties.
Our prototype has successful neutralized real botnet
infections during DARPA exercises, spawning to a
variety of devices including IoT and routers.
Principal Investigator: Dr. Michael Gordon
Partners: MIT CSAIL
Funding Source: DARPA HACCS
Arya - Automatic Vulnerability Discovery
This projects aims to generate robust,
non-disruptive exploits using unstructured
information about known (n-day) vulnerabilities. The
system automatically extracts information about
known vulnerabilities from semi-structured and
unstructured sources, including natural language
sources such as blog posts and commit messages, and
uses that information to drive guided, robust
exploit generation. The system generates hardened
exploits that ensure minimal disruption to the
vulnerable system for a variety of vulnerability
classes. To generate n-day exploits, we are
developing a new technique and system called
Directed Vulnerability Discovery Engine for
automatically generating inputs that trigger memory
safety, command injection, privilege escalation and
authentication bypass vulnerabilities. To harden the
exploit, the Exploit Generation component generates
additional predicates designed to make the exploit
more reliable on new environments, work in the
presence of modern mitigation techniques, improve
the robustness of the vulnerable application
post-exploit, and provide non-interference
guarantees of existing program invariants.
Principal Investigator: Jeff Perkins
Partners: MIT CSAIL
Funding Source: DARPA HACCS
GRAIL - Guaranteed Robust Artificial Intelligence
The goal of GRAIL is to develop robust AI into a
principled and active field that can deliver
practically deployable systems backed by solid
theoretical foundations. This goal naturally
decomposes into two major but interacting thrusts,
specifically: (1) developing the theoretical
foundations of the field and (2) developing the
multi-faceted techniques required to build deep
learning systems that offer the strong and truly
meaningful performance and robustness guarantees
required for real-world, high-stakes deployment.
Therefore, to make robust models possible (and
practical) to deploy it is crucial to develop more
comprehensive, efficient, and effective robust
training tools. Developing such a toolkit is one of
the primary goals of GRAIL. In particular, we aim to
generalize adversarial training -- the current
dominant approach to robust training that boils down
to training the model on some adversarial
perturbations of the training inputs - by putting
together a framework that enables a broad synthesis
of inputs that our model is trained on. GRAIL will
develop efficient techniques that make full
verification of input robustness a deployable
reality. Crucially, these new techniques will be
conservative, they never incorrectly classify an
input as robust when it is not.
Principal Investigator: Dr. Michael Gordon
Partners: MIT CSAIL (prime)
Funding Source: DARPA GARD
Require Security - Supply Chain Protection
Require
Security is a spin-off of MIT and Aarno Labs
that provides advanced protections supply chain
threats. Our technologies enable automated and
seamless protection of open-source JavaScript
libraries. We enable developers to understand how
untrusted libraries could interact with sensitive
data and resources. We can then automatically
protect those resources.
Our techniques are developer-first, offering fully
automated analysis, transformation, and in certain
cases even synthesis of libraries and their
interactions. The results are intuitive, allowing
easy integration with a plethora of environments—for
example, continuous integration pipelines à la
GitHub and AWS, other security products, as well as
IDEs. Our technologies are better at detecting and
preventing attacks versus the current market leader
in this space—for real world security threats, for
0-day vulnerabilities, and for actively malicious
code.
Funding
Sources: OPS-5G
and NSA In-Q-Tel.
MRAM - Multifocal Relational Analysis for Assured Micropatching
MRAM develops new techniques for
automatically patching production stripped binaries
to eliminate security vulnerabilities while provably
preserving noninterference properties between the
patch and the baseline functionality that the binary
implements. One prominent goal is to preserve the
validation effort invested in such binaries by
preserving as much of the original binary as
possible. Another goal is to develop new program
analysis techniques that infer and verify
noninterference patch correctness properties to
guarantee the absence of undesirable interactions
between the patch and the baseline functionality of
the binary. Yet another goal is to support
scenarios in which it is not possible to compile the
patch with standard compilers or other tools,
typically because source code or build information
required for compilation is not available. When
recompilation is infeasible or undesirable, current
approaches leave developers reduced to using
laborious manual binary patching. The unfortunate
result is a reluctance or even inability to
eliminate important security vulnerabilities in
deployed binaries. The research will dramatically
reduce the time, effort, and cost required to
reliably patch security vulnerabilities in stripped
binaries.
Principal Investigator: Dr. Michael Gordon
Partners: MIT CSAIL
Funding Source: DARPA AMP
Saran - Mobile Application Interposition
Mobile applications are an important computing
platform; securing mobile devices, and their
applications, is a primary concern. Unfortunately,
the level of security controls exposed by mobile
platforms do not typically meet DoD
requirements. Effectively securing these
applications requires the ability to transparently
and efficiently identify functional code blocks and
add instrumentation that satisfies DoD requirement
and policies. Retrofitting security controls and
policies to mobile devices and their applications is
an extremely challenging problem. First, identifying
instrumentation points (functional code blocks) is
difficult when the application employs obfuscation.
Second, retrofitting instrumentation transparently
requires sophisticated techniques that hide the
side-effects of injected instrumentation from the
application, while maintaining functionality. Third,
maintaining transparency and control of
instrumentation can adversely impact application
performance.
Saran is a novel system to transparently, and
efficiently, instrument Android applications. Saran
supports the transparent instrumentation of entire
Android APK's DEX bytecode is instrumented using a
static binary decompilation that lifts the bytecode
into an intermediate representation that facilitates
program analysis and transformation. Our DEX
bytecode instrumentation maintains transparency by
intercepting and sanitizing reflective, and other
introspective, calls and provides completeness by
supporting reflection. Saran enables APK
instrumentation by defining a set of common events
to instrument. Dynamic taint tracking via Saran
instrumentation adds additional instructions to the
original program to track the sensitive information
sources that influence a program value. For
example, if the introspection would like to report
when location information is written to a network
connection, Saran will add dynamic taint tracking
instructions for all instructions that could be on
the path from a location read to a network write.
This mechanism will provide unprecedented levels of
precision for these types of predicated
introspection directives.
Principal Investigator: Jeff Perkins
Funding Source: SBIR
AIKIDO - Automatic Vulnerability Injection
Evaluating security mechanisms is a difficult and
manual process which more often than not relies on
synthetic benchmarks that are typically comprised of
simple test cases or a combination of known
vulnerabilities in an attempt to provide coverage
for different classes of attacks. As such, security
mechanisms adapt their techniques to guarantee good
coverage for the synthetic benchmarks but provide
little guarantees of their efficacy beyond that. To
address this problem, we are developing a new
technique and system for automatically generating
and injecting realistic vulnerabilities to
real-world applications. Our system uses targeted
symbolic execution to discover program paths that
could be used to generate vulnerabilities, such as
buffer and integer overflows. The programs paths
are then modified using information from formal
methods (e.g., SMT solvers) to generate and inject
new code, that is provably vulnerable: the system
can prove that the generated conditions along a
specific program path can lead to a vulnerability.
The system is developed using the LLVM Compiler
Infrastructure.
Principal Investigator: Dr. Ricardo Baratto
Funding Source: SIBR
Aria - Cross-Scale Capability Runtime Monitoring and Reconfiguration
Aria is developing an adaptive security
architecture that targets (and, in fact, leverages)
highly heterogeneous near-future computing ensembles
enabled by 5G. This security architecture is (i)
language-based, in that it leverages
programming-language techniques to achieve its
goals, (ii) zero-trust, in that it employs entities
that are mutually-distrustful, and (iii)
resource-aware, in that it is careful in how it
exploits available resources and their heterogeneity
across devices and scales. Aria is centered around
distributed object-capabilities, unforgeable but
transferable rights to perform one or more
operations on objects. Its foundation is a
capability-safe subset of an existing programming
language, specifically tailored to bootstrap the
security architecture. The subset is expressive
enough to write real programs and leverage an
existing software ecosystem, but constrained enough
to disallow ambient authority and enable powerful
analyses. A series of domain-specific languages
(DSLs) allow expressing security policies at scale,
and an associated set of analyses ensures these
policies are correct with respect to a set of trust
criteria. These DSLs compile down to a set of
runtime monitors that wrap select portions of
programs that cannot be proved correct with respect
to the trust criteria in order effectively integrate
them into the trust base. For the capability-safe
subset, Aria's static correctness proofs avoid the
overhead of any runtime monitoring; Aria's runtime
monitors, however, allow the execution of programs
that fall outside the capability-safe subset. The
result is a collection of mutually distrustful
components that do not only resemble a distributed
system, they operate as one. Aria strikes a middle
ground between clean-slate and retrofitting
approaches, by defining a capability-safe subset of
an existing and widely used programming language,
progressively building an entire single-node trust
infrastructure, and culminating with a
general-purpose multi-scale trust environment. The
research will dramatically reduce the time, effort,
and cost required to enable security at scale across
devices with widely disparate size, weight, and
power characteristics.
Principal Investigator: Dr. Michael Gordon
Partners: MIT CSAIL
Funding Source: OPS-5G
ClearScope
ClearScope is a robust system for precise and comprehensive provenance
tracking of information that flows through Android devices. In
contrast to previous systems, ClearScope tracks the complete path that
data takes through the device, from its initial entry into the device
through to its exit point, including applications, files, and IPC that
the data traverses along this path, as well as tracking flows in both
the Dalvik Android Runtime (ART) and code executing outside of the
runtime (native code). ClearScope can also track up to 232
combinations of information sources and intermediate information
traversal points. Previous systems, in contrast, can track only a
small fixed number of information sources. And the information that
ClearScope delivers has unprecedented precision, including the time of
data traversal events, the precise location in the application where
data traversal events take place, and the initial source or sources of
relevant data at the level of individual bytes. Clearscope combines
static instrumentation and dynamic monitoring to achieve tracking on
real-world applications on real devices.
DroidSafe
The DroidSafe project develops effective program
analysis techniques and tools to uncover malicious
code in Android mobile applications. The core of
the system is a static information flow analysis
that reports the context under which sensitive
information is used. For example, "Application A
has the potential to send location information to
network address 128.6.21.6 on 'Button B’ press".
The DroidSafe project invested significant time
developing a comprehensive semantic model of Android
runtime behaviors alongside the analysis to achieve
acceptable precision, accuracy, and scalability for
real-world Android applications. The combined system
has been demonstrated to be the most precise and
accurate information flow analysis for Android
applications. The analysis results can be used to
automatically check applications for security policy
violations, and the results can help a human analyst
inspect sensitive behaviors of an app, increasing
accuracy and throughput of application vetting. For
each of the last six DARPA APAC engagements to date, the
DroidSafe team has been unsurpassed in malware
diagnosis accuracy with the least amount of human
analysis time.
Principal Investigator: Dr. Michael Gordon
Partners: MIT and Kestrel Institute
Funding Source: DARPA APAC Program
Dark Corners
Sound static program analysis promises exhaustive detection of many
classes of program errors and, ideally, verification that the program
is free of these errors. However, the critical issue is the lack of
precision that such analyses inevitably encounter when analyzing the
programs that occur in practice. Our hypothesis is that, in existing
programs, only a small percentage of the code is responsible for the
lack of precision for verification tasks. Aarno Labs is developing
program analysis tools that identify and mitigate the sources of
imprecision in large, real-world programs. We are developing analysis
technologies such that identify the exact source of imprecision in the
implementation using techniques such as logical abduction.
Furthermore, we are developing techniques to increase precision once
target code is identified such as (1) Applying more expensive analysis
techniques to this code; (2) Making manual changes to the code and/or
adding annotations/dynamic checks to reduce the complexity; (3)
Replacing the code with similar code from other implementations
automatically; (4) Providing an alternative implementation that is
easier to analyze but can be shown to match the existing
implementation (with respect to memory safety).
Principal Investigator: Jeff Perkins
Partners: Professor Thomas Dillig
Funding Source: DARPA APAC Program
Targeted Automatic Patching
Overflow errors are an insidious source of software failures and security
vulnerabilities. The accepted wisdom is to use a multitude of tools, such as
involved software development strategies, dynamic bug finders and static
analysis tools in an attempt to eliminate as many bugs as possible. However,
experience has shown that it is very hard to achieve bug-free software. As a
result, even under the best of circumstances, buggy software is deployed and
developers face a constant and time-consuming battle of creating and releasing
patches fast enough to fix newly discovered bugs. Patches can take days if not
weeks to create, and it is not uncommon for systems to continue running
unpatched applications long after an exploit of a bug has become well-known.
As the scale of software deployments and the frequency of bug reports increase,
it is clear that patch generation needs to be automated.
We are developing an automatic system (Targeted Automatic Patching or
TAP) for finding and eliminating overflow errors. This system
leverages bug finding tools to automatically generate inputs that expose overflow errors.
TAP then uses a template-based synthesis approach to
automatically generate software patches (both source and binary
patches may be feasible). Template-based synthesis applies known
error fixing patterns (or templates) in combination with formal
methods, such as SMT solvers, to generate code that eliminates
overflow errors. The proposed synergistic coupling of bug finding
tools and automatic repair enables an unprecedented level of
verifiable automation to automatically eliminate overflow errors
Principal Investigator:
Dr. Stelios Sidiroglou-Douskos
Funding Source: DARPA MRC Program
Tunable Cyber Defensive Mechanisms
Evaluating security mechanisms is a difficult and manual
process. To effectively compare different techniques,
security analysts use synthetic bench- marks that are
typically comprised of simple test cases or a combination
of known vulnerabilities in an attempt to provide coverage
for different classes of attacks. As such, security
mechanisms adapt their techniques to guarantee good
coverage for the synthetic benchmarks but provide little
guarantees of their efficacy beyond that. To ensure
accurate evaluation and comparison of competing security
techniques on real-world programs, we need automated
techniques for injecting realistic and verifiable
vulnerabilities.
We developing a new technique and system for
automatically generating and injecting realistic
vulnerabilities to real-world applications. Our
system uses targeted symbolic execution to discover program
paths that could be used to generate vulnerabilities (e.g.,
integer overflows). The programs paths (i.e, symbolic
constraints) are then modified using information from
formal methods (e.g.,, using SMT solvers) to generate and
inject new code , at the source- or binary-level, that is
provably vulnerable (e.g., the system can prove that the
generated conditions along a specific program path can lead
to an integer overflow). The code is then obfuscated, using
previously learned bug patterns, to look like native
vulnerable code.
Principal Investigator:
Dr. Ricardo Baratto
Funding Source: DARPA SBIR Program