How does Metalware work?

Metalware’s technology stack combines fuzzing, concolic analysis, emulation, and monitoring in a holistic way. This hybrid approach allows it to thoroughly test firmware in a manner that maximizes code coverage and uncovers even the most elusive bugs. The platform effectively acts as a virtual hardware + hacker that is tirelessly stress-testing the firmware with millions of permutations of events and inputs, guided by intelligent algorithms to go deeper and find weaknesses that human testers or simpler tools would miss.

Register and Interrupt-level Fuzzing

Metalware fuzzes at the granularity of hardware registers and CPU interrupts. This is a unique capability allowing it to simulate hardware events. For example, it can feed random or edge-case values into peripheral registers (via MMIO) that firmware reads (e.g. simulating a sensor giving extreme data), or trigger interrupts at varying intervals to emulate asynchronous events. Research has shown that cycling through and toggling interrupts in an emulator enables exploring how firmware reacts to different timing and sequences. Metalware uses similar techniques to reach deep into interrupt-handling code, uncovering issues that only appear under specific event interleavings.

Coverage-Guided Fuzzing

Metalware uses a coverage-guided fuzzing engine similar to AFL++ or libFuzzer, tailored for firmware binaries. Coverage-guided fuzzing means the fuzzer instruments the execution (via the emulator) to track which code paths have been executed by a given test input. Inputs that hit new lines of code or new branches are favored and mutated further.

This approach, also known as greybox fuzzing, is essentially a feedback-driven evolutionary process: the fuzzer generates inputs, runs the firmware, sees what new coverage is achieved, and then uses that feedback to produce even better inputs.

The goal is to maximize code coverage, which statistically correlates with finding more bugs. For firmware, the “inputs” are not typical command-line or file inputs, but rather the emulated hardware interactions (bytes read from peripherals, data in memory, etc.). Metalware’s instrumentation hooks into the CPU emulation, so each time an input causes the firmware to explore a new function or branch, the fuzzer knows it. Over time, this methodically maps out the firmware’s state space much more efficiently than random testing.

Concolic Execution

Metalware integrates concolic execution to complement fuzzing. Concolic (concrete + symbolic) execution means running the firmware with both concrete inputs and symbolic variables, and using a constraint solver to determine what inputs would be needed to traverse different paths. When the fuzzer encounters a path that is hard to get into via random mutations (for example, a condition like if (password == 0x5A3C12F4) that requires an exact match), the concolic module can take over. It will treat the relevant input bytes symbolically and solve the constraint (here, computing the exact value 0x5A3C12F4) to generate an input that satisfies the check.

This is extremely useful for firmware, which often has checksum checks, magic numbers, or password comparisons that would block fuzzing. By automatically solving these, Metalware’s concolic execution ensures the fuzzer doesn’t get stuck and can reach deeper execution states. Essentially, the concolic engine guides the fuzzer past “hard gates” in the code by logically reasoning about the input conditions needed.

Hybrid Fuzzing

Hybrid fuzzing seamlessly blends the coverage-guided fuzzing with concolic execution. This hybrid fuzzing addresses the limitations of using either method alone. The coverage-guided fuzzer is great at rapidly exploring lots of paths via random mutations, while the symbolic executor can surgically pick apart complex conditions.

In practice, Metalware runs the fast fuzzer continuously and whenever it hits an impasse (e.g., finds an input that gets “stuck” at some complex condition or new edge that looks promising but unmet), it will invoke the concolic solver to generate the needed input to advance. This way, the two modes feed into each other: fuzzing explores and finds interesting states, symbolic execution extends coverage beyond heuristic mutation limits, and the new inputs found are fed back into fuzzing.

The result is a much higher coverage than fuzzing or symbolic execution alone could achieve. Our state-of-the-art hybrid fuzzer demonstrates significant improvements in bug-finding by using this approach. Metalware is purpose-built for the firmware domain, making it possible to reach deep logic in embedded code (e.g., error-handling routines, seldom-invoked functionality) that traditional fuzzing can’t hit.

Boot Solver for Deep Execution Paths

One unique feature of Metalware is its ability to handle complex boot-up sequences of firmware. Many embedded firmwares perform hardware initialization, self-tests, or wait for specific events at startup. Traditional emulation might stop if the firmware is waiting indefinitely for a hardware signal that never comes in the emulator.

Metalware’s boot solver addresses this by using a combination of techniques: it can use concolic execution to satisfy checks (e.g., bypass a startup password prompt by solving it), and it employs heuristics to skip or simulate hardware delays. The goal is to autonomously get the firmware into a fully running state (e.g., reach the main control loop or the point where it starts handling external inputs) – essentially, automatically “re-host” the firmware.

This capability to reach deep execution paths means Metalware can fuzz parts of the code that only run after minutes of uptime or only after certain sequences – things that a naive fuzzer would never reach. Combined with interrupt fuzzing (which can simulate events that typically occur later or asynchronously), the boot solver ensures the fuzzer isn’t stuck in the early boot code forever. It significantly broadens the scope of testing to cover the entire lifecycle of firmware execution.

Dynamic Run-Time Monitoring

As the firmware runs under test, Metalware employs dynamic monitoring to catch any signs of failure or misbehavior. This includes traditional crash detection (monitoring for exceptions like segmentation faults, usage faults, hard-faults in ARM terminology) and extends to custom sanitizers.

Our emulator is instrumented to check each memory access, allocation, and free operation. If the firmware writes outside of an allocated buffer or to an invalid address, Metalware will detect it immediately (even if the firmware hasn’t crashed yet). In essence, it enforces security policies at runtime to turn silent memory corruptions into visible failures. For example, writing beyond the end of a buffer in memory might normally just overwrite adjacent data and not crash right away – Metalware’s instrumentation can catch that condition on the spot. This is analogous to how AddressSanitizer or Valgrind works, but implemented for the binary within the emulator.

Additionally, Metalware monitors for hangs or stalls – if the firmware gets stuck (e.g., an infinite loop waiting for input), the platform can detect that via lack of new coverage, and adjust the execution to continue fuzzing.

Timing and performance metrics are also gathered. All these runtime checks ensure that a wide array of bugs (not just outright crashes) are detected and logged. The approach yields higher fidelity results with near-zero false positives – any issue reported is backed by an actual execution trace of the firmware misbehaving, as opposed to static analysis which may guess at potential bugs.

Advanced Sanitization & Analysis Methods

Metalware doesn’t stop at just finding a crash; it also performs analysis to help understand and prioritize the issues found. It uses advanced sanitization methods to categorize the type of failure (for instance, detecting an overflow vs. a use-after-free vs. an assertion failure) and precisely pinpoint buffer overflow boundaries.

Additionally, Metalware performs taint analysis during fuzzing – tracking how fuzz input data moves through the firmware. This can identify if fuzz data reaches sensitive functions (like authentication or crypto routines) and might highlight potential injection points or security impact.

Metalware also performs crash triage: determining if multiple crashes are due to the same underlying bug or different ones (so developers aren’t overwhelmed by duplicate reports). This involves analyzing the program counter and call stack at crash time, comparing signatures of crashes, etc. All these methods are geared toward making the fuzzing results actionable: developers get not just a crash dump, but an informative report on what the issue is and possibly how to fix it.