Normal view

There are new articles available, click to refresh the page.
Before yesterdayTenable TechBlog - Medium

G-3PO: A Protocol Droid for Ghidra

21 December 2022 at 14:02

(A Script that Solicits GPT-3 for Comments on Decompiled Code)

“This is the droid you’re looking for,” said Wobi Kahn Bonobi.

TL;DR

In this post, I introduce a new Ghidra script that elicits high-level explanatory comments for decompiled function code from the GPT-3 large language model. This script is called G-3PO. In the first few sections of the post, I discuss the motivation and rationale for building such a tool, in the context of existing automated tooling for software reverse engineering. I look at what many of our tools — disassemblers, decompilers, and so on — have in common, insofar as they can be thought of as automatic paraphrase or translation tools. I spend a bit of time looking at how well (or poorly) GPT-3 handles these various tasks, and then sketch out the design of this new tool.

If you want to just skip the discussion and get yourself set up with the tool, feel free to scroll down to the last section, and then work backwards from there if you like.

The Github repository for G-3PO can be found HERE.

On the Use of Automation in Reverse Engineering

At the present state of things, the domain of reverse engineering seems like a fertile site for applying machine learning techniques. ML tends to excel, after all, at problems where getting the gist of things counts, where the emphasis is on picking out patterns that might otherwise go unnoticed, and where error is either tolerable or can be corrected by other means. This kind of loose and conjectural pattern recognition is where reverse engineering begins. We start by trying to get a feel for a system, a sense of how it hangs together, and then try to tunnel down. Impressions can be deceptive, of course, but this is a field where they’re easily tested, and where suitable abstractions are both sought and mistrusted.

A still from the movie, Matrix, showing Cipher in front of monitors displaying arcane data dumps, and saying “You get used to it. I don’t even see the code anymore. All I see is blonde, brunette, redhead,” or something like that.
You get used to it.

The goal, after all, is to understand (some part of) a system better than its developers do, to piece together its specification and where the specification breaks down.

At many stages along the way, you could say that what the reverse engineer is doing is searching for ways to paraphrase what they’re looking at, or translate it from one language into another.

We might begin, for example, with an opaque binary “blob” (to use a semi-technical term for unanalyzed data) that we dumped off a router’s NAND storage. The first step might be to tease out its file format, and through a process of educated guesses and experiments, find a way to parse it. Maybe it turns out to contain a squashfs file system, containing the router’s firmware. We have various tools, like Binwalk, to help with this stage of things, which we know can’t be trusted entirely but which might provide useful hints, or even get us to the next stage.

Suppose we then unpack the firmware, mount it as a filesystem, and then explore the contents. Maybe we find an interesting-looking application binary, called something like telnetd_startup. Instead of reading it as an opaque blob of bits, we look for a way to make sense of it, usually beginning by parsing its file structure (let’s say it’s an ELF) and disassembling it — translating the binary file into a sequence, or better, a directed graph of assembly instructions. For this step we might lean on tools like objdump, rizin, IDA Pro (if we have an expense account), or, my personal favourite, Ghidra. There’s room for error here as well, and sometimes even the best tools we have will get off on the wrong foot and parse data as code, or misjudge the offset of a series of instructions and produce a garbled listing, but you get to recognize the kinds of errors that these sorts of tools are prone to, especially when dealing with unknown file formats. You learn various heuristics and rules of thumb to minimize and correct those errors. But tools that can automate the translation of a binary blob into readable assembly are nevertheless essential — to the extent that if you were faced with a binary that used an unknown instruction set, your first priority as a reverse engineer may very well be to figure out how to write at least a flawed and incomplete disassembler for it.

The disassembly listing of a binary gives us a fine grained picture of its application logic, and sometimes that’s the furthest that automated tools can take us. But it’s still a far cry from the code that its developer may have been working with — very few programs are written in assembly these days, and its easy to get lost in the weeds without a higher-level vantage point. This might be where the reverser begins the patient manual work of discovering interesting components of the binary — components where its handling user input, for example — by stepping through the binary with a debugger like GDB (perhaps with the help of an emulator, like QEMU), and then annotating the disassembly listing with comments. In doing so the reverser tries to produce a high-level paraphrase of the program.

Nowadays, however, we often have access to another set of tools called decompilers, which can at least approximately translate the dissassembly listing into something that looks like source code, typically something like C (but extended with a few pseudo types, like Ghidra’s undefined and undefined* to indicate missing information). (Other tools, static analysis frameworks like BAP or angr (or, internally, Ghidra or Binary Ninja), for example, might be used to “lift” or translate the binary to an intermediate representation more amenable to further automated analysis, but we’ll leave those aside for now.) Decompilation is a heuristically-driven and inexact art, to a significantly greater extent than disassembly. When source code (in C, for example) is compiled down to x86 or ARM machine code, there’s an irreversible loss of information, and moving back in the other direction involves a bit of guess work, guided by contextual clues and constraints. When reverse engineers work with decompilers, we take it for granted that the decompiler is probably getting at least a few things wrong. But I doubt anyone would say that they’re unhelpful. We can, and often must, go back to the disassembly listing whenever needed after all. And when something seems fishy there, we can go back to the binary’s file format, and see if something’s been parsed incorrectly.

In my day to day work, this is usually where automated analysis stops and where manual annotation and paraphrase begins. I slowly read through the decompiler’s output and try to figure out, in ordinary language, what the code is “supposed” to be doing, and what it’s actually doing. It’s a long process of conjecture and refutation, often involving the use of debuggers, emulators, and tracers to test interpretations of the code. I might probe the running or emulated binary with various inputs and observe the effects. I might even try to do this in a brute force way, at scale, “fuzzing” the binary and looking for anomalous behaviour. But a considerable amount of time is spent just adding comments to the binary in Ghidra, correcting misleading type information and coming up with informative names for the functions and variables in play (especially if the binary’s been stripped and symbols are missing). Let’s call this the process of annotation.

We might notice that many of the automated stages in the reverse engineer’s job — parsing and unpacking the firmware blob, disassembling binary executables, and then decompiling them — can at least loosely be described as processes of translation or paraphrase. And the same can be said for annotation.

This brings us back to machine learning.

Using Large Language Models as Paraphrasing Engines, in the Context of Reverse Engineering

If there’s one thing that large language models, like OpenAI’s GPT-3, have shown themselves to be especially good at, it’s paraphrase — whether it’s a matter of translating between one language and another, summarising an existing knowledge base, or rewriting a text in the style of a particular author. Once you notice this, as I did last week while flitting back and forth between a project I was working on in Ghidra and a browser tab opened to ChatGPT, it might seem natural to see how an LLM handles the kinds of “paraphrasing” involved in a typical software reverse engineering workflow.

The example I’ll be working with here, unless otherwise noted, is a function carved from a firmware binary I dumped from a Canon ImageClass MF743Cdw printer.

GPT-3 Makes a Poor Disassembler

Let’s begin with disassembly:

A screenshot showing me prompting ChatGPT with a hexdump of ARM machine code.

Disassembly seems to fall squarely outside of ChatGPT’s scope, which isn’t surprising. It was trained on “natural language” in the broad sense, after all, and not on binary dumps.

A screenshot showing ChatGPT offering a fallacious disassembly of the ARM binary snippet.
A failed attempt by ChatGPT to disassemble some ARM machine code.

The GPT-3 text-davinci-003 model does no better:

A screenshot showing the text-davinci-003 model fail to provide an accurate disassembly of the hexdumped binary provided.

This, again, would be great, if it weren’t entirely wrong. Here’s what capstone (correctly) returns for the same input:

0x44b2d4b0: cmp r2, #3
0x44b2d4b4: bls #0x44b2d564
0x44b2d4b8: ands ip, r0, #3
0x44b2d4bc: beq #0x44b2d4e4
0x44b2d4c0: ldrb r3, [r1], #1
0x44b2d4c4: cmp ip, #2
0x44b2d4c8: add r2, r2, ip
0x44b2d4cc: ldrbls ip, [r1], #1
0x44b2d4d0: strb r3, [r0], #1
0x44b2d4d4: ldrblo r3, [r1], #1
0x44b2d4d8: strbls ip, [r0], #1
0x44b2d4dc: sub r2, r2, #4
0x44b2d4e0: strblo r3, [r0], #1
0x44b2d4e4: ands r3, r1, #3
0x44b2d4e8: beq #0x44b36318
0x44b2d4ec: subs r2, r2, #4
0x44b2d4f0: blo #0x44b2d564
0x44b2d4f4: ldr ip, [r1, -r3]!
0x44b2d4f8: cmp r3, #2
0x44b2d4fc: beq #0x44b2d524
0x44b2d500: bhi #0x44b2d544
0x44b2d504: lsr r3, ip, #8
0x44b2d508: ldr ip, [r1, #4]!
0x44b2d50c: subs r2, r2, #4
0x44b2d510: orr r3, r3, ip, lsl #24
0x44b2d514: str r3, [r0], #4
0x44b2d518: bhs #0x44b2d504
0x44b2d51c: add r1, r1, #1
0x44b2d520: lsr r3, ip, #0x10
0x44b2d524: ldr ip, [r1, #4]!
0x44b2d528: subs r2, r2, #4
0x44b2d52c: orr r3, r3, ip, lsl #16
0x44b2d530: str r3, [r0], #4
0x44b2d534: bhs #0x44b2d520
0x44b2d538: add r1, r1, #2
0x44b2d53c: lsr r3, ip, #0x18
0x44b2d540: ldr ip, [r1, #4]!
0x44b2d544: subs r2, r2, #4
0x44b2d548: orr r3, r3, ip, lsl #8
0x44b2d54c: str r3, [r0], #4
0x44b2d550: bhs #0x44b2d53c
0x44b2d554: add r1, r1, #3
0x44b2d558: lsls r2, r2, #0x1f
0x44b2d55c: ldrbhs r3, [r1], #1
0x44b2d560: ldrbhs ip, [r1], #1
0x44b2d564: ldrbmi r2, [r1], #1
0x44b2d568: strbhs r3, [r0], #1
0x44b2d56c: strbhs ip, [r0], #1
0x44b2d570: strbmi r2, [r0], #1
0x44b2d574: bx lr
0x44b2d578: subs r2, r2, #0x20
0x44b2d57c: blo #0x44b2d5b0
0x44b2d580: cmp r2, #0x80
0x44b2d584: blo #0x44b2d5a0
0x44b2d588: pld [r1, #0x80]
0x44b2d58c: ldm r1!, {r3, r4, r5, r6, r7, r8, ip, lr}
0x44b2d590: sub r2, r2, #0x20
0x44b2d594: stm r0!, {r3, r4, r5, r6, r7, r8, ip, lr}
0x44b2d598: ldm r1!, {r3, r4, r5, r6, r7, r8, ip, lr}
0x44b2d59c: subs r2, r2, #0x20
0x44b2d5a0: stm r0!, {r3, r4, r5, r6, r7, r8, ip, lr}
0x44b2d5a4: bhs #0x44b2d598
0x44b2d5a8: lsls ip, r2, #0x1c
0x44b2d5ac: ldmhs r1!, {r3, r4, ip, lr}
0x44b2d5b0: stmhs r0!, {r3, r4, ip, lr}
0x44b2d5b4: ldmmi r1!, {r3, r4}
0x44b2d5b8: stmmi r0!, {r3, r4}
0x44b2d5bc: lsls ip, r2, #0x1e
0x44b2d5c0: ldrhs r3, [r1], #4
0x44b2d5c4: strhs r3, [r0], #4
0x44b2d5c8: bxeq lr
0x44b2d5cc: lsls r2, r2, #0x1f
0x44b2d5d0: ldrhhs r3, [r1], #2
0x44b2d5d4: ldrbmi r2, [r1], #1
0x44b2d5d8: strhhs r3, [r0], #2
0x44b2d5dc: strbmi r2, [r0], #1
0x44b2d5e0: bx lr

GPT-3 Makes a Pretty Good Decompiler

Things look a lot brighter when we turn to decompilation.

If I first disassemble the binary using Capstone and then ask the LLM to decompile the result, I start to get surprisingly accurate and informative answers.

>>> prompt_openai_model(f"Here is some ARM assembly code:\n\n```\n{s}\n```\n\nIt can be decompiled into C code as follows:\n\n```\n",
model="text-davinci-003")

This gives us the following:

if (r2 == 3) {
if ((r0 & 3) == 0) {
while (r2 > 0) {
r3 = *r1++;
*r0++ = r3;
r2--;
}
} else {
while (r2 > 0) {
if ((r0 & 3) == 2) {
r3 = *r1++;
*r0++ = r3;
r2--;
} else {
ip = *r1++;
r2 -= 4;
if (r3 == 2) {
*r0++ = (ip << 24) | (*r1++);
} else if (r3 < 2) {
*r0++ = (ip << 16) | (*r1++);
r1++;
} else {
*r0++ = (ip << 8) | (*r1++);
r1 += 3;
}
}
}
}
} else {
while (r2 > 0) {
if (r2 >= 32) {
if (r2 >= 128) {
for (int i = 0; i < 4; i++) {
*r0++ = *r1++;
*r0++ = *r1++;
*r0++ = *r1++;
*r0++ = *r1++;
r2 -= 32;
}
} else {
*r0++ = *r1++;
*r0++ = *r1++;
*r0++ = *r1++;
*r0++ = *r1++;
r2 -= 32;
}
} else {
if (r2 >= 16) {
*r0++ = *r1++;
*r0++ = *r1++;
*r0++ = *r1++;
*r0++ = *r1++;
r2 -= 16;
} else {
if (r2 >= 8) {
*r0++ = *r1++;
*r0++ = *r1++;
r2 -= 8;
} else {
if (r2 >= 4) {
*r0++ = *r1++;
r2 -= 4;
} else {
if (r2 >= 2) {
*r0++ = *r1++;
r2 -= 2;
} else {
*r0++ = *r1++;
r2--;
}
}
}
}
}
}
}

Compare this with what Ghidra gives us:


undefined8 FUN_44b2d4b0(uint *param_1,uint *param_2,uint param_3,uint param_4)

{
uint *puVar1;
uint *puVar2;
byte *pbVar3;
uint *puVar4;
byte bVar5;
undefined2 uVar6;
uint uVar7;
uint uVar8;
uint uVar9;
uint uVar10;
uint uVar11;
uint uVar12;
byte bVar13;
uint in_r12;
uint uVar14;
uint uVar15;
uint uVar16;
bool bVar17;
bool bVar18;

if (3 < param_3) {
uVar14 = param_1 & 3;
in_r12 = uVar14;
if (uVar14 != 0) {
bVar5 = *param_2;
puVar2 = param_2 + 1;
if (uVar14 < 3) {
puVar2 = param_2 + 2;
in_r12 = *(param_2 + 1);
}
*param_1 = bVar5;
param_2 = puVar2;
if (uVar14 < 2) {
param_2 = puVar2 + 1;
bVar5 = *puVar2;
}
puVar2 = param_1 + 1;
if (uVar14 < 3) {
puVar2 = param_1 + 2;
*(param_1 + 1) = in_r12;
}
param_3 = (param_3 + uVar14) - 4;
param_1 = puVar2;
if (uVar14 < 2) {
param_1 = puVar2 + 1;
*puVar2 = bVar5;
}
}
param_4 = param_2 & 3;
if (param_4 == 0) {
uVar14 = param_3 - 0x20;
if (0x1f < param_3) {
for (; 0x7f < uVar14; uVar14 = uVar14 - 0x20) {
HintPreloadData(param_2 + 0x20);
uVar7 = *param_2;
uVar8 = param_2[1];
uVar9 = param_2[2];
uVar10 = param_2[3];
uVar11 = param_2[4];
uVar12 = param_2[5];
uVar15 = param_2[6];
uVar16 = param_2[7];
param_2 = param_2 + 8;
*param_1 = uVar7;
param_1[1] = uVar8;
param_1[2] = uVar9;
param_1[3] = uVar10;
param_1[4] = uVar11;
param_1[5] = uVar12;
param_1[6] = uVar15;
param_1[7] = uVar16;
param_1 = param_1 + 8;
}
do {
param_4 = *param_2;
uVar7 = param_2[1];
uVar8 = param_2[2];
uVar9 = param_2[3];
uVar10 = param_2[4];
uVar11 = param_2[5];
uVar12 = param_2[6];
uVar15 = param_2[7];
param_2 = param_2 + 8;
bVar17 = 0x1f < uVar14;
uVar14 = uVar14 - 0x20;
*param_1 = param_4;
param_1[1] = uVar7;
param_1[2] = uVar8;
param_1[3] = uVar9;
param_1[4] = uVar10;
param_1[5] = uVar11;
param_1[6] = uVar12;
param_1[7] = uVar15;
param_1 = param_1 + 8;
} while (bVar17);
}
if (uVar14 >> 4 & 1) {
param_4 = *param_2;
uVar7 = param_2[1];
uVar8 = param_2[2];
uVar9 = param_2[3];
param_2 = param_2 + 4;
*param_1 = param_4;
param_1[1] = uVar7;
param_1[2] = uVar8;
param_1[3] = uVar9;
param_1 = param_1 + 4;
}
if (uVar14 << 0x1c < 0) {
param_4 = *param_2;
uVar7 = param_2[1];
param_2 = param_2 + 2;
*param_1 = param_4;
param_1[1] = uVar7;
param_1 = param_1 + 2;
}
puVar1 = param_1;
puVar2 = param_2;
if (uVar14 >> 2 & 1) {
puVar2 = param_2 + 1;
param_4 = *param_2;
puVar1 = param_1 + 1;
*param_1 = param_4;
}
uVar6 = param_4;
if ((uVar14 & 3) != 0) {
bVar18 = uVar14 >> 1 & 1;
uVar14 = uVar14 << 0x1f;
bVar17 = uVar14 < 0;
puVar4 = puVar2;
if (bVar18) {
puVar4 = puVar2 + 2;
uVar6 = *puVar2;
}
puVar2 = puVar4;
if (bVar17) {
puVar2 = puVar4 + 1;
uVar14 = *puVar4;
}
puVar4 = puVar1;
if (bVar18) {
puVar4 = puVar1 + 2;
*puVar1 = uVar6;
}
puVar1 = puVar4;
if (bVar17) {
puVar1 = puVar4 + 1;
*puVar4 = uVar14;
}
return CONCAT44(puVar2,puVar1);
}
return CONCAT44(puVar2,puVar1);
}
bVar17 = 3 < param_3;
param_3 = param_3 - 4;
if (bVar17) {
param_2 = param_2 - param_4;
in_r12 = *param_2;
puVar2 = param_1;
if (param_4 == 2) {
do {
puVar1 = param_2;
param_4 = in_r12 >> 0x10;
param_2 = puVar1 + 1;
in_r12 = *param_2;
bVar17 = 3 < param_3;
param_3 = param_3 - 4;
param_4 = param_4 | in_r12 << 0x10;
param_1 = puVar2 + 1;
*puVar2 = param_4;
puVar2 = param_1;
} while (bVar17);
param_2 = puVar1 + 6;
}
else if (param_4 < 3) {
do {
puVar1 = param_2;
param_4 = in_r12 >> 8;
param_2 = puVar1 + 1;
in_r12 = *param_2;
bVar17 = 3 < param_3;
param_3 = param_3 - 4;
param_4 = param_4 | in_r12 << 0x18;
param_1 = puVar2 + 1;
*puVar2 = param_4;
puVar2 = param_1;
} while (bVar17);
param_2 = puVar1 + 5;
}
else {
do {
puVar1 = param_2;
param_4 = in_r12 >> 0x18;
param_2 = puVar1 + 1;
in_r12 = *param_2;
bVar17 = 3 < param_3;
param_3 = param_3 - 4;
param_4 = param_4 | in_r12 << 8;
param_1 = puVar2 + 1;
*puVar2 = param_4;
puVar2 = param_1;
} while (bVar17);
param_2 = puVar1 + 7;
}
}
}
bVar13 = in_r12;
bVar5 = param_4;
bVar18 = param_3 >> 1 & 1;
param_3 = param_3 << 0x1f;
bVar17 = param_3 < 0;
if (bVar18) {
pbVar3 = param_2 + 1;
bVar5 = *param_2;
param_2 = param_2 + 2;
bVar13 = *pbVar3;
}
puVar2 = param_2;
if (bVar17) {
puVar2 = param_2 + 1;
param_3 = *param_2;
}
if (bVar18) {
pbVar3 = param_1 + 1;
*param_1 = bVar5;
param_1 = param_1 + 2;
*pbVar3 = bVar13;
}
puVar1 = param_1;
if (bVar17) {
puVar1 = param_1 + 1;
*param_1 = param_3;
}
return CONCAT44(puVar2,puVar1);
}

These look, at first blush, pretty close to one another. In both cases what this function looks like is something like a compiler-optimized memcpy, implemented in such a way as to exploit whatever common alignment the source and destination pointer might have.

Now, as far as machine code goes, Ghidra’s decompiler is already quite good, and there’s no real need to put a rather opaque and heuristic LLM in its place. Where LLM-driven approximate decompilations can be quite useful is when dealing with a bytecode for which a good decompiler isn’t immediately available. Another researcher on the Tenable Zero Day team, Jimi Sebree, was able to coax ChatGPT into producing reasonably useful (if imperfect) decompilations of Lua bytecode while reversing a certain router’s LuCI front-end. This took us from something like this:

A screenshot of some Lua bytecode.
An (incomplete) snippet of disassembled Lua bytecode, decompiled by ChatGPT below.

To something like this:

module("luci.controller.admin.access_control", package.seeall)
local uci = require("luci.model.uci")
local controller = require("luci.model.controller")
local sys = require("luci.sys")
local form = require("luci.tools.form")
local debug = require("luci.tools.debug")
local client_mgmt = require("luci.model.client_mgmt")
local checktypes = require("luci.model.checktypes")
local cur = uci.cursor()
form = form.Form(cur, {"mac"})
local json = require("luci.json")
local http = require("luci.http")
local util = require("luci.util")
local ctl = controller.Controller()
local mgmt = client_mgmt.Client_mgmt()

function index()
entry({"admin", "access_control"}, call("action_index"), _("Access Control"), 3)
entry({"admin", "access_control", "details"}, call("action_details"), nil)
entry({"admin", "access_control", "details", "add"}, call("action_details_add"), nil)
entry({"admin", "access_control", "details", "add_from_pool"}, call("action_details_add_from_pool"), nil)
entry({"admin", "access_control", "details", "delete"}, call("action_details_delete"), nil)
end

function action_index()
local mac_list = mgmt:get_mac_list()
http.prepare_content("application/json")
http.write_json(mac_list)
end

function action_details()
local mac = http.formvalue("mac")
if not mac then
return
end
http.prepare_content("application/json")
http.write_json(mgmt:get_mac_details(mac))
end

As far as machine code binaries go, however, it seems that Ghidra already has decompilation well in hand. What the Ghidra framework could use, it seems, is a way of providing high-level explanations of what a particular decompiled function is doing. And this is just the sort of paraphrasing at which large language models excel.

GPT-3’s Not Bad at Commentary

So now here comes the interesting part. Let’s go back to that function we dumped from the Canon ImageClass printer at address 0x44b2d4b0, take the decompiled code that Ghidra gave us, and ask the LLM to provide it with a high-level gloss, explaining in “natural language” just what the code is doing.

>>> print(prompt_openai_model(f"""Here is some decompiled C code:```
{d}
```
Give me a high-level explanation of what this code is doing, and suggest a name for the function in question."""))
This code is performing a memory copy operation. It is copying data from one memory location to another. The code is looping through the source memory location, and copying each byte to the destination memory location. It is also taking into account any alignment issues that may arise.
A suitable name for this function could be 'memory_copy'.

This is a pretty astute bird’s-eye view. The function is, indeed, a memory copying function, optimized in such a way as to exploit any alignment the memory pointers share.

To recap, we’ve observed how the workflow of a software reverse engineer involves (among other things) a series tasks that can be loosely grouped together as operations of translation or paraphrase. These include, but are not limited to,

  1. data carving and the parsing of filesystems and files
  2. disassembly
  3. decompilation
  4. annotation and commentary

The modern-day reverse engineer is equipped with tools that can automate the first three tasks — albeit never in a foolproof fashion, and the reverser who relies entirely on their automated toolbox is no reverser at all. That the abstractions we deal in deceive us is something reverse engineers take for granted, after all, and this goes for the abstractions our tools employ no less than the abstractions our targets use.

Introducing G-3PO

What these quick and dirty experiments with an LLM suggest is that that the fourth process listed here, the paraphrase of disassembled or decompiled code into high-level commentary, can be assisted by automated tooling as well.

And this is just what the G-3PO Ghidra script does.

The output of such a tool, of course, would have to be carefully checked. Taking its soundness for granted would be a mistake, just as it would be a mistake to put too much faith in the decompiler. We should trust such a tool, backed as it is by an opaque LLM, far less than we trust decompilers, in fact. Fortunately reverse engineering is the sort of domain where we don’t need to trust much at all. It’s an essentially skeptical craft. The reverser’s well aware that every non-trivial abstraction leaks, and that complex hardware and software systems rarely behave as expected. The same healthy skepticism should always extend to our tools.

Developing the G-3PO Ghidra Script

Developing the G-3PO Ghidra script was surprisingly easy. The lion’s share of the work was just a matter of looking up various APIs and fiddling with a somewhat awkward development environment.

One of the weaknesses in Ghidra’s Python scripting support is that it’s restricted to the obsolete and unmaintained “Jython” engine, a Python 2.7 interpreter that runs on the Java Virtual Machine. One option would have been to make use of the Ghidra to Python Bridge, a supplementary Ghidra script that lets you interact with Ghidra’s Jython interpreter from the Python 3 environment of your choice, over a local socket, but since my needs were pretty spare, I didn’t want to overburden the project with extra dependencies. All I really needed from the OpenAI Python module after all was an easy way to serialise, send, receive and parse HTTP requests that conform to the OpenAI API. Ghidra’s Jython distribution doesn’t come with therequests module included, but it does provide httplib, which is almost as convenient (in an earlier draft, I overlooked httplib and resorted to calling curl via subprocess, a somewhat ugly and insecure solution):

def send_https_request(address, path, data, headers):
try:
conn = httplib.HTTPSConnection(address)
json_req_data = json.dumps(data)
conn.request("POST", path, json_req_data, headers)
response = conn.getresponse()
json_data = response.read()
conn.close()
try:
data = json.loads(json_data)
return data
except ValueError:
logging.error("Could not parse JSON response from OpenAI!")
logging.debug(json_data)
return None
except Exception as e:
logging.error("Error sending HTTPS request: {e}".format(e=e))
return None


def openai_request(prompt, temperature=0.19, max_tokens=MAXTOKENS, model=MODEL):
data = {
"model": MODEL,
"prompt": prompt,
"max_tokens": max_tokens,
"temperature": temperature
}
# The URL is "https://api.openai.com/v1/completions"
host = "api.openai.com"
path = "/v1/completions"
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer {openai_api_key}".format(openai_api_key=os.getenv("OPENAI_API_KEY")),
}
data = send_https_request(host, path, data, headers)
if data is None:
logging.error("OpenAI request failed!")
return None
logging.info("OpenAI request succeeded!")
logging.info("Response: {data}".format(data=data))
return data

This is good enough to avoid any dependency on the Python openai library.

The prompt that G-3PO sends to the LLM is pretty basic, and there’s certainly room to tweak it a little in search of better results. What I’m currently using looks like this:

prompt = """
Below is some C code that Ghidra decompiled from a binary that I'm trying to
reverse engineer.

```
{c_code}
```
Please provide a detailed explanation of what this code does, in {style},
that might be useful to a reverse engineer. Explain your reasoning as much
as possible. {extra}

Finally, suggest suitable names for this function and its parameters.
""".format(c_code=c_code, style=STYLE, extra=EXTRA)

The c_code interpolated into the prompt is taken from the output of the Ghidra decompiler, for the function the user is currently inspecting. Quite usefully, this output includes any comments, variable names, or type annotations that the user has already added to the code listing, allowing the LLM to build on the user’s work. The exception is the plate comment positioned at the beginning of the function. This is where G-3PO will place its own comments, and if there’s already a comment in that slot it will refuse to clobber it, unless that comment begins with the phrase OpenAI GPT-3 generated comment.

G-3PO in Action

Here’s how the script looks in action, commenting on the same function we looked at in the previous section, the infelicitously named FUN_44b2d4b0(), to which G-3PO suggests we should give the name CopyMemory():

A screenshot showing G-3PO providing commentary for a function in Ghidra.

Here’s what the comment says, so that you don’t have to squint:

OpenAI GPT-3 generated comment, take with a grain of salt:
This code is a function that copies data from one memory location to another. It takes four parameters: two pointers to memory locations (param_1 and param_2), and two integers (param_3 and param_4).
The function first checks if the value of param_3 is greater than 3. If it is, it checks if the first pointer (param_1) is aligned to a 4-byte boundary. If it is not, it copies the first byte from the second pointer (param_2) to the first pointer (param_1), and then checks if the second pointer (param_2) is aligned to a 4-byte boundary. If it is not, it copies the next byte from the second pointer (param_2) to the first pointer (param_1). It then subtracts the number of bytes copied from param_3 and adjusts the pointers accordingly.
The function then enters a loop, where it copies 8 bytes from the second pointer (param_2) to the first pointer (param_1) and adjusts the pointers accordingly. This loop is repeated until the value of param_3 is less than 0x20.
If the value of param_3 is less than 0x20, the function checks if the value of param_3 is greater than 3. If it is, it copies 4 bytes from the second pointer (param_2) to the first pointer (param_1) and adjusts the pointers accordingly. It then checks if the value of param_3 is greater than 1. If it is, it copies 2 bytes from the second pointer (param_2) to the first pointer (param_1) and adjusts the pointers accordingly. Finally, it checks if the value of param_3 is greater than 0. If it is, it copies 1 byte from the second pointer (param_2) to the first pointer (param_1) and adjusts the pointers accordingly.
The function then returns the adjusted pointers.
A suitable name for this function is “CopyMemory”. The variables bearing
default names can be renamed as follows:
 — param_1 -> destPtr
 — param_2 -> srcPtr
 — param_3 -> numBytes
 — param_4 -> alignment
Model: text-davinci-003, Temperature: 0.19

A close look at the code does indeed reveal something like an optimized memcpy function. It differs from a naïve memory copying loop, say

for (i=0; i<numBytes; i++) {
destPtr[i] = srcPtr[i];
}

only in seeking to copy the largest aligned chunks in each iteration, up to a chunk size of 32 bytes. In this fashion, the processor can take advantage of the ARM ldmia instruction, which lets it read a block of memory into several registers in sequence, and the stmia instruction, which lets it write from a sequence of registers into a block of memory. Here’s what the assembly code looks like, for example, where the function is copying 32-byte (i.e., 8-word) blocks of memory at once:

                                 LAB_44b3632c                                            XREF[1]:       44b36340(j)  
44b3632c 80 f0 d1 f5 pld [r1,#0x80]
44b36330 f8 51 b1 e8 ldmia r1!,{r3,r4,r5,r6,r7,r8,r12,lr}=>LAB_454918fc
44b36334 20 20 42 e2 sub r2,r2,#0x20
44b36338 80 00 52 e3 cmp r2,#0x80
44b3633c f8 51 a0 e8 stmia r0!,{r3,r4,r5,r6,r7,r8,r12,lr}=>DAT_e1a04005
44b36340 f9 ff ff 2a bcs LAB_44b3632c

A Backdoor Lockpick, Revisited

Earlier this year, I posted a detailed analysis of a vulnerable backdoor that I found on every Phicomm router I could get my hands on. At the heart of that analysis was a fine-grained reversing of the telnetd_startup binary that I found in these routers’ firmware, beginning with the version discovered on the Phicomm K2GA router, firmware version 22.6.3.20. Let’s see what G-3PO has to say about a few of the important functions in that binary, unaided by my own comments, corrections, and annotations to the decompiler output (starting from scratch, that is, with the output Ghidra provides).

A screenshot of Ghidra’s decompilation of a function from the Phicomm backdoor binary.
The unannotated Ghidra decompilation for a function in the Phicomm backdoor binary, telnetd_startup, which checks to see if either a reset request or the initial handshake has been received.

Here’s a function that the backdoor server calls to check to see if the client has supplied the magic handshake, ABCDEF1234, when the state machine is in its initial state, or if the client has requested to restart the state machine.

This is what G-3PO had to say about this fairly simple function:

G-3PO glossing a function in the Phicomm backdoor binary, telnetd_startup.
G-3PO’s commentary on one of the functions called in the Phicomm backdoor binary, telnetd_startup.

This is certainly correct, though we should note that what’s significant about this function, for the reverse engineer, has more to do with the role it plays in the state machine implementing the backdoor protocol as a whole.

In certain cases, we see that G-3PO makes an incorrect but reasonable guess — such as when it mistakes the function that generates the ephemeral keys for temporarily and permanently unlocking the backdoor for a function that sets up file system paths. The difference, however, is entirely a matter of context, and we can steer our droid — which can’t, after all, see anything outside this single block of code — in the right direction by naming a few of the other functions thatthis function calls.


/* /--------------------------------------------------------------------------------\
|OpenAI GPT-3 generated comment, take with a grain of salt: |
| |
| This code appears to be setting up two different paths, one for permanent |
|storage and one for temporary storage. The function is first setting up a buffer|
|of 88 bytes (auStack_60) and then using sprintf to create two strings, |
|acStack_460 and acStack_260, which are the paths for the permanent and temporary|
|storage, respectively. The function then calls FUN_00401f30 with the buffer as |
|an argument, followed by a call to FUN_00402b28 with the buffer, the permanent |
|path string, and the length of the permanent path string as arguments. This is |
|followed by a call to FUN_00402c28 with the buffer and a constant string as |
|arguments. The function then repeats the same steps for the temporary path |
|string. |
| This function could be named setupPaths(). The variable auStack_60 could be |
|named buffer, acStack_460 could be named permPath, and acStack_260 could be |
|named tempPath. |
| |
|Model: text-davinci-003, Temperature: 0.19 |
\--------------------------------------------------------------------------------/
/
/~\
|oo )
_\=/_
/ \
//|/.\|\\
|| \_/ ||
|| |\ /| ||
# \_ _/ #
| | |
| | |
[]|[]
| | |
/_]_[_\
*/

/* DISPLAY WARNING: Type casts are NOT being printed */

undefined4 FUN_0040182c(void)

{
size_t sVar1;
char acStack_460 [512];
char acStack_260 [512];
undefined auStack_60 [88];

memset(auStack_60,0,0x58);
sprintf(acStack_460,"%s+PERM",&DAT_00414b80);
sprintf(acStack_260,"%s+TEMP",&DAT_00414b80);
FUN_00401f30(auStack_60);
sVar1 = strlen(acStack_460);
FUN_00402b28(auStack_60,acStack_460,sVar1);
FUN_00402c28(auStack_60,&DAT_00414c20);
FUN_00401f30(auStack_60);
sVar1 = strlen(acStack_260);
FUN_00402b28(auStack_60,acStack_260,sVar1);
FUN_00402c28(auStack_60,&DAT_00414c30);
return 0;
}


/* /--------------------------------------------------------------------------------\
|OpenAI GPT-3 generated comment, take with a grain of salt: |
| |
| This code is responsible for generating two MD5 hashes from two strings. The|
|first string is a concatenation of the XORED_MSG_0014b80 variable and the string|
|"+PERM". The second string is a concatenation of the XORED_MSG_0014b80 variable |
|and the string "+TEMP". |
| The function begins by setting an array of 88 bytes to 0. This array is used|
|to store the MD5 context. It then uses sprintf to create two strings, |
|acStack_460 and acStack_260, which are the two strings mentioned above. |
| The function then calls md5_init with the MD5 context array as an argument. |
|This initializes the MD5 context. It then calls strlen to get the length of the |
|first string, acStack_460, and passes it to md5_add along with the MD5 context |
|array and the first string. This adds the first string to the MD5 context. |
| The function then calls md5_digest with the MD5 context array and the |
|address of the DAT_00414c20 variable as arguments. This generates an MD5 hash |
|from the first string and stores it in the DAT_00414c20 variable. |
| The function then calls md5_init again with the MD5 context array as an |
|argument. It then calls strlen to get the length of the second string, |
|acStack_260, and passes it to md5_add along with the MD5 context array and the |
|second string. This adds the second string to the MD5 context. |
| The function then calls md5_digest with the MD5 context array and the |
|address of the DAT_00414c30 variable as arguments. This generates an MD5 hash |
|from the second string and stores it in the DAT_00414c30 variable. |
| This function could be named generateMD5HashesFromStrings. The variables |
|could be renamed as follows: |
| - acStack_460 -> permString |
| - acStack_260 -> tempString |
| - auStack_60 -> md5Context |
| - DAT_00414c20 -> permHash |
| - DAT_00414c30 -> tempHash |
| |
|Model: text-davinci-003, Temperature: 0.19 |
\--------------------------------------------------------------------------------/
/
/~\
|oo )
_\=/_
/ \
//|/.\|\\
|| \_/ ||
|| |\ /| ||
# \_ _/ #
| | |
| | |
[]|[]
| | |
/_]_[_\
*/

/* DISPLAY WARNING: Type casts are NOT being printed */

undefined4 FUN_0040182c(void)

{
size_t sVar1;
char acStack_460 [512];
char acStack_260 [512];
undefined auStack_60 [88];

memset(auStack_60,0,0x58);
sprintf(acStack_460,"%s+PERM",&XORED_MSG_0014b80);
sprintf(acStack_260,"%s+TEMP",&XORED_MSG_0014b80);
md5_init(auStack_60);
sVar1 = strlen(acStack_460);
md5_add(auStack_60,acStack_460,sVar1);
md5_digest(auStack_60,&DAT_00414c20);
md5_init(auStack_60);
sVar1 = strlen(acStack_260);
md5_add(auStack_60,acStack_260,sVar1);
md5_digest(auStack_60,&DAT_00414c30);
return 0;
}

But how do we know that these auxiliary functions are setting up an MD5 hash? Well, we could figure it out the old fashioned way, as I did when I wrote the report on the backdoor earlier this year, by recognizing the constants used in initializing the MD5 struct, or we could just let our droid have a go at it. The droid, it turns out, does a pretty good job:


/* /--------------------------------------------------------------------------------\
|OpenAI GPT-3 generated comment, take with a grain of salt: |
| |
| This code is initializing a data structure used for the MD5 hashing |
|algorithm. The data structure is an array of 5 32-bit unsigned integers, which |
|are stored in the memory pointed to by the parameter 'param_1'. The first |
|element of the array is set to 0, the second element is set to 0x67452301, the |
|third element is set to 0, the fourth element is set to 0xefcdab89, the fifth |
|element is set to 0x98badcfe, and the sixth element is set to 0x10325476. |
| A suitable name for this function could be 'init_md5_state'. The parameter |
|'param_1' could be renamed to 'md5_state', since it is a pointer to the array of|
|integers used to store the state of the MD5 hashing algorithm. |
| |
|Model: text-davinci-003, Temperature: 0.19 |
\--------------------------------------------------------------------------------/
/
/~\
|oo )
_\=/_
/ \
//|/.\|\\
|| \_/ ||
|| |\ /| ||
# \_ _/ #
| | |
| | |
[]|[]
| | |
/_]_[_\
*/

/* DISPLAY WARNING: Type casts are NOT being printed */

void FUN_00401f30(undefined4 *param_1)

{
*param_1 = 0;
param_1[2] = 0x67452301;
param_1[1] = 0;
param_1[3] = 0xefcdab89;
param_1[4] = 0x98badcfe;
param_1[5] = 0x10325476;
return;
}

The droid provides a reasonable description of the main server loop in the backdoor binary, too:

Screenshot of G-3PO providing a comment on a decompiled function in Ghidra.
G-3PO glossing the main server loop in the Phicomm backdoor binary, telnetd_startup.

Installing and Using G-3PO

So, G-3PO is now ready for use. The only catch is that it does require an OpenAI API key, and the text completion service is unfree (as in beer, and as insofar as the model’s a black box). It is, however, reasonably cheap, and even with heavy use I haven’t spent more than the price of a cup of coffee while developing, debugging, and toying around with this tool.

To run the script:

  • get yourself an OpenAI API key
  • add the key as an environment variable by putting export OPENAI_API_KEY=whateveryourkeyhappenstobe in your ~/.profile file, or any other file that will be sourced before you launch Ghidra
  • copy or symlink c3po.py to your Ghidra scripts directory
  • add that directory in the Script Manager window
  • visit the decompiler window for a function you’d like some assistance interpreting
  • and then either run the script from the Script Manager window by selecting it and hitting the ▶️ icon, or bind it to a hotkey and strike when needed

Ideally, I’d like to provide a way for the user to twiddle the various parameters used to solicit a response from model, such as the “temperature” in the request (high temperatures — approaching 2.0 — solicit a more adventurous response, while low temperatures instruct the model to respond conservatively), all from within Ghidra. There’s bound to be a way to do this, but it seems neither the Ghidra API documentation, Google, nor even ChatGPT are offering me much help in that regard, so for now you can adjust the settings by editing the global variables declared near the beginning of the g3po.py source file:

##########################################################################################
# Script Configuration
##########################################################################################
MODEL = "text-davinci-003" # Choose which large language model we query
TEMPERATURE = 0.19 # Set higher for more adventurous comments, lower for more conservative
TIMEOUT = 600 # How many seconds should we wait for a response from OpenAI?
MAXTOKENS = 512 # The maximum number of tokens to request from OpenAI
C3POSAY = True # True if you want the cute C-3PO ASCII art, False otherwise
LANGUAGE = "English" # This can also be used as a style parameter.
EXTRA = "" # Extra text appended to the prompt.
LOGLEVEL = INFO # Adjust for more or less line noise in the console.
COMMENTWIDTH = 80 # How wide the comment, inside the little speech balloon, should be.
C3POASCII = r"""
/~\
|oo )
_\=/_
/ \
//|/.\|\\
|| \_/ ||
|| |\ /| ||
# \_ _/ #
| | |
| | |
[]|[]
| | |
/_]_[_\
"""
##########################################################################################

The LANGUAGE and EXTRA parameters provide the user with an easy way to play with the form of the LLM’s commentary. Setting style to "in the form of a sonnet", for example, gives us results like this:

A screenshot of G-3PO glossing a function in sonnet form.
G-3PO glossing the main loop function in the Phicomm backdoor binary, telnetd_startup, in the form of a sonnet.
A screenshot of G-3PO glossing a function in sonnet form.
G-3PO glossing the optimized memory copy function in the Canon printer firmware, in the form of a sonnet.

These are by no means good sonnets, but you can’t have everything.

G-3PO is open sourced and released under an MIT license. You can find the script in Tenable’s public Github repository HERE.

Happy holidays and happy hacking!


G-3PO: A Protocol Droid for Ghidra was originally published in Tenable TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

A Backdoor Lockpick

Reversing Phicomm’s Backdoor Protocols

TL;DR

  1. Phicomm’s router firmware has numerous critical vulnerabilities that can be chained together by a remote, unauthenticated attacker to gain a root shell on the device.
  2. Every Phicomm router firmware since at least 2017 exposes a cryptographically locked backdoor.
  3. I’ve analysed this backdoor’s network protocol through three distinct iterations, across eleven firmware versions.
  4. And I show how the backdoor’s cryptographic lock can be “picked” to grant a root shell to an attacker.
  5. Phicomm is no more. These devices will never be patched.
  6. Not only are Phicomm devices still on the market, but their surplus is being resold by other vendors, such as Wavlink, who occasionally neglect to reflash the device and ship it with the vulnerable Phicomm firmware.

A Phicomm in Wavlink’s Clothing

In early September, 2021, a fairly ordinary and inexpensive residential router came into the Zero Day research team’s possession.

The WAVLINK AC1200, an inexpensive WiFi Router.

It was branded as a Wavlink AC1200 WiFi Router, a model that you can find on Amazon for under $30.

When I plugged in the router and attempted to navigate the browser to its administrative interface — which, according to the sticker on the bottom of the router, should have been waiting for us at 192.168.10.1 –things took an unexpected turn. The router’s DHCP server, to begin with, had assigned us an address on the 192.168.2.0/24 subnet, with 192.168.2.1 as its default gateway.

And this is what was waiting to greet me:

This doesn’t look like WAVLINK firmware…

If the Amazon reviews for the WAVLINK AC1200 are anything to go by, I wasn’t alone in this particular situation.

Quite suspicious!

With a little help from Google Translate, I set about exploring this unexpected Phicomm interface. The System Status (系统状态) page identifies the device model as K2G, hardware version A1, running firmware version 22.6.3.20.

The System Status (系统状态) page in the Phicomm firmware’s administrative web UI.

An online search for “Phicomm K2G A1” turned up a few listings for this product, which indeed bears a striking resemblance to the “WAVLINK” router we’d received from Amazon. In many cases the item was listed as “discontinued”.

This looks familiar.
A familiar looking router, with the original Phicomm branding.
Do you see the difference? (The branding is the difference.)

I take a stab at reconstructing the story of how, exactly, K2G A1 routers with Phicomm firmware made their way to the market with WAVLINK branding in the Appendix to this post, but first let’s look at a few particularly interesting vulnerabilities in this misbegotten router.

How to Get the Wifi Password

It’s never a good idea to enable remote management on a residential router, but that rarely prevents vendors from offering this feature, and there will always be users unable to resist the temptation of exposing the controls to their LAN to the Internet at large, nominally protected by a flimsy password authentication mechanism at best.

Like many other residential routers, the Phicomm K2G A1 provides this feature, and a quick perusal of Shodan shows that remote management’s been enabled on many such devices.

If the user decides to enable remote management, the UI will suggest 8181 as the default port for the administrative web interface, and 255.255.255.255 as default netmask (which will expose port 8181 to the entire WAN, which in the case of most residential networks means the Internet).

A basic Shodan search suggests that plenty of users (most of them in China) have made precisely these choices when setting up their routers.

A shodan.io search, showing some results consistent with the remote management interface on certain Phicomm routers.
A shodan.io search for “port:8181 luci”, many of whose results bear a very close resemblance to the remote-management webserver on the Phicomm K2G router.

Access to the admin panel itself requires knowledge of the password that the user chose when setting up the router. Phicomm allows the user to save several seconds and ease the burden of memory by clicking a checkbox and setting the admin password to be the same as the 2.4GHz wireless password.

The Phicomm firmware’s administrative web server exposes a number of interfaces, such as /LocalMACConfig.asp or /wirelesssetup.asp, which can be used to get and set router configuration parameters without requiring any authentication whatsoever. This is especially hazardous when remote management has been enabled, since it effectively grants administrative control of several router settings to any passer-by on the internet, and discloses some highly sensitive information.

For example, if you’re curious what devices might be connected to the router’s local area network, all you need to do is issue a request to http://10.3.3.12:8181/LocalClientList.asp?action=get (assuming 10.3.3.12 is the router’s IP address and 8181 is its remote management port):

A screenshot showing how a LAN directory can be obtained from the management webserver without authentication.
Obtaining LAN information from the Phicomm management webserver, without authentication.

Here we see the Kali and pfSense VMs I’ve connected to the Phicomm router, along with an iPad that’s spoofing its MAC address.

But suppose we’d like to connect to this LAN ourselves. If the router’s nearby, we could try to connect to one of its WiFi networks. But how do we get the password? It turns out that all you need to do is ask and the router will gladly provide it:

Screenshot showing how the WiFi passwords can be obtained without authentication.
Obtaining the WiFi passwords from the remote management service without authentication.

If the owner of that router had taken Phicomm up on its suggestion that they use the same password for both the 2.4GHz wireless network and the administrative interface, then you now have remote administrative access to the router as well.

Screenshot of the Phicomm admin panel.
Phicomm explicitly offers to set the web admin password to the 2.4GHz WiFi password.

But even if you’re not so lucky, there are a number of setting operations that the pseudo-asp endpoints enable as well.

A screenshot of the Phicomm router’s web admin UI, showing the LAN information.
The LAN information page in the administrative web UI.
A screenshot showing how to rename hosts on the target’s LAN.
You can use the unauthenticated remote management endpoint to rename hosts on the target’s LAN.
The results of this renaming attack. This is a vector for pushing potentially malicious content into the administrative web UI.

If we were feeling a little less kind, or felt that this was a network that was best avoided and decided to take matters into our own hands, we could use the same interface to ban local users from the network.

We are also able to ban users from the LAN, from the WAN, without needing any prior authentication.
What the unfortunate client sees in their browser after being banned in this way.

This type of ban only bars access to the router and the WAN, and can be easily evaded by changing the client’s MAC address.

Changing the MAC address to evade the ban.

An unbanning request for a particular MAC address can be issued by setting BlockUser parameter to 0.

[+] Requesting url http://10.3.3.12:8181//LocalMACConfig.asp?action=set&BlockUser=0&MAC=A6%3aDC%3a5C%3aF6%3a2C%3a2B&IP=unknown&DeviceRename=kali&isBind=0&ifType=0&UpMax=0&DownMax=0&_=1642459782743
{'retMACConfigresult': {'ALREADYLOGIN': 0, 'MACConfigresult': 1}}
We see that the ban depends on the MAC address of the LAN-side client. We also see that this ban can be lifted in much the same way that it was imposed, by a WAN-side machine issuing unauthenticated requests.

The library responsible for handling these .asp endpoints is the lighttpd module, mod_mobileapp.so. Of the 68 or so endpoints defined by the administrative interface, 18 can be triggered without requiring any authentication from the user. These include wirelesssetup.asp and any bearing the prefix Local:

LocalCheckClientNumber.asp
LocalCheckDetectFinish.asp
LocalCheckInetHealthStatus.asp
LocalCheckInetLinkStatus.asp
LocalCheckInetSpeedStatus.asp
LocalCheckInterfacelink.asp
LocalCheckNetworkType.asp
LocalCheckRouterPassword.asp
LocalCheckWIFI.asp
LocalCheckWanStatus.asp
LocalCheckWifiPassword.asp
LocalCheckWirelessStatus.asp
LocalClientList.asp
LocalIndex.asp
LocalMACConfig.asp
LocalNetworkSet.asp
LocalStartAutodetect.asp
wirelesssetup.asp

Escalating from an Authenticated Admin Session to a Root Shell on the Router

Suppose that you’ve managed to access the admin panel on a Phicomm K2G A1 router, thanks to the careless exposure of the admin password through the non-authenticated /wirelesssetup.asp?action=get endpoint. Obtaining a root shell on the device is now fairly straightforward, due to a command injection vulnerability in the Phicomm interface, which appears to already be fairly well-known among Phicomm router hackers. Upantool has provided a comprehensive writeup documenting this attack vector (Google translate can be helpful here, if, like me, you can’t read Chinese).

A screenshot of a post-auth command injection attack, courtesy of UpanTool.

The command injection attack is triggered by submitting the string | /usr/sbin/telnetd -l /bin/login.sh where the firmware update menu asks for a time of day at which to check for updates. The router will pass the time of day given to a shell command, which it will run with root privileges, and the pipe symbol | will instruct it to send the output of the first command to a second, which is supplied by the attacker. The injected command, /usr/sbin/telnetd -l /bin/login.sh, opens a root shell that the attacker can connect to over telnet, on port 23.

This was indeed the method I used to obtain a root shell, explore the router’s runtime environment, and download its firmware to my workstation for further analysis. (I did this the easy way, by piping each block device through gzip and over netcat to my host, and then extracting the filesystems with binwalk.)

Verification that the command injection attack documented by UpanTool works.

The first thing I wanted to do when I got there was to look at the output of netstat -tunlp to see what other services might be listening on this device.

Using netstat on the router to find which services are listening on which UDP and TCP ports.

Notice the service listening on UDP port 21210, which netstat identifies as telnetd_startup. This service provides a cryptographically locked backdoor into the router, and in the next section, we’re going to see, first, how the lock works, and second, how to pick it.

Reverse Engineering the Phicomm Backdoor

The Phicomm telnetd_startup service superficially resembles Netgear’s telnetEnable daemon, and serves a similar purpose: to allow an authorized party to activate the telnet service, which will, in turn, provide that party with a root shell on the router. What distinguishes the Phicomm backdoor is not just its elaborate challenge-and-response protocol, but that it requires that the authorized party employ a private RSA key to unlock it. This requirement, however, is not foolproof, and a critical loophole in telnetd_startup allows an attacker to “pick” the cryptographic lock without any need of the key.

Initial State

telnetd_startup begins by listening unobtrusively on UDP port 21210. Until it receives a packet containing the magic 10-byte handshake, ABCDEF1234, it will remain completely silent. Nmap will report UDP port 21210 as open|filtered, and provide no clue as to what might be listening there.

Control flow diagram of the main event loop in the telnetd_startup binary.

If the service does receive the magic handshake, it will respond with a UDP packet of its own, carrying a 16-byte buffer. An analysis of the daemon’s binary code reveals the tell-tale constants of an MD5 hash function, which would be consistent with the length of 16 bytes.

Disassembly of the block of code in telnetd_startup that initializes the hasher used to produce the product-identifying message. This hasher can be recognized as MD5 by its tell-tale constants.

void md5_init(
uint *context)
{
*context = 0;
context[2] = 0x67452301;
context[1] = 0;
context[3] = 0xefcdab89;
context[4] = 0x98badcfe;
context[5] = 0x10325476;
return;
}
Control-flow diagram of the hashing function, recognizable as MD5.
void md5_add(uint *param_1,void *param_2,uint param_3)
{
uint uVar1;
uint uVar2;
uint __n;

uVar2 = (*param_1 << 0x17) >> 0x1a;
uVar1 = param_3 * 8 + *param_1;
__n = 0x40 - uVar2;
*param_1 = uVar1;
if (uVar1 < param_3 * 8) {
param_1[1] = param_1[1] + 1;
}
param_1[1] = param_1[1] + (param_3 >> 0x1d);
if (param_3 < __n) {
__n = 0;
}
else {
memcpy((void *)((int)param_1 + uVar2 + 0x18),param_2,__n);
FUN_00402004(param_1 + 2,param_1 + 6);
while( true ) {
uVar2 = 0;
if (param_3 < __n + 0x40) break;
FUN_00402004(param_1 + 2,(int)param_2 + __n);
__n = __n + 0x40;
}
}
memcpy((void *)((int)param_1 + uVar2 + 0x18),(void *)((int)param_2 + __n),param_3 - __n);
return;
}
The block of code responsible for sending the product-identifying hash back to the client that sends the router the initiating handshake token (“ABCDEF1234”).

With a bit of help and annotation, Ghidra decompiles that code block into the following C-code:

memset(&K2_COSTDOWN__VER_3.0_at_00414ba0,0,0x80);             memcpy(&K2_COSTDOWN__VER_3.0_at_00414ba0,"K2_COSTDOWN__VER_3.0",0x14);
memset(md5,0,0x58);
md5_init(md5);
md5_add(md5,&K2_COSTDOWN__VER_3.0_at_00414ba0,0x80);
md5_digest(md5,&HASH_OF_K2_COSTDOWN_at_4149a0);
MD5_HASH_OF_K2_COSTDOWN_STRING_COPY_at_401d30 = 0;
DAT_00414b74 = 0;
DAT_00414b78 = 0;
DAT_00414b7c = 0;
memcpy(&MD5_HASH_OF_K2_COSTDOWN_STRING_COPY_at_401d30,
&HASH_OF_K2_COSTDOWN_at_4149a0,
0x10);
sendto(SKT,
&MD5_HASH_OF_K2_COSTDOWN_STRING_COPY_at_401d30,
0x10,
0,
&src_addr,
addrlen);
CHECK_STATE_004147e0 = 0;

The string that gets hashed here is "K2_COSTDOWN__VER_3.0", a product identification string, which is first copied into a zeroed-out buffer 128 bytes in length. This can easily be verified.

Verification that the product-identifying message does indeed contain an MD5 hash of a descriptive string found in the telnetd_startup binary.

After this exchange, a global variable at address 0x004147e0 is switched from its initial value of 2 to 0, and the main loop of the server enters another iteration. What we’re looking at, here, is a finite state machine, and the handshake token, "ABCDEF1234" is what sends it from the initial state into the second.

Second State

Control flow diagram of the next stage of the protocol, where the second message received from the client is “decrypted” using a hard-coded public RSA key, a random secret is generated, and then the “decrypted” message is XORed with the random secret, which is then used to generate ephemeral passwords by the set_telnet_enable_keys() function.

In the second state, shown above, in basic block graph form, and below, decompiled into C code, five important things happen after the client replies to the message containing the product-identifying hash:

S = ingest_token(payload_buffer,2);
if (S != 2) {
memset(&PAYLOAD_00414af0,0,0x80);
memcpy(&PAYLOAD_00414af0,payload_buffer,number_of_bytes_received);
S = rsa_public_decrypt_payload();
if (S != 0) break;
CHECK_STATE_004147e0 = 1;
generate_random_plaintext();
rsa_encrypt_with_public_key();
sendto(SKT,&ENCRYPTED_at_4149f0,0x80,0,&src_addr,addrlen);
xor_decrypted_payload_with_plaintext();
set_telnet_enable_keys();
goto LAB_00401e1c;
}

1. Decryption of the client’s message with a public key

The reply, which is assumed to have been encrypted with the client’s private key, is then decrypted with a public RSA key that’s been hardcoded into the binary.

It’s unclear exactly what the designers of this algorithm expect the encrypted blob to contain, and indeed there’s nothing in what follows that would really constrain its contents in any way. This step to some extent resembles the authentication request stage of the SSH public key authentication protocol. This is where the client sends the server a request containing:

  1. the username,
  2. the public key to be used, and
  3. a signature

The signature is produced by first hashing a blob of data known to both parties — the username, for example, or session ID — and then encrypting that hash with the private key that corresponds to the public key sent (2). Something similar seems to be taking place at this stage of the Phicomm backdoor protocol, except that the content of the “signature” isn’t checked in any way. There’s no username, after all, for the client to provide, and just a single valid keypair in play, which determined by the server’s own hardcoded public key. (Thanks to my colleague, Katie Sexton, for highlighting this resemblance and helping me make sense of this stage of the protocol.)

Control flow graph of the function that “decrypts” the client’s message using the hardcoded public RSA key.

Note the constant 3 passed to the OpenSSL library function, RSA_public_decrypt, which specifies that no padding is to be used. This will make our lives a significantly easier in the near future.

int rsa_public_decrypt_payload(void)
{
RSA *rsa;
BIGNUM *a;
int n;
uint digest_len;
size_t length_of_decrypted_payload;
BIGNUM *local_18 [3];
rsa = RSA_new();
local_18[0] = BN_new();
a = BN_new();
BN_set_word(a,0x10001);
BN_hex2bn(local_18, "E541A631680C453DF31591A6E29382BC5EAC969DCFDBBCEA64CB49CBE36578845C507BF5E7A6BCD724AFA70 63CA754826E8D13DBA18A2359EB54B5BE3368158824EA316A495DDC3059C478B41ABF6B388451D38F3C6650C DB4590C1208B91F688D0393241898C1F05A6D500C7066298C6BA2EF310F6DB2E7AF52829E9F858691");
rsa->e = a;
rsa->n = local_18[0];
memset(&DECRYPTED_PAYLOAD_at_4149d0,0,0x20);
n = RSA_size(rsa);
digest_len = RSA_public_decrypt(n,
&PAYLOAD_00414af0,
&DECRYPTED_PAYLOAD_at_4149d0,
rsa,
RSA_NO_PADDING);
if (digest_len < 0x101) {
length_of_decrypted_payload = strlen(&DECRYPTED_PAYLOAD_at_4149d0);
n = -(length_of_decrypted_payload < 0x101 ^ 1);
}
else {
n = -1;
}
return n;
}

Bizarrely, telnetd_startup at no point compares the result of this “decryption” with anything. It seems to rest content so long as the decryption function doesn’t outright fail, or yield a buffer of more than 256 bytes in length – which I’m not quite sure is even possible in this context, barring an undetected bug.

The n-component of the public key is stored in the binary as a hexadecimal string, and can be easily retrieved with the strings tool. The e-component is the usual 0x10001.

$ strings -n 256 usr/bin/telnetd_startup       
E541A631680C453DF31591A6E29382BC5EAC969DCFDBBCEA64CB49CBE36578845C507BF5E7A6BCD724AFA7063CA754826E8D13DBA18A2359EB54B5BE3368158824EA316A495DDC3059C478B41ABF6B388451D38F3C6650CDB4590C1208B91F688D0393241898C1F05A6D500C7066298C6BA2EF310F6DB2E7AF52829E9F858691

An interesting question to ask, here, might be this: what’s the point of this initial exchange? An initial handshake is sent to the router, the router sends back a 16-byte message that uniquely identifies the model, and the router then expects the client to reply with a message encrypted with a particular key private key. Why the handshake ("ABCDEF1234")? Why the product-identifying hash? Why not begin the interaction with the signed or “privately encrypted” message? This protocol would make sense if the client, whoever that might be, is expected to be in possession of a database that associates each product-identifying hash it might receive with its own private RSA key. If this were to be the case, then we might be looking at a particular implementation of a general backdoor protocol.

2. A random secret is generated

A random secret consisting of exactly 31 printable ASCII characters is generated. That these characters are printable will turn out to be a helpful constraint.

Control-flow graph of the function that generates a random, 31-character secret.

3. The random secret is encrypted

The random secret is then encrypted using the hardcoded public RSA key, such that the only feasible way to decrypt it will be with the corresponding private key.

int rsa_encrypt_with_public_key(void)
{
RSA *rsa;
BIGNUM *a;
int iVar1;
BIGNUM *local_18 [3];
rsa = RSA_new();
local_18[0] = BN_new();
a = BN_new();
BN_set_word(a,0x10001);
BN_hex2bn(local_18, "E541A631680C453DF31591A6E29382BC5EAC969DCFDBBCEA64CB49CBE36578845C507BF5E7A6BCD724AFA70 63CA754826E8D13DBA18A2359EB54B5BE3368158824EA316A495DDC3059C478B41ABF6B388451D38F3C6650C DB4590C1208B91F688D0393241898C1F05A6D500C7066298C6BA2EF310F6DB2E7AF52829E9F858691");
rsa->e = a;
rsa->n = local_18[0];
memset(&ENCRYPTED_at_4149f0,0,0x80);
iVar1 = RSA_size(rsa);
iVar1 = RSA_public_encrypt(iVar1,
&RANDOMLY_GENERATED_PLAINTEXT_at_4149b0,
&ENCRYPTED_at_4149f0,
rsa,
3);
return iVar1 >> 0x1f;
}

4. The random, plaintext secret is XORed with the client’s message

This seems like a particularly strange move to me, a needless twist of complexity that, far from improving the security of the system, will afford a means for completely undoing it. The “decrypted” message received from the client in step 1 of state 2 — “decrypted”, remember, with the public key — is bitwise-xored with the random secret.

Control-flow graph of the function that calculates the bitwise-XOR of the random secret and the result of “decrypting” the client’s second message.
void xor_decrypted_payload_with_plaintext(void)
{
byte *pbVar1;
byte *pbVar2;
int i;
byte *pbVar3;

i = 0;
do {
pbVar1 = &DECRYPTED_PAYLOAD_at_4149d0 + i;
pbVar2 = &RANDOMLY_GENERATED_PLAINTEXT_at_4149b0 + i;
pbVar3 = &XORED_MSG_00414b80 + i;
i = i + 1;
*pbVar3 = *pbVar1 ^ *pbVar2;
} while (i != 0x20);
return;
}

5. The resulting string is used to construct ephemeral passwords

Here’s where things truly break down. The string produced by XORing the random plaintext secret with the client’s “decrypted” message is concatenated with two hardcoded salts: "+PERM" and "+TEMP". The resulting concatenations are then hashed with the same MD5 algorithm used earlier to produce the product identifier. The resulting 16-byte hashes are then set as the ephemeral passwords that, if correctly guessed, will allow the client to unlock the backdoor.

int set_telnet_enable_keys(void)
{
size_t xor_str_len;
char xor_str_perm [512];
char xor_str_temp [512];
uint md5 [22];

sprintf(xor_str_perm,"%s+PERM",&XORED_MSG_00414b80);
sprintf(xor_str_temp,"%s+TEMP",&XORED_MSG_00414b80);
memset(md5,0,0x58);
md5_init(md5);
xor_str_len = strlen(xor_str_perm);
md5_add(md5,xor_str_perm,xor_str_len);
md5_digest(md5,&TELNET_ENABLE_PERM_at_414c20);
md5_init(md5);
xor_str_len = strlen(xor_str_temp);
md5_add(md5,xor_str_temp,xor_str_len);
md5_digest(md5,&TELNET_ENABLE_TEMP_at_0x414c30);
return 0;
}

Can you see the problem here? Think it over. We’ll come back to this in a minute.

Verifying things in the GDB

Once I had a general idea of how all the pieces fit together, I wanted to test my understanding of things by pushing a static MIPS build of gdbserver to the router, and then step through the telnetd_startup state machine with gdb-multiarch and my favourite gdb extension library, gef.

As I understood it, it seemed that telnetd_startup was expecting me, the client, to decrypt its secret message using the private RSA key that corresponds to the public key coded into the binary. Since I did not, in fact, possess that key, and since OpenSSL’s RSA implementation seemed like a tough nut to crack, I figured that I could verify my conjectures by simply cheating. I learned that if I just use the debugger to grab the random plaintext secret from the buffer at address 0x004149b0, salt it with the suffix "+TEMP", MD5-hash it, and send back the result, then I am in fact able to drive the state machine to its final destination, where system("telnetd -l /bin/login.sh") is called and the backdoor is thrown wide open. So long as I chose, for my second message, a string that I knew would be “decrypted” into a buffer of null bytes by the hardcoded public RSA key — and this is rather easy to do — I knew that that method would produce the correct ephemeral password. This gave me a pretty good indication of what we need to do in order to open the backdoor without the assistance of a debugger, and without peeking at memory that, in a realistic scenario, an attacker would have no means of seeing.

Screenshot of a debugger session (gdb-multiarch + gef), a python REPL, and a telnet session that shows how by reading the random secret directly from memory we can calculate the ephemeral password needed to initialize a telnet session. The client’s second message, in this scenario, is chosen so that the hardcoded public RSA key “decrypts” it to a buffer of null bytes.

What this proves is that all we need to do in order to open the backdoor is to either discover the private RSA key, or else guess the 31-character secret string. The odds of guessing a random string at that length are abysmal, and so, armed with the public RSA key, I focussed, at first, on rummaging around the internet for some trace of that key (in various formats) in hopes that I might find the complete key pair just lying around. A long shot, sure, but worth checking. It did not, however, pay off.

At this point I still hadn’t quite noticed the critical loophole that I mentioned earlier. It came while I was patiently sketching out the protocol diagram, shown below.

The Backdoor Protocol

Here is a complete protocol diagram of the Phicomm backdoor, as apparently intended to be used:

Picking the Backdoor’s Lock

Remember how I said, regarding step 5 of state 2, that things break down in the construction of the two ephemeral passwords? The first thing to observe here is how the XORed strings are concatenated with the two salts:

sprintf(xor_str_perm,"%s+PERM",&XORED_MSG_00414b80);
sprintf(xor_str_temp,"%s+TEMP",&XORED_MSG_00414b80);

We can expand XORED_MSG_00414b80 to make its construction a bit clearer, like so:

sprintf(xor_str_temp, 
"%s+TEMP",
xor(SECRET_PLAINTEXT,
RSA_public_decrypt(HARDCODED_PUBLIC_KEY,
ENCRYPTED_XOR_MASK)));
temp_password = MD5(xor_str_temp);

And mutatis mutandis for +PERM. Now, the format specifier %sas used by sprintf is not meant to handle just any byte arrays whatsoever. It’s meant to handle strings — null-terminated strings, to be precise. The array of bytes at &XORED_MSG_00414b80 might, in the mind of the developer, be 31 bytes long, but in the eyes of sprintf() it ends where the first null byte occurs.

If the value of the first byte of that “string” is zero (i.e, '\x00', not the ASCII numeral '0'), then %s will format it as an empty string!

If &XORED_MSG_00414b80 is treated as an empty string, then xor_str_temp and xor_str_perm are just going to be "+TEMP" and "+PERM". The random component is completely dropped! Their MD5 hashes will be entirely predictable. When that happens, this code

memset(md5,0,0x58);  
md5_init(md5);
xor_str_len = strlen(xor_str_perm);
md5_add(md5,xor_str_perm,xor_str_len);
md5_digest(md5,&TELNET_ENABLE_PERM_at_414c20);
md5_init(md5);
xor_str_len = strlen(xor_str_temp);
md5_add(md5,xor_str_temp,xor_str_len);
md5_digest(md5,&TELNET_ENABLE_TEMP_at_0x414c30);

will produce precisely these two hashes:

In [53]: salt = b"+TEMP" ; MD5.MD5Hash(salt + b'\x00' * (0x58 - len(salt))).digest().hex()
Out[53]: 'f73fbf2e90e43136f07279c745f2f9f2'
In [54]: salt = b"+PERM" ; MD5.MD5Hash(salt + b'\x00' * (0x58 - len(salt))).digest().hex()
Out[54]: 'c423a902bacd28bafd095350d66e7455'

What this means is that all we have to do to produce a situation where we can predict the two ephemeral passwords is to make it likely that

XORED_MSG_00414b80[0] == DECRYPTED_PAYLOAD_at_4149d0[0] ^ RANDOMLY_GENERATED_PLAINTEXT_at_4149b0[0] == '\x00'

This turns out to be easy.

In the absence of padding (i.e., when the padding variable is set to RSA_NO_PADDING (=3)),RSA_public_decrypt() will “successfully” transform the vast majority of 128-byte buffers into non-null buffers. Just to get a ballpark idea of the odds, here’s what I found when I used the hardcoded public RSA key provided to “decrypt” 1000 random buffers, in the Python REPL:

In [23]: D = [pub_decrypt(os.urandom(0x80), padding=None) for i in range(1000)]      
In [24]: len([x for x in D if x and any(x)]) / len(D)                                                                                                                                                
Out[24]: 0.903

Over 90% came back non-null. If the padding variable were set to RSA_PKCS1_PADDING, by contrast, we’d be entirely out of luck. Control of the plaintext would be virtually impossible:

In [85]: D = [pub_decrypt(os.urandom(0x80), padding="pkcs1") for x in range(1000)]
In [86]: len([x for x in D if x and any(x)]) / len(D)
Out[86]: 0.0

What this means is that so long as the server uses a padding-free cipher, we don’t actually need the private key in order to have some control over what RSA_public_decrypt() does with the message we send back to telnetd_startup at the beginning of State 2.

So, what kind of control are we after here? Simple: we want the first byte of the “decrypted” buffer to be printable. Why? Because the one thing we know about the random plaintext secret is that it’s composed of printable bytes, that is, bytes that fall somewhere between 0x21 and 0x7e, inclusive.

In [25]: len([x for x in D if (0x21 <= x[0]) and (x[0] < 0x7f)]) / len(D)                                                                                                                      
Out[25]: 0.372

So that winds up being true of about 37% of random 128-byte buffers.

Here’s a bit of C-code that will whip up some phony ciphertext, meeting these fairly broad specifications.

unsigned char *find_phony_ciphertext(RSA *rsa) {
unsigned char *phony_ciphertext;
unsigned char phony_plaintext[1024];
int plaintext_length;
memset(phony_plaintext, 0, 0x20);
phony_ciphertext = calloc(PHONY_CIPHERTEXT_LENGTH, sizeof(char));
do {
    random_buffer(phony_ciphertext, PHONY_CIPHERTEXT_LENGTH);
phony_ciphertext[0] || (phony_ciphertext[0] |= 1);
    plaintext_length = decrypt_with_pubkey(rsa, 
phony_ciphertext, phony_plaintext);

if ((plaintext_length < 0x101) &&
(0x21 <= phony_plaintext[0]) &&
(phony_plaintext[0] < 0x7f)) {
printf("[!] Found stage 2 payload:\n");
hexdump(phony_ciphertext, PHONY_CIPHERTEXT_LENGTH);
printf("[=] Decrypts to (%d bytes):\n", plaintext_length);
hexdump(phony_plaintext, plaintext_length);
return phony_ciphertext;
}
} while (1);
}

Once we’ve generated such a buffer, we then have a 1 in 94 (0x7f — 0x21) chance of having a message whose “decryption”, via the hardcoded RSA key, begins with the same character as the random secret plaintext. Those are astronomically better odds than trying to guess a 31-character string (94−31) or a 16-byte hash (2−128).

If we guess right, then the ephemeral password to temporarily enable telnetd will become MD5("+TEMP"), and the ephemeral password to permanently enable it will become MD5("+PERM)".

And in this fashion we can gain an unauthenticated root shell on the Phicomm router after somewhere in the ballpark of one hundred guesses.

Protocol Diagram Showing How the Backdoor Lock can be Picked

Proof of concept

To bring these findings together, I wrote a small proof-of-concept program in C that will reliably pick the lock on the Phicomm router’s backdoor and grant the user a root shell over telnet. You can see it in action below.

A screencast showing our exploit in action, successfully picking the lock on the Phicomm K2G router’s backdoor.

Picking the Lock on the K3C’s Backdoor

An advertisement for the Phicomm K3C, which sports an essentially identical backdoor.

I was curious whether Phicomm’s flagship router, the K3C, might implement the same backdoor protocol, and, if so, whether it might be vulnerable to an identical attack. These devices are still available through Phicomm’s Amazon storefront, for less than $30. So I put in an order for the device, and while I waited, set about scouring a few Chinese forums for surviving copies of the K3C’s firmware image. I was in luck! I was able to obtain firmware images for the K3C, in each of the following versions:

  • 32.1.15.93
  • 32.1.22.113
  • 32.1.26.175
  • 32.1.45.267
  • 32.1.46.268
$ find . -path "*usr/bin/telnetd_startup" -exec bash -c 'echo -e "$(grep -o "fw_ver .*" $(dirname {})/../../etc/config/system)\n\tMD5 HASH OF BINARY: $(md5sum {})\n\tPRODUCT IDENTIFIER: $(strings {} | grep VER)\n\tPUBLIC RSA KEY(S): $(strings -n 256 {})\n"' {} \;
fw_ver '32.1.15.93'
MD5 HASH OF BINARY: f53a60b140009d91b51e4f24e483e893 ./_K3C_V32.1.15.93.bin.extracted/squashfs-root/usr/bin/telnetd_startup
PRODUCT IDENTIFIER:
PUBLIC RSA KEY(S): CC232B9BB06C49EA1BDD0DE1EF9926872B3B16694AC677C8C581E1B4F59128912CBB92EB363990FAE43569778B58FA170FB1EBF3D1E88B7F6BA3DC47E59CF5F3C3064F62E504A12C5240FB85BE727316C10EFF23CB2DCE973376D0CB6158C72F6529A9012786000D820443CA44F9F445ED4ED0344AC2B1F6CC124D9ED309A519
9FC8FFBF53AECF8461DEFB98D81486A5D2DEE341F377BA16FB1218FBAE23BB1F3766732F8D382E15543FC2980208D968E7AE1AC4B48F53719F6D9964E583A0B791150B9C0C354143AE285567D8C042240CA8D7A6446E49CCAF575ACC63C55BAC8CF5B6A77DEE0580E50C2BFEB62C06ACA49E0FD0831D1BB0CB72BC9B565313C9
fw_ver '32.1.22.113'
MD5 HASH OF BINARY: d23c3c27268e2d16c721f792f8226b1d ./_K3C_V32.1.22.113.bin.extracted/squashfs-root/usr/bin/telnetd_startup
PRODUCT IDENTIFIER:
PUBLIC RSA KEY(S): CC232B9BB06C49EA1BDD0DE1EF9926872B3B16694AC677C8C581E1B4F59128912CBB92EB363990FAE43569778B58FA170FB1EBF3D1E88B7F6BA3DC47E59CF5F3C3064F62E504A12C5240FB85BE727316C10EFF23CB2DCE973376D0CB6158C72F6529A9012786000D820443CA44F9F445ED4ED0344AC2B1F6CC124D9ED309A519
fw_ver '32.1.26.175'
MD5 HASH OF BINARY: d23c3c27268e2d16c721f792f8226b1d ./_K3C_V32.1.26.175.bin.extracted/squashfs-root/usr/bin/telnetd_startup
PRODUCT IDENTIFIER:
PUBLIC RSA KEY(S): CC232B9BB06C49EA1BDD0DE1EF9926872B3B16694AC677C8C581E1B4F59128912CBB92EB363990FAE43569778B58FA170FB1EBF3D1E88B7F6BA3DC47E59CF5F3C3064F62E504A12C5240FB85BE727316C10EFF23CB2DCE973376D0CB6158C72F6529A9012786000D820443CA44F9F445ED4ED0344AC2B1F6CC124D9ED309A519
fw_ver '32.1.45.267'
MD5 HASH OF BINARY: 283b65244c4eafe8252cb3b43780a847 ./_SW_K3C_703004761_V32.1.45.267.bin.extracted/squashfs-root/usr/bin/telnetd_startup
PRODUCT IDENTIFIER: K3C_INTELALL_VER_3.0
PUBLIC RSA KEY(S): E7FFD1A1BB9834966763D1175CFBF1BA2DF53A004B62977E5B985DFFD6D43785E5BCA088A6417BAF070BCE199B043C24B03BCEB970D7E47EEBA7F59D2BE4764DD8F06DB8E0E2945C912F52CB31C56C8349B689198C4A0D88FD029CCECDDFF9C1491FFB7893C11FAD69987DBA15FF11C7F1D570963FA3825B6AE92815388B3E03
fw_ver '32.1.46.268'
MD5 HASH OF BINARY: 283b65244c4eafe8252cb3b43780a847 ./_K3C_V32.1.46.268.bin.extracted/squashfs-root/usr/bin/telnetd_startup
PRODUCT IDENTIFIER: K3C_INTELALL_VER_3.0
PUBLIC RSA KEY(S): E7FFD1A1BB9834966763D1175CFBF1BA2DF53A004B62977E5B985DFFD6D43785E5BCA088A6417BAF070BCE199B043C24B03BCEB970D7E47EEBA7F59D2BE4764DD8F06DB8E0E2945C912F52CB31C56C8349B689198C4A0D88FD029CCECDDFF9C1491FFB7893C11FAD69987DBA15FF11C7F1D570963FA3825B6AE92815388B3E03

The older versions appeared to work differently, and in one of the writeups I dug up on Baidu, I found instructions for using a tool that sounded, at first, very much like mine in order to gain a root shell over telnet, so as to upgrade the firmware to the most recent version — something no longer facilitated by the official Phicomm firmware repository, which shut its doors when the company collapsed at the beginning of 2019.

A screenshot of Jack Cruise’s post (passed through Google Translate), showing how the RoutAckProV1B2.exe tool can be used to crack the backdoor implemented in an obsolescent version of the K3C firmware. This tool, unlike ours, cannot crack the backdoor protocol used on the most recent versions of Phicomm firmware for the K2G and K3C routers.

A quick look at RoutAckProV1B2.exe suggested that it did, indeed, interact with whatever runs on UDP port 21210 (0x52da in hexadecimal, da 52 in little-endian representation).

A hex dump of RoutAckProV1B2.exe, which hints that this tool, too, interacts with a service that listens on UDP port 21210 on the router.

I wondered if I’d been scooped, for a moment, and spun up a Windows VM on the isolated network to which Phicomm K2G was connected. I downloaded the RoutAckProV1B2 tool, and monitored it with procmon.exe and Wireshark as it tried in vain to open the backdoor on the K2G. This tool wasn’t sending the handshake token, "ABCDEF1234".

A screenshot of the RoutAckProV1B2.exe tool running in a Windows VM, while being inspected by the Windows process monitor.

Instead it was sending a single 128-byte payload, five times in succession, before finally giving up.

This is the “magic packet” that the RoutAckProV1B2.exe tool uses to unlock the backdoor installed an older versions of Phicomm router firmware.
A closeup of the RoutAckProV1B2.exe tool, courtesy of Jack Cruise. The website www.right.com.cn is a Chinese-language forum for sharing technical information on a variety of routers.
Here we see the RoutAckProV1B2.exe tool unsuccessfully attempting to open the backdoor on a virtual machine running the most recent firmware I could find for the Phicomm K3C.

Versions 32.1.45 of the firmware and up, however, shared an identical build of the telnetd_startup daemon, which appeared to differ from its counterpart on the K2G router only in having been compiled to a big-endian MIPS instruction set, rather than the little-endian architecture found in the K2G. Surprisingly, this binary hadn’t been stripped of symbols, which made life just a little bit easier.

The function that set the ephemeral passwords (see above) suffered from the same programming mistake as its K2G counterpart, and was almost certainly built from the same source code.

A decompilation of the function I referred to above as “set_telnet_enable_keys()”, here seen in K3C’s build of the telnetd_startup binary. Here it’s compiled to a big-endian rather than little-endian MIPS architecture, and, unlike the K2G binary, has not been stripped of debugging symbols, which makes reverse engineering the binary somewhat easier. The algorithm is, nevertheless, identical.

All I’d need to do, then, was recover the hardcoded public RSA key from the binary and I could easily adapt my tool to pick the lock on this backdoor as well. Running strings -n 256 on the binary was all that it took.

Using strings -n 256 to grab the hardcoded public RSA key from the telnetd_startup binary in the K3C firmware (version 32.1.46.268).

strings also helped extract the product identifier. Where the Phicomm K2G build contained K2_COSTDOWN__VER_3.0, the K3C build had K3C_INTELALL_VER_3.0:

I used strings to grab the hardcoded product identifier from that binary, too.

I added this information to the table in the backdoor-lockpick tool, which associated product identifying strings with public RSA keys.

Adding the product identifier and hardcoded public RSA key to a lookup table used by my “backdoor lockpick” tool, enabling it to pick the lock on the K3C backdoor as well as the K2G one.

With a week to wait before my K3C arrived, I decided I’d make do with the tools at my disposal and emulate the K3C build of telnetd_startup in user mode with QEMU (wrapped, for the sake of portability and convenience, in a Docker container, following this method @drablyechos describes in this 2020 IOT Village talk at DEFCON, though the Docker wrapper isn’t strictly necessary).

The telnetd_startup daemon fails its preliminary search for the telnet flag in flash storage, since there’s no flash storage device to check, but it recovers from this failure gracefully and goes on to listen on UDP port 21210, just as it would if the telnet flag had been set to the disabled position in the flash device (which is, after all, the default setting).

The lockpick has no more trouble with this backdoor than it did with the one on the K2G.

A screencast showing my backdoor lockpick in action, again, this time picking the lock on the K3C’s backdoor. The K3C firmware, in this case, is being run on a virtual machine. The hardware was still in the mail.

For the sake of thoroughness, I decided to test RoutAckProV1B2.exe’s attack against my virtualized K3C, running firmware version 32.1.46.268.

Relying on Google Translate to read on-screen Chinese sometimes presents a challenge.

Google translate doing its best to help me read the log messages on RoutAckProV1B2.exe’s GUI.

Not entirely sure of what was happening here, I decided I’d better check Wireshark again. RoutAckProV1B2 was repeatedly sending 128-byte packets to my virtualized K3C server (running firmware version 32.1.46.268) on UDP port 21210, but receiving no replies. At no point did a telnet port open.

When tested against the older firmware version 32.1.26.175, however, RoutAckProV1B2.exe worked like a charm.

This seems to establish beyond any doubt that the most recent firmware versions for Phicomm’s K2G and K3C routers are using a new backdoor protocol, designed with better security but implemented with a catastrophic loophole, which permits anyone on the LAN to gain a root shell on either device.

The Phicomm K3C with International Firmware Version 33.1.25.177

Still unsure whether I’d tested the most recent versions of the Phicomm K3C firmware, or whether I’d find the same backdoor in the devices they’d built for the international market, I was eager to get my hands on a brand new K3C device. It arrived just as I was wrapping up with my K3C emulations.

I set up the router and found that the firmware running on this device bore the version 33.1.25.177, a major version bump ahead of the latest Chinese market firmware I’d tested.

The web admin interface for the international release of the K3C, running firmware version 33.1.25.177.

There was something listening on UDP port 21210, but it didn’t, at first, appear to behave like the backdoor I’d found on the Chinese market firmware I’d studied. Rather than listening silently until it received the magic handshake, ABCDEF1234, it would respond to any packet with an unpredictable, high-entropy packet containing exactly 128 bytes. I suspected this might be something like the encrypted secret that the backdoor would send to its client in Stage 2 of the protocol discussed above.

The behaviour was reminiscent of the simpler backdoor that the tool RoutAckProV1B2.exe seemed designed for, but I wasn’t able to get anywhere with that particular tool.

I figured I could make better sense of things if I could just look at the binary of whatever it was that listened on UDP port 21210 on this device, so I set to work taking it apart, in search of a UART port by which I might obtain a root shell.

I was in luck! The device not only sports a UART, but a clearly-labelled UART at that!

A clearly labelled UART at that!

So I grabbed my handy-dandy UART-to-USB serial bridge…

My handy-dandy UART-to-USB bridge.

…and set about soldering some header pins to the UART port. These devices are somewhat delicate machines, so I first tried to get as far as I could without disassembling everything and removing it from the casing. A hot air gun was helpful here.

And there we go:

UART pins ready!

The molten plastic casing was still a bit awkward to work around, however, so I did eventually end up taking things apart, and removing the unneeded upper board, which housed the RF components. Everything still worked fine.

With the UART adapter connected, I was able to obtain a serial connection using minicom, at 115200 Baud 8N1. This gave me access to a U-Boot BIOS shell after interrupting the boot process, with direct read and write access to the 1Gb F-die NAND flash storage chip (a Samsung 734 K9F1G08U0F SCB0), on which both the firmware and the bootloader are stored.

The Samsung 734 K9F1G08U0F SCB0.

If we let the boot process run its course, we’re presented with a linux login prompt. We could try to guess the password here, or take the more difficult, principled approach of first dumping the NAND and searching it for clues. Let’s do things the hard way. I adapted Valerio’s TCL expect script to hexdump the entire NAND volume, and left it running overnight.

Valerio’s U-Boot flash dumping script, adapted to work on the K3C.

I deserialized the hex back to binary with a bit of Python, and then went at it with the usual tools. The most rewarding turned out to be strings :

Digging some password hashes out of the NAND volume.

Hashcat didn’t have any trouble with this, and gave me one of the root passwords in seconds:

Returning to the login prompt while hashcat warmed up my office, I logged in with username root, password admin, and presto!

The firmware conveniently had netcat installed, and our old friend telnetd_startup was sitting right there in /usr/bin. I piped it over to my workstation, and dropped it into Ghidra.

The protocol implemented by the version of telnetd_startup in the latest international market firmware for the K3C closely resembles what we see in the Chinese market K2G 22.6.3.20 and the K3C 32.1.46.268. It differs only in omitting the initial stage. Rather than waiting for the ABCDEF1234 handshake, and then responding with a device identifying hash, it expects the initial packet to contain a message encrypted with the private RSA key that matches its hardcoded public key. It “decrypts” this message with the public key, XORs it with a randomly generated 31-character secret, and then, fatally, concatenates it with either +TEMP or +PERM using sprintf(), before hashing the result with MD5, to produce the ephemeral passwords for temporarily and permanently activating the telnet service respectively.

This all looks very familiar.
A familiar-looking xor() function in the international firmware for the K3C.
And here’s where they make their fatal mistake.

This algorithm is vulnerable to the same attack that worked against the three-stage backdoor protocol implemented in the telnetd_startup versions we’ve already looked at. All we need to do is grab the hardcoded public key and tweak our lockpick tool so that it skips the handshake/identifier stage when communicating with this particular release.

That public key, by the way, is

CC232B9BB06C49EA1BDD0DE1EF9926872B3B16694AC677C8C581E1B4F59128912CBB92EB363990FAE43569778B58FA170FB1EBF3D1E88B7F6BA3DC47E59CF5F3C3064F62E504A12C5240FB85BE727316C10EFF23CB2DCE973376D0CB6158C72F6529A9012786000D820443CA44F9F445ED4ED0344AC2B1F6CC124D9ED309A519

Remember that one.

I made the necessary adjustments to the tool, and it worked, again, like a charm!

An Exposed Private RSA Key in the K2 Router, with Firmware Version 22.5.9.163, but One that You Don’t Even Need

I mentioned, before, that another solution to this puzzle would simply be to obtain the private RSA key that matched the hardcoded public key. In the case of the K2G (the one in Wavlink’s clothing) I made some effort to search for the public key online, after converting it to various ASCII formats, just in case the pair had been left lying around somewhere. It was a long shot and didn’t pan out. But while I was exploring one of the older firmware images for Phicomm’s K2 line of routers— 22.5.9.163, dating from 2017— I noticed something interesting:

Look familiar?

It’s using the same public key we saw in the brand new international release of the Phicomm K3C. But there’s more:

That shouldn’t be there!

In firmware version 22.5.9.163 for the K2 router, Phicomm exposed the private RSA key corresponding to the hardcoded public key that they continued to deploy in their international release long after correcting the error in their domestic market firmware versions. This error didn’t go unnoticed — this key pair shows up in a strings dump of RoutAckProV1B2.exe, which attacks an earlier, simpler backdoor protocol than either of the two protocols analysed here.

The method for constructing the ephemeral passwords in the K2 22.5.9.163 differs from what we’ve seen in these later firmware versions. Instead of generating a random secret and XORing it with public-key-decrypted data received from the client prior to concatenating it with the two magic salts, this earlier release simply concatenates the client’s decrypted secret with the salts. Everything is then hashed with MD5, just as it was before, and the two passwords are set.

The md5_command() function from the telnetd_startup binary in the K2G 22.5.9.163 firmware.

Curiously, this release contains what must be a typo: instead of +PERM we have +PERP.

Now, leaked d parameter notwithstanding, it’s possible to crack open this backdoor without even using the private key. All that needs to be done is:

  1. Generate some ${phony_ciphertext} that the known public key will “decrypt” into a non-null buffer (call this the ${phony_plaintext}). It simplifies things if you also constrain things so that the phony plaintext contains no null bytes. This can be found pretty quickly through brute trial and error.
  2. Take the MD5 hash of the string ${phony_plaintext}+TEMP. Let’s call that the ${temp_password}.
  3. Send ${phony_ciphertext} to UDP port 21210 on the router.
  4. And then, quickly afterwards, send ${temp_password} to the same port.

This will open the telnet service on the K2 22.5.9.163. For a telnet service that persists after rebooting, do the same as above but substitute PERP for TEMP (this misspelling seems to be peculiar to this particular version).

A Reconstructed History of Phicomm’s Backdoor Protocols

In the course of researching this vulnerability, I’ve looked closely at eleven different firmware images. Arranged in order of build date, they are:

So, to sum things up, the history of the Phicomm backdoor looks like this:

The oldest generation I’ve found of Phicomm’s telnetd_startup protocol (shaded blue, in the tables above) is relatively simple: the server waits to receive an encrypted message, which it decrypts and hashes with two different salts. It then waits for another message, and if that message matches either of those hashes, it will either spawn the telnet service or write a flag to the flash drive to trigger the spawning of telnet on boot. This is the protocol we see in the K2 22.5.9.163, released in early 2017. That particular build made the blunder of hardcoding the private key in the binary, which defeats the purpose of asymmetric encryption. This error enabled the creation of RoutAckProV1B2.exe, a router-hacking tool which has been circulating online for several years, which uses the pilfered private key to allow any interested party to gain root access to this iteration of the backdoor. Of course, as we just saw, use of the private key isn’t even necessary to open the door. What the design overlooks — and this oversight will never be truly corrected — is that it’s not only possible but easy to generate phony ciphertext that a public RSA key will “decrypt” into predictable, phony plaintext. Doing so will permit an attacker to subvert the locking mechanism on the backdoor, and gain unauthorized entry.

Phicomm responded to this situation in an entirely insufficient fashion in the next generation of the protocol (shaded yellow, above), which we find in the firmware versions released later in 2017, including the still-for-sale international release of the K3C (analysed above). They redacted the private key from the binary, but failed to change the public key. Their next design, moreover, appears to share the assumption that it’s only by encrypting data with the private key that an attacker can predict or control the output of its public key decryption. Rather than addressing either of these errors, they just piled on further complexity: this is when they began to generate a 31-character random secret and XOR it with the public-key-decrypted data received from the client in order to generate their ephemeral passwords. This makes the backdoor slightly harder to attack, if we continue to ignore the leaked private key, but it’s ultimately just a matter of discovering some phony ciphertext that decrypts to a plaintext that begins with a printable ASCII character. This gives us a 1 in 92 chance of colliding with the first byte of the random secret, which, due to the careless use of sprintf‘s %s specifier for bytearray concatenation, will result in a completely predictable empheral password.

The next generation (mauve in the tables above) is the last I looked at, and likely the last released. Phicomm finally removed the compromised public key, and took the additional precaution of deploying a distinct public key to each router model. They also added a device-identifying handshake phase to the protocol, which makes the backdoor considerably stealthier — there’s no real way to tell that it’s listening on UDP port 21210, unless you send it the magic token ABCDEF1234. It responds to this magic token with a device-identifying hash, permitting the client to select the private key that matches the public key compiled into the service. The algorithm itself, however, shares the same security flaws as its predecessor, and is vulnerable to an essentially identical attack. This is the iteration we see in the Chinese market release of K3C 32.1.46.268, and the Chinese market K2G A1 22.6.3.20 — the firmware image that ended up on certain Wavlink-branded routers, that Wavlink neglected to flash with firmware of their own.

I’d love to conduct a more exhaustive test of various Phicomm firmware images, but they’re becomming rather difficult to find online. If you know where I might find a copy of a firmware version not mentioned here, please reach out to us at bughunters at tenable dot com.

Will these Vulnerabilities Ever Be Patched?

No.

These vulnerabilities will never be patched. Certainly not through official channels.

The Phicomm corporation is dead and gone.

After various attempts to contact Phicomm’s customer support offices in China, Germany, and California, and even reaching out to the CEO directly, I received this reply on October 10 from whatever remained of Phicomm’s American office.

Dear Sir,
Thank you for contacting Phicomm Support in Germany. Phicomm has closed all Business worldwide since 01.01.2019.
Yours sincerely
Service Team Phicomm

I’m not sure whether or not the @PHICOMM account on telegram.com is managed by the company, but if it is, things didn’t look good on that end, either.

Poor guy.

So, what exactly happened to Phicomm?

In 2015, while at the height of their economic power — with a net operating income of close to 10 billion yuan (a little over 1.5 billion USD), earning them comparisons to Huawei in the press — Phicomm, under the leadership of CEO and founder Gu Guoping, entered into a highly questionable business arrangement with the p2p lending company, Lianbi Financial. Former Project Director for Phicomm, James Soh, has posted on LinkedIn about

the sudden appearance in June 2015 of a person-to-person (P2P) financial service company called LianBi Finance that started month-long on-site promotion on company grounds. They claimed that LianBi Finance is a partner firm and there is proper agreement in place for collaboration between Shanghai Phicomm and LianBi Finance but it was never publicized. They promote financial products that has unrealistic returns. Thereafter, the tie-up between Shanghai Phicomm and LianBi Finance went further where Shanghai Phicomm home Wifi kit costing 399 RMB and up, shall be refunded by LianBi Finance for the full amount if the buyer scanned the QR code on the Wifi product box and provided personal details. People will buy more and more sets, however discovered that they cannot get the full amount back from the second set of kit they bought, instead they are offered to purchase a certain amount of financial investment products of say 5,000 RMB, and returns of 12% per month will be credited back into the buyer. This is a pyramid scheme in disguise. In addition, Mr Gu tied staff promotion and bonus in Shanghai Phicomm to how much LianBi products each person buy.
Gu Guoping, in better days than these.

Peer to Peer (P2P) lending is a high-risk financial instrument that often offers investors — that is, lenders —astonishingly high rates of return, and which has been criticized for being a Ponzi scheme with extra steps. It would eventually become known that Gu “effectively also owned and controlled LianBi.” 2016 saw the beginnings of the Chinese government’s crackdown on P2P lending platforms, in a campaign that would reach its summit in 2018. LianBi Financial was filed that year, under suspicion of “illegally absorbing public deposits.” In 2021, the police raided LianBi’s offices and arrested Gu Guoping.

Police raiding the LianBi Financial headquarters.

A public hearing was held against Gu on February 4, that year, and on December 8, 2021,

Gu Guoping was sentenced to life imprisonment for the crime of fundraising fraud, deprived of political rights for life, and confiscated all personal property. Nong Jin, Chen Yu, Zhu Jun, Wang Jingjing, and Zhang Jimin were sentenced to fixed-term imprisonment ranging from 15 to 10 years for the crime of fund-raising fraud, as well as confiscation of personal property of RMB 5 million to 600,000.
Gu Guoping, together with a few of his associates, at a public hearing in the Shanghai №1 Intermediate People’s Court, on February 4, 2021. The yellow sign says “defendant”.

And this, in a nutshell, is why we can expect no patches from Phicomm for the vulnerabilities discussed in this post.

So, what about Wavlink?

This part of the story is still a little unclear, but it seems to me that what happened was this: sometime between May, 2018, when they released their last batch of routers, and January 2019, when they closed down business worldwide, Phicomm liquidated their remaining stock of routers, selling the surplus K2Gs to the Winstars corporation. Winstars then outfitted these devices with the branding of their subsidiary, Wavlink, and distributed them through Amazon, which is how a Phicomm router in Wavlink clothing eventually arrived on my desk.

After hitting a wall with Phicomm, I reached out to Wavlink to report these vulnerabilities I’d found on what was, in a sense, their hardware. I imagined that they’d be interested to hear that they had been shipping out devices with Phicomm’s firmware. They replied that they had “released related patches last year or the beginning of this year,” but gave no indication as to how the customer might be able to upgrade to those patches if they were among those whose Wavlink-branded routers were running Phicomm firmware.

If removing the backdoor is your chief concern, then it’s far from given that re-flashing your router with Wavlink firmware would put you on any firmer ground. Wavlink, in fact, has its own history of installing backdoors. And shoddy or not, at least Phicomm made an effort to lock their backdoors. If you’re interested in reading more about Wavlink’s own backdoors, I recommend you read James Clee’s excellent writeup.

What Should I Do With my Phicomm Router?

There no longer exists an official avenue to update the firmware on any Phicomm router. The company collapsed entirely well before we discovered these zero days.

An intrepid user can, however, at their own risk, leverage one or more of the vulnerabilities documented above to re-flash their router with an open-source firmware like OpenWRT, which now supports several Phicomm models. There’s considerable risk of bricking your device in the process, and it isn’t for the faint of heart, but it’s quite probably the surest way to rid your router of the vulnerabilities analysed here.

Other creative solutions, available to the adventurous, might include using the backdoor to modify the firmware by hand —by disabling the telnetd_startup daemon, say. The user might also attempt to simply restrict access to UDP port 21210 by means of a firewall rule.

Remote management should be disabled immediately, if nothing else.

Disclosure Timeline

  • Tuesday, October 5, 2021: Phicomm customer support contacted to report vulnerabilities
  • Sunday, October 10, 2021: Phicomm’s German office replies to inform us that Phicomm “has closed all business worldwide since 01.01.2019.”
  • Thursday, October 7, 2021: Wavlink notified that several of their “AC1200” routers have shipped with vulnerable Phicomm firmware
  • Friday, October 8, 2021: Wavlink responds to request further details
  • Friday, October 29, 2021: Wavlink provided with requested details
  • Monday, December 6, 2021: Reminder sent to Wavlink after receiving no response

A Backdoor Lockpick was originally published in Tenable TechBlog on Medium, where people are continuing the conversation by highlighting and responding to this story.

❌
❌