On 28th of February, I’ve sent a short summary to lkrg-users mailing list (https://www.openwall.com/lists/lkrg-users/2020/02/28/1) regarding recent Linux kernel XFRM UAF exploit dropped by Vitaly Nikolenko. I believe it is worth reading and I’ve decided to reference it on my blog as well:
Hey,
Vitaly Nikolenko published an exploit for Linux kernel XFRM use-after-free. His tweet with more details can be found here:
I’ve tested his exploit under the latest version of LKRG (from the repo) and it correctly detects and kills it:
[Fri Feb 28 10:04:24 2020] [p_lkrg] Loading LKRG…
[Fri Feb 28 10:04:24 2020] Freezing user space processes … (elapsed 0.008 seconds) done.
[Fri Feb 28 10:04:24 2020] OOM killer disabled.
[Fri Feb 28 10:04:24 2020] [p_lkrg] Verifying 21 potential UMH paths for whitelisting…
[Fri Feb 28 10:04:24 2020] [p_lkrg] 6 UMH paths were whitelisted…
[Fri Feb 28 10:04:25 2020] [p_lkrg] [kretprobe] register_kretprobe() for failed! [err=-22]
[Fri Feb 28 10:04:25 2020] [p_lkrg] ERROR: Can't hook ovl_create_or_link function :(
[Fri Feb 28 10:04:25 2020] [p_lkrg] LKRG initialized successfully!
[Fri Feb 28 10:04:25 2020] OOM killer enabled.
[Fri Feb 28 10:04:25 2020] Restarting tasks … done.
[Fri Feb 28 10:04:42 2020] [p_lkrg] [JUMP_LABEL] New modification: type[JUMP_LABEL_JMP]!
[Fri Feb 28 10:04:42 2020] [p_lkrg] [JUMP_LABEL] Updating kernel core .text section hash!
[Fri Feb 28 10:04:42 2020] [p_lkrg] [JUMP_LABEL] New modification: type[JUMP_LABEL_JMP]!
[Fri Feb 28 10:04:42 2020] [p_lkrg] [JUMP_LABEL] Updating kernel core .text section hash!
[Fri Feb 28 10:04:42 2020] [p_lkrg] [JUMP_LABEL] New modification: type[JUMP_LABEL_JMP]!
[Fri Feb 28 10:04:42 2020] [p_lkrg] [JUMP_LABEL] Updating kernel core .text section hash!
[Fri Feb 28 10:04:42 2020] [p_lkrg] [JUMP_LABEL] New modification: type[JUMP_LABEL_JMP]!
[Fri Feb 28 10:04:42 2020] [p_lkrg] [JUMP_LABEL] Updating kernel core .text section hash!
[Fri Feb 28 10:06:49 2020] [p_lkrg] process[67342 | lucky0] has different user_namespace!
[Fri Feb 28 10:06:49 2020] [p_lkrg] process[67342 | lucky0] has different user_namespace!
[Fri Feb 28 10:06:49 2020] [p_lkrg] Trying to kill process[lucky0 | 67342]!
[Fri Feb 28 10:08:32 2020] [p_lkrg] process[81090 | lucky0] has different user_namespace!
[Fri Feb 28 10:08:32 2020] [p_lkrg] process[81090 | lucky0] has different user_namespace!
[Fri Feb 28 10:08:32 2020] [p_lkrg] Trying to kill process[lucky0 | 81090]!
[Fri Feb 28 10:08:32 2020] [p_lkrg] process[81090 | lucky0] has different user_namespace!
[Fri Feb 28 10:08:32 2020] [p_lkrg] process[81090 | lucky0] has different user_namespace!
[Fri Feb 28 10:08:32 2020] [p_lkrg] Trying to kill process[lucky0 | 81090]!
[Fri Feb 28 10:08:32 2020] [p_lkrg] process[81090 | lucky0] has different user_namespace!
[Fri Feb 28 10:08:32 2020] [p_lkrg] process[81090 | lucky0] has different user_namespace!
[Fri Feb 28 10:08:32 2020] [p_lkrg] Trying to kill process[lucky0 | 81090]!
[Fri Feb 28 10:08:32 2020] [p_lkrg] process[81090 | lucky0] has different user_namespace!
[Fri Feb 28 10:08:32 2020] [p_lkrg] process[81090 | lucky0] has different user_namespace!
[Fri Feb 28 10:08:32 2020] [p_lkrg] Trying to kill process[lucky0 | 81090]!
Latest LKRG detects user_namespace corruption, which in a way proofs that our namespace escape logic works. When I’ve made the same test, but reverting LKRG code base to the commit just before namespace corruption detection, LKRG is still detecting it via standard method:
[Fri Feb 28 10:34:28 2020] [p_lkrg] process[17599 | lucky0] has different SUID! 1000 vs 0 [Fri Feb 28 10:34:28 2020] [p_lkrg] process[17599 | lucky0] has different GID! 1000 vs 0 [Fri Feb 28 10:34:28 2020] [p_lkrg] process[17599 | lucky0] has different SUID! 1000 vs 0 [Fri Feb 28 10:34:28 2020] [p_lkrg] process[17599 | lucky0] has different GID! 1000 vs 0 [Fri Feb 28 10:34:28 2020] [p_lkrg] Trying to kill process[lucky0 | 17599]! … [Fri Feb 28 10:35:02 2020] [p_lkrg] process[22293 | lucky0] has different SUID! 1000 vs 0 [Fri Feb 28 10:35:02 2020] [p_lkrg] process[22293 | lucky0] has different GID! 1000 vs 0 [Fri Feb 28 10:35:02 2020] [p_lkrg] process[22293 | lucky0] has different SUID! 1000 vs 0 [Fri Feb 28 10:35:02 2020] [p_lkrg] process[22293 | lucky0] has different GID! 1000 vs 0 [Fri Feb 28 10:35:02 2020] [p_lkrg] Trying to kill process[lucky0 | 22293]!
This is an interesting case. Vitaly published just a compiled binary of his exploit (not a source code). This means that adopting his exploit to play cat-and-mouse game with LKRG is not an easy task. It is possible to reverse-engineer it and modify the exploit binary, however it’s more work.
I’ve recently spent some time looking at ‘exec_id’ counter. Historically, Linux kernel had 2 independent security problems related to that code: CVE-2009-1337 and CVE-2012-0056.
Until 2012, ‘self_exec_id’ field (among others) was used to enforce permissions checking restrictions for /proc/pid/{mem/maps/…} interface. However, it was done poorly and a serious security problem was reported, known as “Mempodipper” (CVE-2012-0056). Since that patch, ‘self_exec_id’ is not tracked anymore, but kernel is looking at process’ VM during the time of the open().
In 2009 Oleg Nesterov discovered that Linux kernel has an incorrect logic to reset ->exit_signal. As a result, the malicious user can bypass it if it execs the setuid application before exiting (->exit_signal won’t be reset to SIGCHLD). CVE-2009-1337 was assigned to track this issue.
The logic responsible for handling ->exit_signal has been changed a few times and the current logic is locked down since Linux kernel 3.3.5. However, it is not fully robust and it’s still possible for the malicious user to bypass it. Basically, it’s possible to send arbitrary signals to a privileged (suidroot) parent process.
CVE MITRE described the problem pretty accurately:
A signal access-control issue was discovered in the Linux kernel before 5.6.5, aka CID-7395ea4e65c2. Because exec_id in include/linux/sched.h is only 32 bits, an integer overflow can interfere with a do_notify_parent protection mechanism. A child process can send an arbitrary signal to a parent process in a different security domain. Exploitation limitations include the amount of elapsed time before an integer overflow occurs, and the lack of scenarios where signals to a parent process present a substantial operational threat.
What is interesting, the story of insufficient restriction of the exit signals might not be ended
How did this pass review and get backported to stable kernels? https://t.co/WhBrqUZhrw (Hint: case of right hand not knowing what the left is doing, involving a recent security fix)
I would like to draw draw attention to the following Openwall’s tweet:
Juho Junnila's Master's Thesis "Effectiveness of Linux Rootkit Detection Tools" shows our LKRG as by far the most effective kernel rootkit detector (of those tested), even though that wasn't our primary focus: https://t.co/pz0r502dK6 h/t @Adam_pi3
We’ve just announced a new version of LKRG 0.8! It includes enormous amount of changes – in fact, so much that we’re not trying to document all of the changes this time (although they can be seen from the git commits), but rather focus on high-level aspects. I encourage to read full announcement here:
Btw. Among others, we have added support for Raspberry Pi 3 & 4, better scalability, performance, and tradeoffs, the notion of profiles, new documentation, @Phoronix benchmarks, and more
The short story of 1 Linux Kernel Use-After-Free bug and 2 CVEs (CVE-2020-14356 and CVE-2020-25220)
Name: Linux kernel Cgroup BPF Use-After-Free Author: Adam Zabrocki ([email protected]) Date: May 27, 2020
First things first – short history:
In 2019 Tejun Heo discovered a racing problem with lifetime of the cgroup_bpf which could result in double-free and other memory corruptions. This bug was fixed in kernel 5.3. More information about the problem and the patch can be found here:
Roman Gushchin discovered another problem with the newly fixed code which could lead to use-after-free vulnerability. His report and fix can be found here:
During the discussion on the fix, Alexei Starovoitov pointed out that walking through the cgroup hierarchy without holding cgroup_mutex might be dangerous:
Unfortunately, there is another Use-After-Free bug related to the Cgroup BPF release logic.
The “new” bug – details (a lot of details ;-)):
During LKRG development and tests, one of my VMs was generating a kernel crash during shutdown procedure. This specific machine had the newest kernel at that time (5.7.x) and I compiled it with all debug information as well as SLAB DEBUG feature. When I analyzed the crash, it had nothing to do with LKRG. Later I confirmed that kernels without LKRG are always hitting that issue:
Points to the freed object. However, kernel still keeps eBPF rules attached to the socket under cgroups. When process (sshd) dies (do_exit() call) and cleanup is executed, all sockets are being closed. If such socket has “pending” packets, the following code path is executed:
However, there is nothing wrong with such logic and path. The real problem is that cgroups disappeared while still holding active clients. How is that even possible? Just before the crash I can see the following entry in kernel logs:
I was testing the same scenario a few times and I had the following results:
percpu ref (cgroup_bpf_release_fn) <= 0 (-70581) after switching to atomic
percpu ref (cgroup_bpf_release_fn) <= 0 (-18829) after switching to atomic
percpu ref (cgroup_bpf_release_fn) <= 0 (-29849) after switching to atomic
Let’s look at this function:
/**
* cgroup_bpf_release_fn() - callback used to schedule releasing
* of bpf cgroup data
* @ref: percpu ref counter structure
*/
static void cgroup_bpf_release_fn(struct percpu_ref *ref)
{
struct cgroup *cgrp = container_of(ref, struct cgroup, bpf.refcnt);
INIT_WORK(&cgrp->bpf.release_work, cgroup_bpf_release);
queue_work(system_wq, &cgrp->bpf.release_work);
}
So that’s the callback used to release bpf cgroup data. Sounds like it is being called while there could be still active socket attached to such cgroup:
/**
* cgroup_bpf_release() - put references of all bpf programs and
* release all cgroup bpf data
* @work: work structure embedded into the cgroup to modify
*/
static void cgroup_bpf_release(struct work_struct *work)
{
struct cgroup *p, *cgrp = container_of(work, struct cgroup,
bpf.release_work);
struct bpf_prog_array *old_array;
unsigned int type;
mutex_lock(&cgroup_mutex);
for (type = 0; type < ARRAY_SIZE(cgrp->bpf.progs); type++) {
struct list_head *progs = &cgrp->bpf.progs[type];
struct bpf_prog_list *pl, *tmp;
list_for_each_entry_safe(pl, tmp, progs, node) {
list_del(&pl->node);
if (pl->prog)
bpf_prog_put(pl->prog);
if (pl->link)
bpf_cgroup_link_auto_detach(pl->link);
bpf_cgroup_storages_unlink(pl->storage);
bpf_cgroup_storages_free(pl->storage);
kfree(pl);
static_branch_dec(&cgroup_bpf_enabled_key);
}
old_array = rcu_dereference_protected(
cgrp->bpf.effective[type],
lockdep_is_held(&cgroup_mutex));
bpf_prog_array_free(old_array);
}
mutex_unlock(&cgroup_mutex);
for (p = cgroup_parent(cgrp); p; p = cgroup_parent(p))
cgroup_bpf_put(p);
percpu_ref_exit(&cgrp->bpf.refcnt);
cgroup_put(cgrp);
}
So if cgroup dies, all the potential clients are being auto_detached. However, they might not be aware about such situation. When is cgroup_bpf_release_fn() executed?
/**
* cgroup_bpf_inherit() - inherit effective programs from parent
* @cgrp: the cgroup to modify
*/
int cgroup_bpf_inherit(struct cgroup *cgrp)
{
...
ret = percpu_ref_init(&cgrp->bpf.refcnt, cgroup_bpf_release_fn, 0,
GFP_KERNEL);
...
}
It is automatically executed when cgrp->bpf.refcnt drops to 1. However, in the warning logs before kernel had crashed, we saw that such reference counter is below 0. Cgroup was already freed.
Originally, I thought that the problem might be related to the code walking through the cgroup hierarchy without holding cgroup_mutex, which was pointed out by Alexei. I’ve prepared the patch and recompiled the kernel:
Interestingly, without this patch I was able to generate this kernel crash every time when I was rebooting the machine (100% repro). After this patch crashing ratio dropped to around 30%. However, I was still able to hit the same code-path and generate kernel dump. The patch indeed helps but it looks like it’s not the real problem since I can still hit the crash (just much less often).
I stepped back and looked again where the bug is. Corrupted pointer (struct cgroup *) is comming from that line:
cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data);
this code is related to the CONFIG_SOCK_CGROUP_DATA. Linux source has an interesting comment about it in “cgroup-defs.h” file:
/*
* sock_cgroup_data is embedded at sock->sk_cgrp_data and contains
* per-socket cgroup information except for memcg association.
*
* On legacy hierarchies, net_prio and net_cls controllers directly set
* attributes on each sock which can then be tested by the network layer.
* On the default hierarchy, each sock is associated with the cgroup it was
* created in and the networking layer can match the cgroup directly.
*
* To avoid carrying all three cgroup related fields separately in sock,
* sock_cgroup_data overloads (prioidx, classid) and the cgroup pointer.
* On boot, sock_cgroup_data records the cgroup that the sock was created
* in so that cgroup2 matches can be made; however, once either net_prio or
* net_cls starts being used, the area is overriden to carry prioidx and/or
* classid. The two modes are distinguished by whether the lowest bit is
* set. Clear bit indicates cgroup pointer while set bit prioidx and
* classid.
*
* While userland may start using net_prio or net_cls at any time, once
* either is used, cgroup2 matching no longer works. There is no reason to
* mix the two and this is in line with how legacy and v2 compatibility is
* handled. On mode switch, cgroup references which are already being
* pointed to by socks may be leaked. While this can be remedied by adding
* synchronization around sock_cgroup_data, given that the number of leaked
* cgroups is bound and highly unlikely to be high, this seems to be the
* better trade-off.
*/
and later:
/*
* There's a theoretical window where the following accessors race with
* updaters and return part of the previous pointer as the prioidx or
* classid. Such races are short-lived and the result isn't critical.
*/
This means that sock_cgroup_data “carries” the information whether net_prio or net_cls starts being used and in such case sock_cgroup_data overloads (prioidx, classid) and the cgroup pointer. In our crash we can extract this information:
Described socket keeps the “sk_cgrp_data” pointer with the information of being “attached” to the cgroup2. However, cgroup2 has been destroyed. Now we have all the information to solve the mystery of this bug:
Process creates a socket and both of them are inside some cgroup v2 (non-root)
cgroup BPF is cgroup2 only
At some point net_prio or net_cls is being used:
this operation is disabling cgroup2 socket matching
now, all related sockets should be converted to use net_prio, and sk_cgrp_data should be updated
The socket is cloned, but not the reference to the cgroup (ref: point 1)
this essentially moves the socket to the new cgroup
All tasks in the old cgroup (ref: point 1) must die and when this happens, this cgroup dies as well
When original process is starting to “use” the socket, it might attempt to access cgroup which is already “dead”. This essentially generates Use-After-Free condition
in my specific case, process was killed or invoked exit()
during the execution of do_exit() function, all file descriptors and all sockets are being closed
one of the socket still points to the previously destroyed cgroup2 BPF (OpenSSH might install BPF)
__cgroup_bpf_run_filter_skb runs attached BPF and we have Use-After-Free
To confirm that scenario, I’ve modified some of the Linux kernel sources:
Function cgroup_sk_alloc_disable():
I’ve added dump_stack();
Function cgroup_bpf_release():
I’ve moved mutex to guard code responsible for walking through the cgroup hierarchy
I’ve managed to reproduce this bug again and this is what I can see in the logs:
...
[ 72.061197] kmem.limit_in_bytes is deprecated and will be removed. Please report your usecase to [email protected] if you depend on this functionality.
[ 72.121572] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[ 72.121574] CPU: 0 PID: 6958 Comm: kubelet Kdump: loaded Not tainted 5.7.0-g6 #32
[ 72.121574] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[ 72.121575] Call Trace:
[ 72.121580] dump_stack+0x50/0x70
[ 72.121582] cgroup_sk_alloc_disable.cold+0x11/0x25
^^^^^^^^^^^^^^^^^^^^^^^
[ 72.121584] net_prio_attach+0x22/0xa0
^^^^^^^^^^^^^^^
[ 72.121586] cgroup_migrate_execute+0x371/0x430
[ 72.121587] cgroup_attach_task+0x132/0x1f0
[ 72.121588] __cgroup1_procs_write.constprop.0+0xff/0x140
^^^^^^^^^^^^^^^^^^^^^^
[ 72.121590] kernfs_fop_write+0xc9/0x1a0
[ 72.121592] vfs_write+0xb1/0x1a0
[ 72.121593] ksys_write+0x5a/0xd0
[ 72.121595] do_syscall_64+0x47/0x190
[ 72.121596] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 72.121598] RIP: 0033:0x48abdb
[ 72.121599] Code: ff e9 69 ff ff ff cc cc cc cc cc cc cc cc cc e8 7b 68 fb ff 48 8b 7c 24 10 48 8b 74 24 18 48 8b 54 24 20 48 8b 44 24 08 0f 05 <48> 3d 01 f0 ff ff 76 20 48 c7 44 24 28 ff ff ff ff 48 c7 44 24 30
[ 72.121600] RSP: 002b:000000c00110f778 EFLAGS: 00000212 ORIG_RAX: 0000000000000001
[ 72.121601] RAX: ffffffffffffffda RBX: 000000c000060000 RCX: 000000000048abdb
[ 72.121601] RDX: 0000000000000004 RSI: 000000c00110f930 RDI: 000000000000001e
[ 72.121601] RBP: 000000c00110f7c8 R08: 000000c00110f901 R09: 0000000000000004
[ 72.121602] R10: 000000c0011a39a0 R11: 0000000000000212 R12: 000000000000019b
[ 72.121602] R13: 000000000000019a R14: 0000000000000200 R15: 0000000000000000
As we can see, net_prio is being activated and cgroup2 socket matching is being disabled. Next:
I’ve decided to report this issue to the Linux Kernel security mailing list around the mid-July (2020). Roman Gushchin replied to my report and suggested to verify if I can still repro this issue when commit ad0f75e5f57c (“cgroup: fix cgroup_sk_alloc() for sk_clone_lock()”) is applied. This commit was merged to the Linux Kernel git source tree just a few days before my report. I’ve carefully verified it and indeed it fixed the problem. However, commit ad0f75e5f57c is not fully complete and a follow-up fix 14b032b8f8fc (“cgroup: Fix sock_cgroup_data on big-endian.”) should be applied as well.
After this conversation Greg KH decided to backport Roman’s patches to the LTS kernels. In the meantime, I’ve decided to apply for CVE number (through RedHat) to track this issue:
CVE-2020-14356 was allocated to track this issue
For some unknown reasons, this bug was classified as NULL pointer dereference
RedHat correctly acknowledged this issue as Use-After-Free and in their own description and bugzilla they specify:
However, in CVE MITRE portal we can see a very inaccurate description:
“A flaw null pointer dereference in the Linux kernel cgroupv2 subsystem in versions before 5.7.10 was found in the way when reboot the system. A local user could use this flaw to crash the system or escalate their privileges on the system.” https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-14356
People have started to hit this Use-After-Free bug in the form of NULL pointer dereference “kernel panic”.
Additionally, the entire description of the bug is wrong. I’ve raised that concern to the CVE MITRE but the invalid description is still there. There is also a small Twitter discussion about that here: https://twitter.com/Adam_pi3/status/1296212546043740160
Second CVE – CVE-2020-25220:
During analysis of this bug, I contacted Brad Spengler. When the patch for this issue was backported to LTS kernels, Brad noticed that it conflicted with his pre-existing backport, and that the upstream backport looked incorrect. I was surprised since I had reviewed the original commit for mainline kernel (5.7) and it was fine. Having this in mind, I decided to carefully review the backported patch: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-4.14.y&id=82fd2138a5ffd7e0d4320cdb669e115ee976a26e
and it really looks incorrect. Part of the original fix is the following code:
+void cgroup_sk_clone(struct sock_cgroup_data *skcd)
+{
+ if (skcd->val) {
+ if (skcd->no_refcnt)
+ return;
+ /*
+ * We might be cloning a socket which is left in an empty
+ * cgroup and the cgroup might have already been rmdir'd.
+ * Don't use cgroup_get_live().
+ */
+ cgroup_get(sock_cgroup_ptr(skcd));
+ cgroup_bpf_get(sock_cgroup_ptr(skcd));
+ }
+}
However, backported patch has the following logic:
+void cgroup_sk_clone(struct sock_cgroup_data *skcd)
+{
+ /* Socket clone path */
+ if (skcd->val) {
+ /*
+ * We might be cloning a socket which is left in an empty
+ * cgroup and the cgroup might have already been rmdir'd.
+ * Don't use cgroup_get_live().
+ */
+ cgroup_get(sock_cgroup_ptr(skcd));
+ }
+}
There is a missing check:
+ if (skcd->no_refcnt)
+ return;
which could result in reference counter bug and in the end Use-After-Free again. It looks like the backported patch for stable kernels is still buggy.
I’ve contacted RedHat again and they started to provide correct patches for their own kernels. However, LTS kernels were still buggy. I’ve also asked to assign a separate CVE for that issue but RedHat suggested that I do it myself.
After that, I went for vacation and forgot about this issue Recently, I’ve decided to apply for CVE to track the “bad patch” issue, and CVE-2020-25220 was allocated. It is worth to point out that someone from Huawei at some point realized that patch is wrong and LTS got a correct fix as well:
During the last Patch Tuesday (13th of October 2020), Microsoft fixed a very interesting (and sexy) vulnerability: CVE-2020-16898 – Windows TCP/IP Remote Code Execution Vulnerability (link). Microsoft’s description of the vulnerability:
“A remote code execution vulnerability exists when the Windows TCP/IP stack improperly handles ICMPv6 Router Advertisement packets. An attacker who successfully exploited this vulnerability could gain the ability to execute code on the target server or client. To exploit this vulnerability, an attacker would have to send specially crafted ICMPv6 Router Advertisement packets to a remote Windows computer. The update addresses the vulnerability by correcting how the Windows TCP/IP stack handles ICMPv6 Router Advertisement packets.”
This vulnerability is so important that I’ve decided to write a Proof-of-Concept for it. During my work there weren’t any public exploits for it. I’ve spent a significant amount of time analyzing all the necessary caveats needed for triggering the bug. Even now, available information doesn’t provide sufficient details for triggering the bug. That’s why I’ve decided to summarize my experience. First, short summary:
This bug can ONLY be exploited when source address is link-local IPv6. This requirement is limiting the potential targets!
The entire payload must be a valid IPv6 packet. If you screw-up headers too much, your packet will be rejected before triggering the bug
During the process of validating the size of the packet, all defined “length” in Optional headers must match the packet size
This vulnerability allows to smuggle an extra “header”. This header is not validated and includes “Length” field. After triggering the bug, this field will be inspected against the packet size anyway.
Windows NDIS API, which can trigger the bug, has a very annoying optimization (from the exploitation perspective). To be able to bypass it, you need to use fragmentation! Otherwise, you can trigger the bug, but it won’t result in memory corruption!
Collecting information about the vulnerability
At first, I wanted to learn more about the bug. The only extra information which I could find were the write-ups provided by the detection logic. This is quite a funny twist of fate that the information on how to protect against attack was helpful in exploitation Write-ups:
“While we ignore all Options that aren’t RDNSS, for Option Type = 25 (RDNSS), we check to see if the Length (second byte in the Option) is an even number. If it is, we flag it. If not, we continue. Since the Length is counted in increments of 8 bytes, we multiply the Length by 8 and jump ahead that many bytes to get to the start of the next Option (subtracting 1 to account for the length byte we’ve already consumed).”
OK, what we have learned from it? Quite a lot:
We need to send RDNSS packet
The problem is an even number in the Length field
Function responsible for parsing the packet will reference the last 8 bytes of RDNSS payload as a next header
That’s more than enough to start poking around. First, we need to generate a valid RDNSS packet.
RDNSS
Recursive DNS Server Option (RDNSS) is one of the sub-options for Router Advertisement (RA) message. RA can be sent via ICMPv6. Let’s look at the documentation for RDNSS (https://tools.ietf.org/html/rfc5006):
5.1. Recursive DNS Server Option The RDNSS option contains one or more IPv6 addresses of recursive DNS servers. All of the addresses share the same lifetime value. If it is desirable to have different lifetime values, multiple RDNSS options can be used. Figure 1 shows the format of the RDNSS option.
Length 8-bit unsigned integer. The length of the option
(including the Type and Length fields) is in units of
8 octets. The minimum value is 3 if one IPv6 address
is contained in the option. Every additional RDNSS
address increases the length by 2. The Length field
is used by the receiver to determine the number of
IPv6 addresses in the option.
This essentially means that Length must always be an odd number as long as there is any payload. OK, let’s create a RDNSS package. How to do it? I’m using scapy since it’s the easiest and fasted way for creating any packages which we want. It is very simple:
Hm… OK. We never hit Ipv6pUpdateRDNSS but we did hit Ipv6pHandleRouterAdvertisement. This means that our package is fine. Why the hell we did not end up in Ipv6pUpdateRDNSS?
are matching my IPv6 address which I’ve used as a source address:
v6_src = "fd12:db80:b052:0:5d7b:5d64:7c08:f5e5"
It is compared with byte 0xFE. By looking here We can learn that:
fe80::/10 — Addresses in the link-local prefix are only valid and unique on a single link (comparable to the auto-configuration addresses 169.254.0.0/16 of IPv4).
OK, so it is looking for the link-local prefix. Another interesting check is when we fail the previous one:
fffff804`4847279b e8f497f8ff call tcpip!IN6_IS_ADDR_LOOPBACK (fffff804`483fbf94)
fffff804`484727a0 84c0 test al,al
fffff804`484727a2 0f85567df4ff jne tcpip!Ipv6pHandleRouterAdvertisement+0x166 (fffff804`483ba4fe)
fffff804`484727a8 4180f8fe cmp r8b,0FEh
fffff804`484727ac 7515 jne tcpip!Ipv6pHandleRouterAdvertisement+0xb842b (fffff804`484727c3)
It is checking if we are coming from the LOOPBACK, and next we are validated again for being the link-local. I’ve modified the packet to use link-local address and…
Essentially, it subtracts 1 from the Length field and the result is divided by 2. This follows the documentation logic and can be summarized as:
tmp = (Length - 1) / 2
This logic generates the same result for the odd and even number:
(8 – 1) / 2 => 3 (7 – 1) / 2 => 3
There is nothing wrong with that by itself. However, this also “defines” how long is the package. Since IPv6 addresses are 16 bytes long, by providing even number, the last 8 bytes of the payload will be used as a beginning of the next header. We can see that in the Wireshark as well:
That’s pretty interesting. However, what to do with that? What next header should we fake? Why this matters at all? Well… it took me some time to figure this out. To be honest, I wrote a simple fuzzer to find it out
First byte is encoding “type” of the package. I’ve made the test and I’ve generated next header to be exactly the same as the “buggy” RDNSS one. I’ve been hitting breakpoint for tcpip!Ipv6pUpdateRDNSS but tcpip!Ipv6pHandleRouterAdvertisement was hit only once. I’ve run my IDA Pro and started to analyze what’s going on and what logic is being executed. After some reverse engineering I realized that we have 2 loops in the code:
First loop goes through all the headers and does some basic validation (size of length etc)
Second loop doesn’t do any more validation but parses the package.
As soon as there are more ‘optional headers’ in the buffer, we are in the loop. That’s a very good primitive! Anyway, I still don’t know what headers should be used and to find it out I had been brute-forcing all the ‘optional header’ types in the triggered bug and found out that second loop cares only about:
Type 3 (Prefix Information)
Type 24 (Route Information)
Type 25 (RDNSS)
Type 31 (DNS Search List Option)
I’ve analyzed Type 24 logic since it was much “smaller / shorter” than Type 3.
Stack overflow
OK. Let’s try to generate the malicious RDNSS packet “faking” Route Information as a next one:
where eax is the size of the package and r15 keeps an information of how much data were consumed. In that specific case we have:
rax = 0x48
r15 = 0x40
This is exactly 8 bytes difference because we use an even number. To bypass it, I’ve placed another header just after the last one. However, I was still hitting the same problem It took me some time to figure out how to play with the packet layout to bypass it. I’ve finally managed to do so.
Problem 4 – size again!
Finally, I’ve found the correct packet layout and I could end up in the code responsible for handling Route Information header. However, I did not Here is why. After returning from the RDNSS I ended up here:
ecx keeps the information about the “Length” of the “fake header”. However, [r14+18h] points to the size of the data left in the package. I set Length to the max (0xFF) which is multiplied by 8 (2040 == 0x7f8). However, there is only “0xb8” bytes left. So, I’ve failed another size validation!
To be able to fix it, I’ve decreased the size of the “fake header” and at the same time attached more data to the package. That worked!
Problem 5 – NdisGetDataBuffer() and fragmentation
I’ve finally found all the puzzles to be able to trigger the bug. I thought so… I ended up executing the following code responsible for handling Route Information message:
It tries to get the “Length” bytes from the packet to read the entire header. However, Length is fake and not validated. In my test case it has value “0x100”. Destination address is pointing to the stack which represents Route Information header. It is a very small buffer. So, we should have classic stack overflow, but inside of the NdisGetDataBuffer function I ended-up executing this:
fffff804`4820e10c 8b7910 mov edi,dword ptr [rcx+10h]
fffff804`4820e10f 8b4328 mov eax,dword ptr [rbx+28h]
fffff804`4820e112 8bf2 mov esi,edx
fffff804`4820e114 488d0c3e lea rcx,[rsi+rdi]
fffff804`4820e118 483bc8 cmp rcx,rax
fffff804`4820e11b 773e ja fffff804`4820e15b
fffff804`4820e11d f6430a05 test byte ptr [rbx+0Ah],5 ds:002b:ffffcb83`086a4c7a=0c
fffff804`4820e121 0f84813f0400 je fffff804`482520a8
fffff804`4820e127 488b4318 mov rax,qword ptr [rbx+18h]
fffff804`4820e12b 4885c0 test rax,rax
fffff804`4820e12e 742b je fffff804`4820e15b
fffff804`4820e130 8b4c2470 mov ecx,dword ptr [rsp+70h]
fffff804`4820e134 8d55ff lea edx,[rbp-1]
fffff804`4820e137 4803c7 add rax,rdi
fffff804`4820e13a 4823d0 and rdx,rax
fffff804`4820e13d 483bd1 cmp rdx,rcx
fffff804`4820e140 7519 jne fffff804`4820e15b
fffff804`4820e142 488b5c2450 mov rbx,qword ptr [rsp+50h]
fffff804`4820e147 488b6c2458 mov rbp,qword ptr [rsp+58h]
fffff804`4820e14c 488b742460 mov rsi,qword ptr [rsp+60h]
fffff804`4820e151 4883c430 add rsp,30h
fffff804`4820e155 415f pop r15
fffff804`4820e157 415e pop r14
fffff804`4820e159 5f pop rdi
fffff804`4820e15a c3 ret
fffff804`4820e15b 4d85f6 test r14,r14
In the first ‘cmp‘ instruction, rcx register keeps the value of the requested size. Rax register keeps some huge number, and because of that I could never jump out from that logic. As a result of that call, I had been getting a different address than local stack address and none of the overflow happens. I didn’t know what was going on… So, I started to read the documentation of this function and here is the magic:
“If the requested data in the buffer is contiguous, the return value is a pointer to a location that NDIS provides. If the data is not contiguous, NDIS uses the Storage parameter as follows: If the Storage parameter is non-NULL, NDIS copies the data to the buffer at Storage. The return value is the pointer passed to the Storage parameter. If the Storage parameter is NULL, the return value is NULL.”
Here we go… Our big package is kept somewhere in NDIS and pointer to that data is returned instead of copying it to the local buffer on the stack. I started to Google if anyone was already hitting that problem and… of course yes Looking at this link:
The short story of broken KRETPROBES and OPTIMIZER in Linux Kernel.
During the LKRG development process I’ve found that:
KRETPROBES are broken since kernel 5.8 (fixed in upcoming kernel)
OPTIMIZER was not doing sufficient job since kernel 5.5
First things first – KPROBES and FTRACE:
Linux kernel provides 2 amazing frameworks for hooking – K*ROBES and FTRACE. K*PROBES is older and a classic one – introduced in 2.6.9 (October 2004). However, FTRACE is a newer interface and might have smaller overhead comparing to K*PROBES. I’m using a word “K*PROBES” because various types of K*PROBES were availble in the kernel, including JPROBES, KRETPROBES or classic KPROBES. K*PROBES essentially enables the possibility to dynamically break into any kernel routine. What are the differences between various K*PROBES?
KPROBES – can be placed on virtually any instruction in the kernel
JPROBES – were implemented using KPROBES. The main idea behind JPROBES was to employ a simple mirroring principle to allow seamless access to the probed function’s arguments. However, since 2017 JPROBEs were depreciated. More information can be found here: https://lwn.net/Articles/735667/
KRETPROBES – sometimes they are called “return probes” and they also use KPROBES under-the-hood. KRETPROBES allows to easily execute user’s own routine at the entry and return path to the hooked function.However, KRETPROBES can’t be placed on arbitrary instructions.
When a KPROBE is registered, it makes a copy of the probed instruction and replaces the first byte(s) of the probed instruction with a breakpoint instruction (e.g., int3 on i386 and x86_64).
FTRACE are newer comparing to K*PROBES and were initially introduced in kernel 2.6.27, which was released on October 9, 2008. FTRACE works completely differently and the main idea is based on instrumenting every compiled function (injecting a “long-NOP” instruction – GCC’s option “-pg”). When FTRACE is being registered on the specific function, such “long-NOP” is being replaced with JUMP instruction which points to the trampoline code. Later such trampoline can execute any pre-registered user-defined hook.
A few words about Linux Kernel Runtime Guard (LKRG)
In short, LKRG performs runtime integrity checking of the Linux kernel (similar to PatchGuard technology from Microsoft) and detection of the various exploits against the kernel. LKRG attempts to post-detect and promptly respond to unauthorized modifications to the running Linux kernel (system integrity) or to corruption of the task integrity such as credentials (user/group IDs), SECCOMP/sandbox rules, namespaces, and more. To be able to implement such functionality, LKRG must place various hooks in the kernel. KRETPROBES are used to fulfill that requirement.
LKRG’s KPROBE on FTRACE instrumented functions
A careful reader might ask an interesting question: what will happen if the function is instrumented by the FTRACE (injected “long-NOP”) and someone registers K*PROBES on it? Does dynamically registered FTRACE “overwrite” K*PROBES installed on that function and vice versa?
Well, this is a very common situation from LKRG’s perspective, since it is placing KRETPROBES on many syscalls. Linux kernel uses a special type of K*PROBES in such case and it is called “FTRACE-based KPROBES”. Essentially, such special KPROBE is using FTRACE infrastructure and has very little to do with KPROBES itself. That’s interesting because it is also subject to FTRACE rules e.g. if you disable FTRACE infrastructure, such special KPROBE won’t work either.
OPTIMIZER
Linux kernel developers went one step forward and they aggressively “optimize” all K*PROBES to use FTRACE instead. The main reason behind that is performance – FTRACE has smaller overhead. If for any reason such KPROBE can’t be optimized, then classic old-school KPROBES infrastructure is used.
When you analyze all KRETPROBES placed by LKRG, you will realize that on modern kernels all of them are being converted to some type of FTRACE
LKRG reports False Positives
After such a long introduction finally, we can move on to the topic of this article. Vitaly Chikunov from ALT Linux reported that when he runs FTRACE stress tester, LKRG reports corruption of .text section:
I spent a few weeks (month+) on making LKRG detect and accept authorized third-party modifications to the kernel’s code placed via FTRACE. When I finally finished that work, I realized that additionally, I need to protect the global FTRACE knob (sysctl kernel.ftrace_enabled), which allows root to completely disable FTRACE on a running system. Otherwise, LKRG’s hooks might be unknowingly disabled, which not only disables its protections (kind of OK under a threat model where we trust host root), but may also lead to false positives (as without the hooks LKRG wouldn’t know which modifications are legitimate). I’ve added that functionality, and everything was working fine… … until kernel 5.9. This completely surprised me. I’ve not seen any significant changes between 5.8.x and 5.9.x in FTRACE logic. I spent some time on that and finally I realized that my protection of global FTRACE knob stopped working on latest kernels (since 5.9). However, this code was not changed between kernel 5.8.x and 5.9.x. What’s the mystery?
if (user_mode(regs)) {
RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
} else {
/*
* We might have interrupted pretty much anything. In
* fact, if we're a machine check, we can even interrupt
* NMI processing. We don't want in_nmi() to return true,
* but we need to notify RCU.
*/
rcu_nmi_enter();
}
preempt_disable();
/*
* idtentry_enter_user() uses static_branch_{,un}likely() and therefore
* can trigger INT3, hence poke_int3_handler() must be done
* before. If the entry came from kernel mode, then use nmi_enter()
* because the INT3 could have been hit in any context including
* NMI.
*/
if (user_mode(regs)) {
idtentry_enter_user(regs);
instrumentation_begin();
do_int3_user(regs);
instrumentation_end();
idtentry_exit_user(regs);
} else {
nmi_enter();
instrumentation_begin();
trace_hardirqs_off_finish();
if (!do_int3(regs))
die("int3", regs, 0);
if (regs->flags & X86_EFLAGS_IF)
trace_hardirqs_on_prepare();
instrumentation_end();
nmi_exit();
}
The root of unlucky change comes from this commit:
and this is what we currently have in all kernels since 5.8. Essentially, KRETPROBES are not working since these commits. We have the following logic:
asm_exc_int3() -> exc_int3():
|
----------------|
|
v
...
nmi_enter();
...
if (!do_int3(regs))
|
-----|
|
v
do_int3() -> kprobe_int3_handler():
|
----------------|
|
v
...
if (!p->pre_handler || !p->pre_handler(p, regs))
|
-------------------------|
|
v
...
pre_handler_kretprobe():
...
if (unlikely(in_nmi())) {
rp->nmissed++;
return 0;
}
Essentially, exc_int3() calls nmi_enter(), and pre_handler_kretprobe() before invoking any registered KPROBE verifies if it is not in NMI via in_nmi() call.
I’ve reported this issue to the maintainers and it was addressed and correctly fixed. These patches are going to be backported to the stable tree (and hopefully to LTS kernels as well):
However, coming back to the original problem with LKRG… I didn’t see any issues with kernel 5.8.x but with 5.9.x. It’s interesting because KRETPROBES were broken in 5.8.x as well. So what’s going on?
As I mentioned at the beginning of the article, K*PROBES are aggressively optimized and converted to FTRACE. In kernel 5.8.x LKRG’s hook was correctly optimized and didn’t use KRETPROBES at all. That’s why I didn’t see any problems with this version. However, for some reasons, such optimization was not possible in kernel 5.9.x. This results in placing classic non-optimized KRETPROBES which we know is broken.
Second problem – OPTIMIZER isn’t doing sufficient job anymore.
I didn’t see any changes in the sources regarding the OPTIMIZER, neither in the hooked function itself. However, when I looked at the generated vmlinux binary, I saw that GCC generated a padding at the end of the hooked function using INT3 opcode:
...
ffffffff8130528b: 41 bd f0 ff ff ff mov $0xfffffff0,%r13d
ffffffff81305291: e9 fe fe ff ff jmpq ffffffff81305194
ffffffff81305296: cc int3
ffffffff81305297: cc int3
ffffffff81305298: cc int3
ffffffff81305299: cc int3
ffffffff8130529a: cc int3
ffffffff8130529b: cc int3
ffffffff8130529c: cc int3
ffffffff8130529d: cc int3
ffffffff8130529e: cc int3
ffffffff8130529f: cc int3
Such padding didn’t exist in this function in generated images for older kernels. Nevertheless, such padding is pretty common.
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index b06d6e1188deb..3a1a819da1376 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -144,7 +144,7 @@ SECTIONS
*(.text.__x86.indirect_thunk)
__indirect_thunk_end = .;
#endif
- } :text = 0x9090
+ } :text =0xcccc
/* End of text section, which should occupy whole number of pages */
_etext = .;
It looks like INT3 is now a default padding used by the linker.
I’ve brought up that problem with the Linux kernel developers (KPROBES owners), and Masami Hiramatsu prepared appropriate patch which fixes the problem:
It was a few months before I joined Microsoft as a Security Software Engineer in 2012 when I sent them a report with an interesting bug/vulnerability in all versions of Microsoft Windows including Windows 7 (the latest version at that time). It was an issue in the implementation of TCP/IP stack allowing attackers to carry out a blind TCP/IP hijacking attack. During my discussion with MSRC (Microsoft Security Response Center) they acknowledged the bug exists, but they had their doubts about the impact of the issue claiming “it is very difficult and very unreliable” to exploit. Therefore, they were not going to address it in the current OSes. However, they would fix it in the upcoming OS which was going to be released soon (Windows 8).
I didn’t agree with MSRC’s evaluation. In 2008 I developed a fully working PoC which would automatically find all the necessary primitives (client’s port, SQN and ACK) to perform blind TCP/IP hijacking attack. This tool was exploiting exactly the same weaknesses in TCP/IP stack which I’ve reported. That being said, Microsoft informed me that if I share my tool (I didn’t want to do it), they would reconsider their decision. However, for now, no CVE would be allocated, and this problem was supposed to be addresses in Windows 8.
In the next months I started my work as FTE (Full Time Employee) for Microsoft, and I verified that this problem was fixed in Windows 8. Over the course of years, I completely forgot about it. Nevertheless, when I left Microsoft, I was doing some cleanups on my old laptop and found my old tool. I copied it from the laptop and decided to re-visit it once I will have a bit more time. I found some time and thought that my tool deserves a release and a proper description.
What is TCP/IP hijacking?
Most likely majority of the readers are aware what this is. For those who don’t, I encourage you to read many great articles about it which you can find on the internet these days.
It might be worth to mention that probably the most famous blind TCP/IP hijacking attack was done by Kevin Mitnick against the computers of Tsutomu Shimomura at the San Diego Supercomputer Center on Christmas Day, 1994.
This is a VERY old-school technique which nobody expects to be alive in 2021… Yet, it’s still possible to perform TCP/IP session hijacking today without attacking the PRNG responsible for generating the initials TCP sequence numbers (ISN).
What is the impact of TCP/IP hijacking nowadays?
(Un)fortunately it is not as catastrophic as it used to be. The main reason is that majority of the modern protocols do implement encryption. Sure, it’s overwhelmingly bad if attacker can hijack any TCP/IP session which is established. However, if the upper-layer protocols properly implement encryption, attackers are limited in terms of what they can do with it. Unless they have ability to correctly generate encrypted messages.
That being said, we still have widely deployed protocols which do not encrypt the traffic, e.g., FTP, SMTP, HTTP, DNS, IMAP, and more. Thankfully, protocols like Telnet or Rlogin (hopefully?) can be seen only in the museum.
Where is the bug?
TL;DR: In the implementation of TCP/IP stack for Windows 7, IP_ID is a global counter.
Details:
The tool which I developed in 2008 was implementing a known attack described by ‘lkm’ (there is a typo and real nickname of the author is ‘klm’) in Phrack 64 magazine and can be read here:
This is an amazing article (research) and I encourage everyone to carefully study all the details.
Back in 2007 (and 2008) this attack could be executed successfully on many modern OS (modern at that time) including Windows 2K/XP or FreeBSD 4. I gave a live presentation of this attack against Windows XP on a local conference in Poland (SysDay 2009).
Before we move to the details on how to perform described attack, it is useful to refresh how TCP handles the communication in more details. Quoting phrack paper:
Each of the two hosts involved in the connection computes a 32bits SEQ number randomly at the establishment of the connection. This initial SEQ number is called the ISN. Then, each time an host sends some packet with N bytes of data, it adds N to the SEQ number.
The sender put his current SEQ in the SEQ field of each outgoing TCP packet. The ACK field is filled with the next expected SEQ number from the other host. Each host will maintain his own next sequence number (called SND.NEXT), and next expected SEQ number from the other host (called RCV.NEXT. (…) TCP implements a flow control mechanism by defining the concept of “window”. Each host has a TCP window size (which is dynamic, specific to each TCP connection, and announced in TCP packets), that we will call RCV.WND. At any given time, a host will accept bytes with sequence number between RCV.NXT and (RCV.NXT+RCV.WND-1). This mechanism ensures that at any time, there can be no more than RCV.WND bytes “in transit” to the host.
In short, in order to execute TCP/IP hijacking attack, we must know:
Client IP
Server IP (usually known)
Client port
Server port (usually known)
Sequence number of the client
Sequence number of the server
OK, but what it has to do with IP ID?
In 1998(!), Salvatore Sanfilippo (aka antirez) posted in the Bugtraq mailing list a description of a new port scanning technique which is known today as an “Idle scan”. Original post can be found here:
In short, if IP_ID is implemented as a global counter (which is the case e.g., in Windows 7), it is simply incremented with each sent IP packet. By “probing” the IP_ID of the victim we know how many packets have been sent between each “probe”. Such “probing” can be performed by sending any packet to the victim which results in a reply to the attacker. ‘lkm’ suggests using an ICMP packet, but it can be any packet with IP header:
[===================================================================]
attacker Host
--[PING]->
<-[PING REPLY, IP_ID=1000]--
... wait a little ...
--[PING]->
<-[PING REPLY, IP_ID=1010]--
<attacker> Uh oh, the Host sent 9 IP packets between my pings.
[===================================================================]
This essentially creates some form of “covert channel” which can be exploited by remote attacker to “discover” all the necessary information to execute TCP/IP Hijacking attack. How? Let’s quote the original phrack article:
Discovering client’s port
Assuming we already know the client/server IP, and the server port, there’s a well known method to test if a given port is the correct client port. In order to do this, we can send a TCP packet with the SYN flag set to server-IP:server-port, from client-IP:guessed-client-port (we need to be able to send spoofed IP packets for this technique to work).
When attacker guessed the valid client’s port, server replies to the real client (not attacker) with ACK. If port was incorrect, server replies to the real client with SYN+ACK. A real client didn’t start a new connection so it replies to the server with RST.
So, all we have to do to test if a guessed client-port is the correct one is:
– Send a PING to the client, note the IP ID – Send our spoofed SYN packet – Resend a PING to the client, note the new IP ID – Compare the two IP IDs to determine if the guessed port was correct.
Finding the server’s SND.NEXT
This is the essential part, and the best what I can do is to quote (again) phrack article:
Whenever a host receive a TCP packet with the good source/destination ports, but an incorrect seq and/or ack, it sends back a simple ACK with the correct SEQ/ACK numbers. Before we investigate this matter, let’s define exactly what is a correct seq/ack combination, as defined by the RFC793 [2]:
A correct SEQ is a SEQ which is between the RCV.NEXT and (RCV.NEXT+RCV.WND-1) of the host receiving the packet. Typically, the RCV.WND is a fairly large number (several dozens of kilobytes at last).
A correct ACK is an ACK which corresponds to a sequence number of something the host receiving the ACK has already sent. That is, the ACK field of the packet received by an host must be lower or equal than the host’s own current SND.SEQ, otherwise the ACK is invalid (you can’t acknowledge data that were never sent!).
It is important to node that the sequence number space is “circular”. For exemple, the condition used by the receiving host to check the ACK validity is not simply the unsigned comparison “ACK <= receiver’s SND.NEXT”, but the signed comparison “(ACK – receiver’s SND.NEXT) <= 0”.
Now, let’s return to our original problem: we want to guess server’s SND.NEXT. We know that if we send a wrong SEQ or ACK to the client from the server, the client will send back an ACK, while if we guess right, the client will send nothing. As for the client-port detection, this may be tested with the IP ID.
If we look at the ACK checking formula, we note that if we pick randomly two ACK values, let’s call them ack1 and ack2, such as |ack1-ack2| = 2^31, then exactly one of them will be valid. For example, let ack1=0 and ack2=2^31. If the real ACK is between 1 and 2^31 then the ack2 will be an acceptable ack. If the real ACK is 0, or is between (2^32 – 1) and (2^31 + 1), then, the ack1 will be acceptable.
Taking this into consideration, we can more easily scan the sequence number space to find the server’s SND.NEXT. Each guess will involve the sending of two packets, each with its SEQ field set to the guessed server’s SND.NEXT. The first packet (resp. second packet) will have his ACK field set to ack1 (resp. ack2), so that we are sure that if the guessed’s SND.NEXT is correct, at least one of the two packet will be accepted.
The sequence number space is way bigger than the client-port space, but two facts make this scan easier:
First, when the client receive our packet, it replies immediately. There’s not a problem with latency between client and server like in the client-port scan. Thus, the time between the two IP ID probes can be very small, speeding up our scanning and reducing greatly the odds that the client will have IP traffic between our probes and mess with our detection.
Secondly, it’s not necessary to test all the possible sequence numbers, because of the receiver’s window. In fact, we need only to do approx. (2^32 / client’s RCV.WND) guesses at worst (this fact has already been mentionned in [6]). Of course, we don’t know the client’s RCV.WND. We can take a wild guess of RCV.WND=64K, perform the scan (trying each SEQ multiple of 64K). Then, if we didn’t find anything, wen can try all SEQs such as seq = 32K + i64K for all i. Then, all SEQ such as seq=16k + i32k, and so on… narrowing the window, while avoiding to re-test already tried SEQs. On a typical “modern” connection, this scan usually takes less than 15 minutes with our tool.
With the server’s SND.NEXT known, and a method to work around our ignorance of the ACK, we may hijack the connection in the way “server -> client”. This is not bad, but not terribly useful, we’d prefer to be able to send data from the client to the server, to make the client execute a command, etc… In order to do this, we need to find the client’s SND.NEXT.
And here is a small, weird difference in Windows 7. Described scenario perfectly works for Windows XP but I’ve encountered a different behavior in Windows 7. Having two edge cases as ACK value to fulfill ACK formula doesn’t really change anything and I have exactly the same results (just in Windows 7) just by always using one of the edge values for ACK. Originally, I thought that my implementation of attack is not working against Windows 7. However, after some tests and tuning it turns out that’s not the case. I’m not sure why or what I’m missing but, in the end, you can send less packages (twice less) and speed-up the overall attack.
Finding the client’s SND.NEXT
Quote:
What we can do to find the client’s SND.NEXT ? Obviously we can’t use the same method as for the server’s SND.NEXT, because the server’s OS is probably not vunerable to this attack, and besides, the heavy network traffic on the server would render the IP ID analysis infeasible.
However, we know the server’s SND.NEXT. We also know that the client’s SND.NEXT is used for checking the ACK fields of client’s incoming packets. So we can send packets from the server to the client with SEQ field set to server’s SND.NEXT, pick an ACK, and determine (again with IP ID) if our ACK was acceptable.
If we detect that our ACK was acceptable, that means that (guessed_ACK – SND.NEXT) <= 0. Otherwise, it means.. well, you guessed it, that (guessed_ACK – SND_NEXT) > 0.
Using this knowledge, we can find the exact SND_NEXT in at most 32 tries by doing a binary search (a slightly modified one, because the sequence space is circular).
Now, at last we have all the required informations and we can perform the session hijacking from either client or server.
(Un)fortunately, here Windows 7 is different as well. This is connected to the differences in the previous stage of how it handles correctness of ACK. Regardless of the guessed_ACK value ((guessed_ACK - SND.NEXT) <= 0 or (guessed_ACK - SND_NEXT) > 0) Windows 7 won’t send any package back to the server. Essentially, we are blind here and we can’t do the same amazingly effective ‘binary search’ to find the correct ACK. However, we are not completely lost here. We can always brute force ACK if we have the correct SQN. Again, we don’t need to verify every possible value of ACK, we can still use the same trick with TCP window size. Nevertheless, to be more effective and not miss the correct ACK brackets, I’ve chosen to use window size value as 0x3FF. Essentially, we are flooding the server with the spoofed packets containing our payload for injection, with the correct SQN and guessed ACK. This operation takes around 5 minutes and is effective Nevertheless, if for any reason our payload is not injected, a smaller TCP window size (e.g., 0xFF) should be chosen.
Important notes
This type of attack is not limited to any specific OS, but rather leverages “covert channel” generated by implementing IP_ID as a global counter. In short, any OS which is vulnerable to the “Idle scan” is also vulnerable to the old-school blind TCP/IP Hijacking attack.
We need to be able to send spoofed IP packets to execute this attack.
Our attack relies on “scanning” and constant “poking” of IP_ID:
Any latency between victim and the server affects such logic.
If victim’s machine is overloaded (heavy or slow traffic) it obviously affects the attack. Taking appropriate measures of the victim’s networking performance might be necessary for correct tuning of the attack.
Proof-of-Concept
Originally, I implemented lkm’s attack in 2008 and I tested it against Windows XP. When I ran compiled binary on the modern system, everything was working fine. However, when I took the original sources and wanted to recompile it on the modern Linux environment, my tool stopped working(!). New binary was not able to find client’s port neither SQN. However, old binary still worked perfectly fine. It was a riddle for me what was really happening. Output of strace tool gave me some clues:
cmsg_len and msg_controllen has different values. However, I didn’t modify the source code so how is it possible? Some GCC/Glibc changes broke the functionality of sending the spoofed package. I’ve found the answer here:
I needed to rewrite spoofing function to make it functional again on the modern Linux environment. However, to do that I needed to use different API. I wonder how many non-offensive tools were broken by this change
Windows 7
I’ve tested this tool against fully updated Windows 7. Surprisingly, rewriting PoC was not the most difficult task… setting up a fully updated Windows 7 is much more problematic. Many updates break update channel/service(!) itself and you need to manually fix it. Usually, it means manual downloading of the specific KB and installing it in “safe mode”. Then it can “unlock” update service and you can continue your work. In the end it took me around 2-3 days to get fully updated Windows 7 and it looks like this:
192.168.1.132 – attacker’s IP address 192.168.1.238 – victim’s Windows 7 machine IP address 192.168.1.169 – FTP server running on Linux. I’ve tested ProFTPd and vsFTP servers running under git TOT kernel (5.11+)
This tool does not do appropriate “tuning” per victim which could significantly speed-up the attack. However, in my specific case, the full attack which means finding client’s port address, finding server’s SQN and finding client’s SQN took about 45 minutes.
I found old logs from attacking Windows XP (~2009) and the entire attack took almost an hour:
…::: -=[ [d]evil_pi3 TCP/IP Blind Spoofer by Adam ‘pi3’ Zabrocki ]=- :::…
[+] Trying to find client port [+] Found port => 49456! [+] Veryfing… OK!
[+] Second level of verifcation [+] Found port => 49456! [+] Veryfing… OK!
[!!] Port is found (49456)! Let’s go further…
[+] Trying to find server’s window SQN [+] Found server’s window SQN => 1874825280, with ACK => 758086748 with seq_offset => 65535 [+] Rechecking… [+] Found server’s window SQN => 1874825280, with ACK => 758086748 with seq_offset => 65535
[!!] SQN => 1874825280, with seq_offset => 65535
[+] Trying to find server’s real SQN [+] Found server’s real SQN => 1874825279 => seq_offset 32767 [+] Found server’s real SQN => 1874825277 => seq_offset 16383 [+] Found server’s real SQN => 1874825275 => seq_offset 8191 [+] Found server’s real SQN => 1874825273 => seq_offset 4095 [+] Found server’s real SQN => 1874823224 => seq_offset 2047 [+] Found server’s real SQN => 1874822199 => seq_offset 1023 [+] Found server’s real SQN => 1874821686 => seq_offset 511 [+] Found server’s real SQN => 1874821684 => seq_offset 255 [+] Found server’s real SQN => 1874821555 => seq_offset 127 [+] Found server’s real SQN => 1874821553 => seq_offset 63 [+] Found server’s real SQN => 1874821520 => seq_offset 31 [+] Found server’s real SQN => 1874821518 => seq_offset 15 [+] Found server’s real SQN => 1874821509 => seq_offset 7 [+] Found server’s real SQN => 1874821507 => seq_offset 3 [+] Found server’s real SQN => 1874821505 => seq_offset 1 [+] Found server’s real SQN => 1874821505 => seq_offset 1 [+] Rechecking… [+] Found server’s real SQN => 1874821505 => seq_offset 1 [+] Found server’s real SQN => 1874821505 => seq_offset 1
[!!] Real server’s SQN => 1874821505
[+] Finish! check whether command was injected (should be :))
[!] Next SQN [1874822706]
real 56m38.321s user 0m8.955s sys 0m29.181s pi3-darkstar z_new #
Some more notes:
Sometimes you can see that tool is spinning around the same value when trying to find “server’s real SQN”. If next to the number in the parentheses you see number 1, kill the attack, copy calculated SQN (the one around which value tool was spinning) and paste it as an SQN start parameter (-M). It should fix that edge case.
Sometimes you can encounter the problem that scanning by 64KB window size can ‘overjump’ the appropriate SQN brackets. You might want to reduce the window size to be smaller. However, tools should change the window size automatically if it finishes scanning the full SQN range with current window size and didn’t find the correct value. Nevertheless, it takes time. You might want to start scanning with the smaller window size (but that implies longer attack).
By default, tool sends ICMP message to the victim’s machine to read IP_ID. However, I’ve implemented functionality that it can read that field from any IP packet. It sends standard SYN packet and waits for reply to extract IP_ID. Please give an appropriate TCP port to appropriate parameter (-P)
Modern operating systems (like Windows 10) usually implement IP_ID as a “local” counter per session. If you monitor IP_ID in specific session, you can see it is just incremented per each sent packet. However, each session has independent IP_ID base.
During LKRG development and testing I’ve found 7 Linux kernel bugs, 4 of them have CVE numbers (however, 1 CVE number covers 2 bugs):
CVE-2021-3411 - Linux kernel: broken KRETPROBES and OPTIMIZER
CVE-2020-27825 - Linux kernel: Use-After-Free in the ftrace ring buffer
resizing logic due to a race condition
CVE-2020-25220 - Linux kernel Use-After-Free in backported patch for
CVE-2020-14356 (affected kernels: 4.9.x before 4.9.233,
4.14.x before 4.14.194, and 4.19.x before 4.19.140)
CVE-2020-14356 - Linux kernel Use-After-Free in cgroup BPF component
(affected kernels: since 4.5+ up to 5.7.10)
I’ve also found 2 other issues related to the ftrace UAF bug (CVE-2020-27825):
Deadlock issue which was not really addressed and devs said they will take a look and there is not much updates on that.
Problem with the code related to hwlatd kernel thread – it is incorrectly synchronizing with launcher / killer of it. You can have WARN in kernels all the time.
Incompatibility of KPROBE optimizer with the latest changes in the linker.
Additionally, I’ve also found a bug with the kernel signal handling in dying process:
CVE-2020-12826 – Linux kernel prior to 5.6.5 does not sufficiently restrict exit signals
However, I don’t remember if I found it during my work related to LKRG so I’m not counting it here (otherwise it would be total 8 bugs while 5 of them would have CVE).
That’s pretty bad stats… However, it might be an interesting story to say during LKRG announcement of the new version. It could be also interesting talk for conference.
This year I’m going to present some amazing research on:
BlackHat 2021 – “Safeguarding UEFI Ecosystem: Firmware Supply Chain is Hard(coded)” – together with Alex Tereshkin and Alex Matrosov Abstract:click here
DefCon 29 – “Glitching RISC-V chips: MTVEC corruption for hardening ISA” – together with Alex Matrosov Abstract:click here
Both of them are really unusual and interesting topics
If anyone is going to be in Las Vegas during BlackHat and/or DefCon this year and would like to grab a beer, just let me know!
At first, I didn’t plan to write an article about the problems with bug bounty programs. This was supposed to be a standard technical blogpost describing an interesting bug in the Linux Kernel i915 driver allowing for a linear Out-Of-Bound read and write access (CVE-2023-28410). Moreover, I’m not even into bug bounty programs, mostly because I don’t need to, since I consider myself lucky enough to have a satisfying, stable and well-paid job. That being said, in my spare time, apart from developing and maintaining the Linux Kernel Runtime Guard (LKRG) project, I still like doing vulnerability research and exploit development not only for my employer, and from time to time it’s good to update your resume with new CVE numbers. Before I started to have a stable income, bug bounties didn’t exist and most of the quality vulnerability research outcome was paying the bills via brokers (let’s leave aside the moral questions arising from this). However, nowadays we have bug bounty programs…
For the last decade (a bit longer), bug bounty programs gained a lot of deserved traction. There are security researchers who rely on bug bounties as their primary(!) source of income. Such cases are an irrefutable proof of the success of the bug bounty programs. However, before the industry ended up where it is now, it went through a long and interesting route.
A little bit of history:
On November 5, 1993, Scott Chasin created the bugtraq mailing list in response to the problems with CERT, where security researchers could publish vulnerabilities, regardless of vendor response, as part of the full disclosure movement of vulnerability disclosure. On July 9, 2002, Len Rose created Full Disclosure – a “lightly moderated” security mailing list because many people felt that the bugtraq mailing list had “changed for the worse”. However, not everyone agreed with the “Full Disclosure” Philosophy and 2 major alternatives can be summarized as a Non-Disclosure and Coordinated Disclosure (also known as “Responsible Disclosure”). Non-Disclosure is generally favored by blackhat hackers and later by commercial exploit vendors (including brokers).
Coordinated (responsible) vulnerability disclosure is a policy under which researchers agree to report vulnerabilities to a coordinating authority, which then reports it to the vendor, tracks the fixes and mitigations, and coordinates the disclosure of information with stakeholders including the public. In some cases, the coordinating authority is the vendor. However, some researchers started to raise concerns about how vendors handle the reported vulnerabilities (and treat the researchers as enemies) while others started to have expectations of compensation for their reports (bug bounty-like programs).
“No more free bugs”
In March 2009 at CanSecWest conference, Alex Sotirov, Dino Dai Zovi and Charlie Miller announced the “No More Free Bugs” campaign spreading the word that, in most cases, they would no longer be providing vendors with free vulnerability notices. Regardless of how effective it was (not really: https://twitter.com/0xcharlie/status/1480593646756024323), it certainly made industry news and opened a broader debate about the problem. It is worth mentioning that in the old days, when security researchers got a free t-shirt for their work, they could consider themselves lucky The real rewards came in fame and job offers from third parties (if at all).
The reason I’m mentioning all these “past problems” is because they are the basis of how and why we have bug bounty programs today in some form or other. Moreover, certainly we would NOT want to find the industry in a similar shape as in the old days.
Bug bounty programs – A new wonderful world!
Let’s ask ChatGPT the following question:
“What is the history of bug bounty programs?”
“Bug bounty programs have their origins in the early days of the internet, with companies such as Netscape and Mozilla offering rewards for the discovery and reporting of security vulnerabilities in their software. The concept of a “bug bounty” was formalized by the United States Department of Defense in the early 2000s with the launch of the Vulnerability Reward Program (VRP). Since then, many technology companies and organizations have implemented similar programs, offering rewards to security researchers for finding and reporting vulnerabilities in their systems. These programs have become an important tool in the fight against cybercrime, as they incentivize individuals to find and report vulnerabilities before they can be exploited by malicious actors.”
Pretty accurate While the market for vulnerabilities has developed, vulnerability commercialization (or “bug bounties”) remains an important tool allowing the developers to discover and resolve bugs before the general public becomes aware of them, preventing incidents of widespread abuse and data breaches.
However, some people might ask “Why do companies bother creating a bug bounty program?”. It’s a fair question, what’s the point (from the companies’ perspective) of paying money to “random” people for finding bugs in their products? Bug bounties benefit not only the researchers. In fact, if company’s security is mature enough and their products development is security oriented (which is usually not the case) they can actually bring a lot of benefits, including:
Cost-effective vulnerability management – bug bounties can be more cost-effective than hiring a third-party security firm to conduct penetration testing or vulnerability assessments, which can be expensive (especially for very complicated and mature products). Additionally, with a bug bounty program, companies can expand their testing and vulnerability research coverage, as they can have many security researchers with different levels of expertise, experience, and skills testing their products and systems. This can help the company to find vulnerabilities that might have been missed by their internal team.
Brand reputation – by having a bug bounty program, companies can show that they care about security and are willing to invest in it. It can also help to improve the company’s reputation in the security industry.
“Protect” the brand – by opening a bug bounty program, companies can encourage researchers to report vulnerabilities directly to them, rather than publicizing them or selling them to malicious actors. This can help to mitigate various security risks.
“Advertisement” to the potential customers – bug bounty are showing that they take security seriously and are actively working to identify and address vulnerabilities. Companies can build trust with their customers, partners, and other stakeholders.
Nevertheless, it should be noted that having a bug bounty program does not replace the need for secure SDLC, regular security testing and vulnerability management practices, and more. It’s an additional layer of security that can complement the existing security measures in place.
“Bug bounties are broken”
As we can see, bug bounties are very successful because both parties – researchers and companies – benefit from them. However, that being said, in recent years more and more security researchers started to loudly complain about some specific bounty programs. A few years ago, given the success of the bug bounties, such unfavorable opinions were marginal. Of course, they have always been there but certainly they were not as visible. Whenever a new company joined the revolution of opening a bug bounty, they were praised (especially by the media) for being mature and understanding the importance of security problems instead of pretending that security problems don’t affect them. However, has any media article really analyzed how such programs work in detail? Are there really no traps hidden for researchers in the conditions of the program?
Unfortunately, it seems as if every month social media (twitter?) discusses cases of, to put it mildly, controversial activities of some companies around their bug bounty programs. Is the golden age of bug bounty over? Perhaps little has changed other than more researchers starting to rely on the bug bounties as a primary (or significant) source of their income. Naturally, problems with the bug bounties would significantly affect their life and cause them to raise their concerns more broadly/boldly.
Bug bounties (as well as any type of vulnerability reporting program) always involved risk for the researchers. Some of the main reasons (again, thanks to ChatGPT! :)) include:
Insufficient rewards – some researchers may feel that the rewards offered by bug bounty programs are insufficient, either in terms of the monetary value or the recognition they receive. This can be especially true for researchers who discover high-severity vulnerabilities, which may be more difficult or time-consuming to find.
Slow or unresponsive communication – researchers may be frustrated by slow or unresponsive communication from bug bounty program coordinators, especially when it comes to triaging and addressing reported vulnerabilities.
Lack of transparency – researchers may feel that there is a lack of transparency in how bug bounty programs are run and how rewards are determined. They may also feel that there is a lack of communication about which vulnerabilities are being fixed and when.
Unclear scope – researchers may feel that the scope of the bug bounty program is not clearly defined, making it difficult for them to know which vulnerabilities are in scope and which are not.
Duplicate reports – researchers may feel frustrated when they report a vulnerability and then find out that it has already been reported by someone else. This can be especially frustrating if they are not rewarded for their work.
Rejecting a report – researchers may feel that their report is rejected without clear explanation, or that their report is not considered as a vulnerability.
No public recognition – some researchers may want to be publicly recognized for their work and not just rewarded with money.
As you can imagine, every single case from that list is happening to some extent. However, the fair question to ask is “what’s the main reason behind the possibilities of such problems?”.
“Imbalance of Power”
The fundamental problem with bug bounties is that they are not only created by the companies/vendors, but they are entirely and arbitrarily defined by them, and security researchers are at their mercy, without any recourse (because to whom can you appeal? to those who created, defined and decided what to do with the reported work?). We do have a problem with the power imbalance. In the ideal world, we would have equitable terms that both parties can agree on, on what is to be expected under what circumstances before a security researcher even starts the work. Bug bounties are simply not structured this way. Moreover, there’s a lot of false and misleading advertising, e.g., all of these articles showing the million dollar bug hunters, giving the impression that YOU too can become rich when you do bug bounties. A lot of young and motivated people, sometimes living in completely different countries than the company, have little options to fight for their rights when they are wronged. They might not know their legal rights, or don’t have the experience and capital to do that. However, some of the people in such situation try to fight by putting the pressure on the company using social media and we can see that from time-to-time e.g., on twitter.
Let’s look at some of the random examples of the problems with bug bounty programs that researchers had to deal with (you can find multiple ones):
1) “Rejecting a report”, “Slow or unresponsive communication”, “Unclear scope”, “Lack of transparency”, “No public recognition”, and after appeal “Insufficient rewards” from the researcher’s perspective:
“Windows 10 RCE: The exploit is in the link” (December 7th, 2021)
New blog post: Windows 10 RCE via an argument injection in the ms-officecmd URI handler. While our RCE vector (MS Teams) has been fixed, the argument injection still persists.https://t.co/dDodI8QREi
"Microsoft Bug Bounty Program's (MSRC) response was poor: Initially, they misjudged and dismissed the issue entirely. After our appeal, the issue was classified as "Critical, RCE", but only 10% of the bounty advertised for its classification was awarded ($5k vs $50k). The patch they came up with after 5 months failed to properly address the underlying argument injection (which is currently also still present on Windows 11)"
In that specific case, researchers “hit” 6(!) out of 7 general potential problems with bug bounties (almost a jackpot! :)). It is very fascinating and educational to read the entire section on communication with Microsoft (highly recommended). What is worth to point out, is that:
"Someone at MS must agree with us, because the report earned us 180 Researcher Recognition Program points, a number which we could only explain as 60 base points with the 3X bonus multiplier for 'eligible attack scenarios' applied."
2) “Unclear scope” which resulted in “Insufficient rewards” from the researcher’s perspective
1) As long as it is a virtual machine escape vulnerability that is out of scope recognized by MSRC, it is not considered as a virtual machine escape vulnerability, even if this component is the default configuration on wsl2 and has the ability to escape from a virtual machine. https://t.co/C4Ksn3BCqUpic.twitter.com/yne3LB724k
A researcher found a bug in the DirectX virtualization feature which allowed him to escape the VM boundary. However, DirectX is not a part of Hyper-V itself but a separate feature which can be used by Hyper-V to provide certain functionality. DirectX is off by default unless certain features opt to turn it on. However, the researcher argued that this specific feature is on by default on WSL2 as well as some cloud providers which provide DirectX virtualization (for some scenarios).
That’s an interesting case from the scoping perspective because at the end of the day, the researcher developed a PoC and proved it could escape the VM boundary. Regardless of whether the problem is in core a Hyper-V component or not, such bugs are valuable in the black market and one of the brokers suggested to contact them next time instead of the bug bounty program:
It's unfortunate, if it helps, next time contract me of you are interested to sell it to a third party and get paid for your efforts
A researcher found a few instances of Microsoft Exchange servers whose IP/Domains belong to Microsoft and they were vulnerable to the ProxyShell attack. ProxyShell is an attack chain that exploits three known vulnerabilities in Microsoft Exchange: CVE-2021-34473, CVE-2021-34523 and CVE-2021-31207. By exploiting these vulnerabilities, attackers can perform remote code execution.
The researcher reported these issues to MSRC and after about 1 month Microsoft fixed all the issues and sent the researcher negative messages, claiming that all the issues are: Out of scope, and/or have only a moderate severity. The researcher argued that by combining all these issues you would get RCE in the end and RCE is certainly not a moderate issue (especially that Microsoft fixed the reported problems).
4) “Duplicate reports” and “Slow or unresponsive communication”
A researcher submitted multiple high-impact vulnerabilities and months later the vendor rejected all the bugs arguing that they had known about them because they had discovered exactly the same bugs internally prior to the submission. However, at the same time the vendor didn’t and couldn’t provide any evidence (no CVEs) that they really found them by themselves (internally). Moreover, the vendor took a long time to respond which raises questions (and undermines credibility) about the internal findings.
5) “Insufficient rewards”
Decided to publish the Lexmark printer exploit + writeup + tools instead of sell it for peanuts. 0day at the time of writing: https://t.co/YptEXw3CjJ — enjoy!
According to the researcher, Lexmark was not notified before the zero-day's release for two reasons. First, Geissler wished to highlight how the Pwn2Own contest is "broken" in some regards, as shown when low monetary rewards are offered for "something with a potentially big impact" such as an exploit chain that can compromise over 100 printer models. Furthermore, he said that official disclosure processes are often long-winded and arduous. "In my experience, patching efforts by the vendor are greatly accelerated by publishing turnkey solutions in the public domain without any heads up whatsoever," Geissler noted. "Lexmark might reconsider partnering with similar competitions in the future and opt to launch their own vulnerability bounty/reward program."
“Linux Kernel i915 Linear Out-Of-Bound read and write access”
In November 2021 during analysis of i915 driver I found a linear Out-Of-Bound (OOB) read and write access causing a memory corruption and/or memory leak bug. The vulnerability exists in the ‘vm_access‘ function, typically used for debugging the processes. This function is needed only for VM_IO | VM_PFNMAP VMAs:
static int
vm_access(struct vm_area_struct *area, unsigned long addr,
void *buf, int len, int write)
{
…
if (write) {
[1] memcpy(vaddr + addr, buf, len);
__i915_gem_object_flush_map(obj, addr, len);
} else {
[2] memcpy(buf, vaddr + addr, len);
}
…
}
Parameter ‘len‘ is never verified and it is directly used as an argument for the memcpy() function. At line [1], the memcpy() function writes the user controlled data to the “pinned” page causing a potential memory corruption / overflow vulnerability. At line [2], the memcpy() function reads the memory (data) from the “pinned” page to the area visible to the user causing a potential memory leak problem. The full and detailed description of the bug with all the analysis (and PoC) can be found here: http://site.pi3.com.pl/adv/CVE-2023-28410_i915.txt
I’ve identified 3 different interfaces which could be used to trigger the bug and I shared my finding with my friend Jared (https://twitter.com/jsc29a) and we started working on PoC. We quickly developed a very reliable PoC crashing the kernel:
At that point I usually decide to develop a fully weaponized exploit, or to contact the vendor ASAP to report and fix the issue. However, just by my curiosity I wanted to know where Intel’s GPU (with i915 driver) is enabled (by default) and I realized that it is used in multiple interesting places, including:
Google Chromebook / Chromium
Most of the business laptops
Power-efficient laptops
Any device using Intel CPU with integrated GPUs
but hold on… doesn’t point “a.” (Chromebook) have a bug bounty program? Maybe it’s a good time to try bug bounties myself? How does it even work? Everyone praises Google and their Security, right?
“… In addition to the Chrome bug classes recognized by the program, we are interested in reports that demonstrate vulnerabilities in Chrome OS’ hardware, firmware, and OS components. … Sandbox escapes Additional components are in scope for sandbox escapes reports on Chrome OS. These include: … Obtaining code execution in kernel context. …”
OK, looks like it should be in scope of their bug bounty. Let’s try that path and see how it goes I didn’t have much time to spend on this bug anymore, but I slightly modified the PoC to leak something (not just crashing the kernel). It was relatively easy, so I opened the bug and waited to see where it goes: https://bugs.chromium.org/p/chromium/issues/detail?id=1293640
Then on Saturday (Feb 5th, 2022) I had some extra free time and decided to see how easy it would be to weaponize the PoC, and I realized something not so great When I developed my PoC leaking something interesting, I did it on my debugging machine with a special configuration including KASAN and all of that. To my surprise, when I re-run my PoC on a “fresh and clean” VM, I was constantly crashing the kernel and did not leak anything. Well… that started bothering me so I spent some time analyzing what was going on. After heavy work and technical discussion with spender (kudos to grsecurity) I realized that when you enable PFN remapping (for debugging some of my peripheral) with KASAN, VM_NO_GUARD flag is set. When KASAN is NOT enabled, unfortunately, vmap() and vmap_pfn() place GUARD page on the top of the mapping in exactly the same way as vmalloc() adds a GUARD page on the top of each allocation Because of that, my PoC was working very well on my debugging kernel but not on a “fresh” VM. Well… I thought I should update the bug which I opened to be fair and transparent, and I shared all the information (including updating the technical details in advisory and adding snipped relevant kernel code): https://bugs.chromium.org/p/chromium/issues/detail?id=1293640#c3
At that point I kind of expected that Chromebook configuration doesn’t enable KASAN on production (why would they? :)) so even though the bug is great by itself, it will only allow to crash the kernel (no code-exec, unless KASAN is enabled ;-)). However, the bug should be fixed anyway (even if it is kernel DoS) and because of non-standard configurations (with VM_NO_GUARD flag) it is exploitable. Google summarized the bug (based on what I shared) as:
“Thanks for the detailed report! Has this also been reported against upstream Linux and/or to the driver maintainers?
Here is a summary according to my understanding:
i915 GPU buffers can be accessed via vm_remote_access (intended for debugging buffers no residing in the userspace process’ address space)
The function that performs the memory access is lacking a length check, allowing userspace to attempt OOB read/writes past the buffer end
The standard kernel configuration has guard pages in place, which causes an access violation in the kernel, not an actual OOB access
Assuming the above is correct, this is a denial of service situation (since we reliably crash accessing the guard page), thus Severity-Low. We still want this fixed, but waiting for upstream to fix and then consuming the fix via stable branches is sufficient.”
Additionally, they confirmed that PoC crashes all the kernels. I didn’t really agree with the statement “(…) causes an access violation in the kernel, not an actual OOB access” because it is the other way around.It IS actual OOB access and because of that it causes access violation. However, I decided not to be pickyand I ignored it. I confirmed their statement, Google assigned the severity “low” to it and informed me that they would forward all the information to Intel PSIRT… Sure Now the “fun” part starts…
I reported the issue to Google on February 3rd, 2022
Google reported the issue to Intel on February 8th, 2022
Google went silent for 58 days
On 7th of April 2022, I asked if there were any updates
Silence for another 5-6 days
On April 12th, 2022 I was doing some LKRG development for the latest vanilla kernel and during git sync I realized that i915 kernel file where I found the bug was updated. Hm… It is “suspicious” to start seeing such “random” updates on that file while not getting any updates from Google for almost 2 months Let’s see what’s going on… and what I found completely shocked me
In the commit message, they used wrong email address: “Suggested-by: Adam Zabrocki [email protected]“ I have not worked for Microsoft since 2019 and I reported this issue from a different email address (gmail.com not microsoft.com).
In summary, Google stayed silent for 58 days, they didn’t reply to my messages and in the meantime Intel fixed the bug more than a month prior, suggesting they found the bug themselves. However, as a consolation prize they put me as an author of the patch, using my old Microsoft address (while I haven’t worked for Microsoft since 2019 as I mentioned before). So far, bug bounty programs are doing great!
It looks like I hit the following problems so far:
“Slow or unresponsive communication” – in fact, I would elevate it to no communication at all
“Lack of transparency” – how bug ended-up being fixed and no one gave any updates for almost 2 months
“No public recognition” – the fixes suggest that the bug was found internally by Intel
I wrote about all my findings to Google, after which they finally replied saying that they didn’t contact Intel nor did Intel contact them. Google has taken responsibility to be the coordinator of handling the bug reporting/fixing process but didn’t follow-up on what was going on for 2 months.
On the top of the listed potential problems with Bug Bounty programs, a “middle-man” like Google might deserve their own list of unique problems. However, at the same time, the described problems can fall into the bracket of “Slow or unresponsive communication” and “Lack of transparency”.
Going back to the main story, Google added me to the mailthread and went silent again. In the meantime, I started talking with Intel directly, and on April 12th they promised to come back to me until EOW after finding out what was going on. I reminded them on April 14th about it and they still were silent. Then I poked them again on the 18th because of course I didn’t hear back from them at the end of the week (“Slow or unresponsive communication” and “Lack of transparency”). Then I got a reply which was kind of… weird?
“I appreciate your patience with this. The issue has received a positive triage and was reproducible. We are currently working with the product team to settle on a severity score and a push to mitigation.(…)”
This reply didn’t make sense because during the time of the conversation, Intel already created a patch for this issue (on March 3, 2022, 6:04 a.m.) and merged it to the official Linux kernel on March 14th, 2022. In short, the bug was already fixed, pushed to the Linux git repo making it essentially public for the last 40+ days. Despite this, no CVE had been assigned and the security bulletin has not been published. However, Intel was claiming that they were still triaging the issue…
I also didn’t get why I had to start resolving the issues which I didn’t create. This bug was officially reported to Google, and I would expect that Google’s responsibility is to handle the communication and fixes correctly.I asked Google about that and the reply which I got was a long corporation statement which in short was saying something around “It’s Intel’s problem, not ours. We are OK, the severity is low so what’s your problem?” https://bugs.chromium.org/p/chromium/issues/detail?id=1293640#c22
Well… I do understand and I’m fully aligned and agree that it is the responsibility of the component’s owner (Intel in this case) to handle any fixes. Although, I would question if such kind of communication is an industry standard as there’s been no follow-up with the researcher (February – April) as well as no follow-up with the vendor throughout. When I asked for updates, I received no reply until I discovered by myself that the bug had been already fixed in a shady way for at least a month.
Google’s approach might fall into something equivalent as “Unclear scope” brackets of problems but unique for a “middle-man”. It is an actual Google’s responsibility as a coordinator for handling the bugs reported to them.
Additionally, I got a pretty interesting statement:
“We are all humans running and crewing security programs trying to triage our own discoveries as well as those from external reporters and trying to run decent VDP programs. All the folks I work with in Chrome/Chrome OS greatly appreciate the contributions of the external researchers – such as yourself- are trying their best to get bugs fixed and want to do the right thing. There are smart people and solid processes to stay on top of all this, but at the end of the day mistakes are going to happen – bugs may get mistriaged, reports may get merged in the wrong direction giving someone else the acknowledgement for a finding, someone may not always respond in a timely manner. But 100% of the time, if you reach out, one of us will respond, hear you out and work with you to make things as right as possible. From what I can see in the comments above and in the email thread, each time you reached out to mnissler@, he responded fairly quickly. He could only do so much to get you the information you wanted because we are on the outside of that process. He did go and attempt to coordinate with Intel as best as possible and to eliminate the need for the middle-person to run comms, added you on the email thread so you could directly coordinate with the vendor/Intel.”
Behind all these sweet words, what is really written is “well, shit happens, deal with it, not our problem”. Moreover, there is lack of any explanation of what really happened here (serious “Lack of transparency”). In fact, there is nothing in what she wrote that I would disagree with, excluding “always respond in a timely manner” (please look at the dates at comment 14, 15, 16 and 17). Almost 2 months of no reply and follow-up, that’s certainly not a “timely manner response” (“Slow or unresponsive communication”). What is the most crazy is that from Google’s perspective everything was done as it should be, including communication. I’m surprised by it, and I would respectfully disagree with that. If during this time (February – April 2022) anyone had followed-up with the vendor and sent even a single email asking for the updates, this mess could have been avoided.
How has it ended?
Intel continued “triaging the bug” until October 2022 (!). They kept postponing the release date from October to November, then to February 2023… and then to May 2023 (remember that bug was reported in February 2022, silently fixed as an internal finding on March 2022, and was originally found by me in November 2021 :)).
My experience with bug bounties was much worse than I expected. Especially Google’s attitude of being silent forever until things went horribly/obviously wrong. Then, they tried to put the responsibility for everythingon Intel and convince me that it was not their problem even though the bug was reported to them, and that they were handling the communication very well(!). As a researcher, what can you do? Nothing. Intel screwed up badly, but they tried best to fix what they messed-up (and what was still possible to fix at that point). Contrary to Google’s attitude, Intel didn’t blame anyone, and they didn’t try to convince me that they were doing their best and they were not dismissive.
It is worth mentioning that Intel officially apologized for how this case was handled: “(…) We would like to apologize for the way this case has been handled. We understand the back and forth with us has been frustrating and a long process. (…)“. However, nothing like that happened on Google’s part.
If this bug was really valuable from the exploitability perspective, it looks like still the best option is to dust off the old broker contacts (if we leave the moral questions aside). They never ever failed so much (at least that’s my experience), even though they are not ideal as well.
The decision of whether or not to write this article has matured in me for quite a long time for many different reasons. However, when I came to the conclusion that it is worth describing the generally unspoken problems associated with bug bounty programs (including my specific case), the following people contributed to its final form (random order):
I hope that we as a security community can start some conversation on how the described unspoken problems of bug bounties can be addressed. “Imbalance of Power” is a real problem for the researchers and it should be changed.
Blind TCP/IP hijacking is still alive! After 13 years, Windows 7/XP/2K/9x (and not only) full blind TCP/IP hijacking bug finally got an allocated CVE-2023-34367 (thanks to MITRE). Interestingly, The Pwnie Awards nomination for this research and the published write-up + PoC didn’t help to get it sooner
I am proud to share that I was elected the Vice Chair of the RISC-V J-extension (so-called J-ext) Task Group My last 3-4 years of work on the RISC-V architecture were quite fruitful:
I found a critical bug in the RISC-V architecture (not in the implementation of a specific processor) that got CVE-2021-1104. Among other things, I talked about this vulnerability in 2021 at the DefCon 29 conference