There are new articles available, click to refresh the page.
Before yesterdayZero Day Initiative - Blog

CVE-2021-2429: A Heap-based Buffer Overflow Bug in the MySQL InnoDB memcached Plugin

2 September 2021 at 16:05

In April 2021, the ZDI received a submission of a vulnerability in the MySQL database. It turned out to be a heap-based buffer overflow bug in the InnoDB memcached plugin. It was submitted to the program by an anonymous researcher.

The vulnerability affects MySQL versions 8.0.25 and prior. It can be triggered remotely and without authentication. Attackers can leverage this vulnerability to execute arbitrary code on the MySQL database server. Oracle patched it in July and assigned itΒ CVE-2021-2429, while ZDI’s identifier is ZDI-2021-889.

The Vulnerability

The following analysis is based on the source code of MySQL Community Server version 8.0.25. The bug is in the memcached GET command, which is used for retrieving data from a table. For performance, the GET command supports fetching multiple key-value pairs in a single memcached query. Here is an example:

The keys specified in the GET command are tokenized by process_get_command() and then handled one by one in innodb_get() .

If a key in the GET command has the form @@containers.name, then the variable report_table_switch will have been set to true, satisfying the branch at (1). The memcpy at (3) copies table_name to the row_buf buffer. Before performing the memcpy, the code at (2) validates that there is still enough space in row_buf. However, this validation is performed with an assert() only. Since assert is a macro that produces code only in debug builds but not in release builds, this leads to a buffer overflow that can be reached when running a release build.

The Trigger

The InnoDB memcached plugin is not enabled by default. One must build MySQL from source with -DWITH_INNODB_MEMCACHED=ON. Here is the build detail. By default, the memcached daemon listens on TCP and UDP port 11211. The payload is a single GET command as seen in the example below.

Β Β Β Β Β Β Β get @@aaa @@aaa @@aaa ...

@@aaa is one of the default rows in the innodb_memcache database.

Each @@aaa is replaced with the table name test/demo_test at (5) within the innodb_get() function shown above. The resulting overflow content has the form test/demo_testtest/demo_testtest/demo_test.... The length of the overflow is controllable by the attacker. After sending the payload, the heap overflow is triggered in the mysqld process. The call stack is shown below.

The Patch

The vulnerability was fixed in version 8.0.26. TheΒ patchΒ is straightforward. It explicitly checks the length before copying.


Although the InnoDB memcached plugin is not enabled by default, it is nonetheless wise to apply the patch as soon as possible. It would not surprise me to see a reliable full exploit in the near future.

You can find me on Twitter @_wmliang_, and follow the team for the latest in exploit techniques and security patches.

CVE-2021-2429: A Heap-based Buffer Overflow Bug in the MySQL InnoDB memcached Plugin

MindShaRE: When MySQL Cluster Encounters Taint Analysis

10 February 2022 at 16:51

Recently, the ZDI received multiple submissions of vulnerabilities inΒ MySQL Cluster. MySQL Cluster is a clustering solution providing linear scalability and high availability for the MySQL database management system. The common attack vector identified in these reports is the open port for the cluster management node and data nodes. Attackers can utilize the protocol and interact with nodes without authentication.

After investigating these submissions, I realized that the code is very buggy, and the pattern of the vulnerabilities is simple. However, the codebase is too large for a manual review. Therefore, the question becomes, β€œIs it possible to identify all low-hanging-fruit bugs automatically and quickly?” Fuzzing works, but it depends on coverage and cannot precisely focus on a specific type of bug. Taint analysis is probably the more suitable answer for this question.

Two tools are chosen for taint analysis: Clang Static Analyzer and CodeQL. Although they have their own pros and cons, both can lead to positive results. This blog looks at both methods and shows how they can be used for taint analysis against this and other programs.

The Target

Here is an example of the kind of low-hanging-fruit bug we are looking for:

The Qmgr::execCM_REGREF function is a registered signal of the QMGR NDB kernel block. These registered signals can be invoked remotely. The signal->getDataPtr() at (1) returns a pointer to a buffer that contains untrusted input from the network. TaddNodeno at (2) is therefore a controlled 32-bit integer from the network, and it is subsequently used as an argument at (3). Finally, at (4) within BitmaskImpl::set, it is used as an array index. Since no validation has been performed on this value, this potentially produces an out-of-bounds (OOB) write.

MySQL Cluster registers around 1,400 signals (the number of calls to addRecSignal()) and around 6,500 accesses on untrusted input (the number of calls to getDataPtr() plus the number of calls to getDataPtrSend() plus the number of direct accesses of theData). Although the example above is not very challenging, the manual review is still very time consuming due to the required scale. It's time to introduce taint analysis.

There are 4 common terms used during taint analysis: SOURCE, SINK, PROPAGATION, and SANITIZER. SOURCE refers to where data originates. In the example above, signal->getDataPtr() at (1) is the SOURCE. SINK refers to where data ends. In the example above, the access of array index at (4) is a SINK. PROPAGATION refers to how the data flows. In our example, the assignment at (2) and the argument copy at (3) are considered PROPAGTIONs. SANITIZER indicates where data is either sanitized or validated. There is no SANITIZER on the above example, and that is the root cause of the bug. The task of taint analysis is to look for a flow from SOURCE to SINK where the flow did not meet SANITIZER during the PROPAGATION. By defining the suitable SOURCE, SINK, PROPAGATION, and SANITIZER, taint analysis should return the types of bugs we seek.

The Process

Two taint analysis tools were used to search for these types of bugs: Clang Static Analyzer and CodeQL. The scanning scope of source code is limited to storage/ndb/src/kernel/ only. We will restrict our search to low-hanging-fruit, which we define as two bug types only: (1) buffer overflows in memcpy-like functions and (2) array index OOB accesses. The version of MySQL Cluster we are using for our examples in this blog is 8.0.25.

Clang Static Analyzer

Clang Static AnalyzerΒ (CSA) has a checker,Β GenericTaintChecker, which provides the taint analysis feature. By default, it has a set ofΒ pre-definedΒ SOURCE and PROPAGATION values. The default SINK willΒ check some dangerous APIs and arguments, such as format string, command injection, buffer size in memcpy-like function, and so forth. GenericTaintChecker also shares tainted information withΒ ArrayBoundCheckerV2Β in order to recognize that the use of a value as an array index is a SINK. Users can also customize some simple SOURCE, SINK, PROPAGATION, and SANITIZER values by providing aΒ config file. If the config file cannot satisfy your requirement, such as for a more complicated semantic, you may have to write a new CSA checker in C++.

Using CSA for taint analysis, we first must let the checker know our SOURCE at (1). The config file cannot define an access to a variable, and writing a new checker would be an unwise expense of effort. Instead, I modified the code base to be analyzed, so that all the accesses of untrusted input have been replaced with something recognized as a pre-defined SOURCE. Some examples are shown as below:

Another untrusted SOURCE is the return of SegmentedSectionPtr in the getSection() function.

The default SINK missed some functions, which can easily be added to the config file as below:

Then, we can scan the project and get the reports using the following commands:

The Makefile2 file specified the target scanning directory:

The reports can be viewed in a browser by running the scan-view command to start up a local web server.

There is some duplication in the output, where multiple reports flag the same line of code. Also, we are interested only in reports that show taints reaching memcpy-like functions and array indexes. In the end, I found approximately 100 interesting reports.


CodeQLΒ also supports taint analysis. It refers to it asΒ taint tracking. There are no pre-defined SOURCEs or SINKs. Users define these withΒ the QL language. We defined our SOURCE and SINK as follows:

Once defined, we can scan the project with the following command line:

A quick cross-check of the scan results against the results from the Clang Static Analysis above showed that CodeQL was missing some bugs. After some investigation, the root cause was that PROPAGATION on some structure field accesses were not being recognized. A similar situation is discussedΒ here,Β and I enhanced the PROPAGATION as follows:

The generated report became longer, but the number of bugs found was excessive. After reviewing the report, I found that in some cases, validation was being performed by ptrCheckGuard(), arrGuard(), or other bounds checking. SANITIZER can help here to reduce the number. The bounds checking is assumed when there is an if statement that includes >, <, data-preserve-html-node="true" >=, or <=. data-preserve-html-node="true" Make sure that your SANITIZER does not accidentally drop some real bugs before applying this modification.

We then scanned the project again. The scanning can also be done in Visual Studio Code with CodeQL extension. However, some complicated queries are too slow to process and may fail due to memory exhaustion.

Some reports are duplicated at the same line of code. After de-duplicating, the number of interesting reports is around 320. However, the number should be fewer since some of them are still similar or even identical.

Here is theΒ sourceΒ of the final CodeQL.

Β The Results

After reviewing all the reports, I generated Proof of Concept (POC) manually to confirm each bug. CSA found 28 bugs. A total of 18 of these bugs are array index OOB vulnerabilities and 10 are overflows on memcpy-like functions. CodeQL found 34 bugs. This tool found 21 array index OOB bugs and 13 overflows on memcpy-like functions. Using these methods, we discovered 37 unique bugs, with 25 of them being found by both tools. Only nine of 37 of these bugs overlapped from ZDI submissions, which means 28 are new for us. These numbers not only mean that the taint analysis is useful in this scenario, but also mean that MySQL Cluster has quite a few bugs to be discovered.

Each of these two tools has its pros and cons. By using both Clang Static Analyzer and CodeQL, we can learn from their different feedback and improve the output from each tool. These variances can be compared to the concept of differential testing. The taint propagation provided by either tool has its deficiencies, but can still yield useful results.

The power of these tools could be extended further by adding additional bug classes in SINK. I recommend applying taint analysis to loop counters and pointer dereferences.


Due to the large-scale codebase and simple bug pattern present in MySQL Cluster, taint analysis was quite useful. It identified low-hanging-fruit bugs automatically and quickly. Each tool discussed has its own pros and cons, and we recommend trying more than one tool to get the best results. The difference can be the feedback, which can be used to improve the overall results. Furthermore, we cannot blindly trust the output of either tool and must verify them carefully.

Also, thanks to my colleagueΒ @RenoRobertr, who provided feedback and several contributions to this work. He will publish a write-up of additional work on MySQL Cluster soon with his advanced Binary Ninja skills. That blog should be available in a couple of days.

We are looking forward to seeing more submissions of this type in the future. Until then, you can find me on TwitterΒ @_wmliang_, and follow theΒ teamΒ for the latest in exploit techniques and security patches.

MindShaRE: When MySQL Cluster Encounters Taint Analysis

  • There are no more articles