❌

Normal view

There are new articles available, click to refresh the page.
Today β€” 7 May 2024NCC Group Research

Ghidra nanoMIPS ISA module

7 May 2024 at 17:11

Introduction

In late 2023 and early 2024, the NCC Group Hardware and Embedded Systems practice undertook an engagement to reverse engineer baseband firmware on several smartphones. This included MediaTek 5G baseband firmware based on the nanoMIPS architecture. While we were aware of some nanoMIPS modules for Ghidra having been developed in private, there was no publicly available reliable option for us to use at the time, which led us to develop our own nanoMIPS disassembler and decompiler module for Ghidra.

In the interest of time, we focused on implementing the features and instructions that we encountered on actual baseband firmware, and left complex P-Code instruction emulation unimplemented where it was not yet needed. Though the module is a work in progress, it still decompiles the majority of the baseband firmware we’ve analyzed. Combined with debug symbol information included with some MediaTek firmware, it has been very helpful in the reverse engineering process.

Here we will demonstrate how to load a MediaTek baseband firmware into Ghidra for analysis with our nanoMIPS ISA module.

Target firmware

For an example firmware to analyze, we looked up phones likely to include a MediaTek SoC with 5G support. Some relatively recent Motorola models were good candidates. (These devices were not part of our client engagement.)

We found many Android firmware images on https://mirrors.lolinet.com/firmware/lenomola/, including an image for the Motorola Moto Edge 2022, codename Tesla: https://mirrors.lolinet.com/firmware/lenomola/tesla/official/. This model is based on a MediaTek Dimensity 1050 (MT6879) SoC.

There are some carrier-specific variations of the firmware. We’ll randomly choose XT2205-1_TESLA_TMO_12_S2ST32.71-118-4-2-6_subsidy-TMO_UNI_RSU_QCOM_regulatory-DEFAULT_cid50_R1_CFC.xml.zip.

Extracting nanoMIPS firmware

The actual nanoMIPS firmware is in the md1img.img file from the Zip package.

To extract the content of the md1img file we also wrote some Kaitai structure definitions with simple Python wrapper scripts to run the structure parsing and output different sections to individual files. The ksy Kaitai definitions can also be used to interactively explore these files with the Kaitai IDE.

Running md1_extract.py with an --outdir option will extract the files contained within md1img.img:

$ ./md1_extract.py ../XT2205-1_TESLA_TMO_12_S2STS32.71-118-4-2-6-3_subsidy-TMO_UNI_RSU_QCOM_regulatory-DEFAULT_cid50_CFC/md1img.img --outdir ./md1img_out/
extracting files to: ./md1img_out
md1rom: addr=0x00000000, size=43084864
        extracted to 000_md1rom
cert1md: addr=0x12345678, size=1781
        extracted to 001_cert1md
cert2: addr=0x12345678, size=988
        extracted to 002_cert2
md1drdi: addr=0x00000000, size=12289536
        extracted to 003_md1drdi
cert1md: addr=0x12345678, size=1781
        extracted to 004_cert1md
cert2: addr=0x12345678, size=988
        extracted to 005_cert2
md1dsp: addr=0x00000000, size=6776460
        extracted to 006_md1dsp
cert1md: addr=0x12345678, size=1781
        extracted to 007_cert1md
cert2: addr=0x12345678, size=988
        extracted to 008_cert2
md1_filter: addr=0xffffffff, size=300
        extracted to 009_md1_filter
md1_filter_PLS_PS_ONLY: addr=0xffffffff, size=300
        extracted to 010_md1_filter_PLS_PS_ONLY
md1_filter_1_Moderate: addr=0xffffffff, size=300
        extracted to 011_md1_filter_1_Moderate
md1_filter_2_Standard: addr=0xffffffff, size=300
        extracted to 012_md1_filter_2_Standard
md1_filter_3_Slim: addr=0xffffffff, size=300
        extracted to 013_md1_filter_3_Slim
md1_filter_4_UltraSlim: addr=0xffffffff, size=300
        extracted to 014_md1_filter_4_UltraSlim
md1_filter_LowPowerMonitor: addr=0xffffffff, size=300
        extracted to 015_md1_filter_LowPowerMonitor
md1_emfilter: addr=0xffffffff, size=2252
        extracted to 016_md1_emfilter
md1_dbginfodsp: addr=0xffffffff, size=1635062
        extracted to 017_md1_dbginfodsp
md1_dbginfo: addr=0xffffffff, size=1332720
        extracted to 018_md1_dbginfo
md1_mddbmeta: addr=0xffffffff, size=899538
        extracted to 019_md1_mddbmeta
md1_mddbmetaodb: addr=0xffffffff, size=562654
        extracted to 020_md1_mddbmetaodb
md1_mddb: addr=0xffffffff, size=12280622
        extracted to 021_md1_mddb
md1_mdmlayout: addr=0xffffffff, size=8341403
        extracted to 022_md1_mdmlayout
md1_file_map: addr=0xffffffff, size=889
        extracted to 023_md1_file_map

The most relevant files are:

  • md1rom is the nanoMIPS firmware image
  • md1_file_map provides slightly more context on the md1_dbginfo file: its original filename is DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
  • md1_dbginfo is an XZ compressed binary file containing debug information for md1rom, including symbols

Extracting debug symbols

md1_dbginfo is another binary file format containing symbols and filenames with associated addresses. We’ll rename it and decompress it based on the filename from md1_file_map:

$ cp 018_md1_dbginfo DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
$ unxz DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31.xz
$ hexdump DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31 | head
00000000  43 41 54 49 43 54 4e 52  01 00 00 00 98 34 56 00  |CATICTNR.....4V.|
00000010  43 41 54 49 01 00 00 00  00 00 00 00 4e 52 31 36  |CATI........NR16|
00000020  2e 52 32 2e 4d 54 36 38  37 39 2e 54 43 32 2e 50  |.R2.MT6879.TC2.P|
00000030  52 31 2e 53 50 00 4d 54  36 38 37 39 5f 53 30 30  |R1.SP.MT6879_S00|
00000040  00 4d 54 36 38 37 39 5f  4e 52 31 36 2e 54 43 32  |.MT6879_NR16.TC2|
00000050  2e 50 52 31 2e 53 50 2e  56 31 37 2e 50 33 38 2e  |.PR1.SP.V17.P38.|
00000060  30 33 2e 32 34 2e 30 33  52 00 32 30 32 33 2f 30  |03.24.03R.2023/0|
00000070  35 2f 31 39 20 32 32 3a  33 31 00 73 00 00 00 2b  |5/19 22:31.s...+|
00000080  ed 53 00 49 4e 54 5f 56  65 63 74 6f 72 73 00 4c  |.S.INT_Vectors.L|
00000090  08 00 00 54 08 00 00 62  72 6f 6d 5f 65 78 74 5f  |...T...brom_ext_|

To extract information from the debug info file, we made another Kaitai definition and wrapper script that extracts symbols and outputs them in a text format compatible with Ghidra’s ImportSymbolsScript.py script:

$ ./mtk_dbg_extract.py md1img_out/DbgInfo_NR16.R2.MT6879.TC2.PR1.SP_LENOVO_S0MP1_K6879V1_64_MT6879_NR16_TC2_PR1_SP_V17_P38_03_24_03R_2023_05_19_22_31 | tee dbg_symbols.txt
INT_Vectors 0x0000084c l
brom_ext_main 0x00000860 l
INT_SetPLL_Gen98 0x00000866 l
PLL_Set_CLK_To_26M 0x000009a2 l
PLL_MD_Pll_Init 0x000009da l
INT_SetPLL 0x000009dc l
INT_Initialize_Phase1 0x027b5c80 l
INT_Initialize_Phase2 0x027b617c l
init_cm 0x027b6384 l
init_cm_wt 0x027b641e l
...

(Currently the script is set to only output label definitions rather than function definitions, as it was unknown if all of the symbols were for functions.)

Loading nanoMIPS firmware into Ghidra

Install the extension

First, we’ll have to install the nanoMIPS module for Ghidra. In the main Ghidra window, go to β€œFile > Install Extensions”, click the β€œAdd Extension” plus button, and select the module Zip file (e.g., ghidra_11.0.3_PUBLIC_20240424_nanomips.zip). Then restart Ghidra.

Initial loading

Load md1rom as a raw binary image. Select 000_md1rom from the md1img.img extract directory and keep β€œRaw Binary” as the format. For Language, click the β€œBrowse” ellipsis and find the little endian 32-bit nanoMIPS option (nanomips:LE:32:default) using the filter, then click OK.

We’ll load the image at offset 0 so no further options are necessary. Click OK again to load the raw binary.

When Ghidra asks if you want to do an initial auto-analysis, select No.Β We have to set up a mirrored memory address space at 0x90000000 first.

Memory mapping

Open the β€œMemory Map” window and click plus for β€œAdd Memory Block”.

We’ll name the new block β€œmirror”, set the starting address to ram:90000000, the length to match the length of the base image β€œram” block (0x2916c40), permissions to read and execute, and the β€œBlock Type” to β€œByte Mapped” with a source address of 0 and mapping ratio of 1:1.

Also change the permissions for the original β€œram” block to just read and execute. Save the memory map changes and close the β€œMemory Map” window.

Note that this memory map is incomplete; it’s just the minimal setup required to get disassembly working.

Debug symbols

Next, we’ll load up the debug symbols. Open the Script Manager window and search for ImportSymbolsScript.py. Run the script and select the text file generated by mtk_dbg_extract.py earlier (dbg_symbols.txt). This will create a bunch of labels, most of them in the mirrored address space.

Disassembly

Now we can begin disassembly. There is a jump instruction at address 0 that will get us started, so just select the byte at address 0 and press β€œd” or right-click and choose β€œDisassemble”. Thanks to the debug symbols, you may notice this instruction jumps to the INT_Initialize_Phase1 function.

Flow-based disassembly will now start to discover a bunch of code. The initial disassembly can take several minutes to complete.

Then we can run the normal auto-analysis with β€œAnalysis > Auto Analyze…”. This should also discover more code and spend several minutes in disassembly and decompilation. We’ve found that the β€œNon-Returning Functions” analyzer creates many false positives with the default configuration in these firmware images, which disrupts the code flow, so we recommend disabling it for initial analysis.

The one-shot β€œDecompiler Parameter ID” analyzer is a good option to run next for better detection of function input types.

Conclusion

Although the module is still a work in progress, the results are already quite useable for analysis and allowed to us to reverse engineer some critical features in baseband processors.

The nanoMIPS Ghidra module and MediaTek binary file unpackers can be found on our GitHub at:

❌
❌