❌

Normal view

There are new articles available, click to refresh the page.
Before yesterdayNVISO Labs

Intercept Flutter traffic on iOS and Android (HTTP/HTTPS/Dio Pinning)

18 August 2022 at 15:54

Some time ago I wrote some articles on how to Man-In-The-Middle Flutter on iOS, Android (ARM) and Android (ARM64). Those posts were quite popular and I often went back to copy those scripts myself.

Last week, however, we received a Flutter application where the script wouldn’t work anymore. As we had the source code, it was easy to figure out that the application was using the dio package to perform SSL Pinning.

While it would be possible to remove the pinning logic and recompile the app, it’s much nicer if we can just disable it at runtime, so that we don’t have to recompile ourselves. The result of this post is a Frida script that works both on Android and iOS, and disables the full TLS verification including the pinning logic.

TL;DR

The test app

As usual, we’ll create a test app to validate our script. I’ve created a basic Flutter app similar to the previous posts which has three buttons: HTTP, HTTPS and HTTPS (Pinned).

The app can be found on the GitHub page and an APK and IPA build are available. The Dio pinning logic is pretty straightforward:

ByteData data = await rootBundle.load('raw/certificate.crt');
Dio dio = Dio();
(dio.httpClientAdapter as DefaultHttpClientAdapter).onHttpClientCreate  = (client) {
  SecurityContext sc = new SecurityContext();
  sc.setTrustedCertificatesBytes(data.buffer.asUint8List());
  HttpClient httpClient = new HttpClient(context: sc);
  return httpClient;
};

try {
  Response response = await dio.get("https://www.nviso.eu/?dio");
  _status = "HTTPS: SUCCESS (" + response.headers.value("date")! + ")" ;
} catch (e) {
  print("Request via DIO failed");
  print("Exception: $e");
  _status = "DIO: ERROR";
}

The new approach

Originally, we hooked the ssl_crypto_x509_session_verify_cert_chain function, which can currently be found at line 361 of ssl_x509.cc. This method is responsible for validating the certificate chain, so if this method returns true, the certificate chain must be valid and the connection is accepted.

When performing a MitM on the test app on Android ARM64, the following error is printed in logcat:

3540  3585 I flutter : Request via DIO failed
3540  3585 I flutter : Exception: DioError [DioErrorType.other]: HandshakeException: Handshake error in client (OS Error: 
3540  3585 I flutter : 	CERTIFICATE_VERIFY_FAILED: self signed certificate in certificate chain(handshake.cc:393))
3540  3585 I flutter : Source stack:
3540  3585 I flutter : #0      DioMixin.fetch (package:dio/src/dio_mixin.dart:488)
3540  3585 I flutter : #1      DioMixin.request (package:dio/src/dio_mixin.dart:483)
3540  3585 I flutter : #2      DioMixin.get (package:dio/src/dio_mixin.dart:61)
3540  3585 I flutter : #3      _MyHomePageState.callPinnedHTTPS (package:flutter_pinning_demo/main.dart:124)
3540  3585 I flutter : <asynchronous suspension>
3540  3585 I flutter : HandshakeException: Handshake error in client (OS Error: 
3540  3585 I flutter : 	CERTIFICATE_VERIFY_FAILED: self signed certificate in certificate chain(handshake.cc:393))

Flutter gives us some nice information: there’s a self-signed certificate in the certificate chain, which it doesn’t like.

The original MitM script hooks session_verify_cert_chain, and for some reason the hooks were never triggered. The session_verify_cert_chain method is called from ssl_verify_peer_cert on line 386 and the error that is shown above results from OPENSSL_PUT_ERROR on line 393:

uint8_t alert = SSL_AD_CERTIFICATE_UNKNOWN;
  enum ssl_verify_result_t ret;
  if (hs->config->custom_verify_callback != nullptr) {
    ret = hs->config->custom_verify_callback(ssl, &alert);
    switch (ret) {
      case ssl_verify_ok:
        hs->new_session->verify_result = X509_V_OK;
        break;
      case ssl_verify_invalid:
        // If |SSL_VERIFY_NONE|, the error is non-fatal, but we keep the result.
        if (hs->config->verify_mode == SSL_VERIFY_NONE) {
          ERR_clear_error();
          ret = ssl_verify_ok;
        }
        hs->new_session->verify_result = X509_V_ERR_APPLICATION_VERIFICATION;
        break;
      case ssl_verify_retry:
        break;
    }
  } else {
    ret = ssl->ctx->x509_method->session_verify_cert_chain(
              hs->new_session.get(), hs, &alert)
              ? ssl_verify_ok
              : ssl_verify_invalid;
  }

  if (ret == ssl_verify_invalid) {
    OPENSSL_PUT_ERROR(SSL, SSL_R_CERTIFICATE_VERIFY_FAILED);
    ssl_send_alert(ssl, SSL3_AL_FATAL, alert);
  }

The code path that is most likely taken, is that a custom_verify_callback is registered, which makes line 368 return true, and the callback executed on line 369 returns ssl_verify_invalid. The code then jumps to line 392 and the ret variable does equal ssl_verify_invalid so the alert is shown.

uint8_t alert = SSL_AD_CERTIFICATE_UNKNOWN;
  enum ssl_verify_result_t ret;
  if (hs->config->custom_verify_callback != nullptr) {
    ret = hs->config->custom_verify_callback(ssl, &alert);
    switch (ret) {
      case ssl_verify_ok:
        hs->new_session->verify_result = X509_V_OK;
        break;
      case ssl_verify_invalid:
        // If |SSL_VERIFY_NONE|, the error is non-fatal, but we keep the result.
        if (hs->config->verify_mode == SSL_VERIFY_NONE) {
          ERR_clear_error();
          ret = ssl_verify_ok;
        }
        hs->new_session->verify_result = X509_V_ERR_APPLICATION_VERIFICATION;
        break;
      case ssl_verify_retry:
        break;
    }
  } else {
    ret = ssl->ctx->x509_method->session_verify_cert_chain(
              hs->new_session.get(), hs, &alert)
              ? ssl_verify_ok
              : ssl_verify_invalid;
  }

  if (ret == ssl_verify_invalid) {
    OPENSSL_PUT_ERROR(SSL, SSL_R_CERTIFICATE_VERIFY_FAILED);
    ssl_send_alert(ssl, SSL3_AL_FATAL, alert);
  }

The easiest approach would be to hook the ssl_verify_peer_cert function and modify the return value to be ssl_verify_ok, which is 0. By hooking this earlier method, both the default SSL validation and any custom validation is disabled. Unfortunately, the ssl_send_alert function already triggers an error and so modifying the return value of ssl_verify_peer_cert would be too late.

Fortunately, we can just throw out the entire function and replace it with a return 0 statement:

function hook_ssl_verify_peer_cert(address)
{
    Interceptor.replace(address, new NativeCallback((pathPtr, flags) => {
        console.log("[+] Certificate validation disabled");
        return 0;
    }, 'int', ['pointer', 'int']));
}

The only thing that’s left is finding the actual location of the ssl_verify_peer_cert function.

Finding the offsets

Manually

The approach which was explained in the previous blogposts can be followed to identify the ssl_verify_peer_cert function:

  • Find references to the string β€œx509.cc” and compare them to x509.cc to find session_verify_cert_chain
  • Find references to the method you identified in order to identify ssl_verify_peer_cert

Both x509.cc and handshake.cc use the OPENSSL_PUT_ERROR macro which swaps in the file name and line number, which you can use to identify the correct functions.

By pattern matching

Alternatively, we can use Frida’s pattern matching engine to search for functions that look very similar to the function from the demo app. The first bytes of a function are typically very stable, as long as the number of local variables and function arguments don’t change. Still, different compilers may generate different assembly code (e.g. usage of different registers or optimisations) so we do need to have some wildcards in our pattern.

After downloading and creating multiple Flutter apps with different Flutter versions, I came to the following list:

iOS x64: FF 83 01 D1 FA 67 01 A9 F8 5F 02 A9 F6 57 03 A9 F4 4F 04 A9 FD 7B 05 A9 FD 43 01 91 F? 03 00 AA 1? 00 40 F9 ?8 1A 40 F9 15 ?5 4? F9 B5 00 00 B4
Android x64: F? 0F 1C F8 F? 5? 01 A9 F? 5? 02 A9 F? ?? 03 A9 ?? ?? ?? ?? 68 1A 40 F9
Android x86: 2D E9 FE 43 D0 F8 00 80 81 46 D8 F8 18 00 D0 F8 ?? 71

These patterns should only result in one hit in the libFlutter library and all match to the start of the ssl_verify_peer_cert function.

The final script

Putting all of this together gives the following script. It’s one script that can be used on Android x86, Android x64 and iOS x64.

Check GitHub for the latest version

The script below may have been updated on the GitHub repo.

var TLSValidationDisabled = false;
var secondRun = false;
if (Java.available) {
    console.log("[+] Java environment detected");
    Java.perform(hookSystemLoadLibrary);
    disableTLSValidationAndroid();
    setTimeout(disableTLSValidationAndroid, 1000);
} else if (ObjC.available) {
    console.log("[+] iOS environment detected");
    disableTLSValidationiOS();
    setTimeout(disableTLSValidationiOS, 1000);
}

function hookSystemLoadLibrary() {
    const System = Java.use('java.lang.System');
    const Runtime = Java.use('java.lang.Runtime');
    const SystemLoad_2 = System.loadLibrary.overload('java.lang.String');
    const VMStack = Java.use('dalvik.system.VMStack');

    SystemLoad_2.implementation = function(library) {
        try {
            const loaded = Runtime.getRuntime().loadLibrary0(VMStack.getCallingClassLoader(), library);
            if (library === 'flutter') {
                console.log("[+] libflutter.so loaded");
                disableTLSValidationAndroid();
            }
            return loaded;
        } catch (ex) {
            console.log(ex);
        }
    };
}

function disableTLSValidationiOS() {
    if (TLSValidationDisabled) return;

    var m = Process.findModuleByName("Flutter");

    // If there is no loaded Flutter module, the setTimeout may trigger a second time, but after that we give up
    if (m === null) {
        if (secondRun) console.log("[!] Flutter module not found.");
        secondRun = true;
        return;
    }

    var patterns = {
        "arm64": [
            "FF 83 01 D1 FA 67 01 A9 F8 5F 02 A9 F6 57 03 A9 F4 4F 04 A9 FD 7B 05 A9 FD 43 01 91 F? 03 00 AA 1? 00 40 F9 ?8 1A 40 F9 15 ?5 4? F9 B5 00 00 B4 "
        ],
    };
    findAndPatch(m, patterns[Process.arch], 0);

}

function disableTLSValidationAndroid() {
    if (TLSValidationDisabled) return;

    var m = Process.findModuleByName("libflutter.so");

    // The System.loadLibrary doesn't always trigger, or sometimes the library isn't fully loaded yet, so this is a backup
    if (m === null) {
        if (secondRun) console.log("[!] Flutter module not found.");
        secondRun = true;
        return;
    }

    var patterns = {
        "arm64": [
            "F? 0F 1C F8 F? 5? 01 A9 F? 5? 02 A9 F? ?? 03 A9 ?? ?? ?? ?? 68 1A 40 F9",
        ],
        "arm": [
            "2D E9 FE 43 D0 F8 00 80 81 46 D8 F8 18 00 D0 F8 ?? 71"
        ]
    };
    findAndPatch(m, patterns[Process.arch], Process.arch == "arm" ? 1 : 0);
}

function findAndPatch(m, patterns, thumb) {
    console.log("[+] Flutter library found");
    var ranges = m.enumerateRanges('r-x');
    ranges.forEach(range => {
        patterns.forEach(pattern => {
            Memory.scan(range.base, range.size, pattern, {
                onMatch: function(address, size) {
                    console.log('[+] ssl_verify_peer_cert found at offset: 0x' + (address - m.base).toString(16));
                    TLSValidationDisabled = true;
                    hook_ssl_verify_peer_cert(address.add(thumb));
                }
            });
        });
    });

    if (!TLSValidationDisabled) {
        if (secondRun)
            console.log('[!] ssl_verify_peer_cert not found. Please open an issue at https://github.com/NVISOsecurity/disable-flutter-tls-verification/issues');
        else
            console.log('[!] ssl_verify_peer_cert not found. Trying again...');
    }
    secondRun = true;
}

function hook_ssl_verify_peer_cert(address) {
    Interceptor.replace(address, new NativeCallback((pathPtr, flags) => {
        return 0;
    }, 'int', ['pointer', 'int']));
}

About the author

Jeroen Beckers
Jeroen Beckers

Jeroen Beckers is a mobile security expert working in the NVISO Software Security Assessment team. He is a SANS instructor and SANS lead author of the SEC575 course. Jeroen is also a co-author of OWASP Mobile Security Testing Guide (MSTG) and the OWASP Mobile Application Security Verification Standard (MASVS). He loves to both program and reverse engineer stuff.

Deobfuscating Android ARM64 strings with Ghidra: Emulating, Patching, and Automating

15 January 2024 at 08:00

In a recent engagement I had to deal with some custom encrypted strings inside an Android ARM64 app. I had a lot of fun reversing the app and in the process I learned a few cool new techniques which are discussed in this writeup.

This is mostly a beginner guide which explains step-by-step how you can tackle a problem like this. Feel free to try it out yourself, or just jump to the parts that interest you.

In this tutorial-like blogpost, we will:

  • Create a small test app that decrypts some strings in-memory
  • Use the Ghidra Emulator to figure out the decrypted value. This will require some manual intervention
  • Automate the decryption using Python

While I learned these techniques analyzing an Android app, they can of course be used on any ARM64 binary, and the general techniques work for any architecture.

Creating a test app

Let’s start with creating a small test app that decrypts some strings using a basic XOR algorithm. It’s always good to isolate the problem so that you can focus on solving it without other potential issues getting in the way. The code snippet below contains three encrypted strings, and a xorString function that takes a string and a key and performs the XOR operation to obtain the actual string. Additionally, there is a status integer for each string to indicate if the string has already been decrypted. The status integer is atomic, so that if multiple threads are using the same string, they won’t interfere with each other while decrypting the string. Using atomic status flags isn’t actually necessary in this small example, since we only have one thread, but it is what the original app was using, and it is very common to see this kind of approach.

#include <stdio.h>
#include <stdatomic.h>
#include <string.h>
#include <stdlib.h>

void xorString(char *str, char *key, int size, _Atomic int *status) {
    // Check and update status atomically
    int expected = 1;
    if (atomic_compare_exchange_strong(status, &expected, 0)) {
        // Perform XOR operation if the string is encrypted
        for (int i = 0; i < size; i++) {
            str[i] ^= key[i % 4];
        }
    }
}

char string1[] = {0x70,0xea,0xc7,0xd4,0x57,0xaf,0xfc,0xd7,0x4a,0xe3,0xcf,0x00};
_Atomic int status1 = 1; // 1 for encrypted
char string2[] = {0xce,0xc6,0x40,0x93,0xaf,0xf7,0x51,0x9f,0xfd,0xca,0x44,0x88,0xe6,0xdc,0x5a,0x00};
_Atomic int status2 = 1; // 1 for encrypted
char string3[] = {0x45,0xf6,0x8d,0x57,0x32,0xcf,0x80,0x4b,0x7a,0xf0,0x97,0x00};
_Atomic int status3 = 1; // 1 for encrypted

char key1[4] = {0x38, 0x8f, 0xab, 0xb8};
char key2[4] = {0x8f, 0xb3, 0x34, 0xfc};
char key3[4] = {0x12, 0x9f, 0xf9, 0x3f};

int main() {

    xorString(string1, key1, strlen(string1), &status1);
    xorString(string2, key2, strlen(string2), &status2);
    xorString(string3, key3, strlen(string3), &status3);

    printf("String 1: %sn", string1);
    printf("String 2: %sn", string2);
    printf("String 3: %sn", string3);

    return 0;
}

Note: The code above definitely still has race conditions, as one thread could be reading out the string before it is completely decrypted. However, I didn’t want to make the example more complex and this example has all the necessary ingredients to examine some interesting Ghidra functionality.

In order to compile it, let’s use the dockcross project, which allows us to very easily crosscompile via a docker instance:

docker run --rm dockcross/android-arm64 > ./dockcross-android-arm64
chmod +x dockcross-android-arm64
./dockcross-anroid-arm64 bash -c '$CC main.c -o main'
file main
# main: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /system/bin/linker64, not stripped

This will result in a non-stripped ARM aarch64 binary. In a real scenario, the binary would most likely be stripped, but for this exercise, that’s not needed. The binary can be pushed to our Android device and it will print the decrypted strings when running it:

adb push main /data/local/tmp/
adb shell /data/local/tmp/main

String 1: Hello World
String 2: Auto Decryption
String 3: With Python

Great! Let’s open this in Ghidra and get started. Create a new project, import the main binary and perform a quick auto analysis. This will give you the main Ghidra listing, and the decompiled main function is decompiled very cleanly:

Main ghidra listing

Looking at string1, string2 and string3, Ghidra doesn’t identify any interesting strings, which makes sense due to the fact that they are obfuscated:

Obfuscated strings

In the main listing, we can see three invocations to the xorString function:

xorString calls

The arguments are kept in x0, x1, x2 and x3. This is of course pretty standard for aarch64, though it’s not uncommon to see other calling conventions due to optimizations or obfuscation.

In this example, the xorString function is quite straightforward, but let’s just imagine it’s a bit more complex and we can’t immediately figure out how it works based on the listing or decompiled code. One way to figure out the decrypted string is to attach a debugger and put a breakpoint right after the function call. However, any Android app that has custom string encryption most likely has some kind of Runtime Application Self-Protection (RASP), which means a debugger (or Frida) will immediately be detected. So rather than trying to get that up and running, let’s use Ghidra’s emulator.

Tracing in Ghidra

Go to the main Ghidra project window and drag the main binary onto the emulator:

Start emulator

This will open the Ghidra emulator. We want to emulate the xorString function, which means we have to properly initialize all the registers. The first function call starts at 0x101860, so make sure that line is selected, and click the button to start a new trace:

Start trace

Before letting the trace continue, add a breakpoint to line 0x101890 which is right after the first call to xorString. You can add a breakpoint by selecting the line and pressing k or by right-mouse clicking all the way on the left (where the trace arrow is) and choosing β€˜Toggle Breakpoint’. Leave the default options and click OK.

Set breakpoint

Finally, click the green Resume button at the top or tap F5 to start the actual trace. After starting the trace, not much will actually happen. The emulator will continue executing until the PC (as indicated in the Registers window) is trying to execute 0x104018 which is not a valid instruction address. So what happened? We can restart the trace by selecting line 0x101860 and clicking the Emulator button. Apparently, the emulator goes into the strlen function to determine the length of the obfuscated string. This strlen function is imported from an external library, and so the Emulator doesn’t have access to it.

There are at least two ways to get around this: Manual intervention, or creating custom sleigh code. Let’s take a look at both and we’ll start with manual intervention.

Manually skipping the strlen function

Start a new trace and continue until you’ve reached the call to strlen (0x10186c). Next, click the β€˜Skip instruction’ button to jump to the next line without actually executing the instruction:

Skip instruction

Of course, since we skipped the strlen function, the correct value is not in x0. We can patch this manually by opening the Registers window and filtering on x. The current value of x0 is 0x103c00 (the location of the string) and we need to replace this with the length. The string1 variable is a null-terminated string (otherwise strlen wouldn’t work) so we can take a look at the memory location (0x103c00) and count the number of characters. We can also label it as a c-style string by right-mouse clicking > Data > TerminatedCString

Assign data type

You can now hover over the ds to see that the length of the string is 12 (0xc), but we have to subtract one for the null byte so the length is 11. Back in our Registers window, we can now change the value of x0 to 11. Before you modify the register, click the button at the top of the registers window to enable editing and then double click the value of the x0 register and update it to 11:

Edit registers

Emulating strlen

As an alternative, we can create some custom SLEIGH code that is run instead of the strlen function. First, close all the current traces via the Threads window. Next, start a new trace at 0x101860 and then add a breakpoint on the strlen call at 0x10186c. In the breakpoints window, right-mouse click on the breakpoint and choose β€˜Set Injection (Emulator)’:

Adding breakpoint injection

As the injection, we’ll use the following code:

# Initialize counter variable
x8=0;
# Top of our for-loop
<loop>
# If we read a null-byte, we know the length
if (*:1 (x0+x8) == 0) goto <exit>;
# Increase the counter
x8 = x8+1;
# Jump back to the top
goto <loop>;
<exit>
# Assign counter to x0
x0=x8;
# Don't execute the current line in the listing
emu_skip_decoded();

Normally we would have to allocate some space on the stack to store the old value of x8, but since we know that x8 will be overwritten in the line after our strlen call, we know it’s free to use. This code is very similar to the code from the official Ghidra documentation, just adapted for AARCH64. You can also continue reading the official documentation to figure out how to make this SLEIGH injection work for any call to strlen rather than just this single occurrence.

Before stepping through the trace, choose Debugger > Configure Emulator > Invalidate Emulator Cache, just to make sure the Emulator will pick up on our custom SLEIGH code. Finally, we can step through the trace and this time it will skip over the strlen call and store the length of the string in x0:

Success!

Continuing with the trace

Continue with single steps until you get to line 0x10178c. When stepping over this stlxr instruction, Ghidra throws an error:

Sleigh userop 'ExclusiveMonitorPass' is not in the library ghidra.pcode.exec.ComposedPcodeUseropLibrary@73117252
ghidra.pcode.exec.PcodeExecutionException: Sleigh userop 'ExclusiveMonitorPass' is not in the library ghidra.pcode.exec.ComposedPcodeUseropLibrary@73117252
    at ghidra.pcode.exec.PcodeExecutor.step(PcodeExecutor.java:275)
    at ghidra.pcode.exec.PcodeExecutor.finish(PcodeExecutor.java:178)
    at ghidra.pcode.exec.PcodeExecutor.execute(PcodeExecutor.java:160)
    at ghidra.pcode.exec.PcodeExecutor.execute(PcodeExecutor.java:135)
    at ghidra.pcode.emu.DefaultPcodeThread.executeInstruction(DefaultPcodeThread.java:586)
    at ghidra.pcode.emu.DefaultPcodeThread.stepInstruction(DefaultPcodeThread.java:417)
    at ghidra.trace.model.time.schedule.Stepper$Enum$1.tick(Stepper.java:25)
    at ghidra.trace.model.time.schedule.TickStep.execute(TickStep.java:74)
    at ghidra.trace.model.time.schedule.Step.execute(Step.java:182)
    at ghidra.trace.model.time.schedule.Sequence.execute(Sequence.java:392)
    at ghidra.trace.model.time.schedule.TraceSchedule.finish(TraceSchedule.java:400)
    at ghidra.app.plugin.core.debug.service.emulation.DebuggerEmulationServicePlugin.doEmulateFromCached(DebuggerEmulationServicePlugin.java:722)
    at ghidra.app.plugin.core.debug.service.emulation.DebuggerEmulationServicePlugin.doEmulate(DebuggerEmulationServicePlugin.java:770)
    at ghidra.app.plugin.core.debug.service.emulation.DebuggerEmulationServicePlugin$EmulateTask.compute(DebuggerEmulationServicePlugin.java:261)
    at ghidra.app.plugin.core.debug.service.emulation.DebuggerEmulationServicePlugin$EmulateTask.compute(DebuggerEmulationServicePlugin.java:251)
    at ghidra.app.plugin.core.debug.service.emulation.DebuggerEmulationServicePlugin$AbstractEmulateTask.run(DebuggerEmulationServicePlugin.java:238)
    at ghidra.util.task.Task.monitoredRun(Task.java:134)
    at ghidra.util.task.TaskRunner.lambda$startTaskThread$0(TaskRunner.java:106)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
    at java.base/java.lang.Thread.run(Thread.java:1589)
Caused by: ghidra.pcode.exec.SleighLinkException: Sleigh userop 'ExclusiveMonitorPass' is not in the library ghidra.pcode.exec.ComposedPcodeUseropLibrary@73117252
    at ghidra.pcode.exec.PcodeExecutor.onMissingUseropDef(PcodeExecutor.java:578)
    at ghidra.pcode.emu.DefaultPcodeThread$PcodeThreadExecutor.onMissingUseropDef(DefaultPcodeThread.java:205)
    at ghidra.pcode.exec.PcodeExecutor.executeCallother(PcodeExecutor.java:562)
    at ghidra.pcode.exec.PcodeExecutor.stepOp(PcodeExecutor.java:249)
    at ghidra.pcode.emu.DefaultPcodeThread$PcodeThreadExecutor.stepOp(DefaultPcodeThread.java:182)
    at ghidra.pcode.exec.PcodeExecutor.step(PcodeExecutor.java:268)
    ... 20 more

---------------------------------------------------
Build Date: 2023-Sep-28 1301 EDT
Ghidra Version: 10.4
Java Home: /usr/lib/jvm/java-19-openjdk-amd64
JVM Version: Private Build 19.0.2
OS: Linux 5.15.0-76-generic amd64

Apparently the call to ExclusiveMonitorPass isn’t implemented for this Emulator so it doesn’t know what to do. The ExclusiveMonitorPass is there because of the atomic status flag which makes sure that different threads don’t interfere with each other while decrypting the string. We could simulate the call again with some custom SLEIGH code, but since our emulation is single-threaded anyway, let’s patch the code to remove the call altogether.

Patching the decryption function

Currently, the decompiled code looks like this:

void xorString(long param_1,long param_2,int param_3,int *param_4)
{
  int iVar1;
  char cVar2;
  bool bVar3;
  int local_30;

  do {
    iVar1 = *param_4;
    if (iVar1 != 1) break;
    cVar2 = 'x01';
    bVar3 = (bool)ExclusiveMonitorPass(param_4,0x10);
    if (bVar3) {
      *param_4 = 0;
      cVar2 = ExclusiveMonitorsStatus();
    }
  } while (cVar2 != '');
  if (iVar1 == 1) {
    for (local_30 = 0; local_30 < param_3; local_30 = local_30 + 1) {
      *(byte *)(param_1 + local_30) =
           *(byte *)(param_1 + local_30) ^ *(byte *)(param_2 + local_30 % 4);
    }
  }
  return;
}

The general flow is described in this ARMv8-A Synchronization primitives document which explains how ExclusiveMonitors work. While the decompiler understands the special exclusive stlxr and ldaxr commands, the emulator does not. In the snippet below, I’ve renamed (L) and retyped (CTRL+L) the variables, and added some constants and comments to make it a bit clearer:

void xorString(char *p_string,char *p_key,int p_length,int *p_status)
{
  int counter;
  bool wasAbleToStore;
  bool monitorIsReady;
  int status;
  int ENCRYPTED = 1;
  int DECRYPTED = 0;

  do {
    status = *p_status;
    // If the string isn't encrypted, no need to do more work
    if (status != ENCRYPTED) break;
    wasAbleToStore = true;

    // Check whether the given address is part of the Exclusive Monitor of the current PE
    monitorIsReady = (bool)ExclusiveMonitorPass(p_status,0x10);
    if (monitorIsReady) {
      // Try to store the value. This will also lift the exclusion in case the write was successful
      *p_status = DECRYPTED;
      // If the store was unsuccessful, wasAbleToStore will become false
      wasAbleToStore = (bool)ExclusiveMonitorsStatus();
    }
  } while (wasAbleToStore != false);

  // Only encrypt if status == ENCRYPTED
  if (status == ENCRYPTED) {
    for (counter = 0; counter < p_length; counter = counter + 1) {
      p_string[counter] = p_string[counter] ^ p_key[counter % 4];
    }
  }
  return;
}

The code will figure out if the string still needs to be decrypted and if so, try to update the status. If this status update succeeds, the thread will continue on to the decryption algorithm and start XORing the different characters. Since we are only using a single thread in the emulator, let’s just patch out the thread-specific logic while making as few modifications as possible.

If we look at the listing, the instruction that triggered the error is stlxr:

stlxr instruction

The stlxr instruction is the exclusive version of the non-exclusive str instruction which simply stores a value at a certain position. Let’s modify this instruction to str w12, [x11] by right-mouse clicking, choosing β€˜Patch Instruction’ and entering the new instruction:

Patching stlxr

In the next line (0x101790) the w10 register, which is no longer there, is checked and we jump to the top of the block if it’s not equal to zero (cbnz). This means that if we nop-out the cbnz instruction, the flow would just continue as if the store was successful. So right-mouse click > Patch Instruction and choose the nop command:

Nopping cbnz

If we now look at the decompilation view, the code is much more straightforward:

void xorString(char *p_string,char *p_key,int p_length,int *p_status)
{
  int counter;
  bool wasAbleToStore;

  if (*p_status == ENCRYPTED) {
    *p_status = 0;
    for (counter = 0; counter < p_length; counter = counter + 1) {
      p_string[counter] = p_string[counter] ^ p_key[counter % 4];
    }
  }
  return;
}

Let’s trace through the first invocation again. Start a new trace, add the custom SLEIGH injection breakpoint and put a breakpoint at line 0x101890. Once the second breakpoint hits, examine address 0x103c0c in the Dynamic listing window. The string has successfully been decrypted and we can convert it into a normal C string using right-mouse click > Data > TerminatedCString. Note that the normal listing still has the obfuscated string, since it is not automatically updated based on the emulator result.

Decrypted strings

Now that we know that the string β€˜Hello world’ is located at address 0x103c00, we can label it appropriately in the normal listing. Select the symbol name string1 and press L:

Assign label

The new label will automatically be used throughout the listing and the decompilation view:

Updated decompilation

The same technique can be used to decode the other two strings, but let’s just automate everything with some python to speed things up.

Automating with python

Automating Ghidra can be a bit tricky due to a few reasons:

  • We can choose between Python2 or Java, but no Python3
  • Not too many code examples
  • Documentation is limited

To solve the first problem, we could install Ghidraton or Ghidra bridge, but for simplicity, let’s just stick to importing a normal print function and using python2.

As for the actual code, the easiest solution by far (at least currently) is to use ChatGPT to generate it. It does a pretty good job and can quickly give you the necessary API calls for some prototyping.

There are a few different approaches we could take:

  • Create a script that we can trigger after selecting a specific call to xorString. We could even select all the relevant lines in the listing for each call so that the script knows exactly where to get the input from.
  • Find all references to the xorString function and try to find the correct input values automatically.

Let’s try the second approach for maximum convenience. This will allow us to run the script once and hopefully identify all obfuscated strings. The requirement is of course that Ghidra has identified all the correct cross-references to the xorString function.

The general structure is as follows:

  1. Retrieve all references to the chosen decryption function
  2. Check each reference to make sure it’s a function call
  3. Go up from the function call until we find correct values for all parameters (x0, x1, x2, x3)
  4. Extract correct values from memory (e.g. collect the byte array at x0)
  5. Use a custom python function to decrypt
  6. Assign the decrypted string as the label of the encrypted byte array in the listing

The most difficult part is definitely step 3 and will be very specific to your application. In the test application, it’s not too difficult. We actually only need x0 (string) and x1 (key) since we can calculate the length of the string ourselves and we don’t really need the status variable. x0 and x1 are defined across a few different statements, but we can actually make use of Ghidra’s calculations.

In the image below, we can see that at line 0x1018d4, Ghidra knows that x0 refers to string3, and at line 0x1018e0, Ghidra knows that x1 refers to key3. So let’s use that knowledge in our script, and search for the first occurrence (working backwards from the call) to where we have a resolved value for x0 and x1.

Interesting registers

One useful trick here is to select the line that has the information you want, and right-mouse click > Instruction Info.

Automatic address calculation

We can see that the value we are looking for (0x103c24: string3) can be accessed via the Address property of Operand-0. So we can scan the code looking for the first occurrence of x0 as Operand-0 and then extract the address.

The full script to resolve all the strings is given below. There might be better/faster ways to do this, but it works. This script can definitely fail for multiple reasons, but as a PoC it works very well. For each decrypted string, the label is updated and the data is converted into a TerminatedCString.

from __future__ import print_function
import os
import jarray
from ghidra.program.model.data import TerminatedStringDataType
from ghidra.program.model.mem import MemoryAccessException
from ghidra.program.model.symbol import SymbolTable, SourceType
from ghidra.program.model.data import CharDataType, ArrayDataType

global toAddr, getReferencesTo, getInstructionAt, currentProgram

def getCurrentProgram():
    return currentProgram

program = getCurrentProgram().getListing()
memory = getCurrentProgram().getMemory()

def main():
    decryptFunction = getState().getCurrentLocation().getAddress()
    functionStart = getStartOfFunction(decryptFunction)
    if decryptFunction != functionStart:
        print("Chosen instruction is inside of a function. Using first instruction of function instead")
        decryptFunction = functionStart

    print("Decrypt function: " + str(decryptFunction))

    # Obtain all references to the chosen function
    xrefs = getReferencesTo(decryptFunction)

    for xref in xrefs:

        # Find the caller, which is an address    
        caller = xref.getFromAddress()

        # Get the instruction at that address
        inst = getInstructionAt(caller)

        if inst:

            mnemonic = inst.getMnemonicString()
            # Interested in function calls
            if mnemonic == "bl":
                # Find x1, x2, x3 and x4
                x0 = getValue("x0", inst)
                x1 = getValue("x1", inst)
                x2 = getStringLength(x0)
                x3 = getValue("x3", inst)
                print("Found call at", caller,"Decoding with arguments: ", x0, x1, x2, x3);

                encryptedString = getMemoryBytes(x0, x2)
                status = getMemoryBytes(x3, 1)
                key = getMemoryBytes(x1, 4);
                decryptedValue = str(xorDecrypt(encryptedString, key))

                print("Decryption: ", decryptedValue, "n")
                assignPrimaryLabel(x0, "s_" + toCamelCase(decryptedValue))
                # Include the x00, so x2 + 1
                tagAsCharArray(x0, x2 + 1)

def assignPrimaryLabel(address, label_name):
    try:
        # Get the current program's symbol table
        symbolTable = getCurrentProgram().getSymbolTable()

        symbol = symbolTable.getPrimarySymbol(address)

        if symbol:
            symbol.setName(label_name, SourceType.USER_DEFINED)
        else:
            symbol = symbolTable.createLabel(address, label_name, SourceType.USER_DEFINED)
            symbol.setPrimary()

    except Exception as e:
        print("Error assigning label:", e)

def toCamelCase(input_string):
    words = input_string.split()
    # Capitalize the first letter of each word except the first one
    camelCaseString = words[0].lower() + ''.join(word.capitalize() for word in words[1:])
    return camelCaseString

def getValue(registerName, inst):
    # A safeguard to only go back 20 lines max
    c = 0
    while True:
        inst = inst.getPrevious()

        register = inst.getRegister(0)

        if register and register.getName() == registerName:
            primRef = inst.getPrimaryReference(0)
            if primRef:
                return primRef.getToAddress()
        c += 1
        if c > 20:
            return None

def assignString(addr, name):
    existingData = program.getDataContaining(addr)

    if not existingData or not isinstance(existingData.getDataType(), TerminatedStringDataType):
        program.clearCodeUnits(addr, addr, False)
        program.createData(addr, TerminatedStringDataType())

def tagAsCharArray(address, length):
    dataManager = getCurrentProgram().getListing()
    charDataType = CharDataType()  # Define the char data type
    charArrayDataType = ArrayDataType(charDataType, length, charDataType.getLength())  # Create an array of chars

    try:
        endAddress = address.add(length)
        dataManager.clearCodeUnits(address, endAddress, False)

        # Apply the char array data type at the given address
        dataManager.createData(address, charArrayDataType)
    except Exception as e:
        print("Error creating char array at address:", e)

def getStringLength(address):
    # print("String length of ", address)
    length = 0
    while True:
        # Read a single byte
        byteValue = memory.getByte(address)

        # Check if the byte is the null terminator
        if byteValue == 0:
            break

        # Move to the next byte
        address = address.add(1)
        length += 1
    return length

def getStartOfFunction(address):
    return program.getFunctionContaining(address).getEntryPoint()

def xorDecrypt(encodedString, key):
    result = bytearray()
    key_length = len(key)
    for i in range(len(encodedString)):
        result.append(encodedString[i] ^ key[i % key_length])
    return result

def getMemoryBytes(address, length):
    try:
        # Create a byte array to hold the memory contents
        byte_array = jarray.zeros(length, 'b')

        # Read memory into the byte array
        if memory.getBytes(address, byte_array) != length:
            print("Warning: Could not read the expected number of bytes.")

        return byte_array
    except MemoryAccessException as e:
        print("Memory access error:", e)
        return bytearray()
    except Exception as e:
        print("An error occurred:", e)
        return bytearray()

main();

Note: It’s not possible to use a bytearray as the second argument for memory.getBytes (see this Ghidra issue). Using a python bytearray will result in an empty bytearray.

To run this script, open the Script Manager (Window > Script Manager) and click β€˜Create New Script’ in the top right. Choose Python and give it a name (e.g. xorStringDecrypter). Paste the content of the script, make sure you select a line of code somewhere inside the xorString function and finally click Run. You should see the following output:

xorStringDecrypter.py> Running...
Decrypt function: 00101754
Found call at 0010188c Decoding with arguments:  00103c00 00103c34 11 00103c0c
Decryption:  Hello World 

Found call at 001018bc Decoding with arguments:  00103c10 00103c38 15 00103c20
Decryption:  Auto Decryption 

Found call at 001018ec Decoding with arguments:  00103c24 00103c3c 11 00103c30
Decryption:  With Python 

xorStringDecrypter.py> Finished!

The decompiler output is automatically updated, and the data is tagged correctly:

Conclusion

There are many ways to solve the different problems listed in this mini-tutorial, and it’s always good to have multiple techniques in your toolbelt. Automation can help tremendously, but you’ll often have to write custom scripts that work for your specific problem.

Finally, if you know of some interesting additional techniques, or maybe faster ways to do something, leave a comment!

Jeroen Beckers

Jeroen Beckers is a mobile security expert working in the NVISO Software Security Assessment team. He is a SANS instructor and SANS lead author of the SEC575 course. Jeroen is also a co-author of OWASP Mobile Security Testing Guide (MSTG) and the OWASP Mobile Application Security Verification Standard (MASVS). He loves to both program and reverse engineer stuff.

Unpacking Flutter hives

13 March 2024 at 08:00
Unpacking Flutter Hives

Intro

When analyzing the security of mobile applications, it’s important to verify that all data is stored securely (See OWASP MASVS-STORAGE-1). A recent engagement involved a Flutter app that uses the Isar/Hive framework to store data. The engagement was unfortunately blackbox, so we did not have access to any of the source code. This especially makes the assessment more difficult, as Flutter is pretty difficult to decompile, and tools like Doldrums or reFlutter only work for very specific (and old) versions. Frida can be used (see e.g. Intercepting Flutter traffic)

The files we extracted from the app were encrypted and we needed to figure out what kind of data was stored. For example, storing the password of the user (even if it’s encrypted) would be an issue, as the password can for example be extracted using a device backup.

In order to figure out how the data is encrypted, we needed to analyze the Hive framework and find some way to extract that data in cleartext. Hive is a β€œLightweight and blazing fast key-value database written in pure Dart.” which means we can’t easily monitor what is stored inside of the databases using Frida. There also isn’t a publicly available Hive viewer that we could find, and there’s a probably good reason for that, as we will see.

The goal of this blogpost is to obtain the content of an encrypted Hive without having access to the source code. This means we will:

  • Create a Flutter test app to get some useful Hives
  • Understand the internals of the Hive framework
  • Create a generic Hive reader that works on encrypted Hives containing custom objects
  • Obtain the password of the encrypted Hive
  • (Bonus) Recover deleted items

Let’s start!

Isar / Hive

Hive is a key-value framework built on top of Isar, which is a no-sql library for Flutter applications. It is possible to store all the simple Dart types, but also more complex types like a List or Map, or even custom objects via custom TypeAdapters. The project is currently in a transition phase to v4 so the focus is on v2.2.3, which is what the target application was most likely using.

While Hive is the name of the framework, what we are actually interested in are boxes. Boxes are the actual files that are stored on the system and each box contains one or more data frames. A data frame simply holds one key-value pair.

Each box is either plaintext or encrypted. The encryption is based on AES-256, which means you need a 256-bit key to open an encrypted box. The storage of this key is not the responsibility of Hive, and the documentation suggests to store your key using the flutter_secure_storage plugin. This is interesting, as the flutter_secure_storage plugin does use the system credential storage of the device to store data, so we can potentially intercept the key when it is being retrieved using Frida.

Keys are not encrypted!
One very important thing to realize is that an encrypted box is not actually fully encrypted. For each key-value pair that is stored, only the value is stored encrypted, while the key is stored in plaintext. This is mentioned in the documentation, but it’s easy to miss it. Now this is generally not a big deal, except of course if sensitive data is being used as the key (e.g. a user ID).

Creating a small test app

Let’s create a small Flutter application that uses Hive and saves some data into a box. The code for this was mostly generated by poking Chat-GPT so that we could spend our time reverse-engineering (and fighting XCode). For simplicity’s sake, I’m deploying to macOS so that the boxes are stored directly on the system and we can easily analyze them.

void main() async {
  WidgetsFlutterBinding.ensureInitialized();
  final dir = await getApplicationDocumentsDirectory();
  Hive.init(dir.path);
  await createBox();
  runApp(MaterialApp(home:MyApp()));
}

void createBox() async {
  // Storing examples of each supported datatype
  var box = await Hive.openBox('basicBox');
  box.put('myInt', 123); // int
  box.put('myDouble', 123.456); // double
  box.put(0x22, true); // bool
  box.put('myString', 'Hello Hive'); // String
  box.put('myBytes', Uint8List.fromList([68, 97, 114, 116])); // List<int>
  box.put('myList', [1, 2, 3]); // List<dynamic>
  box.put('myMap', {'name': 'Hive', 'isCool': true}); // Map<dynamic, dynamic>
  box.put('myDateTime', DateTime.now()); // DateTime
}
Dart

And our dependencies:

dependencies:
  flutter:
    sdk: flutter

  hive: ^2.2.3
  hive_flutter: ^1.1.0
  isar_flutter_libs: ^3.1.0+1
  path_provider: ^2.1.2
  file_picker: ^6.1.1
  path: ^1.9.0

dev_dependencies:
  flutter_test:
    sdk: flutter

  hive_generator: ^1.1.0 
  build_runner: ^2.0.1
YAML

When we run the application, it creates a new .hive file, which is in fact a box containing all of our key-value pairs:

(Terminal screenshots created using carbon.now.sh)

Hive internals

A box contains multiple frames, and each frame is responsible for indicating how long it is. There is no global index of the frame offsets, which means that we can’t jump directly to a specific frame and we have to parse all the frames one by one until we have parsed all frames. Each frame consists of a key (either a string or an index) and a value, which can be any default or custom type:

The Integer value (123) is stored in Float64 format so it looks a bit weird.

If the key is a String (frames 1 and 2), it has type 0x01, followed by the length and then the actual ASCII value. If the key is an int (frame 3), the encoding is slightly different. The type is 0x00 and the key is encoded as a uInt32. The key will be an int if you specify an int as the key (e.g. myBox.put(0x22, true)) or if you use the autoIncrement feature (myBox.add("test")).

If we run the application a second time, Hive will open the box from the filesystem (based on the name of the box) and load all the current values. When the put instructions are executed again, Hive doesn’t overwrite the frame belonging to the given key (as that would require the entire file to be shifted based on the new lengths, a very intensive operation), but rather it appends a new frame with the new value. As a result, running the code twice will double the size of the box. When the box is read, all frames are parsed sequentially and all key-value pairs simply overwrite any previously loaded information.

Deleting data
Even if you delete a value using .delete(β€œkey”), this simply appends a new delete frame. A delete frame is a frame with an empty value, indicating that the value has been deleted. The previous data is however not deleted from the box.

It is possible to optimize the box using the compact function, or Hive may do this automatically at some point based on the maximum file size of the box, which can be configured when opening the box. This feature is documented, but only under the advanced section. As a result, there is a very good chance that older values are still available in a box, even if they were deleted.

For example, let’s take the following box:

var emptyBox = await Hive.openBox("emptyBox");
emptyBox.put("mySecret", "Don't tell anyone");
emptyBox.delete("mySecret");
Dart

The created box still contains the secret if you look at the binary content:

The delete frame is simply a frame with a key and no value.

Custom types

In addition to storing normal Dart types in a box, it is possible to store custom types as long as Hive knows how to serialize/deserialize them. Let’s look at a quick example with a custom Bee class:

import 'package:hive/hive.dart';

part 'BeeModel.g.dart';

@HiveType(typeId: 1)
class Bee extends HiveObject{
  @HiveField(0)
  final String name;
  @HiveField(1)
  final int age;
  Bee({
    required this.name, 
    required this.age,
  });
}
Dart

The Bee class extends HiveObject and defines two string properties. It’s not technically necessary to extend the HiveObject class, but it makes things easier. We’ve also added annotations so that we can use the hive_generator package to automatically generate a serializer by running dart run build_runner build:

This will generate a new class called BeeModel.g.dart which takes care of serializing/deserializing:

// GENERATED CODE - DO NOT MODIFY BY HAND

part of 'BeeModel.dart';

// **************************************************************************
// TypeAdapterGenerator
// **************************************************************************

class BeeAdapter extends TypeAdapter<Bee> {
  @override
  final int typeId = 1;

  @override
  Bee read(BinaryReader reader) {
    final numOfFields = reader.readByte();
    final fields = <int, dynamic>{
      for (int i = 0; i < numOfFields; i++) reader.readByte(): reader.read(),
    };
    return Bee(
      name: fields[0] as String,
      age: fields[1] as int,
    );
  }

  @override
  void write(BinaryWriter writer, Bee obj) {
    writer
      ..writeByte(2)
      ..writeByte(0)
      ..write(obj.name)
      ..writeByte(1)
      ..write(obj.age);
  }

  @override
  int get hashCode => typeId.hashCode;

  @override
  bool operator ==(Object other) =>
      identical(this, other) ||
      other is BeeAdapter &&
          runtimeType == other.runtimeType &&
          typeId == other.typeId;
}
Dart

We can see that the serialization format is pretty straightforward: it first writes the number of fields (2), followed by an index-value pair which correspond to the HiveField annotations. The write() function is the function that is used during a normal put() operation, so the fields will have the same structure as seen earlier.

Finally, to be able to use this new type, the Adapter needs to be registered:

  Hive.registerAdapter(BeeAdapter());
  var beeBox = await Hive.openBox("beeBox");
  beeBox.put("myBee", Bee(name: "Barry", age: 1));
Dart

After running this code, the beeBox is generated, containing one frame:

Decoding unknown types

Let’s now assume that we have access to a box with unknown types. We can still load it, as long as we can figure out a suitable deserializer. It’s not a far stretch to assume that the developer has used the automatically generated adapter, so let’s focus on that. If they haven’t, you’ll have to dive into Ghidra and start disassembling the Hive deserialization, or make some educated guesses based on the hexdump.

We can take the BeeAdapter as a starting point, but rather than creating Bee objects, let’s create a generic List object in which we can store all the deserialized values. Luckily a List can contain any type of data in Dart, so we don’t have to worry about the actual types of the different fields. Additionally, we want to make the typeId dynamic since we want to register all of the possible custom typeIds.

The following GenericAdapter does exactly that:

import 'package:hive/hive.dart';

class GenericAdapter extends TypeAdapter<List> {
  @override
  final int typeId;

  GenericAdapter(this.typeId);

  @override
  List read(BinaryReader reader) {
    final numOfFields = reader.readByte();
    var list = List<dynamic>.filled(numOfFields, null, growable: true);

    for (var i = 0; i < numOfFields; i++) {
      list[reader.readByte()] = reader.read();
    }
    return list;
  }

  @override
  int get hashCode => typeId.hashCode;
  
  @override
  void write(BinaryWriter writer, List obj) {
    // No write needed
  }
}
Dart

We can then register this GenericAdapter for all the available custom typeIds (0 > 223) and read the beeBox we created earlier without needing the Bee or BeeAdapter class:

for(var i = 0; i<223; i++)
{
   Hive.registerAdapter(GenericAdapter(i));
}
var beeBox = await Hive.openBox("beeBox");
List myBee = beeBox.get("myBee");
print(myBee.toString()); // prints [Barry, 1]
Dart

Encrypted hives

As mentioned earlier, it’s possible to encrypt boxes. Let’s see how this changes the internals of the box:

final encryptionKey = Hive.generateSecureKey();
final encryptedBox = await Hive.openBox('encryptedBox', 
                          encryptionCipher: HiveAesCipher(encryptionKey));
encryptedBox.put("myString", "Hello World");
encryptedBox.close();
Dart

The code above generates the box below:

As explained earlier, the key is not encrypted, but the value is. The encryption covers all the bytes between the KEY and the CRC code. There is no special format for indicating an encrypted value, but Hive knows that it needs to decrypt the data due to the encryptionCipher parameter while opening the box. When the frames are read, the value is decrypted and parsed according to the normal deserialization logic. This means that we can use our GenericAdapter for encrypted boxes too, as long as we have the password.

Obtaining the password

Potentially the most tricky part, as this can be very easy, or very difficult. In general, there are a few different options:

  1. Intercept the password when it is loaded from storage
  2. Intercept the password when the box is opened
  3. Extract the password from storage

The first option is only possible if the password is actually stored somewhere (rather than being hardcoded). In the official Hive documentation, the developer recommends to use the flutter_secure_storage plugin, which will use either the KeyStore (Android) or KeyChain (iOS).

On Android, we can hook the Java code to intercept the password when it is loaded from the encrypted shared preferences. For example, there is the FlutterSecureStorage.read function which returns the value for a given key. By default, flutter optimizes the application in release mode, which means we can’t directly hook into FlutterSecureStorage.read because the class and method name will be stripped. It takes a little bit of effort to find the correct method, but the hook is straightforward:

Java.perform(() => {
    // Replace with correct class and method
    let a = Java.use("c0.a");
    a["l"].implementation = function (str) {
        console.log(`a.l is called: str=${str}`);
        let result = this["l"](str);
        console.log(`a.l result=${result}`);
        return result;
    };
});
JavaScript

Running this with Frida will print the base64 encoded password:

On iOS, the flutter_secure_storage plugin has moved to Swift, so intercepting the call is not straightforward. We do know, however, that the flutter_secure_storage plugin uses the KeyChain, and it does so without any additional encryption. This means we can obtain the password by dumping the keychain with objectionβ€˜s ios dump keychain command:

In case these options don’t work, you’ll probably want to dive into Ghidra and start reverse-engineering the app.

Recovering deleted items

We now have the password and a generic parser, so we can extract the items from the Hive. Unfortunately, if we use the normal API, we will only see the latest version of each item, or nothing at all in case there are delete frames. We could modify the Hive source code to notify us whenever a Frame is loaded (and there is actually some unreachable debugging code available that does just that), but it would be nicer to have a solution that doesn’t require a custom version of the library.

The way that Hive makes sure that only the latest version of an item is available is by adding each frame to a dictionary based on the frame’s key. Newer frames automatically overwrite older frames, so only the final value is kept. To make sure values don’t get overwritten, let’s just make sure that each frame key is unique by changing it if the key has already been used. Similarly, if we rename delete frames, they will not overwrite the old value either.

When we rename the key of a frame, we need to update the size of the frame and update the CRC32 checksum at the end so that Hive can still load the modified box. The following code copies a given box to a temporary location and updates all the frames to have unique names. It uses the Crc32 class which was copied from the source of the Hive framework so that we can be sure the logic is consistent:

Future<File> recoverHive(originalFile, HiveAesCipher? cipher) async {
  var filePath = await copyFileToTemp(originalFile);
  var file = File(filePath);
  var bytes = await file.readAsBytes();
  int offset = 0;
  var allFrames = BytesBuilder();
  var keyNames = <String, int>{};
  var keyInts = [];

  while (offset < bytes.length) {
    var frameLength = ByteData.sublistView(bytes, offset, offset + 4)
                              .getUint32(0, Endian.little);
    var keyOffset = offset + 4; // Skip frame length
    var endOffset = offset + frameLength;
    if (bytes.length > keyOffset + 2) {

      Uint8List newKey;
      int frameResize;
      int keyLength;
      if(bytes[keyOffset] == 0x01){
        // Key is String
        keyLength = bytes[keyOffset + 1];
        var keyBytes = bytes.sublist(keyOffset + 2, keyOffset + 2 + keyLength);
        var keyName = String.fromCharCodes(keyBytes);

         if (keyNames.containsKey(keyName)) {
            keyNames[keyName] = keyNames[keyName]! + 1;
            keyName = "${keyName}_${keyNames[keyName]}";
          } else {
            keyNames[keyName] = 1;
          }
          var modifiedKeyBytes = Uint8List.fromList(keyName.codeUnits);
          var modifiedKeyLength = modifiedKeyBytes.length;

          // get bytes for TYPE + LENGTH + VALUE
          var bb = BytesBuilder();
          bb.addByte(0x01);
          bb.addByte(modifiedKeyLength);
          bb.add(modifiedKeyBytes);
          newKey = bb.toBytes();
          frameResize = modifiedKeyLength - keyLength;
          keyLength += 2; // add the length of the type
      }
      else{
        // Key is int
        keyLength = 5; // type + uint32
        var keyIndexOffset = keyOffset + 0x01;
        var keyInt = ByteData.sublistView(bytes, keyIndexOffset, keyIndexOffset + 4)
                              .getUint32(0, Endian.little);

        while(keyInts.contains(keyInt)){
          keyInt += 1;
        }
        keyInts.add(keyInt);

        var index = ByteData(4)..setUint32(0, keyInt, Endian.little);
        
        // get bytes for TYPE + index
        var bb = BytesBuilder();
        bb.addByte(0x00);
        bb.add(index.buffer.asUint8List());
        newKey = bb.toBytes();
        frameResize = 0;
      }

      // If there is no value, it's a delete frame, so we don't add it again
      if(frameLength == keyLength + 8){ // 4 bytes CRC, 4 bytes frame length
        offset = endOffset;
        print("Dropping delete frame for " + newKey.toString());
        continue;
      }
      
      // Calculate new length of frame
      frameLength += frameResize;

      // Create a new frame bytes builder
      var frameBytes = BytesBuilder();
      
      // Prepare the frame length in ByteData and add it to the frame
      var frameLengthData = ByteData(4)..setUint32(0, frameLength, Endian.little);
      frameBytes.add(frameLengthData.buffer.asUint8List());

      // Add the new key
      frameBytes.add(newKey);
      
      // Add the rest of the frame after the original key. Don't include the CRC
      frameBytes.add(bytes.sublist(keyOffset + keyLength, endOffset-4));

      // Compute CRC using Hive's Crc32 class
      var newCrc = Crc32.compute(
        frameBytes.toBytes(),
        offset: 0,
        length: frameLength - 4,
        crc: cipher?.calculateKeyCrc() ?? 0,
      );

      // Write Crc code
      var newCrcBytes = Uint8List(4)..buffer.asByteData()
                                            .setUint32(0, newCrc, Endian.little);
      frameBytes.add(newCrcBytes);

      // Update the overall frames with the modified frame
      allFrames.add(frameBytes.toBytes());
    }

    offset = endOffset; // Move to the next frame
  }

  var reconstructedBytes = allFrames.takeBytes();

  try {
    await file.writeAsBytes(reconstructedBytes);
    print('Bytes successfully written to temporary file: ${file.path}');
  } catch (e) {
    print('Failed to write bytes to temporary file: $e');
  }
  return file;
}
Future<String> copyFileToTemp(String sourcePath) async {
  var sourceFile = File(sourcePath);
  // Generate a random subfolder name
  var rng = Random();
  var tempSubfolderName = "temp_${rng.nextInt(10000)}"; // Random subfolder name
  var tempDir = Directory.systemTemp.createTempSync(tempSubfolderName);
  
  // Create a File instance for the destination file in the new subfolder
  var tempFile = File('${tempDir.path}/${sourceFile.uri.pathSegments.last}');

  try {
    await sourceFile.copy(tempFile.path);
    print('File copied successfully to temporary directory: ${tempFile.path}');
  } catch (e) {
    print('Failed to copy file to temporary directory: $e');
  }
  return tempFile.path;
}
Dart

Putting it all together

Now that we can recover deleted items, read encrypted vaults and view custom objects, let’s put it all together. The target vault is created as follows:

final ultimateBox = await Hive.openBox('ultimateBox', 
                                    encryptionCipher: HiveAesCipher(hiveKey));
ultimateBox.add(123);
ultimateBox.add(456);
ultimateBox.deleteAt(1);
ultimateBox.put("myString", "Hello World");
ultimateBox.put("anotherString", "String2");
ultimateBox.add("Something");
ultimateBox.delete("myString");
ultimateBox.add(Bee(age: 12, name: "Barry"));
ultimateBox.put("test", 99999);
ultimateBox.put("anotherString", 200);
Dart

Reading is straightforward, we only need to specify the key and register the Generic adapter:

// Register the GenericAdapter for all available typeIds
for(var i = 0; i<223; i++)
{
  Hive.registerAdapter(GenericAdapter(i));
}
// Decode password and open box
var passwordBytes = base64.decode(password);
var encryptionCipher = HiveAesCipher(passwordBytes);
box = await Hive.openBox<dynamic>(boxName, path: directory, 
                                  encryptionCipher: encryptionCipher);
Dart

Finally, we can create a small UI around this functionality so that we can easily view all of the frames. In the screenshot below, we can see that the none of the data is deleted, and old values (anotherString > String 2) are still visible. The source code for this app can be found here.

Conclusion

It’s always faster to use an available library than create a solution yourself, but for security-critical applications it’s very important to fully understand the libraries you’re using. As we saw above, the Hive framework:

  • Keeps old values in the box until it is compacted
  • Only encrypts values, not keys

In this case, the documentation is clear on both facts, so it’s not really a security vulnerability. However, developers should be aware of the correct way to use the Hive framework in case any type of sensitive information is stored.

Finally, the fact that we don’t have access to the source code doesn’t stop us from identifying weaknesses, it just takes more time to reverse engineer the application/frameworks and develop custom tooling.

Jeroen Beckers

Jeroen Beckers is a mobile security expert working in the NVISO Software Security Assessment team. He is a SANS instructor and SANS lead author of the SEC575 course. Jeroen is also a co-author of OWASP Mobile Security Testing Guide (MSTG) and the OWASP Mobile Application Security Verification Standard (MASVS). He loves to both program and reverse engineer stuff.

❌
❌