🔒
There are new articles available, click to refresh the page.
Before yesterdayNCC Group Research

Detecting DNS implants: Old kitten, new tricks – A Saitama Case Study 

11 August 2022 at 15:20

Max Groot & Ruud van Luijk

TL;DR

A recently uncovered malware sample dubbed ‘Saitama’ was uncovered by security firm Malwarebytes in a weaponized document, possibly targeted towards the Jordan government. This Saitama implant uses DNS as its sole Command and Control channel and utilizes long sleep times and (sub)domain randomization to evade detection. As no server-side implementation was available for this implant, our detection engineers had very little to go on to verify whether their detection would trigger on such a communication channel. This blog documents the development of a Saitama server-side implementation, as well as several approaches taken by Fox-IT / NCC Group’s Research and Intelligence Fusion Team (RIFT) to be able to detect DNS-tunnelling implants such as Saitama.

Introduction

For its Managed Detection and Response (MDR) offering, Fox-IT is continuously building and testing detection coverage for the latest threats. Such detection efforts vary across all tactics, techniques, and procedures (TTP’s) of adversaries, an important one being Command and Control (C2). Detection of Command and Control involves catching attackers based on the communication between the implants on victim machines and the adversary infrastructure.  

In May 2022, security firm Malwarebytes published a two1-part2 blog about a malware sample that utilizes DNS as its sole channel for C2 communication. This sample, dubbed ‘Saitama’, sets up a C2 channel that tries to be stealthy using randomization and long sleep times. These features make the traffic difficult to detect even though the implant does not use DNS-over-HTTPS (DoH) to encrypt its DNS queries.  

Although DNS tunnelling remains a relatively rare technique for C2 communication, it should not be ignored completely. While focusing on Indicators of Compromise (IOC’s) can be useful for retroactive hunting, robust detection in real-time is preferable. To assess and tune existing coverage, a more detailed understanding of the inner workings of the implant is required. This blog will use the Saitama implant to illustrate how malicious DNS tunnels can be set-up in a variety of ways, and how this variety affects the detection engineering process.  

To assist defensive researchers, this blogpost comes with the publication of a server-side implementation of Saitama. This can be used to control the implant in a lab environment. Moreover, ‘on the wire’ recordings of the implant that were generated using said implementation are also shared as PCAP and Zeek logs. This blog also details multiple approaches towards detecting the implant’s traffic, using a Suricata signature and behavioural detection. 

Reconstructing the Saitama traffic 

The behaviour of the Saitama implant from the perspective of the victim machine has already been documented elsewhere3. However, to generate a full recording of the implant’s behaviour, a C2 server is necessary that properly controls and instructs the implant. Of course, the source code of the C2 server used by the actual developer of the implant is not available. 

If you aim to detect the malware in real-time, detection efforts should focus on the way traffic is generated by the implant, rather than the specific domains that the traffic is sent to. We strongly believe in the “PCAP or it didn’t happen” philosophy. Thus, instead of relying on assumptions while building detection, we built the server-side component of Saitama to be able to generate a PCAP. 

The server-side implementation of Saitama can be found on the Fox-IT GitHub page. Be aware that this implementation is a Proof-of-Concept. We do not intend on fully weaponizing the implant “for the greater good”, and have thus provided resources to the point where we believe detection engineers and blue teamers have everything they need to assess their defences against the techniques used by Saitama.  

Let’s do the twist

The usage of DNS as the channel for C2 communication has a few upsides and quite some major downsides from an attacker’s perspective. While it is true that in many environments DNS is relatively unrestricted, the protocol itself is not designed to transfer large volumes of data. Moreover, the caching of DNS queries forces the implant to make sure that every DNS query sent is unique, to guarantee the DNS query reaches the C2 server.  

For this, the Saitama implant relies on continuously shuffling the character set used to construct DNS queries. While this shuffle makes it near-impossible for two consecutive DNS queries to be the same, it does require the server and client to be perfectly in sync for them to both shuffle their character sets in the same way.  

On startup, the Saitama implant generates a random number between 0 and 46655 and assigns this to a counter variable. Using a shared secret key (“haruto” for the variant discussed here) and a shared initial character set (“razupgnv2w01eos4t38h7yqidxmkljc6b9f5”), the client encodes this counter and sends it over DNS to the C2 server. This counter is then used as the seed for a Pseudo-Random Number Generator (PRNG). Saitama uses the Mersenne Twister to generate a pseudo-random number upon every ‘twist’. 

Function used by Saitama client to convert an integer into an encoded string

To encode this counter, the implant relies on a function named ‘_IntToString’. This function receives an integer and a ‘base string’, which for the first DNS query is the same initial, shared character set as identified in the previous paragraph. Until the input number is equal or lower than zero, the function uses the input number to choose a character from the base string and prepends that to the variable ‘str’ which will be returned as the function output. At the end of each loop iteration, the input number is divided by the length of the baseString parameter, thus bringing the value down. 

To determine the initial seed, the server has to ‘invert’ this function to convert the encoded string back into its original number. However, information gets lost during the client-side conversion because this conversion rounds down without any decimals. The server tries to invert this conversion by using simple multiplication. Therefore, the server might calculate a number that does not equal the seed sent by the client and thus must verify whether the inversion function calculated the correct seed. If this is not the case, the server literately tries higher numbers until the correct seed is found.   

Once this hurdle is taken, the rest of the server-side implementation is trivial. The client appends its current counter value to every DNS query sent to the server. This counter is used as the seed for the PRNG. This PRNG is used to shuffle the initial character set into a new one, which is then used to encode the data that the client sends to the server. Thus, when both server and client use the same seed (the counter variable) to generate random numbers for the shuffling of the character set, they both arrive at the exact same character set. This allows the server and implant to communicate in the same ‘language’. The server then simply substitutes the characters from the shuffled alphabet back into the ‘base’ alphabet to derive what data was sent by the client.  

Server-side implementation to arrive at the same shuffled alphabet as the client

Twist, Sleep, Send, Repeat

Many C2 frameworks allow attackers to manually set the minimum and maximum sleep times for their implants. While low sleep times allow attackers to more quickly execute commands and receive outputs, higher sleep times generate less noise in the victim network. Detection often relies on thresholds, where suspicious behaviour will only trigger an alert when it happens multiple times in a certain period.  

The Saitama implant uses hardcoded sleep values. During active communication (such as when it returns command output back to the server), the minimum sleep time is 40 seconds while the maximum sleep time is 80 seconds. On every DNS query sent, the client will pick a random value between 40 and 80 seconds. Moreover, the DNS query is not sent to the same domain every time but is distributed across three domains. On every request, one of these domains is randomly chosen. The implant has no functionality to alter these sleep times at runtime, nor does it possess an option to ‘skip’ the sleeping step altogether.  

Sleep configuration of the implant. The integers represent sleep times in milliseconds

These sleep times and distribution of communication hinder detection efforts, as they allow the implant to further ‘blend in’ with legitimate network traffic. While the traffic itself appears anything but benign to the trained eye, the sleep times and distribution bury the ‘needle’ that is this implant’s traffic very deep in the haystack of the overall network traffic.  

For attackers, choosing values for the sleep time is a balancing act between keeping the implant stealthy while keeping it usable. Considering Saitama’s sleep times and keeping in mind that every individual DNS query only transmits 15 bytes of output data, the usability of the implant is quite low. Although the implant can compress its output using zlib deflation, communication between server and client still takes a lot of time. For example, the standard output of the ‘whoami /priv’ command, which once zlib deflated is 663 bytes, takes more than an hour to transmit from victim machine to a C2 server. 

Transmission between server implementation and the implant

The implant does contain a set of hardcoded commands that can be triggered using only one command code, rather than sending the command in its entirety from the server to the client. However, there is no way of knowing whether these hardcoded commands are even used by attackers or are left in the implant as a means of misdirection to hinder attribution. Moreover, the output from these hardcoded commands still has to be sent back to the C2 server, with the same delays as any other sent command. 

Detection

Detecting DNS tunnelling has been the subject of research for a long time, as this technique can be implemented in a multitude of different ways. In addition, the complications of the communication channel force attackers to make more noise, as they must send a lot of data over a channel that is not designed for that purpose. While ‘idle’ implants can be hard to detect due to little communication occurring over the wire, any DNS implant will have to make more noise once it starts receiving commands and sending command outputs. These communication ‘bursts’ is where DNS tunnelling can most reliably be detected. In this section we give examples of how to detect Saitama and a few well-known tools used by actual adversaries.  

Signature-based 

Where possible we aim to write signature-based detection, as this provides a solid base and quick tool attribution. The randomization used by the Saitama implant as outlined previously makes signature-based detection challenging in this case, but not impossible. When actively communicating command output, the Saitama implant generates a high number of randomized DNS queries. This randomization does follow a specific pattern that we believe can be generalized in the following Suricata rule: 

alert dns $HOME_NET any -> any 53 (msg:"FOX-SRT - Trojan - Possible Saitama Exfil Pattern Observed"; flow:stateless; content:"|00 01 00 00 00 00 00 00|"; byte_test:1,>=,0x1c,0,relative; fast_pattern; byte_test:1,<=,0x1f,0,relative; dns_query; content:"."; content:"."; distance:1; content:!"."; distance:1; pcre:"/^(?=[0-9]+[a-z]\|[a-z]+[0-9])[a-z0-9]{28,31}\.[^.]+\.[a-z]+$/"; threshold:type both, track by_src, count 50, seconds 3600; classtype:trojan-activity; priority:2; reference:url, https://github.com/fox-it/saitama-server; metadata:ids suricata; sid:21004170; rev:1;) 

This signature may seem a bit complex, but if we dissect this into separate parts it is intuitive given the previous parts. 

Content Match  Explanation 
00 01 00 00 00 00 00 00  DNS query header. This match is mostly used to place the pointer at the correct position for the byte_test content matches. 
byte_test:1,>=,0x1c,0,relative;  Next byte should be at least decimal 25. This byte signifies the length of the coming subdomain 
byte_test:1,<=,0x1f,0,relative;  The same byte as the previous one should be at most 31. 
dns_query; content:”.”; content:”.”; distance:1; content:!”.”;  DNS query should contain precisely two ‘.’ characters 
pcre:”/^(?=[0-9][a-z]|[a-z][0-9])[a-z0-9] {28,31} 
\.[^.]\.[a-z]$/”; 
Subdomain in DNS query should contain at least one number and one letter, and no other types of characters.
threshold:type both, track by_src, count 50, seconds 3600  Only trigger if there are more than 50 queries in the last 3600 seconds. And only trigger once per 3600 seconds. 
Table one: Content matches for Suricata IDS rule

 
The choice for 28-31 characters is based on the structure of DNS queries containing output. First, one byte is dedicated to the ‘send and receive’ command code. Then follows the encoded ID of the implant, which can take between 1 and 3 bytes. Then, 2 bytes are dedicated to the byte index of the output data. Followed by 20 bytes of base-32 encoded output. Lastly the current value for the ‘counter’ variable will be sent. As this number can range between 0 and 46656, this takes between 1 and 5 bytes. 

Behaviour-based 

The randomization that makes it difficult to create signatures is also to the defender’s advantage: most benign DNS queries are far from random. As seen in the table below, each hack tool outlined has at least one subdomain that has an encrypted or encoded part. While initially one might opt for measuring entropy to approximate randomness, said technique is less reliable when the input string is short. The usage of N-grams, an approach we have previously written about4, is better suited.  

Hacktool  Example 
DNScat2  35bc006955018b0021636f6d6d616e642073657373696f6e00.domain.tld5 
Weasel  pj7gatv3j2iz-dvyverpewpnnu–ykuct3gtbqoop2smr3mkxqt4.ab.abdc.domain.tld 
Anchor  ueajx6snh6xick6iagmhvmbndj.domain.tld6 
Cobalt Strike  Api.abcdefgh0.123456.dns.example.com or   post. 4c6f72656d20697073756d20646f6c6f722073697420616d65742073756e74207175697320756c6c616d636f20616420646f6c6f7220616c69717569702073756e7420636f6d6d6f646f20656975736d6f642070726.c123456.dns.example.com 
Sliver  3eHUMj4LUA4HacKK2yuXew6ko1n45LnxZoeZDeJacUMT8ybuFciQ63AxVtjbmHD.fAh5MYs44zH8pWTugjdEQfrKNPeiN9SSXm7pFT5qvY43eJ9T4NyxFFPyuyMRDpx.GhAwhzJCgVsTn6w5C4aH8BeRjTrrvhq.domain.tld 
Saitama  6wcrrrry9i8t5b8fyfjrrlz9iw9arpcl.domain.tld 
Table two: Example DNS queries for various toolings that support DNS tunnelling

Unfortunately, the detection of randomness in DNS queries is by itself not a solid enough indicator to detect DNS tunnels without yielding large numbers of false positives. However, a second limitation of DNS tunnelling is that a DNS query can only carry a limited number of bytes. To be an effective C2 channel an attacker needs to be able to send multiple commands and receive corresponding output, resulting in (slow) bursts of multiple queries.  

This is where the second step for behaviour-based detection comes in: plainly counting the number of unique queries that have been classified as ‘randomized’. The specifics of these bursts differ slightly between tools, but in general, there is no or little idle time between two queries. Saitama is an exception in this case. It uses a uniformly distributed sleep between 40 and 80 seconds between two queries, meaning that on average there is a one-minute delay. This expected sleep of 60 seconds is an intuitive start to determine the threshold. If we aggregate over an hour, we expect 60 queries distributed over 3 domains. However, this is the mean value and in 50% of the cases there are less than 60 queries in an hour.  

To be sure we detect this, regardless of random sleeps, we can use the fact that the sum of uniform random observations approximates a normal distribution. With this distribution we can calculate the number of queries that result in an acceptable probability. Looking at the distribution, that would be 53. We use 50 in our signature and other rules to incorporate possible packet loss and other unexpected factors. Note that this number varies between tools and is therefore not a set-in-stone threshold. Different thresholds for different tools may be used to balance False Positives and False Negatives. 

In summary, combining detection for random-appearing DNS queries with a minimum threshold of random-like DNS queries per hour, can be a useful approach for the detection of DNS tunnelling. We found in our testing that there can still be some false positives, for example caused by antivirus solutions. Therefore, a last step is creating a small allow list for domains that have been verified to be benign. 

While more sophisticated detection methods may be available, we believe this method is still powerful (at least powerful enough to catch this malware) and more importantly, easy to use on different platforms such as Network Sensors or SIEMs and on diverse types of logs. 

Conclusion

When new malware arises, it is paramount to verify existing detection efforts to ensure they properly trigger on the newly encountered threat. While Indicators of Compromise can be used to retroactively hunt for possible infections, we prefer the detection of threats in (near-)real-time. This blog has outlined how we developed a server-side implementation of the implant to create a proper recording of the implant’s behaviour. This can subsequently be used for detection engineering purposes. 

Strong randomization, such as observed in the Saitama implant, significantly hinders signature-based detection. We detect the threat by detecting its evasive method, in this case randomization. Legitimate DNS traffic rarely consists of random-appearing subdomains, and to see this occurring in large bursts to previously unseen domains is even more unlikely to be benign.  

Resources

With the sharing of the server-side implementation and recordings of Saitama traffic, we hope that others can test their defensive solutions.  

The server-side implementation of Saitama can be found on the Fox-IT GitHub.  

This repository also contains an example PCAP & Zeek logs of traffic generated by the Saitama implant. The repository also features a replay script that can be used to parse executed commands & command output out of a PCAP. 

References

[1] https://blog.malwarebytes.com/threat-intelligence/2022/05/apt34-targets-jordan-government-using-new-saitama-backdoor/ 
[2] https://blog.malwarebytes.com/threat-intelligence/2022/05/how-the-saitama-backdoor-uses-dns-tunnelling/ 
[3] https://x-junior.github.io/malware%20analysis/2022/06/24/Apt34.html
[4] https://blog.fox-it.com/2019/06/11/using-anomaly-detection-to-find-malicious-domains/   

Wheel of Fortune Outcome Prediction – Taking the Luck out of Gambling

16 August 2022 at 19:50

Authored by: Jesús Miguel Calderón Marín

Introduction

Two years ago I carried out research into online casino games specifically focusing on roulette. As a result, I composed a detailed guide with information on classification of online roulette, potential vulnerabilities and the ways to detect them[1].

Although this guideline was particularly well-received by the security community, I felt that it was too theoretical and lacked a real-world example of a vulnerable casino game.
With this, I decided to carry out research on a real casino game in search of new vulnerabilities and exploit techniques. In case of success, I planned to share the results with the affected vendor[2] and afterwards with the community.

While I was looking for a target I had a look on a particular variant of the casino game ‘Wheel of fortune’. The wheel is spun manually by a croupier and not by any automated system. That caught my eye and made me think about the randomness of the winning numbers. Typically, pseudo random number generators (PRNGs) are one of the main targets when it comes to game security assessments. However, there is no a PRNG in this case. Apparently, the randomness relies on the number of times the croupier spins the wheel, which, in turn, depends on their arm strength among other properties. The question that immediately came to my mind was whether a croupier is a good ‘PRNG’?

Summary

IMPORTANT NOTE. For security reasons and in order to keep confidential the identity of the vendor and game affected, some data has been redacted or omitted and the name of the game was changed to a generic one (Big Six). In addition, screenshots of the real wheel and croupiers have been substituted by similar images specially created for this purpose.

Bix Six is a casino game based on Wheel of Fortune game. Briefly, it is a big vertical wheel where the player bets on the number it will stop on [3].

According to this security analysis, the outcome of the Big Six game is predictable enough in order that the house edge could be overcome and consequently a profit could be made in the long run. Generally speaking, croupiers unconsciously tend to spin the wheel a specific number of times hence the dispersion of the number of spins is too small. Consequently, some positions of the wheel had higher chances of winning and a player could benefit by betting on these positions.

Table of Contents

The rules

The wheel is comprised of 54 segments. The possible outcomes on the wheel are 1, 2, 5, 10, 20, 40, multiplier 1 (M1) and multiplier 2 (M2).

Figure 1 – Bix Six Wheel

Players bet on a number they think the wheel will land on and then the croupier spins the wheel. The bets must be placed within the table limits, which are shown on the screen. The colour around the countdown indicates when players can place bets (green), when betting time is nearly over (amber) and when no further bets can be placed for the current round (no countdown).

Figure 2 – Phases of the betting round

It is worth mentioning that the croupier starts spinning the wheel before the betting time is over and continues doing it for several seconds once the betting time is over and the betting panel is no longer available.

Some spins after, the winning number is determined and pay-outs are made on winning bets.

Figure 3 – Winning segment indicated by the leather pointer at the top of the wheel

Odds and pay-outs

The wheel can stop on the numbers 1, 2, 5, 10, 20 and 40. The pay-out of each segment is a bet multiplied by its number plus the stake. For example, if a player bets 15 pounds on number 10 and this turns out to be the winning number, the player is paid 165 pounds (15 x 10 + 15).

The segments M1 and M2 are multiplier segments, which makes the game more interesting. If the wheel stops on any of them, new bets are not accepted, and the wheel is spun again. However, any wins on the next spin are multiplied by [*REDACTED*] or [*REDACTED*], according to the multiplier the wheel stops on in the original spin. If the wheel stops on two or more multipliers, the final win is increased by as many times as all the multipliers before indicate.

The table below shows the number of stops, pay-out and house edge for each possible outcome:

Table 1 – Odds and pay-outs

Wheel tracker

A script was developed to record the behaviour and the outcome of the Bix Six online game. The obtained data included inter alia, initial speeds of the wheel, croupiers and winning numbers.

7,278 hands were recorded in April 2021 and subsequently analysed. The figures below show some of those hands.

Figure 4 – Tracked hands

Among most relevant data for analysis, the following is included:

  • winningNumber – the winning number displayed on the wheel as an outcome of every hand (1, 2, 5, 10, 20, 40, M1, o M2).
Figure 5 – Winning Number ‘2’
  • AbsolutePosition – unique number to identify unambiguously every segment. E.g. the yellow segment has the absolute position 0. This does not vary from hand to hand unlike relative positions (see the definition below).
Figure 6 – Segments’ identifiers (absolute positions 0, 18, 30, 43)
  • winningAbsolutePosition – the absolute position of the segment of the winning number. The following picture shows the winning number 40, which has the absolute position 0.
Figure 7 – Winning Absolute Position ‘0’
  • direction – the direction in which the wheel is spun. The value assigned to it is whether ‘CLOCKWISE’ or ‘ANTICLOCKWISE’.
  • positions_run – identical to the number of the wheel spins multiplied by 54 (the number of segments the wheel is divided into). For instance, if the wheel spins 1.5 times, the value of this variable will be 81 (1.5 * 54).
  • HAND_TIME (Initial position) – The moment in the video when the hand starts (e.g. 35.2 seconds from the beginning of the recording). This coincides with the instant before the betting panel is disabled and no longer available until the next game (specifically 0.5 seconds before). The position of the wheel at this moment will be referred to as the initial position from now on.
Figure 8 – Initial position – instant before the betting panel is disabled
Figure 9 – Instant when the betting panel is disabled
  • Relative positions – unique numbers to identify the segments of the wheel which are assigned at the initial position beginning from the segment on the top (position 1), followed by the next segment (position 2), etc. The next segment is on its left if the direction is clockwise or on its right if the direction is anticlockwise.
Figure 10 – Relative positions assigned at the initial position
  • winningRelativePosition – defines the relative position of the segment containing the winning number. It can be calculated using the following formula: round(positions_run % 54, 0) + 1. E.g. in the figures below, the blue segment that is in the relative position number 10 is the winning one. Therefore, the winning relative position is 10 for this hand.
Figure 11 – Initial position – Relative positions
Figure 12 – Final position – Winning relative position ‘10’

Wheel behaviour analysis

The values of the variables ‘winningAbsolutePosition’, ‘winningNumber’ and ‘winningRelativePosition’ have been analysed to establish the fact that they are random. In order to do this, the chi-square test of independence have been used to ascertain whether the difference between the analysed numbers distribution and the expected distribution is attributed to good luck or, on the contrary, to the lack of randomness, which could be eventually exploited by a malicious player. Should any further information about the method be required, the reference added to this document could be consulted [1][2].

Variables winningNumber and winningAbsolutePosition

The variables ‘winningNumber’ and ‘winningAbsolutePosition’ have successfully passed the test. Particularly, in case of ‘winningNumber’ the chi-squared statistic was 4.48. The critical value for the distribution chi-square with 7 degrees of freedom and the level of significance of 1% is 18.47[3]. As the critical value is significantly higher than chi-squared statistic (4.48), it is impossible to state that winning numbers are not random.

Similarly, the statistics for ‘winningAbsolutePosition’ was 32.18, which is much less than the critical value 79.84 (53 degrees of freedom and a level of significance of 1%). This implies that it cannot be stated that there is difference in size of the segments or the wheel is biased.

Variable winningRelativePosition

However, as for the parameter ‘winningRelativePosition’, it is notable that some positions win more frequently than others do, which could make it possible for a player to overcome the house edge and benefit from it. According to the collected data, the chi-squared statistic is 90.75 and exceeds the critical value 79.84 (53 degrees of freedom and a level of significance of 1%). In addition, the p-value (probability of obtaining test results at least as extreme as the results actually observed) [4] is 0.095%. These results suggest that the parameter ‘winningRelativePosition’ is far from being random.

Table 2 – Chi Squared – winningRelativePosition

The table below shows that p-value is even lower in winning relative positions for hands with clockwise direction, particularly 0.00000014%.

Table 3 – Chi Squared – winningRelativePosition – Clockwise

Simultaneous confidence intervals [5][6] were calculated for this last sample to ultimately know the maximum and minimum potential benefit which a player would be able to gain. In order to work this out, the Wilson score method was used with a confidence level of 90%.

It was estimated that a player has a probability of 2.15% of winning betting on the position 29 in the worst of cases. This probability considerably exceeds the expected value (1.851%) and implies a significant advantage for the player.

Table 4 -Confidence Intervals (Wilson method – 90% confidence) for winning relative positions – CLOCKWISE

Exploiting lack of randomness on winning relative positions

In order to exploit the lack of randomness of winning relative positions, betting strategies have to be designed. The following two sections include betting strategies designed for clockwise and anticlockwise games, and the analysis of their efficiency in comparison with other strategies.

Betting Strategies

A very simple winning betting strategy consists in betting on number 40 if the segment (there is only one segment with number 40) is in the relative position 29 and the wheel direction is clockwise. The following shows an example of how this strategy works. 

The image below shows the initial position of the wheel (this coincides with the instant before the betting panel is disabled and no longer available until the next game). Number 40 is in the relative position 8 but not in the relative position 29. Therefore, this game would be ignored, and no bets should be made.

Figure 13 – Initial position – Segment 40 is on the relative position 8

In the following initial position, number 40 is in the relative position 29. Therefore, a bet should be made on number 40.

Figure 14-  Initial position – Segment 40 is on the relative position 29

It is worth mentioning that the bets would need to be made in an automated way using a script because such tasks as identifying the number positioned in a specific relative position, and making (or not making) a bet within 0.5 seconds, are not possible to do manually.

According to the simultaneous confidence intervals calculated previously, the probability of winning would be between 2.15% and 3.01% (without taking into account the M2 and M1 segments), which considerably exceeds the expected value (1/54 = 0.0185 = 1.85%).

Taking into account the aforementioned probabilities and assuming that:

  • the wheel stops on the segments ‘M1’ and ‘M2’ with probabilities 1.9% and 1.4% in the worst of the cases, and 2.71% and 2.12% in the best of the cases.
  • all the segments have equal probability of winning if previously the wheel stopped on ‘M2’ or ‘M1’
  • the size of the bet is always 1€ and the winning quantity limit is 500.000€

it was estimated that the player could obtain a return on betting that would range from 0.56% to 41.80% using this strategy. For instance, a player would win a minimum of 5.6€ and a maximum of 418€ per every 1000€ bet, with approximately 90% of confidence.

Notably, this strategy might require long time to obtain a ‘worthy’ benefit as most of the games are discarded because bets are only placed when number 40 is in the relative position 29.

As a proof of concept, a more complex betting strategy was designed based on the estimated probabilities and expected ROI. It will be referred to as ‘MY BETTTING STRATEGY’ from now on.

Table 5 – My betting strategy

Depending on the direction (CLOCKWISE and ANTICLOCKWISE), the strategies are different.

The columns ‘BEST NUMBER TO BET ON’ contain the numbers which the player should bet on and the columns ‘RELATIVE POSITION OF NUMBER 40’ indicate the relative position of number 40.

For example, if the wheel is spinning clockwise and the relative position of the segment 40 is 7 (see the image below), the player should not bet on any number.

Figure 15 – Initial position – Segment 40 is on the relative position 7

However, if the wheel is spinning clockwise and the relative position of the segment 40 is 39, the player should bet on number 10 according to the strategy (see the following image and table).

Figure 16 – Initial position – Segment 40 is on the relative position 39
Table 6 – Excerpt from the ‘My betting strategy’ table

Analysing the effectiveness of betting strategies

A computer simulation of a fictitious player following ‘MY BETTING STRATEGY’ described in the previous section was run using the sample of 7.278 games (Figure 4 – Tracked hands).

For the simulation, it was assumed that:

  • all the segments have equal probability of winning if previously the wheel stopped on ‘M2’ or ‘M1’
  • as the winning numbers after the wheel stopping on ‘M2’ and ‘M1’ were not tracked by the script, the expected ROI was returned when the ‘winning segment’ was either ‘M2’ or ‘M1’. For instance, if following the strategy one euro is bet is on number 10 and the wheel stops on the segment ‘M2’, the total balance will be increased by [*REDACTED*] as this quantity is the expected ROI over the long run.
  • the size of the bet is always 1€ and the winning quantity limit is 500.000€

 The following table shows the results:

Table 7 – ROI of ‘My betting strategy’

Noticed that not all the games were played. E.g. for the ‘CLOCKWISE’ direction, 1,102 out of 3,646 games were played, which means that 2,544 were ignored, as they were not profitable according to the strategy.

The balance shows the winnings (positive in both cases) and the column ‘ROI’ indicates the average money per played hand, which the player made. In other words, ROI = 100 * ‘BALANCE’ / ‘GAMES PLAYED’.

In order to determine the effectiveness of the betting strategy, the probability of obtaining a return greater than or equal to the returns obtained was worked out. Specifically, a bootstrap[7] analysis was performed to estimate the distribution of returns for the following losing strategies:

  • RANDOM strategy consists in betting on any number (1, 2, 5, 10, 20 or 40) randomly.
  • ALWAYS 10 strategy consists in always betting on number 10. This is a very interesting strategy to compare with ‘MY BETTING STRATEGY’, as number 10 has the lowest house edge among all the numbers, [*REDACTED*]% (see Odds and pay-outs). Therefore, ‘ALWAYS 10’ strategy is supposed to be the best strategy as it allows minimising the losses per hand.

It is worth mentioning that Monte Carlo[8] analysis was performed as well, which yielded very similar results.

The following table shows the results of the analysis:

Table 8 – Results of the comparison between the strategies ‘RANDOM’, ‘ALWAYS 10’ and ‘MY BETTING STRATEGY’

As it can be observed, the probability of obtaining a return greater than or equal to ‘MY BETTING STRATEGY’ with the ‘random’ and ‘always 10’ betting strategies (across 1,102 and 303 games respectively) is less than 1%. This result suggests that the high effectiveness of ‘MY BETTING STRATEGY’ is far from being by luck.

The following graphs visually illustrate the effectiveness of ‘MY BETTING STRATEGY’ for the CLOCKWISE direction in comparison with the ‘random’ and ‘always 10’ betting strategies. A thousand games were simulated.

Figure 17 – MY BETTING STRATEGY vs ALWAYS 10 strategy
Figure 18 – MY BETTING STRATEGY strategy vs RANDOM strategy

It is noteworthy that better strategies could be worked out. However, they were not explored as exploiting the lack of randomness in an efficient way was not the aim of this analysis but highlighting the fact that the house edge could be overcome.

Other Considerations

It is worth mentioning that no intrusive tests were conducted during this research. Additionally, it was not necessary to make any bets to detect or proof the potential vulnerability described in this document. The interaction with the game was limited to record videos of the wheel, which were analysed afterwards.

Other online games were found to be similar to Bix Six. Therefore, these games might be vulnerable as well.

Recommendations

It is recommended to make the necessary changes to the game in order to generate random winning relative positions. This way, it will not be possible to overcome the house edge and make profit in the long run.

The best and safest solution (probably, the most expensive to implement as well) is to replace the croupiers by hardware that randomly generates the outcome and spins the wheel with the necessary and exact strength to show the previously determined number as the winning number.

Other solution might consist in increasing the difference between the minimum and maximum number of wheel spins. According to the observations, the croupiers currently spin the wheel approximately between 2.7 times (150 segments) and 4.7 times (258 segments). This means a difference of only two wheel spins (4.7 – 2.7 = 2). Additionally, it was observed that the croupiers unconsciously tend to spin the wheel a specific number of times. Particularly, a number between 3.56 and 3.62 times (192.5 – 195.5) as can be seen in the following histogram:

Picture 19 – Number of spins (expressed in number of segments)

Apparently, the fact that this distribution is bell-shaped is the reason why the winning positions are not random enough. Therefore, increasing the difference between the maximum and minimum number of wheel spins will help to flatten the curve and, consequently, to obtain more random winning numbers.

To illustrate this solution, a simulation of 7,200 wheel spins, whose numbers of segments run ranged from 147.5 to 511 (4 wheel spins difference instead of 2), was conducted. Its histogram can be seen in the image below:

Picture 20 – Number of spins (expressed in number of segments)

A chi-square test was conducted, and the p-value obtained was 98.6%. This result conforms well with a fair game and the deviation from expectations is well with the normal range.

Table 9 – Chi Squared – Winning numbers

Alternatively, winnings of players could be monitored and analysed statistically in real time. If a player’s winnings were unlikely to be by chance at a particular time, their accounts could be blocked temporarily and further investigation could be undertaken. Additionally, suspicious betting patterns could be monitored as well. For example, a player betting only on specific numbers (40 and 20) sporadically could be an indicative of a player trying to exploit this issue.

References

[1] Online Casino Roulette – A guideline for penetration testers and security researchers: https://research.nccgroup.com/2020/09/18/online-casino-roulette-a-guideline-for-penetration-testers-and-security-researchers/

[2] NCC Group Vulnerability Disclosure Policy: https://research.nccgroup.com/wp-content/uploads/2021/03/Disclosure-Policy.pdf

[3] Big Six – Wizard of odds: https://wizardofodds.com/games/big-six/

[4] Chi-squared distribution: https://en.wikipedia.org/wiki/Chi-squared_distribution

[5] Goodness of fit: https://en.wikipedia.org/wiki/Goodness_of_fit

[6] Chi Square Distribution Table for Degrees of Freedom 1-100: https://www.easycalculation.com/statistics/chisquare-table.php

[7] P-value – Wikipedia: https://en.wikipedia.org/wiki/P-value

[8] Confidence interval: https://en.wikipedia.org/wiki/Confidence_interval

[9] MultinomCI – Confidence Intervals for Multinomial Proportions: https://rdrr.io/cran/DescTools/man/MultinomCI.html

[10] Bootstrapping – https://en.wikipedia.org/wiki/Bootstrapping_(statistics)

[11] Monte Carlo – https://en.wikipedia.org/wiki/Monte_Carlo_method

  • There are no more articles
❌