🔒
There are new articles available, click to refresh the page.
✇de engineering

Exploring the process of virtual memory address translation and structure of a page table entry.

By: Mr. Rc

We learned about the fundamentals of virtual memory management in the last post, as well as two Windows API functions that allow us to allocate virtual memory (VirtualAlloc) and free it (VirtualFree).
In this blog, we’ll continue our exploration of virtual memory management in Windows by learning about the how does a virtual memory address translate to a physical address, the structure of a page table in memory (explained later), what information it contains, and how we can use Window API functions to query that information and some other internals regarding the workings of virtual memory in Windows.

Table of contents:

Translation of virtual memory address

When a virtual memory address gets translated, it goes through several different translation layers where each time it’s translated, it points to a new table (which can be thought of as a structure) which also points to another table and this process is repeated until it finally gets translated into an address in the actual physical memory (RAM). The translation of these pages is done by the Memory Management Unit (MMU) of the CPU and their management is done by the Memory Manager (a component of the Windows OS). On x64 Windows, there are four tables that do this job, namely:

  • Page Map Level 4 (PML4)
  • Page Directory Pointer Table (PDPT)
  • Page Directory Table (PDT)
  • Page Table (PT)

Each of these tables contain indexes that point to the start of the next paging structure. Each of these paging structures have 512 entries. These indexes are called Page Frame Numbers (PFN) and the entries themselves are called as PxE, where x is the name of the table and E means entry, so entries inside the PML4 will be called PML4E (x = ML4), for Page Tables it will be PTE and so on.
This can be visually understood by looking at this diagram:

Virtual Memory Translation on x64 Windows
This image is probably more confusing than what you read before watching this, but let me explain so you can feel cool and get some dopamine hits.
This is simply the translation process of a virtual memory address. On the top, you can see the distribution of 48 bits (0 to 47) into division of 9 bits with one exception of 12 bits (explained later), and since we already know that on x64 systems, the addressing only happens for 48 bits, this makes sense. This explains that this top part of this fancy looking image is basically showing you the distribution of those bits.
Below them are the tables that I just talked about, you can see how each of them are pointing to some other table in coordination with the information inside the virtual memory address to finally translate to a physical memory address.

You might now have a guess of where this is going and how does the address translation takes place. Different bits inside a virtual memory address are distributed into parts and those parts contains data that tells the MMU where to look for the next entry in the next table until it finds a physical page after looking at finding the entry in the Page Table.

Now, let’s look into this distribution of bits and understand it’s work.
The first division starts from the 39th bit to 47th bit, which is a index inside the Page Map Level 4 paging structure (the address of this structure is stored in a special register, will be deeply described in a later post) and the entry at that index contains a PFN that tells the MMU where PDPT is and similarly, the bits from 30th position to 38th position tells the MMU the index of the entry inside PDPT that points to the next paging structure and this process continues until we reach the Page Table.
Once the translation process has reached the point where it has found the entry inside the Page Table which points to the address of a physical page in the RAM, the left 12 bits are used to index a specific byte in the physical page to get the exact needed data that was requested.

Understanding the structure of a PTE

Each Page Table entry has some status and protection bits set, which store information regarding the page itself. These entries tell the MMU how these pages should be managed and what is their current status.
This is how a x64 PTE looks like:

A Page Table entry on x64 Windows
As you can see, there are multiple bits (some are grouped others are not) and each of have some information regarding the page itself or it’s status. Let us understand each of them one by one so we can have a clear understanding of a Page Table entry’s structure.

Hardware bits vs. Software bits in Page Table Entries

Before talking about these bits themselves, let us understand the types of bits that are inside a PTE.
Hardware bits: Hardware bits are the bits that the MMU actually takes in consideration while translating a virtual address into a physical address.
Software and Reserved bits: These are the bits that are totally ignored by the MMU and actually used by the Memory Manager to manage pages. If you look in the diagram, you will find that bit 9 to 11 are marked as Software bits which means they are used by the Memory Manager.

Understanding the bits

Valid bit: The bit at the 0th index is the Valid bit which tells the MMU that the page for which this page table entry is, actually exists somewhere in the physical RAM and it is not paged out (explained in part one of this blog). This bit is useful because as we know, Windows uses demand paging and since some pages might not be used by a process but they might still be allocated then it’s certain that the Memory Manager will page out the unused pages from the memory to the disk. This bit helps the Memory Manager to keep track of paged and non paged memory pages.

Write bit: The bit at the 1st index is the Write bit which tells the MMU that whether the page is writeable or not. When this bit is clear (set to 0), the page is read-only and when this bit is set, we are allowed to write to that page. You can relate this with the information from the last blog post, we used the flProtect argument of the VirtualAlloc function to specify the memory protections that we wanted while allocating a page and if we use any protection that allows writing of the page then this bit will be set to 1.

Owner bit: The bit at the 2nd index is the Owner bit which tells the MMU whether the page is allowed to be accessed from the user mode or if it’s access is limited to the kernel mode. If this bit is set in the pte of a page then that page will be accessible from the user mode and if it’s not set then that page will only be accessible in the kernel mode.

Write Through bit: The bit at the 3rd index is the Write Through bit which tells the MMU to enable write through on the page. Write through is a storage method in which data is written into the cache and the corresponding main memory location at the same time. The cached data allows for fast retrieval on demand, while the same data in main memory ensures that nothing will get lost if a crash, power failure, or other system disruption occurs.

Cache Disabled bit: The bit at the 4th index is the Cache Disabled bit which tells the MMU that this page should not be cached.

Accessed bit: The bit at the 5th index is the Accessed bit which tells the MMU that this page has been accessed at least once after being mapped.

Dirty bit: The bit at the 6th index is the Dirty bit which tells the MMU that this page has been written to (there has been a write operation on this page).

Large bit: The bit at the 7th index is the Large bit which tells the MMU that this page is a large page and it maps to a page that is larger than 4KB.

Global bit: The bit at the 8th index is the Global bit which tells the MMU that this page should not be flushed to the Translation Lookaside Buffer (a caching system for recently used pages).

Copy-on-write bit (Software): The bit at the 9th index is the Copy-on-write bit, which is a Software bit and it has a special purpose. When a thread tries to access a page that is read-only (has the write bit set to 0), a memory-management exception occurs. Along with this, the Memory Manager’s fault handler checks if the Copy-on-write bit is set, if it is set then it makes a copy of that page and gives that thread the access of that copy and this copy has write access enabled so that thread will now be able to write to that data but those writes won’t affect the original page which doesn’t has the write bit set. However, if a thread tries to access a read-only page and this bit is not set then it raises the access violation exception.

Prototype bit (Software): The bit at the 10th index is the Prototype bit, which is also a Software bit and this bit is used to mark a page as a “Prototype”. This is a bit complex concept and to better understand it, you can check the resources section.

Write bit (Software): The bit at the 11th index is the Write bit, which is the last Software bit in a x64 PTE and this bit also has a quite unique usage. This may feel strange to know after everything you have learned but actually, when a page is allocated, whether it was supposed to be writeable or not, the Memory Manager initially sets the write (hardware) bit to 0, which means that all the pages are not writeable on the time of initialization and the actual way the Memory Manager knows that if a page is writeable or not is by using the 11th bit (Software Write bit). Since, the hardware write bit is set 0, every time a thread tries to write to any page for the first time, a Memory Management exception occurs and the Memory Manager checks if the bit 11 (Software Write bit) is set, if it is then it gets to know that this page is actually writeable, then it sets the Dirty bit and Write hardware bit to 1 and updates some other Memory Management information and then it dismisses the exception and then the write operation happens normally. This happens only on the first write operation on a page as the hardware write bit gets set to 1 after this process is done.
The reason it is implemented in this way is related to the existence of multiprocessors and can be understood better by reading the “Address translation” section of the Windows Internals, Part 1 7th edition book.

PFN: The 36 bits from the 12th index to the 47th index are the page frame number that we talked about earlier.

Reserved: These bits from 47th index to 62nd index are completely ignored by the MMU and only used by the Memory Manager for special purposes.

NX bit: The last and 63rd bit in a pte is the NX bit. NX stands for “no-execute” and it tells the MMU whether this page can be executed or not.

Now, since you now have the knowledge of the translation process of a virtual memory address as well the structure of a hardware PTE and you know what information it stores, it’s time for you to learn about another Windows API function which allows us to query information about a page.

GetLastError


Before we start, I would like to introduce you to a function from the Windows API, it is GetLastError. It is used to get the error code of the last error that occurred and we can get more information about the error code by looking at the error code list which is available at msdn here : System Error Codes - Win32 apps
We will be using this function in the code examples to see if there are any errors in our code.

1. VirtualQuery


This function is used to query the information of a virtual memory region (page).

Function signature

This is the syntax for VirtualQuery function:

SIZE_T VirtualQuery(
  LPCVOID                   lpAddress,
  PMEMORY_BASIC_INFORMATION lpBuffer,
  SIZE_T                    dwLength
);

Arguments

The function’s return type is SIZE_T, it’s basically an unsigned int.

lpAddress: You might already know the use of this argument if you have read the part one of this blog, it’s basically the base address of Virtual Memory region that we allocated which is returned by VirtualAlloc.

lpBuffer: This argument is a pointer to a struct. The name of this struct is _MEMORY_BASIC_INFORMATION, it is defined in winint.h. Here is how it looks like:

typedef struct _MEMORY_BASIC_INFORMATION {
  PVOID  BaseAddress;
  PVOID  AllocationBase;
  DWORD  AllocationProtect;
  WORD   PartitionId;
  SIZE_T RegionSize;
  DWORD  State;
  DWORD  Protect;
  DWORD  Type;
} MEMORY_BASIC_INFORMATION, *PMEMORY_BASIC_INFORMATION;

I’ll explain it’s members later.

dwLength: This argument is the size of the struct from the last argument.

Return value

Instead of returning anything, the function just updates the struct that we had created.

Examples

As we have learned enough about the function, let’s take a look at some examples and see the function and it’s working in action.

Example #1

Now as we have done with understanding of the function, we’ll see some code examples of the function. We are going to make a program that will give us the information about a memory region that we’ll allocate using the functions that we learned about in the last blog post. Let me show you the code first, then I will explain it:

#include <Windows.h>
#include <stdio.h>

int main()
{
    MEMORY_BASIC_INFORMATION info; 
    int ret;
    int *vm = VirtualAlloc(NULL, 8, MEM_COMMIT, PAGE_READONLY); // 8 byte allocation.
    ret = VirtualQuery(vm, &info, sizeof(info));
    if (!ret) // error checking.
    {
        printf("VirtualQuery failed\n");
        printf("The error code for the last error was %d", GetLastError());
        return 1;
    }

    switch (info.AllocationProtect)
    {
        case PAGE_EXECUTE_READ:
            printf("Protection type : EXECUTE + READ\n");
            break;
        case PAGE_READWRITE:
            printf("Protection type : READ + WRITE\n");
            break;
        case PAGE_READONLY:
            printf("Protection type : READ\n");
            break;
        default:
            printf("Not found");
            break;
    }

    switch (info.State)
    {
        case MEM_COMMIT:
            printf("Region State : Committed");
            break;
        case MEM_FREE:
            printf("Region State : Free");
            break;
        case MEM_RESERVE:
            printf("Region State : Reserve");
            break;
        default:
            break;
    }
    VirtualFree(vm, 8, MEM_RELEASE); // free the allocated memory.
    return 0;
}

I have used Windows.h instead of using any other header file because Windows.h contains almost everything that we need for doing Windows API programming.
Let’s now understand the code.
First, we have declared a struct of type MEMORY_BASIC_INFORMATION, which is the struct that we talked about, then we committed eight bytes of virtual memory which is read-only.
After that, we have used VirtualQuery function to get information about that memory region.
We gave it the address of the allocated memory region as our first parameter, then we gave the address of the info struct that will hold all the returned data from this function, then we gave it the size of our info struct.
Then, we are doing a check if the function is failed, If it’s failed then the error code can be found by using the GetLastError function.
Then, we have a switch-case clause, where we are checking the value of AllocationProtect member of our info struct. This switch-case clause will check for the protection type of the virtual memory region that was specified as the first parameter.
The constants that are being used to compare in the switch-case clause are defined in the Windows.h header file that we included.
We are then checking the value of State member from our info struct. This switch-case clause is comparing the state of the allocated virtual memory region. Then, we are just printing information according to the statements. One thing to note is that we cannot compare the value with every type of protection type or every type of memory state, I have tried doing so but I was unsuccessful, so I am have just used the types that can be compared.
Then we just free the allocated memory.

Results #1

Here’s the output that I get after running the example:

$ ./vquery-example
Protection type : READ
Region State : Committed

The results are expected, we had hardcoded the page protection to be read-only and the page state to committed and the result by the function is precise.

Example #2

This example will be quite fun. Here, I am asking the user to select which page state and page protection they want for the page and then using VirtualQuery to query the information of the allocated page and then printing it to verify with the input user gave. Here’s the code for it:

#include <Windows.h>
#include <stdio.h>

int main()
{
    MEMORY_BASIC_INFORMATION info;
    int ret;

    char state;         // used for input
    char protection;    // used for input
    int MEM_STATE;
    int MEM_PROTECTION;

    printf("Choose the page state you want to use: \n");
    printf("1. MEM_COMMIT\n");
    printf("2. MEM_RESERVE\n");
    scanf("%c", &state);
    getchar();

    switch (state)      // checking user input.
    {
    case '1':
        MEM_STATE = MEM_COMMIT;  
        break;
    case '2':
        MEM_STATE = MEM_RESERVE;        
        break;
    default:
        printf("Invalid choice!");
        exit(-1);
    }
    
    printf("Choose the page protection you want to use: \n");
    printf("1. PAGE_READONLY\n");
    printf("2. PAGE_READWRITE\n");
    printf("3. PAGE_EXECUTE_READ\n");
    scanf("%c", &protection);

    switch (protection) 
    {
    case '1':
        MEM_PROTECTION = PAGE_READONLY;        
        break;
    case '2':
        MEM_PROTECTION = PAGE_READWRITE;        
        break;
    case '3':
        MEM_PROTECTION = PAGE_EXECUTE_READ;        
        break;
    default:
        printf("Invalid choice!");
        exit(-1);
    }

    // allocating memory.
    int *vm = VirtualAlloc(NULL, 8, MEM_STATE, MEM_PROTECTION);
    printf("Address of memory returned by VirtualAlloc is %lu\n", vm);

    //querying data about that memory.  
    ret = VirtualQuery(vm, &info, sizeof(info));
    
    // error checking.
    if (!ret)
    {
        printf("VirtualQuery failed\n");
        printf("The error code for the last error was %d", GetLastError());
        return 1;
    }

    printf("Protection type : ");
    
    switch (info.AllocationProtect) // comparing protection.
    {
        case PAGE_EXECUTE_READ:
            printf("EXECUTE + READ\n");
            break;
        case PAGE_READWRITE:
            printf("READ + WRITE\n");
            break;
        case PAGE_READONLY:
            printf("READ ONLY\n");
            break;
        case PAGE_GUARD:
            printf("Guard Page\n");
            break;
        default:
            printf("%x\n", info.AllocationProtect);
            break;
    }

    printf("Region State : ");
    switch (info.State) // comparing state.
    {
        case MEM_COMMIT:
            printf("Committed");
            break;
        case MEM_FREE:
            printf("Free");
            break;
        case MEM_RESERVE:
            printf("Reserve");
            break;
        default:
            printf("Unknown");
            break;
    }

    VirtualFree(vm, 8, MEM_DECOMMIT); // free the allocated memory.
    return 0;
}

Most part of the code is similar to the code from the last example, but there are some major changes.

First, we are asking the user to choose which page state they want to allocate, then we are storing their input in a character variable state, then we are taking that input variable state and comparing it in a switch-case clause to find out which page state the user asked for, then we are setting an integer variable MEM_STATE to the constant of the page state which the user asked for and then we did the same for page protection by using the protection character variable for input and MEM_PROTECTION for storing the constant.
Next, we are allocating memory using those variables (MEM_STATE and MEM_PROTECTION) as parameters for VirtualAlloc and then we are taking the address returned by VirtualAlloc and querying the information about it from VirtualQuery, then comparing it possible constants and printing it’s state and protection.

Result #2

Here’s the output of the program:

Choose the page state you want to use: 
1. MEM_COMMIT 
2. MEM_RESERVE
1
Choose the page protection you want to use: 
1. PAGE_READONLY
2. PAGE_READWRITE
3. PAGE_EXECUTE_READ
2
Address of memory returned by VirtualAlloc is 131072
Protection type : READ + WRITE
Region State : Committed

Cool!, it works as expected.

Summary

In this post, we have learned about a lot of complex things related to Windows Virtual Memory Management. We learned about the four paging structures that are used during the translation process of a virtual memory address and the process of translation itself, then we learned about the complex structure of a Page Table Entry and then finally we learned about how we can get the error code of the last error using the GetLastError function, then we learned about the VirtualQuery function and how we can use it to query the information of a virtual memory region and then we made two small projects to see that in action. I hope you enjoyed the blog post and learned something new!
Thank you for reading!

Resources

✇de engineering

Windows API - Exploring Virtual Memory and the Virtual Memory Management API.

By: Mr. Rc

If you have ever explored Windows Internals or just the internal workings of an Operating System or Computer, you must have heard of the term “Virtual Memory” or “Paging” somewhere because these are some of the most important concepts of an Operating System and these are the concepts which we are going to explore in this blog post. Of course, I won’t be able to cover the whole concepts but I’ll try to give you basic understanding of every concept I talk about and I will also link to the resources that explain each concept in detail in the resources section.

Table of contents:

Virtual Memory

64 bit systems are capable of having addresses upto 64 bits in size but they only use 48 bits for addressing out of their total bits. Why?
Because, if all 64 bits are allowed, the system will be able to address up to 16 Exabyte (EB) of memory (1 EB = 1000000 Terabytes) and of course, none of the system today can use or need that much memory right now and another reason behind this is the memory manager can’t manage this much memory at the same time as 16 EB is a lot of memory. This is why only 48 bits of addressing is allowed. 48 bits can address up to 256 TB of memory, which is still a lot.
We generally use the term “memory” (in context of computers) to refer to the RAM or some data stored in the RAM but behind the scenes, there is a lot going on that actually makes memory a thing and one of the many component behind this is virtual memory.
If you are a bit familiar with pointers or assembly, you might already know how does a memory address look like. It’s like this:

0xFFFFDEADC0DE

yeah, like that. This is an example of a virtual memory address (or simply a virtual address). These virtual memory addresses don’t actually point to a place in the physical RAM installed on your computer, but they are basically a reference to the them. The addresses that point to memory in the Physical RAM are called physical memory address. Virtual memory address are translated (converted) into physical memory addresses by the combined workings of both the CPU and the Memory Manager (which is a part of Windows itself).

Paging

Implementation of virtual memory by the Memory Management Unit is known as paging. Windows uses two types of paging which are known as Disc Paging and Demand Paging with clustering.
Virtual memory and physical memory both are divided into 4KB chunks (regions/parts), these chunks are called Pages (virtual memory chunks) and page frames (physical memory chunks). There are also large pages and huge pages but I won’t talk about them in this blog post.
In disc paging, whenever there is not enough memory left in the physical memory (RAM), the memory manager (explained later) moves pages from the RAM to special files called page files and this process of moving data from RAM to disc is called paging out memory or swapping. Moving pages from RAM to page files frees the RAM and this free memory can be used by new processes for their work. But, what happens to the paged out memory?
Whenever some code (instruction) tries to access some data that is not in the physical memory but is paged out, the MMU generates a page fault which is then handled by the OS, the OS takes that page from the disk and moves it back into the physical memory and restarts the instruction that wanted to access that memory. However, in clustering, instead of bringing back only the page that the fault requested, the memory manager also brings the pages surrounding the page that the fault requested.
In demand paging, whenever a process tries to allocate memory, the memory manager doesn’t really allocate any memory but it still returns a pointer to some memory, which is actually not yet allocated, it gets allocated only when after it is accessed. Memory is not allocated -> Process accesses the non existent memory so page fault happens -> Windows allocates the memory and allows you to use it. This method is used because programs may allocate memory that they will never access or use and having this kind of pages in the memory will only waste the demand paging allows the system to save unused memory.
Each 64 bit process on Windows is allowed to use 256 TB of virtual memory addresses but this memory is divided into different sized regions, some of which is used by the system and some of it is allowed to be used by a process. Here is a diagram of the division:

Page states

A page can be in one of the three states:

Memory Manager in Windows

All the management of the virtual memory and virtual addresses is done by the Memory Manager, which is a part of the Windows executive (kernel component). Here are the specific tasks of the memory manager:

  • Telling the MMU how to translate a virtual memory address to a physical memory address.
  • Performing paging.
  • Allocation, Reservation, Freeing of virtual memory.
  • Handling page faults.
  • Managing page files.
  • Providing a userland API for allocation, reservation and freeing of virtual memory.

Memory-Mapped files

A memory-mapped file is a special region in virtual memory that contains the contents of a file, this allows processes to treat the the contents of a file like a normal region in the memory.
There are two types of memory-mapped files in Windows:

  • Persisted memory-mapped files: These are the files that are associated (connected) with an actual file on the disk. After the last process has done it’s work with the memory-mapped file, the mapped file is written to the original file to which the memory-mapped file was associated with.
  • Non-Persisted memory mapped files: These files are not associated with any file on the disk and are mostly used for inter-process communications (IPC). After the last process had done it’s work with the memory-mapped file, it’s content is lost.

Page sharing

There are pages that are shared with different processes and these pages are called shared pages. Shared pages are mostly used to share DLLs that most processes on Windows require which saves RAM as the system doesn’t have to allocate same DLLs for each process, an example of this is kernel32.dll. Shared pages are essentially just shared memory-mapped pages which are associated with DLLs or some other shareable data.

The Virtual Memory Management API

This API is provided by the memory manager of Windows. This API allows us to allocate, free, reserve and secure virtual memory pages. All the memory related functions in the Windows API reside under the memoryapi.h header file. In this particular post, we will see the VirtualAlloc and VirtualFree functions in depth.

1. VirtualAlloc

The VirtualAlloc function allows us to allocate private memory regions (blocks) and manage them, managing these regions means reserving, committing, changing their states (described later). The memory regions allocated by this function are called a “private memory regions” because they are only accessible (available) to the processes that allocate them. Memory regions allocated with this function are initialised to 0 by default.

Function signature

This is the function signature of this function:

LPVOID VirtualAlloc(
  LPVOID lpAddress,
  SIZE_T dwSize,
  DWORD  flAllocationType,
  DWORD  flProtect
);

Arguments

The return type of this function is LPVOID, which is basically a pointer to a void object. LPVOID is defined as typedef void* LPVOID in the Windef.h. In simple words, LPVOID is an alias for void *. LP in LPVOID stands for long pointer.

lpAddress: This argument is used to specify the starting address of the memory region (page) to allocate. This address can be provided either from the return value of the previous call to this function or it can be specified as an arbitrary address but if there is memory already allocated at this address, then the Memory manager will decide where it should allocate the memory. If we don’t know where to allocate memory (as if we have not called this function previously), we can simply specify NULL and the system will decide where to allocate the memory. If the address specified is from a memory region that is inaccessible or if it’s an invalid address to allocate memory from, the function will fail with ERROR_INVALID_ADDRESS error.

dwSize: This argument is used to specify the size of the memory region that we want to allocate in bytes. If the lpAddress argument was specified as NULL then this value will be rounded up to the next page boundary.

fAllocationType: This argument is used to specify which type of memory allocation we need to use. Here are some valid types as defined in the Microsoft documentation:

Valid types for fAllocation

If you are confused about the hex values which are written after every value, they are basically the real value of the constants (i.e. MEM_COMMIT, MEM_RESERVE, etc). For example, if we use MEM_COMMIT, then it will be converted to 0x00001000 and same with all other values.

What does committing memory actually means?

In the table of types and definitions, I have described MEM_COMMIT (which is used to commit virtual memory) terribly, so let me explain what committing memory actually means in a better way.
When you commit a region of memory using VirtualAlloc, due to the use of demand paging, the memory manager doesn’t actually allocate the memory region, neither on the physical disk nor in the Virtual Memory, but, when you try to access that memory address returned by the VirtualAlloc function, it causes a page fault which causes a series of events and eventually the system allocates that memory region and serves it to you. So, until there’s an access request to the memory, it’s not allocated, there’s just a guarantee by the memory manager that there exists some memory and you can use them whenever you want.

The types which are used rarely can be found here.

flProtect: This argument is used to specify the memory protection that we want to use for the memory region that we are allocating.
These are the supported parameters:

Some memory protection constants

These are only the most used memory protection constants, the full list can be found here.

Return value

If the function succeeds, it will return the starting address of the memory region that was modified or allocated. If the function fails, it will return NULL.

2. VirtualFree

This function is basically used to free the virtual memory that was allocated using VirtualAlloc.

Function signature

This is the syntax of VirtualFree:

BOOL VirtualFree(
  LPVOID lpAddress,
  SIZE_T dwSize,
  DWORD  dwFreeType
);

As you can see, the return type of this function is BOOL, it means that it will either return true (success) or false (fail).

Arguments

lpAddress: As we know, this argument is used to specify the starting address of the memory region (page) which we want to modify (free in this case), but unlike the first time, we cannot specify NULL as an argument because obviously, the function cannot free a memory region whose address it doesn’t know.

dwSize: We also know about this argument, it is used to pass the size in bytes of the memory region which we want to modify. Here, we will use it specify the size of the memory region that we want to free.

dwFreeType: This argument is used to specify the type which we want to use to free the memory. It may be a bit confusing to you but looking at these types and their definition will clear your confusion:
virtualfree() free types

Return value

If the function does its job successfully, it returns a nonzero value. If the function fails, it will return a zero (0).

Examples

As we have looked into all the explanation, now it’s time to write some code and clear the doubts.

Example #1

Let’s start with taking example of VirtualAlloc. We will write some code which will commit 8 bytes of virtual memory.
First we’ll start by including the needed libraries:

#include <stdio.h>
#include <memoryapi.h>

Now, we’ll define a main function that will use the VirtualAlloc function to commit 8 bytes of Virtual Memory which will be rounded up to 4KB as it is the nearest page boundary to 8 bytes. We will specify the lpAddress argument as NULL, so that the system will determine from where to allocate the memory. Here is how the code looks like:

#include <stdio.h>
#include <memoryapi.h>

int main(){
    int *pointer_to_memory = VirtualAlloc(NULL, 8, MEM_COMMIT, PAGE_READWRITE); // commit 4KB of virtual memory (8 byte is rounded up to 4KB) with read write permissions 
    printf("%x", pointer_to_memory); // print the pointer to the start of the region.
  return 0;
}

Do you think something is missing from the code?
It’s the VirtualFree function. Whenever we allocate any kind of memory, we have to free it so that it can be used by other processes on the system.

Now it’s time to implement the VirtualFree function, so here is it:

#include <stdio.h>
#include <memoryapi.h>

int main(){
    int *pointer_to_memory = VirtualAlloc(NULL, 8, MEM_COMMIT, PAGE_READWRITE); // commit 8 bytes of virtual memory with read write permissions. 
    printf("The base address of allocated memory is: %x", pointer_to_memory); // print the pointer to the start of the region.
    VirtualFree(pointer_to_memory, 8, MEM_DECOMMIT); // decommit the memory region.
    return 0;
}

Until this point, the working of the code must be clear to you, but if it’s not, here’s the line-by-line explanation of the code.
First, there’s a variable which is pointing to the memory address returned by VirtualAlloc. We have passed four parameters to the VirtualAlloc function.
The first parameter is NULL, by passing NULL as a parameter, we are telling the function that the starting point of the memory region should be decided by the system.
The second parameter is the size of the memory region that we want to allocate in bytes, which is 8 bytes.
The third parameter is the allocation type, we have specified that we want to commit the memory. After we commit a memory region, it is available to us for our use but it’s not actually allocated until we access it for the first time.
The last parameter is PAGE_READWRITE, which is telling it that we want the memory region to be readable and writeable.
The we are printing virtual memory address returned by VirtualAlloc function as a hex value.
In the end, we are decommitting the memory region that we allocated by using the VirtualFree function.
The first parameter is the base address of the memory region that we allocated.
The second parameter is the size of memory region in bytes, we specified 8 while allocating it so the we’ll specify 8 while deallocating it.
Then we have specified the type of deallocation. As we are using MEM_DECOMMIT, the memory region will be reserved after it gets decommitted, which means that any other function will not be able to use it after you decommit it until you use VirtualFree function again with MEM_RELEASE to release the memory region.

Results #1

As we are almost done with everything, let’s compile and run the code. I suggest you to write the code by yourself and see the result. This is the that result that I got after I ran it:

$ ./vmem-example.exe
The base address of allocated memory is: 61fe18

Cool, right?
We have just used the VirtualAlloc function to allocate 8 bytes of virtual memory and we freed it by ourselves. Now let’s add some data to the allocated virtual memory and print it.

Example #2

Now let’s save some data inside the virtual memory that we allocated:

#include <stdio.h>
#include <memoryapi.h>

int main(){
    int *pointer_to_memory = VirtualAlloc(NULL, 8, MEM_COMMIT, PAGE_READWRITE); // commit 8 bytes of virtual memory with read write permissions. 
    printf("The base address of allocated memory is: %x", pointer_to_memory); // print the pointer to the start of the region.
    memmove(pointer_to_memory, (const void*)"1337", 4); // move "1337" string into the allocated memory.
    printf("The data which is stored in the memory is %s", pointer_to_memory); // print the data from the memory.
    VirtualFree(pointer_to_memory, 8, MEM_DECOMMIT); // decommit the memory region.
    return 0;
}

The memmove function is used to move data from one destination to other. The first argument to this function is the destination memory address where you want to move the data and the second argument is the data that will be moved and the last and third argument is the size of data, which in this case is 5 (length of the string + null byte). Here, we have copied “1337” to the memory our virtually allocated memory. If you’re confused about the type conversion, it’s used because memmove takes second argument as a const void* and we can’t directly pass char array to it.

Results #2

Let’s compile and run the code. This is the output that we’ll get:

$ ./vmem-example.exe
The base address of allocated memory is: 61fe18
The data which is stored in the memory is 1337

looks even more cool :D!

Summary

We learned a lot about virtual memory in this post, we first looked at how it is basically “virtual” memory which points to “physical” memory then we learned about paging on windows and different paging schemes that Windows’ memory manager uses then we got to know that a page is basically a memory region of 4KB, then we had look at two memory management related functions which allow us to modify virtual memory by allowing us to allocate and free it. I hope you enjoyed the blog and it wasn’t boring, any suggestions and constructive criticism is welcome!
Thank you for reading!

Resources

✇de engineering

Understanding the booting process of a computer and trying to write own operating system.

By: Mr. Rc

In this post, I am going to teach you how can you write your own Operating System. Although, it won’t be a fully-fleged Operating system (like the one you are using right now to read this post), but it will be a part of an Operating System that would be able to boot and it will give you a brief if not full understanding of the booting process of an Operating System. If you want to take this post seriously, I suggest you to take notes as there is a lot of information combined in this single post and can be uncomfortable to grasp at the same time.
If you find something difficult to understand from my explanation, you can always check the resources section to get a link to some alternative explanation of that topic.
I would start this post by introducing you to some important components of the booting process of an Computer.

Table of contents:

Firmware

Unless you live under a rock, you might have heard of the term “Firmware” several times, if you didn’t then let me introduce you to what a Firmware is.
The most well known example of firmwares are Basic Input/Output System (BIOS) and Unified Extensible Firmware Interface (UEFI).
The term itself is actually made up of two fancy words - FIRM softWARE. The word “FIRM” means “something that doesn’t change or something that is not likely to change” and I know you are a smart person and you know what a software is. The word is nice and all but you are here to learn about the cool technical stuff so let me explain the techincal part of it. The firmware is stored inside non-volatile memory devices (devices which store sort of permanent data that doesn’t change after a system restart) as instructions or data and it is the first thing that the CPU runs after the computer is powered on. Everything that we are learning in this blog post is specific to the BIOS firmware type. Modern Operating Systems do not use BIOS, however, that doesn’t mean that the knowledge in this article is of no use as concepts of BIOS are simpler to understand still relavent to learn.

In order to understand the importance and the uses of a firmware, you would need to understand the boot process (“boot” refers to “Bootstrap”) of a computer.

The boot process

The booting process is something like this:

  • Computer is powered on.
  • The Central Processing Unit (CPU) runs the firmware from a specific Read-Only Memory (ROM) chip on your motherboard. The ROM from which your CPU is going to read the firmware depends upon the CPU your system is having.
  • The firmware detects several (but not all) hardware components connected to the system, such as network interfaces, keyboards, mouse, and so on, and does some error checking (also known as Power-On Self Test or POST) before activating them.
  • The firmware doesn’t know what are the properties and details of the Operating System that is about to be going to be ran on the system, So, it transfers it’s control to the Operating System and lets it do it’s setup. It starts with searching through the available/connected storage devices or network interfaces in a pre-defined order (this order is known as the “boot device sequence” or “boot order”) and attempts to find a bootable disk. A bootable disk is a disk whose first sector (a subdivision of a HDD which can hold 512 bytes of user-accessible data) contains the magic number 0xAA55 (big-endian). This magic number is also called as the “boot signature”. In this sector the byte at index 511 should be 0xAA and the byte at index 512 should be 0x55. This first sector is called the Master Boot Record (MBR) and the program stored inside it is called the MBR bootloader or simply bootloader. Remember that this bootloader is a part of the Operating System, so technically, this is part of the process where we are actually booted in the Operating System. This whole process is done after the firmware calls the interrupt 0x19 (more about this later).
  • After the firmware has found the bootloader, it loads it into the address 0x7c00 in the RAM and hands over the control to it.
  • Now, the bootloader can do whatever it is programmed to do, it may print a nihilist quote and tell you that your life has no meaning or it may just do nothing if it is programmed that way. Jokes aside, while it can be programmed to do anything, the main work it is supposed to be doing is performing several tasks that sets up the environment for the loading of next part (the kernel) of the OS. After performing some tasks like the initialisation of some registers, tables and so on. It reads the kernel from the disk and loads it somewhere in the RAM and handles over the control to it.
  • Now, the kernel has the control over the system. Just like a bootloader, there is no pre-defined tasks for a kernel. Whatever it will do entirely depends upon what it has been programmed to do. For example, this can be seen in the Linux and Windows kernel, they are entirely different and what they will do is too entirely different but they will eventually start the User Interface and allow the user to have the control of the system. If you find this complex, here’s an example - Just like everyone in your company does different stuff after they wake up - they may reply drink a cup of chai, they may go for a walk or do anything they want but their end goal is to reach the office on time and start working, a kernel too has the end goal of successfully loading the easy-to-use User Interface part of the OS to the user. Note that this is not the only work of the kernel in the OS, the kernel is an essential part of an OS and also has a lot to do after it has served you the nice UI.

Environment setup

Before diving in, You should have nasm and qemu installed. I know you probably do not have any of them, so go ahead and install them. Both are available for Windows and Linux.

In linux nasm and qemu can be installed through a single command:

$ sudo apt install nasm; sudo apt install qemu-system-x86

Writing our bootloader

As writing a complete kernel from scratch and then writing our own user interface, software, compiler, etc. would be a lot of pain to write in single blog post and even for you to understand, I am going to not do it all in this post and instead of writing the whole OS, we would be only be writing a bootloader, and it actually worths trying to write it, as you will too learn a lot of new things related to bootloaders and Operating Systems.

For now, we will start by writing an endless loop which is not pointless (unlike your life). It will be a function that does nothing more than jumping to itself (looping endlessly).

loop:
    jmp loop 


Here’s how you assemble it:

$ nasm bootsector.asm -f bin -o bootloader.bin

The -f flag specifies the format which is bin (binary) in our case, and the -o flag is used to name the file in which we want our output to be saved.

hexdump of bootloader.bin:

00000000: ebfe                                     ..

The opcode or the hex representaion of these instructions is ebfe, it is an infinite loop in assembly, which is exactly what we wanted.

Adding some data to our bootloader

Now that we are done with our endless loop, we will continue to write some more instructions to our bootloader and will eventually make it bootable.

We will first start by writing some data to our bootloader, here’s how you do it:

loop:
    jmp loop

db 0x10

hexdump:

00000000: ebfe 10                                  ...

The db (data byte) instruction is used to put a byte “literally” in the executable, that’s why you can see 10 being stored in the executable.

Making our bootloader bootable

The first thing we need to do in order to make this an actual bootable device is to add the the magic bytes at the end of the our bootloader’s code (at 511 and 512 index), so that the firmware can actually know that this is a bootable device. This is how we do it:

loop:
    jmp loop						; endless loop
db 0x10  						; pointless data
db "You didn't chose to exist." 			; makes sense?

times 0x1fe-($-$$) db 0					; explained later. 0x1fe = 510 in decimal.
dw 0xaa55 						; the magic number.

The instruction times 0x1fe-($-$$) db 0 may look scary but it’s really easy to understand.
The instruction can be broken into two instructions: times 0x1fe-($-$$) and db 0. Let me explain the first one to you then you will be able to make sense of the second one too.

The times instruction

The times instruction tells the assembler (nasm in this case) to produce multiple (n) copies of a specified instruction. In order to understand this more clearly, let’s look at the syntax of times instruction:

times <n> <instruction> <operand> ...		; n = number of times.

One thing you should know is the number of operands depends on the instruction being used. Here’s a simpler use case example of the times instruction:

times 10 db '1337' 

Here, 10 is n, db is the instruction and '1337' is an operand. This instruction will tell the assembler to make 10 copies of the instruction db '1337'.
Here’s the hexdump of the code:

00000000  31 33 33 37 31 33 33 37  31 33 33 37 31 33 33 37  |1337133713371337|
00000010  31 33 33 37 31 33 33 37  31 33 33 37 31 33 33 37  |1337133713371337|
00000020  31 33 33 37 31 33 33 37                           |13371337|
00000028

As expected, we can notice the string '1337' repeated 10 times. It worked just fine.


Now, let’s move to the original instruction and try to understand the subtraction it’s doing.
Let’s start with the subtraction under the bracket ($-$$). The $ operator in assembly (nasm) denotes money the address of the current instruction and $$ operator denotes the address of the first instruction (beginning of the current section), which in this case, is the address of the definition of the endless loop and whose address would be 0x7C00 (as we know, firmware loads the bootloader at address 0x7C00).
It’s basically this:

addr_of_current_instruction - addr_of_first_instruction_0x7c00 

This subtraction will return the number of bytes from the start of the program to the current line, which is just the size of the program and it is getting substracted from 0x1fe (510 in decimal). Why are we doing this subtraction?
We are doing this to get the value of bytes that aren’t used so that we can fill them with zeros (db 0) and then we will successfully be having the magic bytes at 511 and 512 index.
It can be understood like this:

200 - (addr_of_current_instruction - addr_of_first_instruction_0x7c00) ; returns the no. of unused bytes.

This value will be passed to times instruction as n and it already has the instruction (db) and operand (0), so it will tell the assembler to fill the bytes aren’t used with 0 until the 510 index.
So, it will finally look like this:

times 200 -(addr_of_current_instruction - addr_of_first_instruction_0x7c00) db 0
; times 0x1fe-($-$$) db 0
; fills the unused bytes with 0

The only thing that is left is to actually put the magic number in the bootloader. It is done by using the dw 0xaa55 instruction (dw is same as db but dw is used for words and db is used for bytes).
Now, that we are done with the understanding of the bootloader, let’s assemble it and look at the hexdump to actually see the result.

00000000: ebfe 1059 6f75 2064 6964 6e27 7420 6368  ...You didn't ch
00000010: 6f73 6520 746f 2065 7869 7374 2e00 0000  ose to exist....
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000100: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000110: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000120: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000130: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000140: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000150: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000160: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000170: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000180: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000190: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000001f0: 0000 0000 0000 0000 0000 0000 0000 55aa  ..............U.

As expected, we have filled the unused bytes with zeros and the last two bytes with the magic number (the order is different due to endianness). Now our bootloader and actually a bootloader and ready to work.

Booting into our bootloader

To boot into it, make sure you have assembled your bootloader code with nasm.

Run this command:

qemu-system-x86_64 bootsector.bin

After you run this, if will see a window of qemu which has some initialization text and then it is blank it means your bootloader works perfectly because we just programmed it to loop so it just doing that. Here’s how the window looks like:

QEMU screeenshot 1.


The final code

We are finally at almost the end of the blog post, and we will now add the final features to our bootloader. These features are not going to be anything fancy, we are only going to make it display the text that we are entering.
Here’s the code for it:

[org 0x7c00]

mov bp, 0xffff
mov sp, bp

call set_video_mode
call get_char_input

jmp $

set_video_mode:
	mov ah, 0x00
	mov al, 0x03
	int 0x10
	ret

get_char_input:
	xor ah, ah 		; same as mov ah, 0x00
	int 0x16

	mov ah, 0x0e
	int 0x10

	jmp get_char_input

times 0x1fe-($-$$) db 0
dw 0xaa55

The org directive

The difference between an instruction an directive is that An instruction is directly translated to something the CPU can execute. A directive is something the assembler can interpret and use it while assembling, it does not produce any machine code.
The first line may look a bit complex because unlike other instructions, it has brackets around it, but there’s nothing to worry about, you can just forget about the brackets and focus on the actual directive. It is org 0x7C00. Here’s the explanation:
As we know, bootloaders get loaded at the memory address 0x7C00 but the assembler don’t know this, that is why we use the org directive to tell the assembler to assume that the address of beginning of our code (base address) is <operand>, which is 0x7C00 in this case. After the assembler knows the base address of the program, every address that the assembler use while assembling the code will be relative to the base address that we have defined. For example, if we do not use this directive, the assembler will assume that the base address to be 0x00 and the address of every function and instruction will be calculated like this:

0x00+relative_addr_of_function
; base_addr + relative_addr_of_function
; base_addr + relative_addr_of_instruction

and these address won’t work on the runtime of our bootloader as it will not be loaded at that address, that is why we need to use the org directive.
Visual comparison of effects of using and not using the org directive:

code without org directivecode with org directive

Setting up the registers.

The next thing we do is setting the correct values for registers.
The first register we set up is the bp (base pointer) register to the address 0xffff and then copy it to sp (stack pointer). Hold up!, Why this address?
In order to understand this, we first need to look at the memory layout of the system when it’s in the booting process. Here is how it looks like:

Memory layout of the system while booting Memory layout of the system while booting.

As you can see, the memory address that we are setting the base pointer is in the free memory that is after the memory address where our bootloader will be loaded (0x7e00) and before the other section of memory which starts at 0x9cf00. We have set it to 0xffff because if we had set it anywhere else (in some non-free memory) then it could possibly overwrite the other data that is around it as the stack increases it’s size whenever data is pushed into it. Note that the address 0xffff is arbitrary and you can use any address from the free space, just make sure that the address that you are choosing is not very closer to the boundaries of other regions inside memory because when you will put data inside your stack, it may expand (stack grows downwards) and overwrite the data inside those other regions.

Interrupts.

The next line of code after the setting up of registers is of a call instruction which is calling the function set_video_mode. Here’s the code of the function:

set_video_mode:
	mov al, 0x03
	mov ah, 0x00
	int 0x10
	ret

The first two lines are pretty basic, they are just moving the constant 0x03 and 0x00 into al and ah register but then we have a new instruction, which is the int instruction. The int instruction is used to generate a software interrupt. So, what is an interrupt?
Interrupts allow the CPU to temporarily halt (stop) what it is doing and run some other, higher-priority instructions before returning to the original task. An interrupt could be raised either by a software instruction (e.g. int 0x10) or by some hardware device that requires high-priority action (e.g. to read some incoming data from a network device.
Each interrupt has a different number assigned to it, which is an index in the Interrupt Vector Table (IVR) which is basically a table that stores these interrupts as indexes to vectors (memory address or pointers) which point to Interrupt Service Routines (ISR). ISRs are initialised by the firmware and they are basically machine code that run differently for each interrupts, they have a sort of a long switch case statement with code to be used differently for different arguments. You can think IVT as a simple hash table (dictionary) in which each index holds a memory address to a function. Here’s an example:

IVT = {
	1: 0x0...,
	2: 0x0...,
	3: 0x0...,
	4: 0x0...,
	5: 0x0...,
	6: 0x0...,
	7: 0x0...
	...
}

The most popular interrupt

If you have ever debugged a program, you might already know what a breakpoint is, it’s simply you asking the debugger to stop the program at some point while it’s running and the debugger does it’s job. But, How do debuggers even make the program stop at while it’s running?
They use the interrupt 3, which is specially made for debuggers to stop a running process.

int 3

How do they use this interrupt to pause a program?
Debuggers replace the opcode of the first opcode of the currently running instruction with the opcode of int 3 which is just a one-byte opcode cc.
Here’s an example:

int-3-instruction-usage

As int 3 has just a single byte opcode, it makes the task very fast and easy for debuggers. When the int 3 instruction is executed, it’s index is checked in the IVT and then it’s ISR is located and it starts running. The ISR then finds the process which needs to get paused, pauses it and notifies the debugger that the process has been stopped, and once the debugger gets this notification, it allows you to inspect the memory and the registers of the process which is getting debugged by the debugger. In order to allow the continuation of the process which was previously paused, the debugger replaces the cc opcode with the original opcode which it was replace with and the program continues from the place where it was stopped. Example:

int-3-instruction-reversed

I hope this section helped you understand the real world usage and implementation of an software interrupt, and now you also know how a debugger makes the breakpoint a thing.

int 0x10

Now, you have a good understanding of interrupts and you have also seen an real world example of it, let’s now understand the usage of the interrupt that is present in the set_video_mode function, the interrupt 0x10. The interrupt 0x10 has video/screen related modification functions. In order to use different functions, we set the ah and al registers together to different values. These are the values that to which the ah register can be set:

  • AH=0x00: Video mode.
  • AX=0x1003: Blinking mode.
  • AH=0x13: Write string.
  • AH=0x03: Get cursor position.
  • AH=0x0e: Write Character in TTY Mode.
set_video_mode:
	mov ah, 0x00
	mov al, 0x03
	int 0x10
	ret

Explanation: The mov instruction is setting the value of the ah register to 0x00, which is basically asking it’s ISR to set the video mode to a mode which is specified in the al register, and these are the supported video modes with the values for ah register:

  • AL=0x00 - text mode. 40x25. 16 colors.
  • AL=0x03 - text mode. 80x25. 16 colors.
  • AL=0x13 - graphical mode. 40x25. 256 colors. 320x200 pixels.

So, both registers combined are basically asking the ISR of interrupt 0x10 to set the video mode of the screen to text mode, which has the size 80x25 and supports 16 colors and that is the only motive of this function.

int 0x16.

The other function we are left with is get_char_input. In this function, we have another interrupt, which is interrupt 0x16.
The interrupt 0x16 is used for basic keyboard related function. These are the some values that can be set in the ah register to use different keyboard functions:

  • AH = 0x00 - Read key press.
  • AH = 0x01 - Get state of the keyboard buffer.
  • AH = 0x02 - Get the State of the keyboard.
  • AH = 0x03 - Establish repetition factor.
  • AH = 0x05 - Simulate a keystroke
  • AH = 0x0A - Get the ID of the keyboard.
Implementation of interrupts into something useful
get_char_input:
	xor ah, ah		; same as mov ah, 0x00
	int 0x16

	mov ah, 0x0e
	int 0x10

	jmp get_char_input

The first thing done in the function’s code is the xoring of the ah register with itself, which is basically the same as mov ah, 0x00 but xoring a register with itself is believed to be faster and less CPU expensive, so I used it.
After setting ah to zero, it will call the interrupt 0x16, whose ISR will then read the keystroke from the keyboard and store it into the al register.
After that, it sets the ah register to 0x0e and calls our good old interrupt 0x10, but this time it is not setting the video mode to something as the ah register is not set to 0x00. If you read the functions of the interrupt 0x10 again, you will find that ah = 0x0e asks it’s ISR to “write a character in tty mode” which basically means “write a character to the screen”. The character which this ISR will print will be taken from the al register. So, these two interrupts are together reading the character from the screen (using interrupt 0x10) and printing it onto the screen (using intterupt 0x16).
After this reading of character, the function is simply calling itself (like an infinite loop) to continue what it’s doing forever until it’s manually stopped.

Our bootloader in action

The final thing we are left with is to see our bootloader in action, so let’s do it. Assemble the code:

$ nasm bootsector.asm -f bin -o bootloder.bin

Run it with qemu:

qemu-system-x86_64 bootsector.bin

Now, you should have a blank window of qemu. You can now type anything and it’ll display it to the screen and that is all it has to it.

final-bootloader-screenshot

Summary

We started this blog post by understanding the boot process of a computer, then we learnt about some new and assembly instructions and then we learned about what interrupts, how they work and then we learnt about how debuggers implement breakpoints using interrupts and lastly we learnt how the interrupt 0x10 and interrupt 0x16 can be used and how can we implement them to read data from the screen and print it.

Author notes

This post really took me so much of my time, efforts and understanding of different aspects of an Operating System. I tried the best way to explain everything and I hope that you also learnt so many new things throughout this blog post.
If you think this thing feels fascinating to you and you want to build your own fully-fledged Operating system, then you can continue learning OS dev and to make your lazy life easier, I have linked to different places where you can learn OS dev in the resources section.

Resources

  • There are no more articles
❌