Posts Playing with Gothic Virtual File System (VDFS)
Post
Cancel

Playing with Gothic Virtual File System (VDFS)

Introduction

When I was playing Gothic someday around the year 2010, I was pretty interested in how the game engine actually works. The entry point of consideration was arbitrarily chosen vdfs32g.dll library, which was responsible for providing a “virtual file system” for Gothic’s engine called ZenGin. I also wanted to inspect this part in order to find out why the game is sometimes crashing inside of the library’s code. As far as I remember, I haven’t found any important malfunction there but still decided to play around a little bit.

Most of the game contents were packaged into large archive files like Texture.vdf, Sound.vdf. There was also a special .mod extension reserved for community-driven modifications (exactly the same file format). The vdfs32g.dll library allowed to mount these archives, so it was possible to access files contained inside them by means of C API which was very similar to the original filesystem API (e.g. vdf_fopen in order to open a file, vdf_fread to read its contents).

So for instance, when the engine did something like:

1
2
3
char *buf[16];
FILE *fp = vdf_fopen("textures/some_file.jpg", "w");
vdf_fread(buffer, 16, 1, fp);

Then the library was opening Texture.vdf file, reading it’s “table of contents” (however it is implemented) in order to determine the offset on which the textures/some_file.jpg is stored in the archive. Then, the appropriate read was performed to the target buffer.

Such layer of abstraction makes game distribution easier (you have to ship 10 big files instead of 300000 small files; it’s easier provided that updates are not too frequent and they cover large part of the game) and also allows to enforce some custom file caching policy inside the library’s implementation. Moreover, modders may ship “overlay” archives which will override some of the original files, but it will be done on the VDFS level, making absolutely no difference for the engine. For instance: if there is textures/some_file.jpg in both Texture.vdf and my_super_modification.mod, then the file will be read from the second one, because of the higher priority.

I wanted to play a little bit with the library using a technique called DLL spoofing (this tutorial was authored by Gynvael Coldwind). I’ve decided to implement a very simple file obfuscator (“encryptor”), so the files will be “encrypted” inside of the actual archive, the tools suited for VDF archive manipulation will extract the encrypted (useless) files. However, the engine will read them properly, because the way of operation of vdf_fread function will be modified (to make it “decrypt” files on the fly).

VDF Obfuscation

The very first thing is to take the original vdfs32g.dll library and examine it’s exports using the impdef tool:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
LIBRARY     VDFS32G.DLL

DESCRIPTION 'DFS Checker/Maker V2.7lfg dll (C) 1994-2002 Peter Sabath / TRIACOM Software GmbH for G'

EXPORTS
    GetFileInfo                                    @19  ; GetFileInfo
    vdf_GetOption                                  @21  ; vdf_GetOption
    vdf_changedir                                  @9   ; vdf_changedir
    vdf_exitall                                    @17  ; vdf_exitall
    vdf_fclose                                     @2   ; vdf_fclose
    vdf_fdirexists                                 @8   ; vdf_fdirexists
    vdf_fexists                                    @7   ; vdf_fexists
    vdf_ffilesize                                  @18  ; vdf_ffilesize
    vdf_findclose                                  @13  ; vdf_findclose
    vdf_findnext                                   @12  ; vdf_findnext
    vdf_findopen                                   @11  ; vdf_findopen
    vdf_fopen                                      @1   ; vdf_fopen
    vdf_fread                                      @3   ; vdf_fread
    vdf_fseek                                      @4   ; vdf_fseek
    vdf_fseekrel                                   @5   ; vdf_fseekrel
    vdf_ftell                                      @6   ; vdf_ftell
    vdf_getdir                                     @10  ; vdf_getdir
    vdf_getlasterror                               @15  ; vdf_getlasterror
    vdf_initall                                    @16  ; vdf_initall
    vdf_searchfile                                 @14  ; vdf_searchfile
    vdf_setOption                                  @20  ; vdf_setOption

We have our original vdfs32g.def file with the list of library’s imports and exports. We are going to rename the original library to OrgVdfs32g.dll and introduce additional, decorator/wrapper-like DLL under the original name vdfs32g.dll. Our wrapper DLL must have exactly the same exports (by means of names and ordinals) in order not to confuse the game engine. Our wrapper library will essentially intercept the function calls made by the engine and call the corresponding functions in the original OrgVdfs32g.dll library. However, in our wrapper we will have full control over what happens before and after the call to the original library. The same, but in pseudocode:

1
2
3
4
5
6
int vdf_something(int some_arg) {
    // here we can execute arbitrary code, e.g. change the initial argument `some_arg`
    // call the original library
    int ret = original_vdf_something(some_arg);
    // also here we may inject some logic, e.g. changing the return value
}

What’s good about .def files is that we can pass them directly to the compiler, essentially telling it “take my code and build the DLL compatible with the definition from the provided .def”.

By default, in case of the above definition, the compiler would look for the symbol vdf_fopen in our object files and export it under the name vdf_fopen. We can override that by using = character, for instance:

1
    vdf_fopen=hook_vdf_fopen                       @1

would mean that “physically” in our library, there is some hook_vdf_fopen which we would like to export, but the exported name seen from the outside has to be vdf_fopen. @1 means that we want to export the function under ordinal 1 (imports may be performed “by name” or “by ordinal”).

Moreover, if we would do:

1
    vdf_changedir=OrgVdfs32g.vdf_changedir         @9

then it means that we want to export the name vdf_changedir from our library, but the actual implementation is in the OrgVdfs32g library under the name vdf_changedir. Such trick is perfectly valid. What we are going to do is to implement wrappers for some functions that we want to wrap and redirect all other functions right to the original library. The resulting vdfs32g.def is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
LIBRARY     VDFS32G.DLL

DESCRIPTION 'DFS Checker/Maker V2.7lfg dll (C) 1994-2002 Peter Sabath / TRIACOM Software GmbH for G'

EXPORTS
; --- not wrapped, redirected straight into the original library ---
    GetFileInfo=OrgVdfs32g.GetFileInfo             @19  ; GetFileInfo
    vdf_GetOption=OrgVdfs32g.vdf_GetOption         @21  ; vdf_GetOption
    vdf_changedir=OrgVdfs32g.vdf_changedir         @9   ; vdf_changedir
    vdf_exitall=OrgVdfs32g.vdf_exitall             @17  ; vdf_exitall
;   vdf_fclose=OrgVdfs32g.vdf_fclose               @2   ; vdf_fclose
    vdf_fdirexists=OrgVdfs32g.vdf_fdirexists       @8   ; vdf_fdirexists
    vdf_fexists=OrgVdfs32g.vdf_fexists             @7   ; vdf_fexists
    vdf_ffilesize=OrgVdfs32g.vdf_ffilesize         @18  ; vdf_ffilesize
    vdf_findclose=OrgVdfs32g.vdf_findclose         @13  ; vdf_findclose
    vdf_findnext=OrgVdfs32g.vdf_findnext           @12  ; vdf_findnext
    vdf_findopen=OrgVdfs32g.vdf_findopen           @11  ; vdf_findopen
;   vdf_fopen=OrgVdfs32g.vdf_fopen                 @1   ; vdf_fopen
;   vdf_fread=OrgVdfs32g.vdf_fread                 @3   ; vdf_fread
;   vdf_fseek=OrgVdfs32g.vdf_fseek                 @4   ; vdf_fseek
    vdf_fseekrel=OrgVdfs32g.vdf_fseekrel           @5   ; vdf_fseekrel
;   vdf_ftell=OrgVdfs32g.vdf_ftell                 @6   ; vdf_ftell
    vdf_getdir=OrgVdfs32g.vdf_getdir               @10  ; vdf_getdir
    vdf_getlasterror=OrgVdfs32g.vdf_getlasterror   @15  ; vdf_getlasterror
    vdf_initall=OrgVdfs32g.vdf_initall             @16  ; vdf_initall
    vdf_searchfile=OrgVdfs32g.vdf_searchfile       @14  ; vdf_searchfile
    vdf_setOption=OrgVdfs32g.vdf_setOption         @20  ; vdf_setOption
; --- wrapped functions which will be redirected to our additional library ---
    vdf_fopen=hook_vdf_fopen                       @1
    vdf_fclose=hook_vdf_fclose                     @2
    vdf_fread=hook_vdf_fread                       @3
    vdf_fseek=hook_vdf_fseek                       @4
    vdf_ftell=hook_vdf_ftell                       @6

Up to now, our wrapper library doesn’t depend on the original library, so we may change the implementation of some functions, but we are not able to refer to the original ones. Thus, we need to import them from OrgVdfs32g.dll:

1
2
3
4
5
6
IMPORTS
    vdf_fopen=OrgVdfs32g.vdf_fopen
    vdf_fclose=OrgVdfs32g.vdf_fclose
    vdf_fread=OrgVdfs32g.vdf_fread
    vdf_ftell=OrgVdfs32g.vdf_ftell
    vdf_fseek=OrgVdfs32g.vdf_fseek

Now on, the situation is as follows:

  • The original library is renamed to OrgVdfs32g.dll.
  • We will soon add new library under the name vdfs32g.dll.
  • Our library will export all the symbols listed above (GetFileInfo, vdf_GetOption, vdf_changedir, …), so it will be fully compatible in terms of it’s interface.
  • The game engine will call our code when refering to vdf_fopen, vdf_fclose, vdf_fread, vdf_fseek and vdf_ftell functions. The rest will go directly to the original implementation.
  • For clarity, we are keeping the original functions under their original names and the “decorators” are called hook_<orig_function_name>.

Wrapper DLL prototypes

What is left is the implementation. The crucial thing is to declare function prototypes correctly, these may be determined by disassembling the library/something which is using the library and seeing how many values are being passed, guessing the names of the arguments, their types etc.

(TODO some examination with IDA?)

The result is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#define DLL_IMPORT __declspec(dllimport)
#define DLL_EXPORT __declspec(dllexport)

extern "C" {
    int DLL_IMPORT vdf_fopen(char* name, int mode);
    int DLL_IMPORT vdf_fclose(int handle);
    long DLL_IMPORT vdf_fread(int handle, char* buffer, long len);
    long DLL_IMPORT vdf_ftell(int handle);
    int DLL_IMPORT vdf_fseek(int handle, long len);

    int DLL_EXPORT hook_vdf_fopen(char* name, int mode);
    int DLL_EXPORT hook_vdf_fseek(int handle, long offset);
    int DLL_EXPORT hook_vdf_fclose(int handle);
    long DLL_EXPORT hook_vdf_fread(int handle, char* buffer, long len);
    long DLL_EXPORT hook_vdf_ftell(int handle);
}

On the assembly level it’s (un)fortunately all about the interpretation (one may replace int by char[4] in the above code, in most cases it’s still going to be valid).

Now, everything will be fairly easy. We just need to implement our “decorators”. It’s a good idea to stub them in order to see if everything works correctly.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int DLL_EXPORT hook_vdf_fopen(char* name, int mode) {
    return vdf_fopen(name, mode);
}

int DLL_EXPORT hook_vdf_fseek(int handle, long offset) {
    return vdf_fseek(handle, offset);
}

int DLL_EXPORT hook_vdf_fclose(int handle) {
    return vdf_fclose(handle);
}

long DLL_EXPORT hook_vdf_fread(int handle, char* buffer, long len) {
    return vdf_fread(handle, buffer, len);
}

long DLL_EXPORT hook_vdf_ftell(int handle) {
    return vdf_ftell(handle);
}

Right now our library is completely transparent. We may compile it using the following commands:

1
2
3
4
5
6
7
# generate static interface library from our DLL
impdef orgVdfs32g.def orgVdfs32g.dll
dlltool -d orgVdfs32g.def -l liborgvdfs32g.a
# compile our C++ spoof code
gcc -O2 spoof.cpp -c -lshlwapi -Wall
# link spoof with it's dependencies and with the original library
dllwrap -def fwdVdfs32g.def -o vdfs32g.dll spoof.o -lshlwapi -lstdc++ liborgvdfs32g.a -static

Actual obfuscator

Our obfuscator is going to “encrypt” files contained in VFS archives by a simple xor loop cipher (TODO exact name). The “encryption” keys will be generated straight from the base file names.

It may happen that not all files will be obfuscated, so it is required to somehow mark them. Let’s say that we will append some special header to the obfuscated files, which will consist of the magic 0xC0FFEE24 (4 bytes) and then the SHA-1 digest of the valid “encryption” key which was used against the file (20 bytes). The digest will be used for the verification purposes.

Let’s start with the actual implementation. We modify the hook_vdf_fopen function, so once a new file is being opened, it will read first 4 bytes in order to check if the file contents start with the magic 0xC0FFEE24. If so, then it means that the file is obfuscated and we will need to deobfuscate (decrypt) the buffers returned from vdf_read every time when read occurs. Before doing that, we need to check verification digest in order to see if the generated key is correct.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
int DLL_EXPORT hook_vdf_fopen(char* name, int mode) {
    int handle = vdf_fopen(name, mode);

    if (handle >= 0) {
        // check for encrypted file header
        uint32_t magic;
        if (vdf_fread(handle, (char*)&magic, MAGIC_SIZE) == MAGIC_SIZE
                && magic == MAGIC) {
            // file contains magic prefix, let's generate a "decryption" key
            // and check if it's correct
            std::string base_name = strip_name(name);
            key_t key = new unsigned char[KEY_LEN];
            sha1::calc(base_name.c_str(), base_name.size(), key);
            if (!check_key(handle, key)) {
                std::string error_text = "GVC: KEY FAIL:";
                error_text += name;
                FatalAppExit(0, error_text.c_str());
                return -1;
            }
            // if so, then link the file handle returned from vdf_fopen with the proper key
            vdf_keys.insert(std::pair<int, key_t>(handle, key));
        } else {
            // file is not prefixed, so probably it's not obfuscated
            // then just rewind to the beginning of the file
            vdf_fseek(handle, 0);
        }
    }

    return handle;
}

The encryption key has length 20 and is generated just by performing SHA-1 on the file name which is being opened (after stripping the path just to the base name, so the key will be the same for ./foo.txt and foo.txt etc).

Right, now we need to implement hook_vdf_fread. If the engine wants to read something from the descriptor which is contained inside vdf_keys map, then we must deobfuscate the buffer before passing it back:

1
2
3
4
5
6
7
8
long DLL_EXPORT hook_vdf_fread(int handle, char* buffer, long len) {
    long result = vdf_fread(handle, buffer, len);
    if (vdf_keys.find(handle) != vdf_keys.end()) {
        long offset = vdf_ftell(handle)-len-MAGIC_SIZE-KEY_LEN;
        crypt_buffer(buffer, len, vdf_keys[handle], offset);
    }
    return result;
}

Where crypt_buffer is actually a trivial xor-loop:

1
2
3
4
5
6
7
8
9
10
11
12
void crypt_buffer(char* buffer, unsigned int len,
                  key_t key, unsigned int initpos) {
    unsigned int pos = initpos % KEY_LEN;
    for(unsigned int i = 0; i < len; i++) {
        *buffer ^= key[pos];
        buffer++;
        pos++;
        if (pos >= KEY_LEN) {
            pos = 0;
        }
    }
}

It was possible to use something stronger there (AES?), but my assumption was that if somebody is able to reverse-engineer my obfuscator, then it would not stop him anyway, so there is no practical reason for doing anything more advanced.

We also need to consider that vdf_ftell and vdf_fseek functions would report invalid values for the obfuscated files, as their contents are prefixed with 4+20 bytes. Thus, we need to subtract them, if we deal with the descriptor pointing to the obfuscated file:

1
2
3
4
5
6
7
8
9
long DLL_EXPORT hook_vdf_ftell(int handle) {
    long result = vdf_ftell(handle);

    if (vdf_keys.find(handle) != vdf_keys.end()) {
        result -= MAGIC_SIZE+KEY_LEN;
    }

    return result;
}

Finally, the proper cleanup is required once the descriptor is no longer valid.

1
2
3
4
5
6
7
8
int DLL_EXPORT hook_vdf_fclose(int handle) {
    if (vdf_keys.find(handle) != vdf_keys.end()) {
        delete[] vdf_keys[handle];
        vdf_keys.erase(handle);
    }

    return vdf_fclose(handle);
}

Here we go, the only thing left is to compile the library.

In order to actually obfuscate the VDF archives, we may extract their contents using tools provided by the game creators (in the mod SDK) and then write some program which will apply our trivial cipher. A simple tool which reads the file contents, encrypts it with the crypt_buffer function and appends the appropriate header would be enough.

Actual strength of the solution

The thing presented here was later used by some Gothic modders in order to obfuscate beta pre-releases of their mods just to avoid premature leaks of their creations. This was just to avoid leaking the bugged beta-releases among mod’s fans, not to disappoint them. Such thing, combined with some hardware id checks could actually do the job if among testers there is nobody who could do basic reverse-engineering.

The quotings around the previous usages of the word “encryption” are not incidental, because the proposed solution has nothing to do with the real-world cryptography (but is very fast, this is some cool thing about it). Also, be aware that such obfuscator could only stop “civilians”, as it is breakable by anybody who would know Assembler and would like to devote an hour (less? more?) to crack it.

Even when I was writting the code in year 2010, I was realizing some of the related issues with the “strength” of such obfuscator. Let’s mention them for completness:

Problem 1: XOR loop encryption could be trivially cracked without knowing anything about the encryption key. Provided that somebody just knows that the algorihm is:

1
2
3
4
5
6
7
// m[n] - plain text message of length n
// k[N] - encryption key of length N
// c[n] - ciphertext

for (int i = 0; i < n; i++) {
    c[i] = m[i] xor k[i % (N+1)];
}

If we know what are the values of c[n] and m[n], we can perform the following algorithm:

  1. Suppose that the key length is some arbitrary N.
  2. Substitute i = 0 into the following set of equations and solve it for k[i] (\(\oplus\) means xor):
\[\begin{cases} c_{i} & = & m_{0+i} \oplus k_{i} \\\\ c_{N+i} & = & m_{N+i} \oplus k_{i} \\\\ c_{2N+i} & = & m_{2N+i} \oplus k_{i} \\\\ \end{cases}\]
  1. If there is no solution, then try again with different N.
  2. If there is a single solution, then solve with i = 1, 2, ... N-1.
  3. We’ve got a key and it’s length.

Problem 2: Cracking is not even necessary, the encryption key is just a SHA-1 digest of the base file name. The attacker may just generate the keys by himself.

Problem 3: Knowing the key is also not necessary, because one may write a simple program which would import vdfs32g.dll and simply ask to read all the files. This could be solved by introducing some additional checks inside our library to ensure that it was loaded by the game engine, not the external program.

Problem 3a: If we have such checks, nothing stops the attacker from performing DLL spoofing of our DLL spoofed library (wut?) and grabbing all the files, so we also need to check against that.

Problem 3b: If we have such checks, somebody may still reverse-engineer the library in order to remove the instructions corresponding to these checks.

Problem 4: No matter what we do, there is still a possibility to grab the encryption keys/raw file contents just from the program’s memory, so we need a serious packer/DRM-like solution/wtf in order to make it harder.

Summary

If the computer could execute something, then it’s always possible for a human to reverse-engineer it with a greater or smaller effort. Even though, it’s fairly easy to invent something trivial and have fun implementing it (gaining experience). Next time I will show something more advanced, also related to Gothic.

The source code is available on GitHub.

This post is licensed under CC BY 4.0 by the author.