Wednesday, April 8, 2015

Anonymous git, among other anon things

There's a pdf over on defcon about de-anonymizing various attempts at anonymity. It would be intensely interesting to implement a service as described in the paper.

Also, a way to anonymously submit public domain code would be very interesting as well.

Tuesday, April 7, 2015

Metric Lumber Company

Need metric based lumber companies to help move us onto metric.

Monday, April 6, 2015

Imperative vs Declarative

Source

Let's generalize and say that there are two ways in which we can write code: imperative and declarative.
We could define the difference as follows:
  • Imperative programming: telling the "machine" how to do something, and as a result what you want to happen will happen.
  • Declarative programming: telling the "machine"1 what you would like to happen, and let the computer figure out how to do it.
1 Computer/database/programming language/etc

Examples of imperative and declarative code


Taking a simple example, let's say we wish to double all the numbers in an array.
We could do this in an imperative style like so:
var numbers = [1,2,3,4,5]
var doubled = []

for(var i = 0; i < numbers.length; i++) {
  var newNumber = numbers[i] * 2
  doubled.push(newNumber)

console.log(doubled) //=> [2,4,6,8,10]
We explicitly iterate over the length of the array, pull each element out of the array, double it, and add the doubled value to the new array, mutating the doubled array at each step until we are done.
A more declarative approach might use the Array.map function and look like:
var numbers = [1,2,3,4,5]
 
var doubled = numbers.map(function(n) {
  return n * 2
})
console.log(doubled) //=> [2,4,6,8,10]
map creates a new array from an existing array, where each element in the new array is created by passing the elements of the original array into the function passed to map (function(n) { return n*2 } in this case).
What the map function does is abstract away the process of explicitly iterating over the array, and lets us focus on what we want to happen. Note that the function we pass to map is pure; it doesn't have any side effects (change any external state), it just takes in a number and returns the number doubled.
There are other common declarative abstractions for lists that are available in languages with a functional bent. For example, to add up all the items in a list imperatively we could do this:
var numbers = [1,2,3,4,5]
var total = 0

for(var i = 0; i < numbers.length; i++) {
  total += numbers[i]
}
console.log(total) //=> 15
Or we could do it declaratively, using the reduce function:
var numbers = [1,2,3,4,5]

var total = numbers.reduce(function(sum, n) {
  return sum + n
});
console.log(total) //=> 15
reduce boils a list down into a single value using the given function. It takes the function and applies it to all the items in the array. On each invocation, the first argument (sum in this case) is the result of calling the function on the previous element, and the second (n) is the current element. So in this case, for each element we add n to sum and return that on each step, leaving us with the sum of the entire array at the end.
Again, reduce abstracts over the how and deals with the iteration and state management side of things for us, giving us a generic way of collapsing a list to a single value. All we have to do is specify what we are looking for.

 

Strange?


If you have not seen map or reduce before, this will feel and look strange at first, I guarantee it. As programmers we are very used to specifying how things should happen. "Iterate over this list", "if this then that", "update this variable with this new value". Why should you have to learn this slightly bizarre looking abstraction when you already know how to tell the machine how to do things?
In many situations imperative code is fine. When we write business logic we usually have to write mostly imperative code, as there will not exist a more generic abstraction over our business domain.
But if we take the time to learn (or build!) declarative abstractions we can take dramatic and powerful shortcuts when we write code. Firstly, we can usually write less of it, which is a quick win. But we also get to think and operate at a higher level, up in the clouds of what we want to happen, and not down in the dirty of how it should happen.

SQL


You may not realise it, but one place where you have already used declarative abstractions effectively is in SQL.
You can think of SQL as a declarative query language for working with sets of data. Would you write an entire application in SQL? Probably not. But for working with sets of related data it is incredibly powerful.
Take a query like:
SELECT * from dogs
INNER JOIN owners
WHERE dogs.owner_id = owners.id
Imagine trying to write the logic for this yourself imperatively:
//dogs = [{name: 'Fido', owner_id: 1}, {...}, ... ]
//owners = [{id: 1, name: 'Bob'}, {...}, ...]

var dogsWithOwners = []
var dog, owner

for(var di=0; di < dogs.length; di++) {
  dog = dogs[di]

  for(var oi=0; oi < owners.length; oi++) {
    owner = owners[oi]
    if (owner && dog.owner_id == owner.id) {
      dogsWithOwners.push({
        dog: dog,
        owner: owner
      })
    }
  }}
}
Yuck! Now, I'm not saying that SQL is always easy to understand, or necessarily obvious when you first see it, but it's a lot clearer than that mess.
But it's not just shorter and easier to read, SQL gives us plenty of other benefits. Because we have abstracted over the how we can focus on the what and let the database optimise the how for us.
If we were to use it, our imperative example would be slow because we would have to iterate over the full list of owners for every dog in the list.
But in the SQL example we can let the database deal with how to get the correct results. If it makes sense to use an index (providing we've set one up) the database can do so, resulting in a large performance gain. If it's just done the same query a second ago it might serve it from a cache almost instantly. By letting go of how we can get a whole host of benefits by letting computers do the hardwork, with little cognitive overhead.

 

d3.js


Another place where declarative approaches are really powerful is in user interfaces, graphics and animations.
Coding user interfaces is hard work. Because we have user interaction and we want to make nice dynamic user interactions, we typically end up with a lot of state management, and generic how code that could be abstracted away, but frequently isn't.
A great example of a declarative abstraction is d3.js. D3 is a library that helps you build interactive and animated visualisations of data using JavaScript and (typically) SVG.
The first time (and fifth time, and possibly even the tenth time) you see or try and write d3 code your head will hurt. Like SQL, d3 is an incredibly powerful abstraction over visualising data that deals with almost all of the how for you, and lets you just say what you want to happen.
Here's an example (I recommend viewing the demo for some context). This is a d3 visualization that draws a circles for each object in the data array. To demonstrate what's going on we add a circle every second.
The interesting bit of code is:
//var data = [{x: 5, y: 10}, {x: 20, y: 5}]

var circles = svg.selectAll('circle')
                    .data(data)

circles.enter().append('circle')
           .attr('cx', function(d) { return d.x })
           .attr('cy', function(d) { return d.y })
           .attr('r', 0)
        .transition().duration(500)
          .attr('r', 5)
It's not essential to understand exactly what's going on here (it will take a while to get your head around regardless), but the gist of it is this:
First we make a selection object of all the svg circles in the visualisation (initially there will be none). Then we bind some data to the selection (our data array).
D3 keeps track of which data point is bound to which circle in the diagram. So initially we have two datapoints, but no circles; we can then use the .enter() method to get the datapoints which have "entered". For those points, we say we would like a circle added to the diagram, centered on the x and y values of the datapoint, with an initial radius of 0 but transitioned over half a second to a radius of 5.

 

So why is this interesting?


Look through the code again and think about whether we are describing what we want our visualisation to look like, or how to draw it? You'll see that there is almost no how code at all. We are just describing at quite a high level what we want:
I want this data drawn as circles, centered on the point specified in the data, and if there are any new circles you should add them and animate their radius.
This is awesome, we haven't written a single loop, there is no state management here. Coding graphics is often hard, confusing and ugly, but here d3 has abstracted away most of the crap and left us to just specify what we want.
Now, is d3.js easy to understand? Nope, it definitely takes a while to learn. And most of that learning is in giving up your desire to specify how things should happen and instead learning how to specify what you want.
Initially this is hard work, but after a few hours something magical happens - you become really, really productive. By abstracting away the how d3.js really lets you focus on what you want to see, which frankly is the only thing you should care about when designing something like a visualisation. It frees you from the fiddly details of the how and lets you interact with the problem at a much higher level, opening up the possibilities for creativity.

 

Finally


Declarative programming allows us to describe what we want, and let the underlying software/computer/etc deal with how it should happen.
In many areas, as we have seen, this can lead to some real improvements in how we write code, not just in terms of fewer lines of code, or (potentially) performance, but by writing code at a higher level of abstraction we can focus much more on what we want, which ultimately is all we should really care about as problem solvers.
The problem is, as programmers we are very used to describing the how. It makes us feel good and comfortable - powerful, even - to be able to control what is happening, and not leave it to some magic process we can't see or understand.
Sometimes it's okay to hold on to the how. If we need to fine tune code for high performance we might need to specify the what in more detail. Or for business logic, where there isn't anything that a generic declarative library could abstract over, we're left writing imperative code.
But frequently we can, and I'd argue should, look for declarative approaches to writing code, and if we can't find them, we should be building them. Will it be hard at first? Yes, almost certainly! But as we've seen with SQL and D3.js the long term benefits can be huge!

Monday, March 16, 2015

Reducing reliance on the mouse

The point of this project is to reduce the amount of reliance on the mouse, as I believe it leads to wrist fatigue and other maladies and is just plain slow. Using a Thinkpad T430, I intend to make controlling the laptop as utterly keyboard centric as possible. The T430 only comes with control, windows, and alt keys, but they are distinct keys, such that control_l is different than control_r and same for alt. The only concession I'm willing to make for this keyboard centric approach is the web browser; simply using emacs web browser 'eww' or links/lynx doesn't cut it for me. Conkeror is very useful at helping reduce mouse usage in browsers, but still requires mouse usage. Plus I still need to use chrome and firefox for development and I like to use their dev tools which requires more mouse usage. Firefox though, has the great firemacs add-on which provides emacs bindings for common movements. Chrome stands out as the biggest mouse user.

For what mouse usage I do have perform, I have essentially three options: an external mouse, the touchpad and the trackpoint which is provided on the T430. An external mouse is by far the worst option, requiring moving my hand completely off of the keyboard. Using the touchpad for mouse movements is better but still requires too much reconfiguring of my hands to be acceptable. The T430 has a trackpoint, which requires the least amount of adjusting of my hands. The touchpad can still serve a purpose though, as I still need some sort of scroll wheel. So I'll investigate converting it to one large scroll wheel.

Controllers

The software to control my laptop boils down to the operating system, a window manager, a terminal, and a text editor (aside from the aforementioned browsers). My preferences for these are (Ubuntu/OpenSUSE) Linux, Xmonad, tmux and emacs. What I'm looking for is a way to control these controllers from the keyboard. In particular, I want to be able to easily control iterating through 'windows' in emacs (default is Control-xo), 'windows' in xmonad (default Meta/Alt-tab) and 'panes' in tmux (default Control-bo). Another goal, is to stay as close to default configurations as possible. I don't like being completely lost if I'm on a different machine somewhere in production that doesn't have my configuration scripts. On the other hand, being lost on a different laptop completely is something I'll have to tolerate due to the key reconfiguration.

Conflicts

From the very start, we have conflicts between the modifier keys for emacs xmonad and tmux. Who has priority? Emacs makes extensive use of Control and Meta/Alt, and is my primary operating space, so it takes priority. This leaves the Windows key available, but I have two more programs that need modifier keys. Since my next operating space is the window manager, xmonad gets the Windows key.  We'll have to figure out something for tmux later.

What space iteration key combo to use?

Meta/Alt-tab is a common key chord amongst window managers. For me, it requires moving my pinky, which inevitably moves the rest of my hand as well, with my index finger ending up over the 'd' key on a qwerty keyboard. I like emacs Control-xo, as the 'o' key is comfortable and doesn't move my left hand, but I don't like having to hit three keys to switch window spaces. I can easily change this to just Control-o in .emacs, but Control-o defaults to 'open-line, which I occasionally use. Meta/Alt-o on the other hand enters emac's font system, which, while powerful, is not something I use at all. This also lines up with tmux's Control-bo to 'select-pane'. So some-kind-of-modifier-o is what I'll use.

Control, Super/Win, Alt

On my Thinkpad T430, I have a set of modifier keys on the left of the spacebar (fn, ctrl_l, windows_l, alt_l) and a set on the right of the spacebar (alt_r, PrtSc, ctrl_r). The Fn key is rather important in that it controls other various aspects of the laptop such as screen brightness and toggling the wireless modem. This is not a key I will reconfigure. PrtSc is located in a valuable space, so I should remap that as a second Windows key, for symmetry.

The ctrl_[l|r] keys are extensively used in emacs but the location of both the ctrl keys is not acceptable for me as they require too much pinky distension. So I plan to make a common swap for the valuable capslock location. ctrl_l is more suitable for controlling tmux as interacting with tmux is something I do much less frequently. So I'll have to find some way of making ctrl_l a modifier key for tmux. The Windows key works well as the modifier for xmonad. Tmux is still without a modifier key, but considering that it typically uses Control-b, a chord in itself as a modifier, I would like is to simply have ctrl_[l|r] be the single key needed for tmux.

The Plan

What I would like then is as follows:
  • replace capslock with control
  • replace ctrl_l with a custom tmux modifier
  • make the windows key the modifier key for xmonad
  • make Alt-o the 'window' iteration key for emacs
  • make Windows-o the 'window' iteration key for xmonad
  • make ctrl_l-o the 'pane' iteration key for tmux
  • duplicate the functionality on the right of the space bar (ie, convert PtrSc to a windows key, ctrl_r becomes a second tmux modifer key)
  • make the touchpad one large scrollbar

Key reconfiguring

Here is things get really, really convoluted and confusing. Configuring keys in Linux is done with Xkb and setxkbmap. Xmodmap predates Xbk and is deprecated, so we'll skip that. Researching on the internet for documentation found Extending xkb, x.org's The X Keyboard Extension, and the Archlinux Wiki.

I first need to figure out what keys I have currently. To do this, I have to dump the current layout with the following:
xkbcomp $DISPLAY xkb.dump
This dumps the layout to the file xkb.dump, which I can then inspect. Doing so reveals that I want to edit the keys CAPS, LCTL, PRSC and RCTL. How to do this though, according to the x.org documentation, is that I need a rules file and a symbol file. The rules file references the symbol file in the 'options' field in order to enable the customizations I specified in the symbol file. The two examples I will use for these files are the evdev rule set located in /usr/share/X11/xkb/rules/evdev and the ctrl symbol file located at /usr/share/X11/xkb/symbols/ctrl.

The rule file references various entries in the symbol file via the !option field which specifies all the options such as
ctrl:nocaps = +ctrl(nocaps)
which points to the 'nocaps' entry in the 'symbols/ctrl' file.

Once things are ready to be saved, there is unfortunately no way to save these in a home directory. Custom rules files must exist in /usr/share/X11/xkb/rules/ and symbol files must exist in /usr/share/X11/symbols/. For ease of use, I created my files in ~/.xkb/ and created symlinks with in symbols/ and rules/ to those files. Once those custom files are set up, the rule file can be specified with setxkbmap. Changes in symbol files wont take affect until the cached versions are removed with rm /var/lib/xkb/*.xkm. Then if our symbol file is /usr/share/X11/symbols/test, with symbols defined as
partial modifier_keys
xkb_symbols "nocaps_nolrctrl" {
    replace key <caps>;    { [ Control_L, Control_L ] };
    replace key <lctl>    { [ NoSymbol ] };
    replace key <rctl>    { [ NoSymbol ] };
    modifier_map  Control { <caps> }
};
our option string to use would be 'test:nocaps_nolrctrl'. Test it with setxkbmap -option 'test:nocaps_nolrctrl'. This is also what is needed in the rule file.

Ultimately, I've ended up with a rule file ~/.xkb/window_ctrl that looks like

// Eliminate CapsLock, making it another Ctrl.
partial modifier_keys
xkb_symbols "nocaps_nolrctrl" {
    replace key <caps>  { [ Control_L, Control_L ] };
    replace key <lctl>  { [ NoSymbol ] };
    replace key <rctl>  { [ NoSymbol ] };
    modifier_map  Control { <caps> };
};

// Replace PtrSc
partial modifier_keys
xkb_symbols "prtsc_super" {
    replace key <prsc>  { [ Super_R, Super_R ] };
    modifier_map  Mod4 { <prsc>, <lwin> };
};

// Have [L|R]CTL generate an fkey
partial modifier_keys
xkb_symbols "ctl_fkey" {
    replace key <lctl>  { [ F12 ] };
    replace key <rctl>  { [ F12 ] };
};

For my rules file, I simply copied and pasted the existing evdev and evdev.lst file to ~/.xkb/evdev.mouseless and ~/.xkb/evdev.mouseless.lst then added to the end of the  
! option = symbols section of evdev.mouseless:

  window_ctrl:ctl_fkey        = +window_ctrl(ctl_fkey)
  window_ctrl:nocaps_nolrctrl = +window_ctrl(nocaps_nolrctrl)
  window_ctrl:prtsc_super     = +window_ctrl(prtsc_super)

Then I set my rule file to run with setxkbmap -rules evdev.mouseless -option "window_ctrl:nocaps_nolrctrl,window_ctrl:prtsc_super,window_ctrl:ctl_fkey".

Xmonad config

Configuring xmonad is relatively straightforward. Create a xmonad.hs file in ~/.xmonad starting from a template config, and update
myModMask with mod4Mask (mod4 being the equivalent to the Super/Windows key).

Emacs config

Configuring emacs is just as straighforward. Add

;; Setup move to previous window to mimic tmux bindings
(global-unset-key "\M-o")
(global-set-key "\M-o" 'other-window)
;; Move to next window
(global-set-key "\M-O" (lambda () (interactive) (other-window -1)))

to .emacs.

Tmux config

Configuring tmux simply requires setting the prefix with:
set-option -g prefix F12

Touchpad as one large scrollwheel

In Ubuntu, synaptics handles all touchpad manipulations. Using the synclient, I messed around with setting the RightEdge to various values to indicate where the scroll edge of the touchpad should end. Ultimately I found a value of 1751 covered all of the pad without causing errors. This script needs to be placed in a startup script such as .xinitrc. My own personal set up is using xmonad with bits of gnome thrown in for handling things such as the the panel bar and other various notifications. I'll take advantage of this and gnomes autostart system to place the synclient command in an autostart file in .config/autostart/synclient.desktop

[Desktop Entry]
Name=synclient
GenericName=synclient
Comment=Change trackpad to scrollpad
Exec=synclient RightEdge=1751
Terminal=true
Type=Application
StartupNotify=false

It all works.... but only partial success

All good, except a number of issues. The fragility of my approach is a significant issue.  My customizations wont benefit from any changes to evdev upstream. I'd like to figure out how to get the keyboard configurations working in xorg.conf.d/, but this will have to be a later investigation.

Another issue is that tmux doesn't respond to hyper and no longer responds to fn keys f13-f20 (it instead sees f13 as super-f1 and so on for f14 etc). So I'm stuck having to use a visible key, a key that affects other programs (in particular firefox/chrome) for controlling tmux. Future investigations will be using to send mod4-f12, but that's not straightforward either.

Yet another issue is the intimidatingly complicated nature of getting this all to work. There is supposed to be a simpler way of setting up .conf files in an /etc/X11/xorg.conf.d/ directory, but none of my experiments got this to work successfully.

Finally, tmux doesn't register holding down keys. Holding down my custom lctrl doesn't continue working after entering another key such as lctrl-o. I have to let go of lctrl and hit o again otherwise it simply outputs 'o'. Emacs and xmonad work as expected, so I suppose this could be a terminal issue of sort that I will have to sort out at some later time.

Thursday, March 12, 2015

Doom 3 source

Source

On November 23, 2011 id Software maintained the tradition and released the source code of their previous engine. This time is was the turn of idTech4 which powered Prey, Quake 4 and of course Doom 3. Within hours the GitHub repository was forked more than 400 times and people started to look at the game internal mechanisms/port the engine on other platforms. I also jumped on it and promptly completed the Mac OS X Intel version which John Carmack kindly advertised.


In terms of clarity and comments this is the best code release from id Software after Doom iPhone codebase (which is more recent and hence better commented). I highly recommend everybody to read, build and experiment with it.


Here are my notes regarding what I understood. As usual I have cleaned them up: I hope it will save someone a few hours and I also hope it will motivate some of us to read more code and become better programmers.

Part 1: Overview
Part 2: Dmap 
Part 3: Renderer
Part 4: Profiling 
Part 5: Scripting
Part 6: Interviews (including Q&A with John Carmack) 

From notes to articles...

I have noticed that I am using more and more drawing and less and less text in order to explain codebase. So far I have used gliffy to draw but this tool has some frustrating limitations (such as lack of alpha channel). I am thinking of authoring a tool specialized in drawing for 3D engines using SVG and Javascript. I wonder if something like this already exist ? Anyway, back to the code...

Background

Getting our hands on the source code of such a ground breaking engine is exciting. Upon release in 2004 Doom III set new visual and audio standards for real-time engines, the most notable being "Unified Lighting and Shadows". For the first time the technology was allowing artists to express themselves on an hollywood scale. Even 8 years later the first encounter with the HellKnight in Delta-Labs-4 still looks insanely great:

First contact

The source code is now distributed via Github which is a good thing since the FTP server from id Software was almost always down or overloaded.
The original release from TTimo compiles well with Visual Studio 2010 Professional. Unfortunately Visual Studio 2010 "Express" lacks MFC and hence cannot be used. This was disappointing upon release but some people have since removed the dependencies.
 
 
    Windows 7 :
    ===========
 
    
    
    git clone https://github.com/TTimo/doom3.gpl.git
 
    
 



For code reading and exploring I prefer to use XCode 4.0 on Mac OS X: The search speed from SpotLight, the variables highlights and the "Command-Click" to reach a definition make the experience superior to Visual Studio. The XCode project was broken upon release but it was easy to fix with a few steps and there is now a Github repository by "bad sector" which works well on Mac OS X Lion.
 
 
    MacOS X :
    =========
 
    
    
    git clone https://github.com/badsector/Doom3-for-MacOSX-
 
    
    
 

Notes : It seems "variable hightlights" and "Control-Click" are also available on Visual Studio 2010 after installing the Visual Studio 2010 Productivity Power Tools. I cannot understand why this is not part of the vanilla install.

Both codebases are now in the best state possible : One click away from an executable !
  • Download the code.
  • Hit F8 / Commmand-B.
  • Run !

Trivia : In order to run the game you will need the base folder containing the Doom 3 assets. Since I did not want to waste time extracting them from the Doom 3 CDs and updating them: I downloaded the Steam version. It seems id Software team did the same since the Visual Studio project released still contains "+set fs_basepath C:\Program Files (x86)\Steam\steamapps\common\doom 3" in the debug settings! 

Trivia : The engine was developed with Visual Studio .NET (source). But the code does not feature a single line of C# and the version released requires Visual Studio 2010 Professional in order to compile.

Trivia : Id Software team seems to be fan of the Matrix franchise: Quake III working title was "Trinity" and Doom III working title was "Neo". This explains why you will find all of the source code in the neo subfolder.

Architecture

The solution is divided in projects that reflect the overall architecture of the engine:
ProjectsBuildsObservations

WindowsMacO SX
Gamegamex86.dllgamex86.soDoom3 gameplay
Game-d3xpgamex86.dllgamex86.soDoom3 eXPension (Ressurection) gameplay
MayaImportMayaImport.dll-Part of the assets creation toolchain: Loaded at runtime in order to open Maya files and import monsters, camera path and maps.
Doom3Doom3.exeDoom3.appDoom 3 Engine
TypeInfoTypeInfo.exe-In-house RTTI helper: Generates GameTypeInfo.h: A map of all the Doom3 class types with each member size. This allow memory debugging via TypeInfo class.
CurlLibCurlLib.lib-HTTP client used to download files (Staticaly linked against gamex86.dll and doom3.exe).
idLibidLib.libidLib.aid Software library. Includes parser,lexer,dictionary ... (Staticaly linked against gamex86.dll and doom3.exe).


Like every engine since idTech2 we find one closed source binary (doom.exe) and one open source dynamic library (gamex86.dll).:


Most of the codebase has been accessible since October 2004 via the Doom3 SDK: Only the Doom3 executable source code was missing. Modders were able to build idlib.a and gamex86.dll but the core of the engine was still closed source.

Note : The engine does not use the Standard C++ Library: All containers (map,linked list...) are re-implemented but libc is extensively used. 

Note : In the Game module each class extends idClass. This allows the engine to perform in-house RTTI and also instantiate classes by classname.

Trivia : If you look at the drawing you will see that a few essential frameworks (such as Filesystem) are in the Doom3.exe project. This is a problem since gamex86.dll needs to load assets as well. Those subsystems are dynamically loaded by gamex86.dll from doom3.exe (this is what the arrow materializes in the drawing). If we use a PE explorer on the DLL we can see that gamex86.dll export one method: GetGameAPI:



Things are working exactly the way Quake2 loaded the renderer and the game ddls: Exchanging objects pointers: 

When Doom3.exe starts up it:
  • Loads the DLL in its process memory space via LoadLibrary.
  • Get the address of GetGameAPI in the dll using win32's GetProcAddress.
  • Call GetGameAPI.
        
        
    gameExport_t * GetGameAPI_t( gameImport_t *import );
        
        
        
At the end of the "handshake", Doom3.exe has a pointer to a idGame object and Game.dll has a pointer to a gameImport_t object containing additional references to all missing subsystems such as idFileSystem.

Gamex86's view on Doom 3 executable objects:
        
        
        typedef struct {
            
            int                         version;               // API version
            idSys *                     sys;                   // non-portable system services
            idCommon *                  common;                // common
            idCmdSystem *               cmdSystem              // console command system
            idCVarSystem *              cvarSystem;            // console variable system
            idFileSystem *              fileSystem;            // file system
            idNetworkSystem *           networkSystem;         // network system
            idRenderSystem *            renderSystem;          // render system
            idSoundSystem *             soundSystem;           // sound system
            idRenderModelManager *      renderModelManager;    // render model manager
            idUserInterfaceManager *    uiManager;             // user interface manager
            idDeclManager *             declManager;           // declaration manager
            idAASFileManager *          AASFileManager;        // AAS file manager
            idCollisionModelManager *   collisionModelManager; // collision model manager
            
        } gameImport_t;
        
        

Doom 3's view on Game/Modd objects:
    typedef struct 
    {

        int            version;     // API version
        idGame *       game;        // interface to run the game
        idGameEdit *   gameEdit;    // interface for in-game editing

    } gameExport_t;

    

Notes : A great resource to understand better each subsystems is the Doom3 SDK documentation page: It seems to have been written by someone with deep understanding of the code in 2004 (so probably a member of the development team).

The Code

Before digging, some stats from cloc:
        
        
     ./cloc-1.56.pl neo
        
     2180 text files.
     2002 unique files.                                          
     626 files ignored.
            
     http://cloc.sourceforge.net v 1.56  T=19.0 s (77.9 files/s, 47576.6 lines/s)
        
     -------------------------------------------------------------------------------
     Language                     files          blank        comment           code
     -------------------------------------------------------------------------------
     C++                            517          87078         113107         366433
     C/C++ Header                   617          29833          27176         111105
     C                              171          11408          15566          53540
     Bourne Shell                    29           5399           6516          39966
     make                            43           1196            874           9121
     m4                              10           1079            232           9025
     HTML                            55            391             76           4142
     Objective C++                    6            709            656           2606
     Perl                            10            523            411           2380
     yacc                             1             95             97            912
     Python                          10            108            182            895
     Objective C                      1            145             20            768
     DOS Batch                        5              0              0             61
     Teamcenter def                   4              3              0             51
     Lisp                             1              5             20             25
     awk                              1              2              1             17
     -------------------------------------------------------------------------------
     SUM:                          1481         137974         164934         601047
     -------------------------------------------------------------------------------
            
        
        

The number of line of code is not usually a good metric for anything but here it can be very helpful in order to assess the effort to comprehend the engine. 601,047 lines of code makes the engine twice as "difficult" to understand compared to Quake III. A few stats with regards to the history of id Software engines # lines of code:
#Lines of codeDoomidTech1idTech2idTech3idTech4
Engine39079143855135788239398601032
Tools3411115528140128417-
Total39420155010163928367815601032

Note : The huge increase in idTech3 for the tools comes from lcc codebase (the C compiler used to generate QVM bytecode) .
Note : No tools are accounted for Doom3 since they are integrated to the engine codebase.

From a high level here are a few fun facts:
  • For the first time in id Software history the code is C++ instead of C. John Carmack elaborated on this during our Q&A.
  • Abstraction and polymorphism are used a lot across the code. But a nice trick avoids the vtable performance hit on some objects.
  • All assets are stored in human readable text form. No more binaries. The code is making extensive usage of lexer/parser. John Carmack elaborated on this during our Q&A.
  • Templates are used in low level utility classes (mainly idLib) but are never seen in the upper levels so they won't make your eyes bleed the way Google's V8 source code does.
  • In terms of code commenting it is the second best codebase from id software, the only one better is Doom iPhone, probably because it is more recent than Doom3. 30% comments is still outstanding and find it rare to find a project that well commented! In some part of the code (see dmap page) there are actually more comments than statements.
  • OOP encapsulation makes the the code clean and easy to read.
  • The days of low level assembly optimization are gone. A few tricks such as idMath::InvSqrt and spacial localization optimizations are here but most of the code just tries to use the tools when they are available (GPU Shaders, OpenGL VBO, SIMD, Altivec, SMP, L2 Optimizations (R_AddModelSurfaces per model processing)...).
It is also interesting to take a look at idTech4 The Coding Standard (mirror) defined by John Carmack (I particularly appreciated the comments aboutconst placement).

Unrolling the loop

Here is the main loop unrolled with the most important parts of the engine: 
        
        
    idCommonLocal    commonLocal;                   // OS Specialized object 
    idCommon *       common = &commonLocal;         // Interface pointer (since Init is OS dependent it is an abstract method
        
    int WINAPI WinMain( HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow ) 
    {
            
        
        Sys_SetPhysicalWorkMemory( 192 << 20, 1024 << 20 );   //Min = 201,326,592  Max = 1,073,741,824
        Sys_CreateConsole();
            
        // Since the engine is multi-threaded mutexes are initialized here: One mutex per "critical" (concurrent execution) section of code.
        for (int i = 0; i < MAX_CRITICAL_SECTIONS; i++ ) { 
            InitializeCriticalSection( &win32.criticalSections[i] );
        }
            
        common->Init( 0, NULL, lpCmdLine );              // Assess how much VRAM is available (not done via OpenGL but OS call)
            
        Sys_StartAsyncThread(){                          // The next look runs is a separate thread.
            while ( 1 ){
                usleep( 16666 );                         // Run at 60Hz
                common->Async();                         // Do the job
                Sys_TriggerEvent( TRIGGER_EVENT_ONE );   // Unlock other thread waiting for inputs
                pthread_testcancel();                    // Check if we have been cancelled by the main thread (on shutdown).
            }
        }
        
        Sys_ShowConsole
            
        while( 1 ){
            Win_Frame();                                 // Show or hide the console
            common->Frame(){
                session->Frame()                         // Game logic
                {
                    for (int i = 0 ; i < gameTicsToRun ; i++ ) 
                        RunGameTic(){
                            game->RunFrame( &cmd );      // From this point execution jumps in the GameX86.dll address space.
                              for( ent = activeEntities.Next(); ent != NULL; ent = ent->activeNode.Next() ) 
                                ent->GetPhysics()->UpdateTime( time );  // let entities think
                        }
                }
                
                session->UpdateScreen( false ); // normal, in-sequence screen update
                {
                    renderSystem->BeginFrame
                        idGame::Draw            // Renderer front-end. Doesn't actually communicate with the GPU !!
                    renderSystem->EndFrame
                        R_IssueRenderCommands   // Renderer back-end. Issue GPU optimized commands to the GPU.
                }
            }
        }
    }        
        
        

For more details here is the fully unrolled loop that I used as a map while reading the code.
It is a standard main loop for an id Software engine. Except for Sys_StartAsyncThread which indicate that Doom3 is multi-threaded. The goal of this thread is to handle the time-critical functions that the engine don't want limited to the frame rate:
  • Sound mixing.
  • User input generation.

Trivia : idTech4 high level objects are all abstract classes with virtual methods. This would normally involves a performance hit since each virtual method address would have to be looked up in a vtable before calling it at runtime. But there is a "trick" to avoid that. All object are instantiated statically as follow:
        
        
    idCommonLocal    commonLocal;                   // Implementation
    idCommon *       common = &commonLocal;         // Pointer for gamex86.dll
        
        
Since an object allocated statically in the data segment has a known type the compiler can optimize away the vtable lookup when commonLocalmethods are called. The interface pointer is used during the handshake so doom3.exe can exchange objects reference with gamex86.dll but in this case the vtable cost is not optimized away. 

Trivia : Having read most engines from id Software I find it noticeable that some method name have NEVER changed since doom1 engine: The method responsible for pumping mouse and joystick inputs is still called: IN_frame().

Renderer

Two important parts:
  • Since Doom3 uses a portal system, the preprocessing tool dmap is a complete departure from the traditional bsp builder. I reviewed it to the deep down on a dedicated page. 



  • The runtime renderer has a very interesting architecture since it is broken in two parts with a frontend and backend: More on the dedicated page. 

Profiling

I used Xcode's Instruments to check where the CPU cycle were going. The results and analysis are here.

Scripting and Virtual Machine

In every idTech product the VM and the scripting language totally changed from the previous version...and they did it again: Details are here.

Interviews

While reading the code, several novelties puzzled me so I wrote to John Carmack and he was nice enough to reply with in-depth explanations about:
  • C++.
  • Renderer broken in two pieces.
  • Text-based assets.
  • Interpreted bytecode.
I also compiled all videos and press interviews about idTech4. It is all in the interviews page.