Written by: Tuesday, March 15, 2011
The new version of the DigitalRune Helper Library (DigitalRune.dll, which is part of DigitalRune Physics Bundle) has not 1 but 2 profiler classes. The “simple profiler” class was introduced in the previous blog post. In this blog post, we talk about the second profiler, which is a “hierarchical profiler”. It was inspired by the article Real-Time Hierarchical Profiling (Game Programming Gems 3).
Do we really need another profiler class? – Yes, really! This class has slightly different goals. The main goal is to measure the timing of nested method calls. This is extremely helpful in many applications – in particular if you need to optimize your game loop. An important difference between the simple profiler and the hierarchical profiler is that the hierarchical profiler does not work across multiple threads. That’s why we still need the simple profiler. The simple profiler can be used to collect values and to profile code in multi-threaded scenarios.
The HierarchicalProfiler class can be found in the namespace DigitalRune.Diagnostics. Here is a class diagram of the relevant classes.
Unlike the Profiler class, the HierarchicalProfiler is not a static class. You need to create a new instance of the HierarchicalProfiler, and this instance must not be shared across threads.
Time is measured with HierarchicalProfiler.Start() and Stop(). This profiler allows nested time measurement – even for recursive calls! If Start(“Foo”) is called twice, without calling Stop() the profiler treats this as a recursive method call. If Start(“Foo0”) and then Start(“Foo1”) is called, the profiler assumes that Foo1 is a method called by Foo0. Thus, it builds a tree of profiler data. The root of the tree is HierarchicalProfiler.Root. Each node in the tree is a HierarchicalProfilerNode and has other child nodes.
The method HierarchicalProfiler.NewFrame() must be called when a new frame of the game starts (= at the beginning of the game’s main loop, or at beginning of Game.Update() in XNA).
I guess it gets clearer with an example…
The following example “simulates” a game. Methods called Update() and Draw() are called in a loop. The Update method calls other methods to compute Physics and AI.
// The compilation symbol "DIGITALRUNE_PROFILE" must // be defined to activate profiling.#define DIGITALRUNE_PROFILE using System;using System.Threading;using DigitalRune.Diagnostics; namespace ProfilingTest { class Program { // The profiler instance. private static HierarchicalProfiler _profiler = new HierarchicalProfiler("MyProfiler"); static void Main(string[] args) { // Start profiling. _profiler.Reset(); // This represents the main-loop of a game. for (int i = 0; i < 10; i++) { // NewFrame() must be called when a new frame of the game begins. _profiler.NewFrame(); Update(); Draw(); } // Write the profiler data to the console. We start at the root node and include // up to 5 child levels. Console.WriteLine(_profiler.Dump(_profiler.Root, 5)); } private static void Update() { _profiler.Start("Update"); Physics(); AI(); AI(); AI(); Thread.Sleep(1); _profiler.Stop(); } private static void Physics() { _profiler.Start("Physics"); Thread.Sleep(6); _profiler.Stop(); } private static void AI() { _profiler.Start("AI"); Thread.Sleep(3); _profiler.Stop(); } private static void Draw() { _profiler.Start("Draw"); Thread.Sleep(4); _profiler.Stop(); } }}
Here is the console output:
MyProfiler.Root 20 frames 20.286ms/frame 405.721ms total Update 79.8% 16.198ms/frame 1 calls/frame (15.99ms/16.198ms/18ms) Physics 29.6% 6.003ms/frame 1 calls/frame (5.367ms/6.003ms/6.356ms) AI 45.1% 9.145ms/frame 3 calls/frame (2.389ms/3.048ms/5.107ms) Other 5.2% 20.977ms 1.049ms/frame Draw 19.6% 3.979ms/frame 1 calls/frame (3.514ms/3.979ms/4.683ms) Other 1.2% 4.919ms 245.97µs/frame
The dump uses a compacted format. The percentage values show the time of a node relative to the root of the dump. The values in round brackets are Minimum/Average/Maximum times. 'Other' represents the time of a node that was not measured by a child node. (If you don’t like the output format, you can traverse the tree manually, starting at the Root node, and collect the data that is relevant in your particular case.)
The output tells us a few things:
0 comment(s) so far...
A collection of the most useful blog articles can be found here:
Article Collection (on Documentation page)
DigitalRune is a trademark of Garstenauer Information Technology OG.
Garstenauer Information Technology OG Weingartenstrasse 35, 4452 Ternberg Austria (EUROPE) office@digitalrune.com