Debugger data model, Javascript & x64 exception handling
Introduction
The main goal of today's post is to show a bit more of what is now possible with the latest Windbg (currently branded "WinDbg Preview" in the Microsoft store) and the time travel debugging tools that Microsoft released a few months ago. When these finally got released, a bit after cppcon2017 this year, I expected a massive pick-up from the security / reverse-engineering industry with a bunch of posts, tools, scripts, etc. To my surprise, this has not happened yet so I have waited patiently for my vacation to write a little something about it myself. So, here goes!
Obviously, one of the most noticeable change in this debugger is the new UI.. but this is not something we will talk about. The second big improvement is .. a decent scripting engine! Until recently, I always had to use pyKD to write automation scripts. This has worked fairly well for years, but I’m glad to move away from it and embrace the new extension model provided by Windbg & Javascript (yes, you read this right). One of the biggest pain point I’ve to deal with with pyKD (aside from the installation process!) is that you had to evaluate many commands and then parse their outputs to extract the bits and pieces you needed. Thankfully, the new debugger data model solves this (or part of this anyway). The third new change is the integration of the time travel debugging (TTD) features discussed in this presentation: Time Travel Debugging: Root Causing Bugs in Commercial Scale Software .
The goal of this post is to leverage all the nifty stuff we will learn to enumerate x64 try/except handlers in Javascript.
So grab yourself a cup of fine coffee and read on :).
- Introduction
- The debugger data model
- Scripting the model in Javascript
- Javascript integers and Int64
- Accessing CPU registers
- Reading memory
- Executing / evaluating commands
- Setting breakpoints
- TTD
- Extend the data model
- Misc
- Try and play with host.* API from the command window
- How to load an extension script
- How to run an imperative script
- Is the Javascript engine only available in Windbg Preview?
- How to debug a script?
- How to do Nat-Vis style visualizer in Javascript?
- Get a value out of a typed object
- Evaluate expressions
- How to access global from modules
- x64 exception handling vs Javascript
- EOF
The debugger data model
Overview
What is being called the debugger data model is a hierarchy of objects (methods, properties, values) that are accessible both directly from the debugger's command window and through a Javascript API. The debugger exposes a bunch of information that it is responsible: thread related information, register values, stack trace information, etc. As an extension writer, you can go and expose your feature through the node of your choosing in the hierarchy. Once it is plugged in into the model, it is available for consumption by another script, or through the debugger's command window.
![model.png](https://doar-e.github.io/images/debugger_data_model__javascript___x64_exception_handling/model.png)
First query
Say you would like to find what modules the current @rip
is pointing into, you can easily express this through a query using LINQ operators and the data model:
0:001> dx @$curprocess.Modules.Where(p => @rip >= p.BaseAddress && @rip < (p.BaseAddress+p.Size))
@$curprocess.Modules.Where(p => @rip >= p.BaseAddress && @rip < (p.BaseAddress+p.Size))
[0x8] : C:\WINDOWS\SYSTEM32\ntdll.dll
..and you can even check all the information related to this module by clicking on the DML [0x8]
link:
0:001> dx -r1 @$curprocess.Modules.Where(p => @rip >= p.BaseAddress && @rip < (p.BaseAddress+p.Size))[8]
@$curprocess.Modules.Where(p => @rip >= p.BaseAddress && @rip < (p.BaseAddress+p.Size))[8] : C:\WINDOWS\SYSTEM32\ntdll.dll
BaseAddress : 0x7ffc985a0000
Name : C:\WINDOWS\SYSTEM32\ntdll.dll
Size : 0x1db000
In the previous two samples, there are several interesting points to highlight:
1) dx
is the operator to access the data model which is not available through the ??
/ ?
operators
2) @$name
is how you access a variable that you have defined during a debugging session. The debugger itself defines several variables right off the bat just to make querying the model easier: @$curprocess
is equivalent to host.currentProcess
in Javascript, @cursession
is host.currentSession
, and @$curthread
is host.currentThread
. You can also define custom variables yourself, for example:
0:001> dx @$doare = "Diary of a reverse-engineer"
@$doare = "Diary of a reverse-engineer" : Diary of a reverse-engineer
Length : 0x1b
0:001> dx "Hello, " + @$doare
"Hello, " + @$doare : Hello, Diary of a reverse-engineer
Length : 0x22
0:001> ?? @$doare
Bad register error at '@$doare'
0:001> ? @$doare
Bad register error at '@$doare'
3) To query all the nodes in the @$curprocess
hierarchy (if you want to wander through the data model you can just use dx Debugger
and click through the DML links):
0:001> dx @$curprocess
@$curprocess : cmd.exe [Switch To]
Name : cmd.exe
Id : 0x874
Threads
Modules
Environment
You can also check Debugger.State.DebuggerVariables
where you can see the definitions for the variables we just mentioned:
0:001> dx Debugger.State.DebuggerVariables
Debugger.State.DebuggerVariables
cursession : Live user mode: <Local>
curprocess : cmd.exe [Switch To]
curthread : ntdll!DbgUiRemoteBreakin (00007ffc`98675320) [Switch To]
scripts
scriptContents : [object Object]
vars
curstack
curframe : ntdll!DbgBreakPoint [Switch To]
0:001> dx Debugger.State.DebuggerVariables.vars
Debugger.State.DebuggerVariables.vars
doare : Diary of a reverse-engineer
4) Last but not least, most of (all?) the iterable objects can be queried through LINQ-style operators. If you’ve never used these it can be a bit weird at the beginning but at some point it will click and then it is just goodness.
Here is the list of the currently available operators on iterable objects in the data model:
Aggregate [Aggregate(AggregateMethod) | Aggregate(InitialSeed, AggregateMethod) | Aggregate(InitialSeed, AggregateMethod, ResultSelectorMethod) - LINQ equivalent method which iterates through the items in the given collection, running the aggregate method on each one and storing the returned result as the current aggregate value. Once the collection has been exhausted, the final accumulated value is returned. An optional result selector method can be specified which transforms the final accumulator value before returning it.]
All [All(PredicateMethod) - LINQ equivalent method which returns whether all elements in the collection match a given predicate]
AllNonError [AllNonError(PredicateMethod) - LINQ equivalent method which returns whether all elements in the collection match a given predicate. Errors are ignored if all non-error results match the predicate.]
Any [Any(PredicateMethod) - LINQ equivalent method which returns whether any element in the collection matches a given predicate]
Average [Average([ProjectionMethod]) - LINQ equivalent method which finds the average of all values in the enumeration. An optional projection method can be specified that transforms each value before the average is computed.]
Concat [Concat(InnerCollection) - LINQ equivalent method which returns all elements from both collections, including duplicates.]
Contains [Contains(Object, [ComparatorMethod]) - LINQ equivalent method which searches for the given element in the sequence using default comparator rules. An optional comparator method can be provided that will be called each time the element is compared against an entry in the sequence.]
Count [Count() - LINQ equivalent method which returns the number of objects in the collection]
Distinct [Distinct([ComparatorMethod]) - LINQ equivalent method which returns all distinct objects from the given collection, using default comparison rules. An optional comparator method can be provided to be called each time objects in the collection must be compared.]
Except [Except(InnerCollection, [ComparatorMethod]) - LINQ equivalent method which returns all distinct objects in the given collection that are NOT found in the inner collection. An optional comparator method can also be specified.]
First [First([PredicateMethod]) - LINQ equivalent method which returns the first element in the collection or the first which matches an optional predicate]
FirstNonError [FirstNonError([PredicateMethod]) - LINQ equivalent method which returns the first element in the collection or the first which matches an optional predicate. Any errors encountered are ignored if a valid element is found.]
Flatten [Flatten([KeyProjectorMethod]) - Method which flattens a tree of collections (or a tree of keys that project to collections via an optional projector method) into a single collection]
GroupBy [GroupBy(KeyProjectorMethod, [KeyComparatorMethod]) - LINQ equivalent method which groups the collection by unique keys defined via a key projector and optional key comparator]
Intersect [Intersect(InnerCollection, [ComparatorMethod]) - LINQ equivalent method which returns all distinct objects in the given collection that are also found in the inner collection. An optional comparator method can also be specified.]
Join [Join(InnerCollection, Outer key selector method, Inner key selector method, Result selector method, [ComparatorMethod]) - LINQ equivalent method which projects a key for each element of the outer collection and each element of the inner collection using the methods provided. If the projected keys from both these elements match, then the result selector method is called with both those values and its output is returned to the user. An optional comparator method can also be specified.]
Last [Last([PredicateMethod]) - LINQ equivalent method which returns the last element in the collection or the last which matches an optional predicate]
LastNonError [LastNonError([PredicateMethod]) - LINQ equivalent method which returns the last element in the collection or the last which matches an optional predicate. Any errors are ignored.]
Max [Max([ProjectionMethod]) - LINQ equivalent method which returns the maximum element using standard comparison rules. An optional projection method can be specified to project the elements of a sequence before comparing them with each other.]
Min [Min([ProjectionMethod]) - LINQ equivalent method which returns the minimum element using standard comparison rules. An optional projection method can be specified to project the elements of a sequence before comparing them with each other.]
OrderBy [OrderBy(KeyProjectorMethod, [KeyComparatorMethod]) - LINQ equivalent method which orders the collection via a key projector and optional key comparator in ascending order]
OrderByDescending [OrderByDescending(KeyProjectorMethod, [KeyComparatorMethod]) - LINQ equivalent method which orders the collection via a key projector and optional key comparator in descending order]
Reverse [Reverse() - LINQ equivalent method which returns the reverse of the supplied enumeration.]
Select [Select(ProjectionMethod) - LINQ equivalent method which projects the collection to a new collection via calling a projection method on every element]
SequenceEqual [SequenceEqual(InnerCollection, [ComparatorMethod]) - LINQ equivalent method which goes through the outer and inner collections and makes sure that they are equal (incl. sequence length). An optional comparator can be specified.]
Single [Single([PredicateMethod]) - LINQ equivalent method which returns the only element in a list, or, if a predicate was specified, the only element that satisfies the predicate. If there are multiple elements that match the criteria, an error is returned.]
Skip [Skip(Count) - LINQ equivalent method which skips the specified number of elements in the collection and returns all the rest.]
SkipWhile [SkipWhile(PredicateMethod) - LINQ equivalent method which runs the predicate for each element and skips it as long as it keeps returning true. Once the predicate fails, the rest of the collection is returned.]
Sum [Sum([ProjectionMethod]) - LINQ equivalent method which sums all the elements in the collection. Can optionally specify a projector method to transform the elements before summation occurs.]
Take [Take(Count) - LINQ equivalent method which takes the specified number of elements from the collection.]
TakeWhile [TakeWhile(PredicateMethod) - LINQ equivalent method which runs the predicate for each element and returns it only if the result is successful. Once the predicate fails, no more elements will be taken.]
Union [Union(InnerCollection, [ComparatorMethod]) - LINQ equivalent method which returns all distinct objects from the given and inner collection. An optional comparator method can also be specified.]
Where [Where(FilterMethod) - LINQ equivalent method which filters elements in the collection according to when a filter method returns true for a given element]
Now you may be wondering if the model is available with every possible configuration of Windbg? By configuration I mean that you can use the debugger live in user-mode attached to a process, offline looking at a crash-dump of a process, live in kernel-mode, offline looking at a system crash-dump, or off-line looking at a TTD trace.
And yes, the model is accessible with all the previous configurations, and this is awesome. This allows you to, overall, write very generic scripts as long as the information you are mining / exposing is not tied to a specific configuration.
Scripting the model in Javascript
As we described a bit earlier, you can now access programmatically everything that is exposed through the model via Javascript. No more eval
or string parsing to extract the information you want, just go find the node exposing what you are after. If this node doesn’t exist, add your own to expose the information you want :)
Javascript integers and Int64
The first thing you need to be aware with Javascript is the fact that integers are encoded in C doubles.. which means your integers are stored in 53 bits. This is definitely a problem as most of the data we deal with are 64 bit integers. In order to address this problem, Windbg exposes a native type to Javascript that is able to store 64 bit integers. The type is called Int64
and most (all?) information available in the data model is through Int64
instances. This type exposes various methods to perform arithmetic and binary operations (if you use the native operators, the Int64
gets converted back to an integer and throws if data is lost during this conversion; cf Auto-conversion). It takes a bit of time to get used to this, but feels natural pretty quickly. Note that the Frida framework exposes a very similar type to address the same issue, which means it will be even easier for you if you have played with Frida in the past!
You can construct an Int64
directly using a native Javascript integers (so at most 53 bits long as described above), or you can use the host.parseInt64
method that takes a string as input. The other very important method you are going to need is Int64.compareTo
which returns 1
if the instance is bigger than the argument, 0
if equal and -1
if smaller. The below script shows a summary of the points we touched on:
// Int64.js
"use strict";
let logln = function (e) {
host.diagnostics.debugLog(e + '\n');
}
function invokeScript() {
let a = host.Int64(1337);
let aplusone = a + 1;
// 53a
logln(aplusone.toString(16));
let b = host.parseInt64('0xdeadbeefbaadc0de', 16);
let bplusone = b.add(1);
// 0xdeadbeefbaadc0df
logln(bplusone.toString(16));
let bplusonenothrow = b.convertToNumber() + 1;
// 16045690984229355000
logln(bplusonenothrow);
try {
let bplusonethrow = b + 1;
} catch(e) {
// Error: 64 bit value loses precision on conversion to number
logln(e);
}
// 1
logln(a.compareTo(1));
// 0
logln(a.compareTo(1337));
// -1
logln(a.compareTo(1338));
}
For more information I would recommend looking at this page JavaScript Debugger Scripting.
Accessing CPU registers
Registers are accessible in the host.currentThread.Registers
object. You can access the classical GPRs in the User
node, but you can also access the xmm/ymm registers via SIMD
and Floating Point
nodes. As you may have guessed, the registers are all instances of the Int64
object we just talked about.
Reading memory
You can read raw memory via the host.memory.readMemoryValues
function. It allows you to read memory as an array of items whose size you can specify. You can also use host.memory.readString
and host.memory.readWideString
for reading (narrow/wide) strings directly from memory.
//readmemory.js
"use strict";
let logln = function (e) {
host.diagnostics.debugLog(e + '\n');
}
function read_u64(addr) {
return host.memory.readMemoryValues(addr, 1, 8)[0];
}
function invokeScript() {
let Regs = host.currentThread.Registers.User;
let a = read_u64(Regs.rsp);
logln(a.toString(16));
let WideStr = host.currentProcess.Environment.EnvironmentBlock.ProcessParameters.ImagePathName.Buffer;
logln(host.memory.readWideString(WideStr));
let WideStrAddress = WideStr.address;
logln(host.memory.readWideString(WideStrAddress));
}
Executing / evaluating commands
Even though a bunch of data is accessible programmatically via the data model, not everything is exposed today in the model. For example, you cannot access the same amount of information that kp
shows you with the Frame
model object. Specifically, the addresses of the frames or the saved return addresses are not currently available in the object unfortunately :-( As a result, being able to evaluate commands can still be important.
The API call ExecuteCommand
evaluates a command and returns the output of the command as a string:
//eval.js
"use strict";
let logln = function (e) {
host.diagnostics.debugLog(e + '\n');
}
function invokeScript() {
let Control = host.namespace.Debugger.Utility.Control;
for(let Line of Control.ExecuteCommand('kp')) {
logln('Line: ' + Line);
}
}
There is at least one pitfall with this function to be aware of: the API executes until it completes. So, if you use ExecuteCommand
to execute let's say gc
the call will return only when you encounter any sort of break. If you don't encounter any break, the call will never end.
Setting breakpoints
Settings breakpoints is basically handled by three different APIs: SetBreakpointAtSourceLocation
, SetBreakpointAtOffset
, and SetBreakpointForReadWrite
. The names are pretty self-explanatory so I will not spend much time describing them. Unfortunately, as far as I can tell there is no easy way to bind a breakpoint to a Javascript function that could handle it when it is hit. The objects returned by these APIs have a Command
field you can use to trigger a command when the breakpoint fires, as opposed to a function invocation. In essence, it is pretty much the same than when you do bp foo "command"
.
Hopefully these APIs will become more powerful and more suited for scripting in future versions with the possibility of invoking a Javascript function when triggered, that would pass an object to the function describing why and where the breakpoint triggered, etc.
Here is a simple example:
//breakpoint.js
"use strict";
let logln = function (e) {
host.diagnostics.debugLog(e + '\n');
}
function handle_bp() {
let Regs = host.currentThread.Registers.User;
let Args = [ Regs.rcx, Regs.rdx, Regs.r8 ];
let ArgsS = Args.map(c => c.toString(16));
let HeapHandle = ArgsS[0];
let Flags = ArgsS[1];
let Size = ArgsS[2];
logln('RtlAllocateHeap: HeapHandle: ' + HeapHandle + ', Flags: ' + Flags + ', Size: ' + Size);
}
function invokeScript() {
let Control = host.namespace.Debugger.Utility.Control;
let Regs = host.currentThread.Registers.User;
let CurrentProcess = host.currentProcess;
let BreakpointAlreadySet = CurrentProcess.Debug.Breakpoints.Any(
c => c.OffsetExpression == 'ntdll!RtlAllocateHeap+0x0'
);
if(BreakpointAlreadySet == false) {
let Bp = Control.SetBreakpointAtOffset('RtlAllocateHeap', 0, 'ntdll');
Bp.Command = '.echo doare; dx @$scriptContents.handle_bp(); gc';
} else {
logln('Breakpoint already set.');
}
logln('Press "g" to run the target.');
// let Lines = Control.ExecuteCommand('gc');
// for(let Line of Lines) {
// logln('Line: ' + Line);
// }
}
This gives:
0:000>
Press "g" to run the target.
0:000> g-
doare
RtlAllocateHeap: HeapHandle: 0x21b5dcd0000, Flags: 0x140000, Size: 0x82
@$scriptContents.handle_bp()
doare
RtlAllocateHeap: HeapHandle: 0x21b5dcd0000, Flags: 0x140000, Size: 0x9a
@$scriptContents.handle_bp()
doare
RtlAllocateHeap: HeapHandle: 0x21b5dcd0000, Flags: 0x140000, Size: 0x40
@$scriptContents.handle_bp()
doare
RtlAllocateHeap: HeapHandle: 0x21b5dcd0000, Flags: 0x140000, Size: 0x38
@$scriptContents.handle_bp()
doare
RtlAllocateHeap: HeapHandle: 0x21b5dcd0000, Flags: 0x0, Size: 0x48
@$scriptContents.handle_bp()
...
Now, I find this interface not well suited for scenarios where you need to have a breakpoint that just dumps stuff and keep going, but hopefully in the future this will improve. Let's say you have a function and you’re interested in dumping its arguments/state every time it gets called. If you attempt to do this with the above code, every time the breakpoint is hit the debugger will execute your callback and stop. At this point you have to tell it to keep executing. (Also, feel free to uncomment the last lines of the script to see what happens if you ExecuteCommand('gc')
:-)).
One way I found around this limitation is to use evaluation and the bp
command:
//breakpoint2.js
"use strict";
let logln = function (e) {
host.diagnostics.debugLog(e + '\n');
}
function handle_bp() {
let Regs = host.currentThread.Registers.User;
let Args = [Regs.rcx, Regs.rdx, Regs.r8];
let ArgsS = Args.map(c => c.toString(16));
let HeapHandle = ArgsS[0];
let Flags = ArgsS[1];
let Size = ArgsS[2];
logln('RtlAllocateHeap: HeapHandle: ' + HeapHandle + ', Flags: ' + Flags + ', Size: ' + Size);
if(Args[2].compareTo(0x100) > 0) {
// stop execution if the allocation size is bigger than 0x100
return true;
}
// keep the execution going if it's a small size
return false;
}
function invokeScript() {
let Control = host.namespace.Debugger.Utility.Control;
let Regs = host.currentThread.Registers.User;
let CurrentProcess = host.currentProcess;
let HeapAlloc = host.getModuleSymbolAddress('ntdll', 'RtlAllocateHeap');
let BreakpointAlreadySet = CurrentProcess.Debug.Breakpoints.Any(
c => c.Address == HeapAlloc
);
if(BreakpointAlreadySet == false) {
logln('RltAllocateHeap @ ' + HeapAlloc.toString(16));
Control.ExecuteCommand('bp /w "@$scriptContents.handle_bp()" ' + HeapAlloc.toString(16));
} else {
logln('Breakpoint already set.');
}
logln('Press "g" to run the target.');
}
Which gives this output:
0:000>
RltAllocateHeap @ 0x7fffc07587a0
Press "g" to run the target.
0:000> g
RtlAllocateHeap: HeapHandle: 0x21b5dcd0000, Flags: 0x0, Size: 0x48
RtlAllocateHeap: HeapHandle: 0x21b5dcd0000, Flags: 0x140000, Size: 0x38
...
RtlAllocateHeap: HeapHandle: 0x21b5dcd0000, Flags: 0x140000, Size: 0x34a
Breakpoint 0 hit
Time Travel Position: 2A51:314
ntdll!RtlAllocateHeap:
00007fff`c07587a0 48895c2408 mov qword ptr [rsp+8],rbx ss:000000b8`7f39e9a0=000000b87f39e9b0
Of course, yet another way of approaching this problem would be to wrap the script invocation into the command of a breakpoint like this:
bp ntdll!RtlAllocateHeap ".scriptrun c:\foo\script.js"
TTD
For those who are not familiar with Microsoft’s "Time Travel Debugging" toolset, in a nutshell it allows you to record the execution of a process. Once the recording is done, you end up with a trace file written to disk that you can load into the debugger to replay what you just recorded -- a bit like a camera / VCR. If you want to learn more about it, I would highly recommend checking out this presentation: Time Travel Debugging: root causing bugs in commercial scale software.
Even though I won’t cover how recording and replaying a TTD trace in this article, I just wanted to show you in this part how powerful such features can be once coupled with the data model. As you have probably realized by now, the data model is all about extensibility: you can access specific TTD features via the model when you have a trace loaded in the debugger. This section tries to describe them.
TTD.Calls
The first feature I wanted to talked about is TTD.Calls
. This API goes through an entire execution trace and finds every unique point in the trace where an API has been called.
0:000> dx -v @$cursession.TTD
@$cursession.TTD : [object Object]
Calls [Returns call information from the trace for the specified set of symbols: TTD.Calls("module!symbol1", "module!symbol2", ...)]
For each of those points, you have an object describing the call: time travel position (that you can travel to: see TimeStart
and TimeEnd
below), parameters (leveraging symbols if you have any to know how many parameters the API expects), return value, the thread id, etc.
Here is what it looks like:
0:000> dx -r1 @$cursession.TTD.Calls("ntdll!RtlAllocateHeap").Count()
@$cursession.TTD.Calls("ntdll!RtlAllocateHeap").Count() : 0x267
0:000> dx @$cursession.TTD.Calls("ntdll!RtlAllocateHeap").First()
@$cursession.TTD.Calls("ntdll!RtlAllocateHeap").First()
EventType : Call
ThreadId : 0x1004
UniqueThreadId : 0x6
TimeStart : 12C1:265 [Time Travel]
TimeEnd : 12DE:DC [Time Travel]
Function : ntdll!RtlAllocateHeap
FunctionAddress : 0x7fffc07587a0
ReturnAddress : 0x7fffbdcd9cc1
ReturnValue : 0x21b5df71980
Parameters
0:000> dx -r1 @$cursession.TTD.Calls("ntdll!RtlAllocateHeap").First().Parameters
@$cursession.TTD.Calls("ntdll!RtlAllocateHeap").First().Parameters
[0x0] : 0x21b5df70000
[0x1] : 0x8
[0x2] : 0x2d8
[0x3] : 0x57
Obviously, the collection returned by TTD.Calls
can be queried via the same LINQ-like operators we mentioned earlier which is awesome. As an example, asking the following question has never been easier: "How many times did the allocator fail to allocate memory?":
0:000> dx @$Calls=@$cursession.TTD.Calls("ntdll!RtlAllocateHeap").Where(c => c.ReturnValue == 0)
@$Calls=@$cursession.TTD.Calls("ntdll!RtlAllocateHeap").Where(c => c.ReturnValue == 0)
0:000> dx @$Calls.Count()
@$Calls.Count() : 0x0
Note that because the API has been designed in a way that abstracts away ABI-specific details, you can have your query / code working on both x86 & x64 seamlessly. Another important point is that this is much faster than setting a breakpoint manually and running the trace forward to collect this information yourself.
TTD.Memory
The other very powerful feature that was announced fairly recently in version 1.1712.15003 is TTD.Memory
. A bit like TTD.Calls
, this feature lets you go and find every memory accesses that happened in an execution trace on a specific memory range. And again, it returns to the user a nice object that has all the information you could be potentially interested in (time travel positions, access type, the instruction pointer address, the address of the memory accessed, etc.):
0:000> dx @$Accesses[0]
@$Accesses[0]
EventType : MemoryAccess
ThreadId : 0x15e8
UniqueThreadId : 0x3
TimeStart : F44:2 [Time Travel]
TimeEnd : F44:2 [Time Travel]
AccessType : Write
IP : 0x7fffc07649bf
Address : 0xb87f67fa70
Size : 0x4
Value : 0x0
Here is how you would go and ask it to find out every piece of code that write-accessed (read and execute are also other valid type of access you can query for and combine) the TEB region of the current thread:
0:001> ? @$teb
Evaluate expression: 792409825280 = 000000b8`7f4e6000
0:001> ?? sizeof(_TEB)
unsigned int64 0x1838
0:001> dx @$Accesses=@$cursession.TTD.Memory(0x000000b8`7f4e6000, 0x000000b8`7f4e6000+0x1838, "w")
@$Accesses=@$cursession.TTD.Memory(0x000000b8`7f4e6000, 0x000000b8`7f4e6000+0x1838, "w")
0:001> dx @$Accesses[0]
@$Accesses[0]
EventType : MemoryAccess
ThreadId : 0x15e8
UniqueThreadId : 0x3
TimeStart : F79:1B [Time Travel]
TimeEnd : F79:1B [Time Travel]
AccessType : Write
IP : 0x7fffc0761bd0
Address : 0xb87f4e7710
Size : 0x10
Value : 0x0
The other beauty of it is that you can travel to the position ID and find out what happened:
0:001> !tt F79:1B
Setting position: F79:1B
(1cfc.15e8): Break instruction exception - code 80000003 (first/second chance not available)
Time Travel Position: F79:1B
ntdll!TppWorkCallbackPrologRelease+0x100:
00007fff`c0761bd0 f30f7f8010170000 movdqu xmmword ptr [rax+1710h],xmm0 ds:000000b8`7f4e7710=00000000000000000000000000000000
0:001> dt _TEB ActivityId
ntdll!_TEB
+0x1710 ActivityId : _GUID
In the above example, you can see that the TppWorkCallbackPrologRelease
function is zeroing the ActivityId
GUID of the current TEB - magical.
TTD.Utility.GetHeapAddress
The two previous features were mostly building blocks; this utility consumes the TTD.Calls
API in order to show the lifetime of a heap chunk in a trace session. What does that mean exactly? Well, the utility looks for every heap related operation that happened on a chunk (start address, size) and show them to you.
This is extremely useful when debugging or root-causing issues, and here is what it looks like on a dummy trace:
0:000> dx -g @$cursession.TTD.Utility.GetHeapAddress(0x21b5dce40a0)
========================================================================================================================================
= = Action = Heap = Address = Size = Flags = (+) TimeStart = (+) TimeEnd = Result =
========================================================================================================================================
= [0x59] : [object Object] - Alloc - 0x21b5dcd0000 - 0x21b5dce4030 - 0xaa - 0x8 - ED:7D7 - EF:7D - =
= [0x6b] : [object Object] - Alloc - 0x21b5dcd0000 - 0x21b5dce40a0 - 0xaa - 0x8 - 105:D9 - 107:7D - =
= [0x6c] : [object Object] - Free - 0x21b5dcd0000 - 0x21b5dce40a0 - - 0x0 - 107:8D - 109:1D - 0x1 =
= [0x276] : [object Object] - Alloc - 0x21b5dcd0000 - 0x21b5dce4030 - 0x98 - 0x0 - E59:3A7 - E5A:8E - =
========================================================================================================================================
The attentive reader has probably noticed something maybe unexpected with entries 0x59 and entries 0x276 where we are seeing two different allocations of the same chunk without any free in between. The answer to this question lies in the way the GetHeapAddress
function is implemented (check out the TTD\Analyzers\HeapAnalysis.js file) - it basically looks for every heap related operation and only shows you the ones where address + size
is a range containing the argument you passed. In this example we gave the function the address 0x21b5dce40a0
, 0x59 is an allocation and 0x21b5dce40a0
is in the range 0x21b5dce4030 + 0xAA
so we display it. Now, a free does not know the size of the chunk, the only thing it knows is the base pointer. In this case if we have a free of 0x21b5dce4030
the utility function would just not display it to us which explains how we can have two heap chunks allocated without a free in the following time frame: ED:7D7, E59:3A7
.
We can even go ahead and prove this by finding the free by running the below command:
0:000> dx -g @$cursession.TTD.Utility.GetHeapAddress(0x21b5dce4030).Where(p => p.Address == 0x21b5dce4030)
========================================================================================================================================
= = Action = Heap = Address = Size = Flags = (+) TimeStart = (+) TimeEnd = Result =
========================================================================================================================================
= [0x61] : [object Object] - Alloc - 0x21b5dcd0000 - 0x21b5dce4030 - 0xaa - 0x8 - ED:7D7 - EF:7D - =
= [0x64] : [object Object] - Free - 0x21b5dcd0000 - 0x21b5dce4030 - - 0x0 - EF:247 - F1:1D - 0x1 =
= [0x276] : [object Object] - Alloc - 0x21b5dcd0000 - 0x21b5dce4030 - 0x98 - 0x0 - E59:3A7 - E5A:8E - =
========================================================================================================================================
As expected, the entry 0x64 is our free operation and it also happens in between the two allocation operations we were seeing earlier - solved.
Pretty neat uh?
It is nice enough to ask the utility for a specific heap address, but it would also be super nice if we had access to the whole heap activity that has happened during the session and that is what TTD.Data.Heap
gives you:
0:000> dx @$HeapOps=@$cursession.TTD.Data.Heap()
...
0:000> dx @$HeapOps.Count()
@$HeapOps.Count() : 0x414
0:000> dx @$HeapOps[137]
@$HeapOps[137] : [object Object]
Action : Free
Heap : 0x21b5dcd0000
Address : 0x21b5dcee790
Flags : 0x0
Result : 0x1
TimeStart : 13A1:184 [Time Travel]
TimeEnd : 13A2:27 [Time Travel]
And of course do not forget that all these collections are queryable. We can easily find out what are all the other heap operations that are not alloc
or free
with the below query:
0:000> dx @$NoFreeAlloc=@$HeapOps.Where(c => c.Action != "Free" && c.Action != "Alloc")
...
0:000> dx -g @$NoFreeAlloc
============================================================================================================
= = Action = Heap = Result = (+) TimeStart = (+) TimeEnd =
============================================================================================================
= [0x382] : [object Object] - Lock - 0x21b5dcd0000 - 0xb87f4e3001 - 1ADE:602 - 1ADF:14 =
= [0x386] : [object Object] - Unlock - 0x21b5dcd0000 - 0xb87f4e3001 - 1AE0:64 - 1AE1:13 =
= [0x38d] : [object Object] - Lock - 0x21b5dcd0000 - 0xb87f4e3001 - 1B38:661 - 1B39:14 =
= [0x391] : [object Object] - Unlock - 0x21b5dcd0000 - 0xb87f4e3001 - 1B3A:64 - 1B3B:13 =
= [0x397] : [object Object] - Lock - 0x21b5dcd0000 - 0xb87f4e3001 - 1BF0:5F4 - 1BF1:14 =
= [0x399] : [object Object] - Unlock - 0x21b5dcd0000 - 0xb87f4e3001 - 1BF1:335 - 1C1E:13 =
...
Extend the data model
After consuming all the various features available in the data model, I am sure you guys are wondering how you can go and add your own node and extend it. In order to do this, you can use the API host.namedModelParent
.
class host.namedModelParent
An object representing a modification of the object model of the debugger.
This links together a JavaScript class (or prototype) with a data model.
The JavaScript class (or prototype) becomes a parent data model (e.g.: similar to a prototype)
to the data model registered under the supplied name.
An instance of this object can be returned in the array of records returned from
the initializeScript method.
Let's say we would like to add a node that is associated with a Process
called DiaryOfAReverseEngineer
which has the following properties:
- DiaryOfAReverseEngineer
- Foo - string
- Bar - string
- Add - function
- Sub
- SubBar - string
- SubFoo - string
Step 1: Attach a node to the Process
model
Using host.namedModelParent
you get the opportunity to link a Javascript class to the model of your choice. The other thing to understand is that this feature is made to be used by extension (as opposed to imperative) scripts.
Extension and imperative scripts are basically the same but they have different entry points: extensions use initializeScript
(the command .scriptload
invokes this entry point) and imperative scripts use invokeScript
(the command .scriptrun
invokes both the initializeScript
and invokeScript
). The small difference is that in an extension script you are expected to return an array of registration objects if you want to modify the data model, which is exactly what we want to do.
Anyway, let's attach a node called DiaryOfAReverseEngineer
to the Process
model:
//extendmodel_1.js
"use strict";
class ProcessModelParent {
get DiaryOfAReverseEngineer() {
return 'hello from ' + this.Name;
}
}
function initializeScript() {
return [new host.namedModelParent(
ProcessModelParent,
'Debugger.Models.Process'
)];
}
Once loaded you can go ahead and check that the node has been added:
0:000> dx @$curprocess
@$curprocess : PING.EXE [Switch To]
Name : PING.EXE
Id : 0x1cfc
Threads
Modules
Environment
TTD
DiaryOfAReverseEngineer : hello from PING.EXE
One important thing to be aware of in the previous example is that the this
pointer is effectively an instance of the data model you attached to. In our case it is an instance of the Process
model and as a result you can access every property available on this node, like its Name
for example.
Step 2: Add the first level to the node
What we want to do now is to have our top node exposing two string properties and one function (we’ll deal with Sub
later). This is done by creating a new Javascript class that represents this level, and we can return an instance of this said class in the DiaryOfReverseEngineer
property. Simple enough uh?
//extendmodel_2.js
"use strict";
class DiaryOfAReverseEngineer {
constructor(Process) {
this.process = Process;
}
get Foo() {
return 'Foo from ' + this.process.Name;
}
get Bar() {
return 'Bar from ' + this.process.Name;
}
Add(a, b) {
return a + b;
}
}
class ProcessModelParent {
get DiaryOfAReverseEngineer() {
return new DiaryOfAReverseEngineer(this);
}
}
function initializeScript() {
return [new host.namedModelParent(
ProcessModelParent,
'Debugger.Models.Process'
)];
}
Which gives:
0:000> dx @$curprocess
@$curprocess : PING.EXE [Switch To]
Name : PING.EXE
Id : 0x1cfc
Threads
Modules
Environment
TTD
DiaryOfAReverseEngineer : [object Object]
0:000> dx @$curprocess.DiaryOfAReverseEngineer
@$curprocess.DiaryOfAReverseEngineer : [object Object]
process : PING.EXE [Switch To]
Foo : Foo from PING.EXE
Bar : Bar from PING.EXE
From the previous dumps there are at least two things we can do better:
1) The DiaryOfAReverseEngineer
node has a string representation of [object Object]
which is not great. In order to fix that we can just define our own toString
method and return what we want.
2) When displaying the DiaryOfAReverseEngineer
node, it displays the instance properties process
where we keep a copy of the Process
model we attached to. Now, this might be something you want to hide to the user as it has nothing to do with whatever this node is supposed to be about. To solve that, we just have to prefix the field with __
.
(If you are wondering why we do not see the method Add
you can force dx
to display it with the -v
flag.)
After fixing the two above points, here is what we have:
// extendmodel_2_1.js
"use strict";
class DiaryOfAReverseEngineer {
constructor(Process) {
this.__process = process;
}
get Foo() {
return 'Foo from ' + this.__process.Name;
}
get Bar() {
return 'Bar from ' + this.__process.Name;
}
Add(a, b) {
return a + b;
}
toString() {
return 'Diary of a reverse-engineer';
}
}
class ProcessModelParent {
get DiaryOfAReverseEngineer() {
return new DiaryOfAReverseEngineer(this);
}
}
function initializeScript() {
return [new host.namedModelParent(
ProcessModelParent,
'Debugger.Models.Process'
)];
}
And now if we display the Process
model:
0:000> dx @$curprocess
@$curprocess : PING.EXE [Switch To]
Name : PING.EXE
Id : 0x1cfc
Threads
Modules
Environment
TTD
DiaryOfAReverseEngineer : Diary of a reverse-engineer
0:000> dx @$curprocess.DiaryOfAReverseEngineer
@$curprocess.DiaryOfAReverseEngineer : Diary of a reverse-engineer
Foo : Foo from PING.EXE
Bar : Bar from PING.EXE
0:000> dx @$curprocess.DiaryOfAReverseEngineer.Add(1, 2)
@$curprocess.DiaryOfAReverseEngineer.Add(1, 2) : 0x3
Step 3: Adding another level and an iterable class
At this stage, I am pretty sure that you guys are starting to get the hang of it. In order to add a new level, you can just define yet another class, define a property in the DiaryOfAReverseEngineer
class and return an instance of it. And that's basically it.
The last concept I wanted to touch on before moving on is how to add the iterable
property on one of your data model classes. Let's say you have a class called Attribute
that stores a key and a value, and let's also say you have another class called Attributes
that is an Attribute
store. The thing is, you might have noticed that one class instance usually corresponds to a node with its own properties in the data model view. This is not great for our Attributes
class as it is basically an array of Attribute
objects, meaning that we will have two copies of everything..
If you want to have the debugger be able to iterate on your instance you can define a *[Symbol.iterator]()
method like this:
// Attributes iterable
class Attribute {
constructor(Process, Name, Value) {
this.__process = Process;
this.Name = Name;
this.Value = Value;
}
toString() {
let S = 'Process: ' + this.__process.Name + ', ';
S += 'Name: ' + this.Name + ', ';
S += 'Value: ' + this.Value;
return S;
}
}
class Attributes {
constructor() {
this.__attrs = [];
}
push(Attr) {
this.__attrs.push(Attr);
}
*[Symbol.iterator]() {
for (let Attr of this.__attrs) {
yield Attr;
}
}
toString() {
return 'Attributes';
}
}
Now if we put it all together we have:
// extendmodel.js
"use strict";
class Attribute {
constructor(Process, Name, Value) {
this.__process = Process;
this.Name = Name;
this.Value = Value;
}
toString() {
let S = 'Process: ' + this.__process.Name + ', ';
S += 'Name: ' + this.Name + ', ';
S += 'Value: ' + this.Value;
return S;
}
}
class Attributes {
constructor() {
this.__attrs = [];
}
push(Attr) {
this.__attrs.push(Attr);
}
*[Symbol.iterator]() {
for (let Attr of this.__attrs) {
yield Attr;
}
}
toString() {
return 'Attributes';
}
}
class Sub {
constructor(Process) {
this.__process = Process;
}
get SubFoo() {
return 'SubFoo from ' + this.__process.Name;
}
get SubBar() {
return 'SubBar from ' + this.__process.Name;
}
get Attributes() {
let Attrs = new Attributes();
Attrs.push(new Attribute(this.__process, 'attr0', 'value0'));
Attrs.push(new Attribute(this.__process, 'attr1', 'value0'));
return Attrs;
}
toString() {
return 'Sub module';
}
}
class DiaryOfAReverseEngineer {
constructor(Process) {
this.__process = Process;
}
get Foo() {