Debugger Internals

Introduction

There are some bugs with the Debugger in Pharo (and Squeak). This part of the system is quite complex and this page aims to document some of the important classes and interactions. As we learn more, hopefully we an fix it!

Highlighting

So the entries of interest for highlighting are

                        
                            
                                1:
2:
3:
4:
5:

                                Debugger>>contentsSelection
Debugger>>pcRange
CompiledMethod>>debuggerMap
DebuggerMethodMap class>>forMethod:
DebuggerMethodMap>>rangeForPC:contextIsActiveContext:
                            

                    

Then you see the DebuggerMethodMap>>forMethod:methodNode: takes both a CompiledMethod and its #methodNode. CompiledMethod>>methodNode invokes the Parser to get the AbstractSyntaxTree from method source, and if it ever fails ends up by trying to decompile the byteCodes.

This is the easy part. Now we to deal with #abstractPCForConcretePC: and #abstractSourceMap.

By reading CompiledMethod>>abstractPCForConcretePC: you should quickly understand that a concrete PC is a byte offset of current byteCode (the offsets displayed in the byteCode view) while the abstractPC is just the rank of current byteCode in the list of byteCodes instructions composing the CompiledMethod. This is just because "byteCodes" may spread on several bytes beside their name... This will use InstructionStream and InstructionClient which are just an iterator and a sort of visitor on byteCode instructions. So this is not really interesting.

The more interesting part is #abstractSourceMap There is a first step to obtain CompiledMethod>>rawSourceRangesAndMethodDo: This is the most important part. The rest is again a mapping from concretePC (instruction byte offset) to abstractPC (instruction rank). And some build of a dictionary mapping instruction rank (abstractPC) -> selected range.

Note that the last trick seems to use a regenerated CompiledMethod (theMethodToScan) rather than the original CompiledMethod. There is no assertion whether these two are equivalent or not. A priori, they should, unless the Compiler changed since last compilation or if its behaviour is affected by some Preferences... Would we introduce some customizable Compiler optimizations that this could become a problem (We would then add to map decompiled AST to source code AST, probably with guesses, unless the CompiledMethod would contain debugger friendly hints...) We will consider this is not a problem by now.

So let's now concentrate on rawSourceRangesAndMethodDo: The nice thing is that you now can just debug this

                        
                                1:
2:

                                (ClosureTests>>#testToDoOutsideTemp) methodNode
rawSourceRangesAndMethodDo: [:rs :mth | ]

Also involves

                        
                            
                                1:
2:
3:
4:
5:

                                Parser
Encoder
MethodNode
EncoderForV3PlusClosures
BlockAnalyzer
                            

                    

A few more notes:

- Encoder>>rawSourceRanges answer a mapping AST Node -> sourceRange

Gnerally, I would expect sourceRange to be an Interval, but this has

to be confirmed.

Theses ranges are constructed at source code Parse time (see senders

of #noteSourceRange:forNode:) - The program counters are assigned to AST nodes at byte code #generate time - for inlined macros, like #to:do: in our case, this pc is after the initialize and test statements.

in CompiledMethod>>#rawSourceRangesAndMethodDo: I just evaluated this:

                        
                                1:
2:

                                methNode encoder rawSourceRanges collect: [:ival |
       sourceText copyFrom: ival first to: ival last]

And what I see is that the original to:do: message (before macros inlining) has the right range

                        
                            
                                1:
2:
3:
4:
5:
6:
7:
8:

                                {1
       to: 5
       do: [:index |
               temp := index.
               collection
                       add: [temp]]}->a Text for 'to: 5 do: [ :index |
               temp := index.
               collection add: [ temp ] ]'
                            

                    

This node is the original #to:do: and has pc=42

There is also {a LeafNode}->a Text for '[ :index |

temp := index. collection add: temp ]'

which has a nil pc so it won't be taken into account in the source map.

But one of the messages produced by the inlining has this curious range: {index <= 5}->a Text for 'to: 5 do: [ :index |

temp := index. c'

This node has pc = 40, so it will be selected before the correct one above has a chance to be. {index <= 5} is the test statement produced by inlining, and this seems to be the incorrect highlighting we see. Thus we know we have to concentrate on macro inlining. This happens in MessageNode>>transformToDo: We'll see this.

In the interim, I played with eliminating the nodes having unitilialized pc Let us try to evaluate this snippet in debugger's inspector:

                        
                            
                                1:
2:
3:
4:
5:

                                (methNode encoder rawSourceRanges keys select: [:e | e pc > 0])
collect: [:node |
       | ival |
       ival := methNode encoder rawSourceRanges at: node.
       node -> (sourceText copyFrom: ival first to: ival last)]
                            

                    

Oh god, a bug in the debugger: DoItIn: homeContext

^ ((homeContext namedTempAt: 2) encoder rawSourceRanges keys

select: [:e | e pc > 0]) collect: [:node |

| ival | ival := (node namedTempAt: 2) encoder rawSourceRanges at: node. node

-> (sourceText copyFrom: ival first to: ival last)]

(node namedTempAt: 2) does not mean a thing... A confusion occurred between the homeContext (DoItIn: method argument) and node (the block argument) Nevermind... forget about it.

Now let's just concentrate on MessageNode>>transformToDo: We see this code:

                        
                            
                                1:
2:
3:
4:
5:
6:

                                test := MessageNode new
    receiver: blockVar
    selector: (increment key > 0 ifTrue: [#<=] ifFalse: [#>=])
    arguments: (Array with: limit)
    precedence: precedence from: encoder
    sourceRange: (myRange first to: blockRange first).
                            

                    

So the intention seems to select 'to: 5 do: '

But we see this:

                        
                            
                                1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:

                                BlockNode>>noteSourceRangeStart:end:encoder:
       "Note two source ranges for this node.  One is for the debugger
        and is of the last expression, the result of the block.  One is for
        source analysis and is for the entire block."
       encoder
               noteSourceRange: (start to: end)
               forNode: self closureCreationNode.
       startOfLastStatement
               ifNil:
                       [encoder
                               noteSourceRange: (start to: end)
                               forNode: self]
               ifNotNil:
                       [encoder
                               noteSourceRange: (startOfLastStatement to: end - 1)
                               forNode: self]
                            

                    

So it seems intentional to select only the last instruction of the block for the debugger. We'd better not change this without prior asking Eliot. But obviously, this is not what is expected by the #transformToDo:

And by changing last line of this piece of code in #transformToDo:

                        
                            
                                1:
2:
3:
4:
5:
6:
7:

                                test := MessageNode new
    receiver: blockVar
    selector: (increment key > 0 ifTrue:
[#<=] ifFalse: [#>=])
    arguments: (Array with: limit)
    precedence: precedence from: encoder
    sourceRange: (myRange first to: blockRange last).
                            

                    

I get a "correct" selection...

-- Nicolas --

Hide

No comments

Wiki

Wiki

Debugger Internals

Introduction

Highlighting