Debugging an embedded, statically linked interpreter

Does the Python debugger require any special magic inside the interpreter to signal that it can be debugged? Ie. does the debugger look for Python33.dll in the modules list, etc? I ask, because my application embeds the interpreter statically; so there
is no Python33.exe or Python33.dll, etc; just my app. I do build the interpreter with debug symbols for my debug build, so I can actually step into the native Py* function calls; but I haven't yet discovered the recipe that will allow me to hit breakpoints
in my Python script. How does the debugger figure out that my process is Python-debuggable (or how can I get it to broadcast that)?

If you're using mixed-mode Python/C++ debugging, then that requires the interpreter to be linked dynamically - it specifically looks for python??.dll or python??_d.dll to find the functions that it needs to intercept for that to work.

If you're using the regular pure Python debugging, then it depends on how you attach. Attaching via Debug -> Attach to Process will also require python??.dll to be loaded. However, if you use remote debugging (ptvsd),
then it doesn't matter how your interpreter is loaded - or whether it is even CPython at all - provided that it can load and run the debugger Python code.

I wish that wasn't the case though; I'd really rather not require the DLL, and it is going to take a bit more effort to get ptvsd to work in my scenario. For a bit more background, I have customized my interpreter to run inside a Windows Store app package,
stripping out functionality that was not WACK-friendly. Since it is part of a winmd DLL, intended to be used by other projects, I wanted to get rid of the additional DLL dependency, since it gets complicated with non-C++ projects that aren't props file friendly
(ie. I need to copy the ARM DLL when that target builds, Win32, debug, release, etc etc).

The problem with the lack of DLL is that it gets that much harder to find all the interpreter guts that way. Mixed-mode could hypothetically still do it because it, for the most part, uses symbols to find things (but occasionally it also uses DLL exports)
- so adding a configurable parameter telling it where to look would be possible. For regular attach, though, it uses the export table only (and hence doesn't need symbols) - and there simply wouldn't be the export table in your case.

Ah, interesting. It would be really useful to have a switch like you describe for mixed-mode. I would actually love to be able to F5 my project into mixed Native/Python or Native/Managed/Python debugging. Currently in all cases I will have to launch without
the debugger, and attach after the fact. Thanks again for the insight. I'll have to rethink my approach potentially.

We actually almost have everything that you'd need for this. In Python project properties on Debug tab, you can specify your own binary under Interpreter Path - you could make this your C++ project output. It will then be launched on F5, paused,
and attached to.

The only problem there is that we assume that it is actually an interpreter, and so the command line that we pass to it will include the startup .py file name. Unfortunately, while you can remove the startup file name on the General tab in project properties,
we won't let you start the project if it's not set. It seems like disabling this check when a custom interpreter path is enabled would do the trick.

I spent some time and experimented with that idea after you suggested it. I built a win32 console app that consumed my Python lib (modified slightly, since it is not an app container) and imported and used a py module. Then I made a Python project, and
tried to launch the console app as the interpreter; but it refused to launch, saying that it wasn't a valid interpreter.

This sounds strange. In can successfully set Interpreter Path in project properties to, say,
C:\Windows\System32\notepad.exe, and it launches (with the startup file open, since that was passed as an argument) when I press F5; with "Enable native code debugging" box checked, it also attaches to it. If we can launch & debug
Notepad like that, it would seem that any other binary should also work.

Make sure that Launch mode is set to "Standard Python launcher" in project properties for you? If that is not it, can you describe your setup in more detail (a screenshot of project properties would probably be most descriptive here).

The output from "Tools -> Diagnostic Info..." will also include some details about the currently open project, so you could email that to
ptvshelp@microsoft.com (with a reference to this discussion so we know where it came from) as well.

At your response, I dug deeper to see if it was a mistake, and it turned out to be a silly mistake. Though, this could be a bug. I obtained the path to my built interpreter executable using Explorer's "Copy as Path", which sticks quotes around
the path. I pasted it in, as is, since usually quotes are okay in URIs. Removing the quotes allowed it to launch my program.

So, on that note, it still didn't give me the mixed python debug experience. So I temporarily switched my Python build over to be a DLL project instead of a static Lib, and got a different problem. However (and this was exciting), I tried instead with my test
Windows Store app, linking in my custom Python build, now a DLL. Since I can't launch the Python mixed debugger from the native project properties, I executed the AppX, used Attach to Process, and saw that it did recognize that there was a Python environment
inside the process! Using the mixed debugger, I was able to inspect PyObject* variables using the [Python] node. Really cool. What didn't work though, was that it wasn't able to find (or associate) my py module with the one in the process; so I couldn't step
through Python code.

But I'm starting to wrap my mind around how this works now, and I think I can see why there is a limitation here. I'm guessing for performance reasons, the debugger looks for the Python DLL to avoid searching every module for the right symbols. I think a useful
addition would be the ability to specify additional assembly names (possibly in the options).

Yes, it specifically searches for python??.dll or python??_d.dll (and even then for supported versions, so
27, 33 or 34 right now), and only then requests the symbols for that particular module.

If you want to experiment with tweaking that, you can try building your own version of PTVS with changes. The
build instructions are really straightforward, and for this you'd only need VS and VS SDK to build the core projects (and ignore all the HPC etc stuff that drags in other dependencies).
Here is the code that detects whether a particular loaded module is a Python runtime DLL - it should be fairly trivial to add your .exe name to that list. If that works, I can help you work this into a full-fledged feature where the module name is defined
as a property in the project file, and then we'll merge it into the product.

We haven't tried AppX debugging with this, but it's probably not really any different. The reason why the code might not be showing up properly is because the files in the project being debugged have paths different from what it has at runtime (because it's
deployed to its container before being launched). Still, I did some work specifically to handle such a mismatch, for remoting purposes, so I would be interested in pursuing this further. Can you check what the full paths of the modules in the debugger Modules
window (not shown by default, but you can enable it in Debug -> Windows) look like?

I thought it was interesting that the slashes are not normalized; but I'm guessing this isn't the problem. Is it possible the debugger doesn't have permission to load the file out of the app container? Still, it is odd that I can open the file from the project
directory and set a breakpoint that appears to be active (it lights up when the debugger is running, and seems to indicate it is linked up where it needs to be). Additionally, the debugger actually seems to stop at the breakpoint, but is unable to open the
file--I get the "can't find sources/disassembly" page that normally shows up in native debugging when PDBs, etc, are not available. I tried the browse link from there also, and it seemed to do nothing useful.

As for the experiment, I need to dig a little deeper, but adding code to make GetPythonLanguageVersion return a valid version when it hits my module makes some, but not all the plumbing work. I can see that it does find symbols and information (interesting
was the static "initialized" variable that I saw it pull successfully out of pythonrun.obj), but the watches window doesn't give me the [Python] node yet. I'll let you know as I find more.

The debugger runs inside VS, so it would have the same access as VS itself. So you can try opening the file from that path in VS editor to see if it has access. I just tried and, indeed, accessing the apps required me to elevate...

Having said that, this path doesn't look like it's coming from the app container to me - it doesn't start with
C:\Program Files\WindowsApps, which is where the individual containers are. I wonder if that is actually the problem. Could it be that you end up deploying .pyc files that are compiled with the development paths, but then run from the app container?
I'm not quite sure what the implication of such a mismatch would be, but things might break.

It might also be interesting to look at ModuleManager.FindDocuments - that's the bit that matches filenames (e.g. the one in breakpoint vs the one in the loaded module). It does some filesystem walking there in case the project and the runtime are not off the
same directory, to match files regardless. This should cover your case, but placing a breakpoint there and stepping through might provide some clues as to what is going wrong.

For debugging the statically linked interpreter, the Modules window also serves as a helper indicator that the Python debugger considers itself fully initialized. It won't fill the list with loaded .py files until such time that it can report itself as a distinct
runtime to VS, and it won't do so until everything is in place. So if you don't get any Python modules in that list, it means that it's still waiting for something to complete. NativeModuleInstanceLoadedNotification.Handle has some code that handles loading
of the main Python interpreter module (which should be your .exe in this case), and the injected debugger helper DLL. Setting the breakpoints on both might help figure out what's going on there.

As far as the path is concerned, when debugging Windows Store Apps they are run out of the AppX folder in the project directory. I don't recall whether this is true for Ctrl+F5 in VS, but it seems unlikely they'd be deployed to an All Users location (LocalAppData
is more likely).