Introduction

I've seen many people ask about utilities to view the tree of #included files in their source code. I think I've seen one of these before, either here at CodeProject or maybe at CodeGuru, but I can't seem to find it. Since I really enjoy text processing, I took it upon myself to write my own. What I've come up with is a fast, multi-threaded #include tree generator.

As you can see in the screenshot above, there are a few fields to fill in before you can generate the tree. I think it's pretty self-explanatory, but I will explain the fields here just in case some think it's not.

Search in

Type in a list of directories you want to search. You can enter multiple directories at one time by separating them with semi-colons. Or, optionally, use the browse button ("...") to pick a directory from the standard Shell Browser. This combo box will store the recently searched directories, so you can also select a directory from the drop-down list if you like.

File mask

Type in a list of file masks you want to search for. You can enter multiple file masks at one time by separating them with semi-colons. This combo box will store the recently used file masks, so you can also select a mask from the drop-down list if you like.

Includes

Type in a list of directories where external header files reside. You can enter multiple include directories at one time by separating them with semicolons. Also, it's possible to enter an environment variable in this field, which the software will expand at runtime. To do this, simply prefix the variable with a '%' character (i.e. %INCLUDE).

Recurse subdirectories

Set this check box if you want the program to recurse through subdirectories of the directories you entered.

When searching a workspace or project, you don't have to input as much information as when searching a directory, but you still have to load a project or workspace, specify a file mask, and select a configuration. The software will read settings related to preprocessing, such as additional include directories, and preprocessor definitions. Also, the VC6 project reader will read the registry for the VC6 include directories, and the VC7 project reader will read the registry for the VC7 include directories, which means if you have VC6 or VC7 configured already, you don't have to configure this program as well.

Note: The workspace and project loading isn't perfect yet, but I have tested it in a variety of scenarios, and it works well for me. The biggest problem is accurately interpreting the project settings, which there are still issues with.

Workspace

Type in the path to the workspace or project you want to load. Or, optionally, use the browse button ("...") to pick a workspace from the standard File Open dialog box. This combo box will store the recently searched workspaces, so you can also select a recent entry from the drop-down list if you like.

Load workspace

Click this button to load the selected workspace or project, so you can select a configuration. If you don't load the workspace before you click Start, the Include Finder will use the first configuration found in each project.

File mask

Type in a list of file masks you want to search for. You can enter multiple file masks at one time by separating them with semi-colons. This combo box will store the recently used file masks, so you can also select a mask from the drop-down list if you like.

Configuration

Select a project configuration. The configuration you select will determine which preprocessor include directories are used, and also which preprocessor definitions are used.

Other options

Parse preprocessor macros

There is now the capability to parse the preprocessor macros, and this flag is used to turn that feature on and off. Since the parsing may not be 100% accurate (with respect to the compiler), I leave this as a choice to you.

Once you've entered some valid data, click Start and the program will start the search. The searching is done in a separate thread, so you will see the button's text turn to "Stop" once you click Start. To stop the search, simply click Stop and the thread will quit, turning the Stop button back into the Start button.

The output is shown in a standard Win32 tree control, using MFC's CTreeCtrl. As each source file is completed, the program populates the tree with the source file and all the files #included by it. You will notice that there are a few icons used in the tree that denote properties of the item. Here's an explanation of them:

This icon is used to denote a regular file.

This icon is used to denote that the instance of the file is not the first time it was included by the current source file (either directly or indirectly).

This icon is used to denote that the file could not be found along the specified include paths.

This icon is used to denote that the file has been recursively included.

To search the tree for a particular file, select the Find... menu item, or click F3 to search for the next occurrence of a string.

If you click on the Save toolbar icon, or select the Save Results... menu item, you will be prompted to enter a file name to save to. The results are saved in XML format like the sample shown below.

Sample XML output (viewed with IE)

Once you have saved an XML file, you can click the Open toolbar icon, or select the Open... menu item to select a file to reload. The XML tree will be reloaded into the Include Finder.

About the code

Note: At my work we have a coding standard which is a form of Hungarian notation. I've also used this standard in this project. It's what I've been using for the past few years, and I like it, but that's not the point of the article or the code, so please don't flame me for it.

I've tried to keep the core code of the program as separated from the interface as possible, so that it will be easy for someone to reuse the interesting portions. Since the code has grown substantially in size since the first version, I decided to omit descriptions of the classes. If you are interested in using the code, you can find a description of what's in each file in the headers at the top of them.

Acknowledgements

I'd like to thank the following people:

Neville Franks for sending me his .dsp/.dsw reading code, which helped make the parsing of them more robust.

CParser::FindFullPath had a bug which made it find the incorrect file in certain cases.

CLexer incorrectly handled certain characters that could be found in binary files.

CLexer incorrectly handled files which had a strange combination of \r\n, which caused a crash.

July 30th, 2003 (bug fixes and some changes to run on NT 4 without IE shell update)

Fixed a bug where FindFullPath would incorrectly handle two files with the same name but in different paths. - Thanks to Oliver Wahl.

Fixed a bug where bracketed include directories from project settings weren't read. - Thanks to EvanKeats.

Removed the use of SHGetSpecialFolderPath and replaced it with SHGetSpecialFolderLocation/SHGetPathFromIDList. - Thanks to brownfox and Hemal Shah.

Disclaimer and copyright

Although great care has gone into developing this software, it is provided without any guarantee of reliability, accuracy of information, or correctness of operation. I am not responsible for any damages that may occur as a result of using this software. Use this software entirely at your own risk. Copyright 2003, Chris Richardson.

Final notes

Any feedback or bug reports will be greatly appreciated. If you do or do not like the article, the code, or the program I've provided, please tell me why.

Thanks for taking the time to read my article. I hope you enjoyed it, and I hope you find the Include Finder useful.

(1) two instances where the return value of _tcschr is assigned to TCHAR pointer (must be TCHAR const *)(2) two for's that break due to the now standard-compliant scope rules(3) above line (the vector iterators are no longer pointers) but this fixes both problems:

I appreciate the help, thanks again. I hadn't noticed the break statement either.

If you are interested in the code base, I could talk to Chris M. about making you a co-author so you could upload your changes. Otherwise if you'd like to send them to me I will make sure to attribute the fixes to you, in the article and code.

Like I said before though, the code is pretty rough, and it has been a few years since I've had time to work on it.

Do you mind if I ask what you are thinking about adding? Only out of curiousity; you are of course welcome to do whatever you'd like with the code.

If you are interested in the code base, I could talk to Chris M. about making you a co-author so you could upload your changes

I was going to suggest that if I get anything useful done I think that's better than a separate article.

I am thinking of the following:

Finding include paths (how come a.cpp depends from b.h?)

Reverting the tree, i.e. starting at a file and seeing into which files it goes)

Some better control over the include directories, like not scanning PDSK headers (and it looks like VC8 include directories need to be supported, too)

better command line support (so it can be added to Extras/Tools conveniently)

VC6 also has a problem accessing the headers during a build while you are scanning - it looks like you are doing everything right (_tfopen opens with "share deny none" at least wit hthe VC8 runtime, so this might be a VC6 problem)

Other things I am now thinking of: some metrics (how "hot" is a file, lines of code), using a checkbox to enable/disable (comment/uncomment) #include statements, so one can go looking for unused ones, stripping a base folder from paths (if they are local to it) ....

But these are just ideas, I am currently just checking if I can get along with the code base and I don't know how much time and energy I can put into it.

The company I work for has a Visual Studio Addin that shows include graphs inside Visual Studio itself. They update automatically as you type and are very configurable. See my sig for more details. It's commercial but cheap, and with a 14-day free trial

I recently made a fix and added an enhancement to IncludeFinder that I wanted to share.

The fix has to do with the search order of include directories -- "standard" include directories should be searched last, not first, the latest version has this backwards. This only matters if you're including a file with the same name as one of the "standard" include files, in which case the wrong file gets picked up. (In my case, it was IMessage.h).

The enhancement supports the use of environment variables to specify include directories in project files. Visual Studio supports this, but IncludeFinder does not without the attached patch.

Hi, I'm using 'doxygen' with 'graphiz' to document my projects and it's doing a great job! Would be very nice to have an UML diagram of classes from a Visual C++ project. Could you please tell me if you could do that? Thanks in advance.

Hi, Thank you for writing this! There've been times in the past when this would have been very useful

One possible enhancement I could suggest would be to allow a workspace to be passed in on the command line. I saw that there is a standard treatment of CCommandLineInfo in the InitInstance but I think it currently treats the input file as a results document. If the workspace can be input in this way then I believe one can add a verb to the Windows file association application and hence be able to right click on a workspace from a File Explorer and launch the IF Hierarchy viewer that way! Just an idea, you probably have left this code behind now!

But people never start one of these without asking for something, do they? So ...

Sometimes you have a conflict with something in a header being included five levels deep. Often there's some preprocessor symbol you could define or undefine that would skip over the problem definition. (Like NOSYSMETRICS etc. in Windows.h, or even a multi-include guard like _WINDOWS_) Tracing the various #ifdef (and #ifndef, #if defined(), #else, etc.) directives can be tedious and error-prone.

Would it be possible to track and report the preprocessor state that was necessary for each file to get included when it was?

(I'd love to look at the sources and add this myself, but I know I won't have time in the forseable future, so I offer the idea here in case somebody else has time.)

If you were to select the "Directory" tab, and under "File mask" enter "*.*", the screen begins to fill up and about a second later, the entire screen disappears. To prevent this from happening, the user has to uncheck the "Recurse subdirectories" checkbox.

The suggestion is, "If this has to be done in order to prevent the screen from disappearing, why not disable that checkbox if the user were to enter "*.*" in the "File mask" window?"

Another suggestion: Since this is a handy programming tool, why not convert it to be a "plug in" and insert a toolbar icon where the user can simply click on the icon and have the information appear on the screen.

Thanks for the feedback. What did you enter for the search directory? And by the screen disappearing, do you mean the tree control? It shouldn't have to limit the recursive search if *.* is used; I need to fix whatever bug you found.

WREY wrote:Another suggestion: Since this is a handy programming tool, why not convert it to be a "plug in" and insert a toolbar icon where the user can simply click on the icon and have the information appear on the screen.

This is in my "To Do" section of the article. Unfortunately, I don't have the time right now. Hopefully some day soon

Yeah I've gotten a quite bit far behind on this project, as I have been extremely busy with work and family the last year or so. I'd like to fix all the issues people have reported, but for the mean time, if you'll tell me which issues will prevent you from using it, I can let you know if I will have time to fix them.

This worked first time, and is most handy. I thought of these things that may enhance it more...

1. To include the icon explanations in the about box, or as a pop-up tip in the list area. You would not need to add any more documentation, it is so easy to use!2. To pre-fill the file mask with some standard/suggested entries.3. To offer a feature to list the files by .H first (rather than by source file). Thus the top level tree lists all of the include files.

this is could be more useful if we can specify additional directories to search while loading the workspace. As I have directories of 3rd party sdks specified globally in the compiler include settings and not via project configuration /I, it is unable to find those files and gives a '?'

I have downloaded your program and code and gived it a try. I have the impression (based on what I saw from the project on which I tried it...) that you're wrongly searching the path.

Explanation : my project is a clean MFC project, using pre-compiled headers, which are accessed from the (not so extravagant!) stdafx.h include file.

In all cases, the shown stdafx.h is the one from VS98\VC98\MFC\SRC directory and not the one from the project. So I'm thinking (I haven't dug into the source code) that you forget to search FIRST in the project's directory before the include's path.

Beside this, it would be interesting to have some kind of reports (HTML format would be great!):- Missing searched files- included filenames sorted by the include count (desc) for a given source-code file (which should be 1)- included filenames sorted by the include count (desc) for a project (which should be less than the number of source-code file, and better: 1)

It dos not work on wondows NT..When tried to run the exe it pops up following error message:"The procedure entry point SHGetSpecia;FolderPathA could not be located in the dynamic link library SHELL32.dll"..Please let me know what is the problem..

This problem is due to corrupt installation of your IE. This error generally occurs when your system dont have IE 4.1 or greater. Do one thing uninstall your current IE version,download IE 5.5 or higher . While runnig IE setup file use IE6Setup.exe /C:"ie6wzd /e:IE4Shell_NTx86 /I:Y".

Your tool is very useful, however my main reason for needing it is to figure out why on earth MyFile.cpp is compiled when TotallyUnrelatedFile.h is changed.. It would be nice if the tree could automatically find me every path from MyFile.cpp down to TotallyUnrelatedFile.h, this would allow me to remove the offending include??

FindFullPath incorrectly handles two include files with the same name (but in different paths). In the code path for !p_bIsBracketed the relative part ( a_szExtra ) is completely ignored. So if you happen to have an include file with the same name in your local directory it incorrectly picks it up, instead of looking in the relative destination. I just added the two _tcscat lines.

Simple minded as I am, I loaded a workspace, entered *.* as file spec and hit start.When the app comes to the first .ico file which is included in the .dsp it locks up forever.The problem is in the CLexer::Lex function where it reads the char and unget's it because its not a alpha, digit or underscore. So it runs in circles trying to CParser::Eat that byte.

[edit]... and hitting stop finally puts the nail in the coffin and kills the app totally...The include finder thread has no means of terminating when it is stuck in a file.[/edit]

powerful binary resource reuse - another word for "no sources, you are stuck with a pain-in-the-a## COM component"

Thanks for the information . I've fixed the problem, and updated the article and the downloads to reflect this fix, as well as a couple others. If you see any other bug reports, feel more than free to send them to me, and I'll keep the source updated with the fixes.

Andreas Saurwein wrote:[edit]... and hitting stop finally puts the nail in the coffin and kills the app totally...The include finder thread has no means of terminating when it is stuck in a file.[/edit]

I'm going to take a look at changing this. Even though I've fixed the current lockup, there may be more in the future, I would like to handle them better.

Working on an empty include file the open method fails to create a file mapping and returns an error which aborts parsing of the parent (should it ?). Here is my fixed version of CMMFStream::Open to check the file size first before creating a mapping. It now keeps on working after processing an empty include file.

Sorry to hear that it's crashing, and no, it's not your fault (it's a bug in my program). It would be a great help to me if you could send the line of code it's processing when it fails (the line of code in your cpp file). Here's how you can get the file name and line number, to determine where the parser currently is:

1. CMMFStream has a member c_szFileName, which is the name of the file that's currently being processed.2. Double click on the callstack entry that says "CLexer::Lex(CToken * 0x00f1fcf4) line 101 + 23 bytes", then take a look at the c_ulLine member of CLexer. This is the current line number in the file that's being processed.

Now, you can open the source file and go to the line that was being processed. If you could send me this line, or a mocked up line which has the same form, I'll try to locate the problem. If you can't send it, that's alright, I'll try and have a look anyways (but don't know how far I'll be able to get).

Hey, I just wanted to let you know I found the bug that was causing your crash (well, I found one that lead me to the same callstack). Thanks for the email today, it helped me a lot. I've updated the article and the downloads so they will reflect this fix (as well as a couple others).

After a condiserable amount of swearing I finally got your excellent tool working. It is, apparently, strictly forbidden to enter "*.h" among the file masks. If you do, the result is: nothing. Maybe a little warning about this would be appropriate? Or maybe it wasn't intended for stupid morons like me that fail to realise that all the header files will be included automatically...