Let me quote the very first book I read about C++, Jesse Liberty's "Teach yourself C++ in 21 days": "If you are writing a function that needs to create memory and then pass it back to the calling function, consider changing your interface. Have the calling function allocate the memory and then pass it into your function by reference. This moves all memory management out of your program and back to the function that is prepared to delete it."

Why did I not follow this great advice? Well firstly, this was designed to be similar to C# class Directory (and in C# garbage collector takes care of memory management, thus the caller does not care about memory issues). And secondly, I wanted a class that would be easy to use. Remember, CEnum's job is simply to create a list of file names.

So, here is how things work in CEnum:

Constructor allocates two lists

After enumeration process ends, pointers to two lists are returned to calling function

Pointer(s) to list(s) are deleted in destructor of CEnum

This means that your enumeration list(s) will live only as long as the life time of CEnum object that created it !!! You cannot pass the pointer to some other function. If you would like to do that, you either need to create a copy of the list, or comment out some (or all) delete statements in destructor.

Is this a bad design? It just might be, but simplicity of use was my primary issue when I designed this class. I wanted to be able to enumerate directory in a single line (see example above), do something with those files (e.g. display them on the screen) and then to forget about CEnum object and all the lists it allocated. If you want to do it 'properly' CEnum can be easily adapted to use lists allocated by client application, all that is needed is to add one more constructor and to comment out few lines in destructor.

This is the list of CEnum's public member variables through which user can set desired options:

bRecursiveDescription: if true, subdirectories will be enumerated tooDefault value: false

bFullPathDescription: if true, files will be enumerated using file's full path, otherwise list will contain file names onlyDefault value: false

bNoCaseDirsDescription: if true, case will be ignored when searching directory (and only directory) namesDefault value: false

bNoCaseFilesDescription: if true, case will be ignored when searching file (and only file) namesDefault value: false

sIncPatternFilesDescription: matching pattern for files you wish to include in your search. Wild cards * and ? are supported. If you have more than one search pattern, separate them with semicolon.Default value: empty string Examples:

"*.mp3;*.mp4"

"*.mp?" (same as first example)

"*.mp3;iron maid*;latest*"

Note that in case of Include patterns, empty string means "enumerate all", i.e. everything is included !!

sExcPatternFilesDescription: matching pattern for files you wish to exclude from your search. Wild cards * and ? are supported. If you have more that one search pattern, separate them with semicolon.Default value: empty string Examples:

"*.mp3;*.mp4"

"*.mp?" (same as first example)

"*.mp3;iron maid*;latest*"

Note that in case of Exclude patterns, empty string means "enumerate none", i.e. nothing is excluded!!Also, in case of conflict, Exclude pattern has precedence over Include pattern.

Since CEnum was written using STL, the only two lists created during the execution are two lists returned by GetDirs() and GetFiles() functions. All four MFC containers (two CStringArrays and two CStringLists) are created only when you call conversion functions (GetFilesAs... and GetDirsAs...). In fact, all MFC related stuff is hidden behind preprocessor directives and is by default inactive (i.e. it does not compile). If you need this functionality, then just uncomment //#define MFC line and recompile.

This is another thing that is needed often but not found easily. Two good examples are work of Jack Handy here on CodeProject [^] and the work of Alessandro Felice Cantatone [^]. Both are great examples, but each has its shortcomings. Jack's function is simple and fast, but it doesn't let you ignore case as STL's tolower and UNICODE don't match well, and Alessandro's function was designed for IBM OS/2 (not to mention his holier than thou attitude, and don't even get me started on some of the restrictions he made. What has AI got to do with string comparing?)

If ? is found in pattern string, chars match, and function moves to the next char in both pattern string and search string.

If * is found in pattern string, function moves to the next char in pattern string only. Exits if there are no more chars in pattern string, or saves the record of current position (one char after '*') and the record of next char in search string.

When two chars need to be compared regardless of case, run-time library _tcsicmp is used:

if strings (temp strings that contain just one character) are equal, function moves to the next char in both pattern string and search string.

if strings differ, function will return false if there was no '*' character in pattern string up to this point. Otherwise it will go back to position of last '*' character and advance by one char in search string.

Regarding the use of _tcsicmp: There was no special reason why I have chosen to use this function, other than the fact that it was the only run-time library routine that passed all my tests for UNICODE string comparing regardless of case.

Note: My tests were limited to European-character sets (stuff like accented, Central European and Nordic characters). For anything beyond that, you will have to test for yourself.

If you would like to use Wildcard compare function in some other project, and you don't need to ignore case, then you can greatly speed things up by using something like this:

CEnum used in demo project has one minor difference. Because I added Wildcard testing functionality in the demo project, CompareStrings function is declared as public and static. Otherwise it is a private non-static method of CEnum.

Testing enumeration This part is very simple. In OnEnumeration() function in a mere 20 lines of code directory is enumerated and its content is added to CListCtrl.

Testing wildcard comparing functionalityDemo uses test.txt test file in which you can add your own test cases. Format of test.txt is very simple. First string is wildcard string, second string is search string and the last string specifies if first two strings match. Comment lines start with '#' character.

CEnum is file-centric, meaning it was designed to search for files moreso than to search for directories.For example:

If you are searching only for files or only for directories, you will enumerate either just the way you intended.

But, if you apply filters (exclude or include) for both files and directories, then you will enumerate only those files that reside in directories that match the filter applied for directories.

Depending on how you look at things, the latter case might be seen as a limitation, because you can't perform independent search for files and directories. You can't get a list of all files in all subdirectories, and at the same time get a list of directories that match certain search criteria. In this case, the only thing you can do is to run CEnum twice, once for files and once for directories.

Another thing that user needs to be aware of is the sorting functionality. After enumerating each directory, CEnum calls STL list's sort() method. This method sorts files in alphabetic order, which may be different sorting order than in your file browser (e.g. Windows Explorer).

Finally, be aware that CEnum was designed for globbing (wildcard search) only. It can't search for files based on size, date, file content or file attributes.

There is no limitation on use of this class. You can use it (as whole or just parts of it) with or without author's permission in any sort of project regardless of license issues, in both commercial and open source projects. Though, a thank you e-mail would be nice. :-)

That's it folks, a simple class that lets you enumerate files without understanding how ::FindFirstFile and ::FindNextFile API works, or even how CEnum internally works. I hope CEnum will speed things up for you when working with files as much as it did for me.

I'd love to get some feedback from you, be it constructive criticism, request for additional features or bug reports. Don't hesitate to post your comments or send them in e-mail. I am especially interested if you know of a similar class or project. In case you do, just let me know where I can find it.

Comments and Discussions

With current implementation, a pattern "a*bc" does not match a string "aabdbc". The CompareStrings function finds first 'b' in "aabdbc" then compares next symbols - 'c' in the pattern and 'd' in the string - and returns false. Though it MUST roll back to the last asterisk in the pattern and then try to find "bc" in remained part of the string - "dbc".

Hi,thanks a lot for your remark. I completely forgot about reoccurring substrings. Funny thing is in two years or so that I've been using this class in my own projects, there never was a problem. I guess all the searches were simple ones, like looking for file extensions.

Anyway here is the new version.There is just one problem though, at the moment I am working 18h/day (weekends included), and this was designed at 2 AM last night. Under such circumstances it is quite possible that new version is incomplete somehow.If you (or anyone else) would be so kind as to post here a bunch of compare tests, especially those tricky ones, that would help me a lot.Even if they work in my current implementation, those test can help me latter if I redesign the algorithm. Demo program provided here is an excellent tool for quick testing, and adding more tests can only help making algorithm more robust.

Next week I will go on vacation and then I will find some time to update article body and project code.

One more time, thank you a lot for your excellent observation and great tip to go back to last '*' in Pattern string. I've tried few different approaches, and rolling back was the way to go.

IMHO, new solution looks too complicated I thought about possible implementations and, IMHO, a recursion is required here. You see, if we have a pattern similar to "ab*cd*ef*gh", then each of '*' in this pattern may need a rollback - try to use such pattern for a string "ab c1 cd e1 ef g1 gh".I have the implementation of such comparing function which uses recursion, but I am not sure it is optimized enough. It may be too slow, I'm not sure. You can find this function below.

There is also another, IMHO, "weak point" in your sources: a lot of dynamic memory allocations. I really don't think they are necessary. You see, std::list itself uses dynamic memory allocation, so I see no reasons to use operator new to create sts::list objects. And, by the way, I did not find where the memory allocated by the function Tokenize() is freed. IMHO, it would be better to use std::list as arguments than as returning result. E.g.

Thanks, I will look into it.I was trying to avoid recursion, I'll see if it can be done without recursion.

Regarding dynamic allocations of container objects, I don't know what to say. I'll think about your suggestion. I like pointers precisely because they allow me to use them as return value from function. Passing arguments by reference is fine, but for me code is more readable with a return value from a function. And performance penalty is trivial.I will think about it, but it was not an error, I did it deliberately.In fact, if I remember correctly, first design of CEnum, written some two years ago, was written using arguments.

You see I was under the "influence" of C#.C#'s class directory gave me the idea to write C++ class that could enumerate a directory with a single function call. And since there are no pointers in C#, I used function arguments. Then I redesigned it into what it is today.

Another, perhaps more important thing is that I currently work for a company that uses some "exotic" hardware and even more "exotic" real-time operating system (RTOS) that allows me to create ONLY pointers as global objects. So, I cannot have a global object that was not allocated with operator new. Therefore, for me it is perfectly normal to have pointers everywhere, even if they are not necessary.

And thanks for taking time to look into the code, I will update destructor to delete pointers created by Tokenize function. (Should I decide to go with pointers, and I just might not )