binary tree token problem

This is a discussion on binary tree token problem within the C++ Programming forums, part of the General Programming Boards category; Hi there... I've got a question about tokenizing strings...
I'm attempting to write a program that inputs a sentence from ...

binary tree token problem

Hi there... I've got a question about tokenizing strings...

I'm attempting to write a program that inputs a sentence from the user, tokenizes it and then hands the words to a binary search tree. I am using 'strtok' to to tokenize the sentence so the original is in a char format. I can print the individual tokens using a pointer and everything works great until I hand the tokens to the tree insert function. It appears that the program only hands off the first character of each word instead of the whole token. I get warnings that the individual first letters are dups (which is a check in the insertNode function template) so I think the function call to insertNode might be the culprit although I can't figure out a way to hand the tokens off any other way (I tried several).

Here is the pertinent code (I omitted the header files because I suspect the problem is in the function call ):

The assignment suggested to use the strtok function so I assumed that this would be the easiest way to accomplish the tokenization of the sentence. I checked many of the previous binary tree threads here but could not find anything that seemed similar to this particular problem. I'm probably missing something obvious but I can seem to figure out why the whole word is not getting passed to the function...

>strtok() is just soooooooooo old-school.
What's wrong with old-school?

>Don't intermix C and C++ code -- you'll scare people
Not the people that matter.

>and one day your poor compiler will freak out.
Doubtful.

>Here is the pertinent code
That's not all of the relevant code. If you suspect insertNode (I do too), you need to post the class definition as well as the code for insertNode. When you have a run-time bug, it helps to post enough code for us to be able to run so that we can see what you're seeing.

>charTree.insertNode( *tokenPtr );
Depending on how you've implemented insertNode, this may or may not introduce a painful dependency. tokenPtr is just that, a pointer. If sentence ever changes, the data in your tree could change as well.

The program is designed to take in just one sentence, tokenize it and pass them to the insertNode function as it is tokenized for insertion into the tree...

I used:

Code:

charTree.insertNode( *tokenPtr );

...because I knew that 'strtok' alters the original string as it tokenizes so I wanted to ensure I passed each word to the function before the next tokenization pass... I'm a little shaky on pointers but I assumed '*tokenPtr' would refer to the token itself (as opposed to the address '&tokenPtr')... Apparently, it is only passing the first letter because I created a 'char' tree (I assume?) and only expects a one char input....

In the 'cout' line below it, the entire token prints to screen without difficulty so I'm sure the pointer can reference the whole word... This is making my head hurt a little! LOL!

Duh, I wasn't even close to paying attention. The problem is both the definition of your tree object and the call to insertNode. You define the template as having char as the type:

Code:

Tree< char > charTree;

Pointer to char would be better for what you're trying to do:

Code:

Tree< char* > charTree;

Ideally you would use a string object instead of pointers. Then you call insertNode with *tokenPtr, but that's dereferencing the pointer and gives you a char (the first character in the string). That's your problem. Add a * in your object definition and remove it for your call to insertNode and things should work a little better.

But now you've got that dependency issue I was talking about. In your constructor this kills you when you use pointers:

Code:

data( d ),

This copies the pointer, not the data that the pointer points to. This is where string objects really shine because otherwise you need to make a deep copy of the string where the string class does this for you.

Prelude - The sentence input is a one shot deal for this program (the tokenization process changes what is in sentence, however, right?)...

I changed the tree object definition:

Code:

Tree< char > charTree;

to:

Code:

Tree< char* > charTree;

...and the call to insertNode...

Code:

charTree.insertNode( *tokenPtr );

to:

Code:

charTree.insertNode( tokenPtr );

These changes definitely made the program insert the words into the tree now and inorder and postorder traversal seem to work fine...

On the downside, the preorder traversal seems to have stopped working... It now comes out exactly in the same order as the inorder traversal... The original setup I used seemed to work OK (with the first letters)... Is this the dependency problem you spoke of?

Is there any easy way to convert this to work with strings? I don't know of a command similar to strtok that works with strings... How would you tokenize the sentence string?

BTW - I really appreciate your help... I think I'm a little over my head here! I feel like I understand most of this but when something unexpected like this happens, I get totally confused, especially when pointers are involved...

>the tokenization process changes what is in sentence, however, right?
Yes, null characters are inserted to split the string up into multiple strings.

>the preorder traversal seems to have stopped working...
Well technically it never did work in the first place. The problem is that you're comparing pointers, not the strings that they point to. Maybe something like this instead?

Will adding 'strcmp' in the template still allow its use for other types of variables (i.e. - int, double, etc.)? I was shying away from messing with the templates themselves to avoid making it string-specific...

Just curious...

BTW - I am looking up the 'stringstream' posts to see if I can convert the program without giving myself an anyeurism!

>I was shying away from messing with the templates themselves to avoid making it string-specific...
I got that impression, but if you want to use C-style strings, you're SOL. Either use std::strings, because they overload the comparison operators, or specialize your template for C-strings. If you don't like templates, you won't like specializations. Let's say something like this:

The stringstream looks like it will do exactly what I want (I found a reference to the FAQs in another thread)... I've been through a bunch of posts and I know I'm missing something stupid... The program prompts for the sentence and when I hit 'enter', the program just stops!

Since I am using a Deitel-provided Student Introductory Edition, I did not know I could use the service packs on the MS site (yes, I am a dope!) I manually corrected the 'String' file and voila! Works fine...

Thank you jafet, JaWiB and Prelude (also hk_mp5kpdw) for putting up with my ignorance... Your string suggestion works much better... I wish I had gone with it sooner (my assignment was 1/2 hr. late)... Oh well, at least I understand the concept better now... Hopefully, one of these days, I'll be able to help people out the way you all do... I really enjoy programming and I'm sure I'll be playing around with C++ long after this class is over...