Deleting Duplicate Proxies

This is a discussion on Deleting Duplicate Proxies within the C++ Programming forums, part of the General Programming Boards category; I’m pulling my hair out lol
I have some spare time this weekend and I want to write a program ...

You could read in every proxy and store it in a linked list. Create two pointers one pointer to the proxy currently being checked for duplicates and the other which moves thru the list then just update the first pointer to the next proxy.

That it, while it is not the end of the file. I am sure someone here will tell you why it is not always the best idea to use eof to check for the end of a loop; but I have never had any problems with it!

>> Instead of inFile in your while loop you could do something like: while( ! inFile.eof() )
>> I am sure someone here will tell you why it is not always the best idea

In this case, making that change means that if there is an error reading it won't be picked up and the loop will never terminate. In fact, neither solution is the best solution.

The read from the file stream should be used as the loop control, or the stream should be checked after the read and before the word is used: while (inFile >> buffer).

There are no cases that I know of where that is a worse solution, and in this case it is better. If you use eof() to control the loop (or you use the file stream like the original code does), there is a real chance that the last word will be deleted incorrectly from your file. The loop will run one too many times and so the last word will still be in the buffer variable when the read fails due to eof. The code will think the read succeeded, and delete that word since it will match the proxy.

C++ has a wonderful container called a string that makes managing text much easier.

Do you care if your proxy strings are sorted or not? If not, I'd read the data from the file into a set<string> container. A set does not allow duplicate data to be stored into it. If you attempt to insert an item into a set that already exists then nothing happens, the insert call fails (a soft failure, not a program ending hard failure). The set container also automatically sorts its data.

I'm not sure if ifstream objects have any output related functionality; no write or seekp member functions, no stream insertion operator (<<). I don't know what the point of that is. It should be an fstream object instead of an ifstream object if you intend to do both input and output using that stream.

Code:

system("PAUSE");

Try to avoid that if possible, a call (or two) to cin.get() should be all you need (you also wouldn't need the <stdlib.h>/<cstdlib> header in that case).

"Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
-Christopher Hitchens

I put this little project to the side until I finished my last semester. Here is the finished duplicate proxy checker. I never heard of a “set<string> container” before could you give a little more info on this?

I am trying to write an "http" proxy checker now. I know very little about network programming. I am interested in simply knowing if the proxy is an http proxy for web surfing. I could care less about socks 4 or 5 proxies. I want to test the proxy; then if it is good, I will push it onto a stack and at the end of testing, all proxies will be written on the file. The only compiler I use is DevC++.