Efficient way to get a string contains at most 3000 chars

I am trying to read a file into a string and crypt it, I don't want to use getline (since some of file does not have any specific character such as '\n'), read by char and crypt it is too slow, so I would like read amount of characters into a string, and then just crypt it.
I select 3000 character each time I read from file, my code like below...
CAN SOMEONE TEACH ME MORE EFFICIENT WAY TO DO THIS JOB???

THANK YOU SO MUCH....

//======================================================
// The code is work, but I would like to know if any efficient way to do this job... Thanks
//======================================================
const int AMOUNT = 3000;
......
ifstream ifile;
string filename = "c:\\test.txt";
ifile.open(filename.c_str());
if(!ifile.is_open())
return;

FYI:
You could use a std::string object directly with the read member function.
However, this method is not safe, and certainly not portable.
If you're familiar with your std::string implementation, and portability is not an issue, this an alternative method, but ill adviced method.

>>There is a buffering at low level.
Modern OS have the low level buffering, but this is outside the scope of the C++ language.
Since you're comment reffered to "availlable in C++", the OS would not be considered.
In any case, the user hasn't stated what OS is being used, so we're not even sure if there is an OS low level buffering.

>>And we don't need a high level buffering as
>>1. data is read only (no write)
>>2. data is read once only

High level buffering is for read and write.
In the example code you posted, there's a read inside a loop, which means you're making multple reads.
The high level buffering optimizes the read by buffering so that the number of hits on the physical drive is reduced.

Even if this buffering did not occur, the {open, read} function sets would not be faster, and more then likely would be less efficient.
More over, you're buffering would be OS dependant, and therefore your code performance would be OS dependant.

Well... since I didn't agree [I think most of the buffering optimisation is done at system level], I made a simple program [see below]
- sys_doit () function doing my way (open ,read)
- cpp_doit () function doing your way (istream.open/read)

Compiled with g++ 3.3.3 under Linux SuSE 8.2

g++ -O2 test.cpp -o test

The test file (test.dat) is 190MB large. Results :

sys-1 took 396ms
cpp-1 took 474ms
sys-2 took 394ms
cpp-2 took 469ms

The system calls are faster than c++ ifstream stuff. I'm not surprised as ifstream is based on system calls at low level, and the file being read only once, c++ cannot rely on more than system. You will notice that the 2nd time is a bit better, as some tables are already in memory. [if you try at home, run the program twice]

For it to be an accurate test you should have used {open,read} set with {fopen,fread} set.
Or compared fstream with iostream.

Since you're mixing object with c-Style functions, you can not accurately determine if it's the class implementation causing the difference or if it's the buffering.

You need to compare apples with apples, and not apples and oranges.

Furthermore, you would have to do a lot more work to remove optimization that would also hinder your test.
It's very easy to get mis-leading results if you're not experience in properly setting environment for a proper performance test. And I don't even consider myself to be experience enough to do a good performance test without having a peer review.

*** Linux
Using system calls, open and read : time 398 ms
Using higher calls, fopen, fread : time 479 ms
Using C++ calls, ifstream class : time 464 ms
Second call:
Using system calls, open and read : time 396 ms
Using higher calls, fopen, fread : time 474 ms
Using C++ calls, ifstream class : time 466 ms

*** Windows
First call:
Using system calls, open and read : time 393 ms
Using higher calls, fopen, fread : time 762 ms
Using C++ calls, ifstream class : time 926 ms
Second call:
Using system calls, open and read : time 389 ms
Using higher calls, fopen, fread : time 761 ms
Using C++ calls, ifstream class : time 925 ms

Are we mixing apples and oranges ? in order to prove that on linux and win XP the simple system calls open and read are faster than both fopen/fread and ifstream class (for read) the program had to incorporate C and C++ code... so am I mixing c-style objects and a class ? Ok, just provide the small part of code of the cpp_doit function with no cstyle object (well, if the trick is to use a vector for str, it won't be faster :) ...

Come on, the 3 functions have a 1-2 lines loop, and are very simple, calling one API ... it shows clearly that on linux and xp, the code proposed above using fread/ifstream is slower than the one using the system calls. What kind of buffering do you expect, while the file is read only once ?! And the file would be read n times, the OS takes care of maintaining the pages in memory if the file is not modified, ifstream or fread won't be better.
Why are they slower ? Simply because open read process the request with no more checking.

Still not a good test.
I commend you for your effort, but like I stated before, it’s takes some experience to conduct a good valid performance test.

First of all, you’re still testing ifstream, which is not relevant in the manner you’re testing it. So I’ll ignore that all together. (apples and oranges).

The fopen test is not using the same code logic as the open logic. So that also makes your test invalid.
In the following lines, you have an extra function call for fopen test, which of course will make it less efficient then the open function.
while( !feof( fp ) )
{
bytesRead = fread( buffer, 3000, 1, fp );
}

It should look more like the following:
while((l = fread( buf, 1, MAXLEN-1, fp )) > 0)
{
}
So as to match it’s counter part.

You'll notice that in my previous comment I stated that the above code is a MORE valid test.

This code however, would still not pass a 100% valid testing code, because it doesn't have enough code logic to remove all possible compiler optimization which could spoil the test.

The above code runs on Windows, but could be easily modified for UNIX/Linux.

When I ran it, the difference in speed fluctuated 2% up and down.

Meaning, sometimes sys_doit was faster by up to 2%, and sometimes f_doit was faster by up to 2%.
In other words, there was no practical measurable difference.
I ran my test on Windows 2000 machine.

Running a performance test is not a laymen’s game.
As if you’re doing a scientific experiment, you want to make everything the same as much as possible, and the only variable should be what you want to measure.

Any thing else that is different in the test, can be a factor in throwing the test off, and therefore making the test invalid.

1. Of course the functions read *only* the file, and did not copy the data (for instance) as what we want to test is the speed of read, not the one of strcpy! If the compiler would have optimized the code, it would have simply missed the API calls, which it didn't obviously (otherwise time would be almost zero). The test is "read the data" not "read the data and do lengthy operations after", since, of course, all times would tend to be the same...
(btw, as stated first, lines may not end with '\n', so it's likely they're not always text ... by using x[len]=0, strcpy you will miss some data :)

2. Ok, fread does not need the eof(). result after optimization: still 16% difference!
Using system calls, open and read : time 397 ms
Using higher calls, fopen, fread : time 472 ms
[ new line: while ((bytesRead = fread( buffer, MAXLEN, 1, fp )) > 0) /*nothing*/; ]

4. ifstream: this is the code you gave to the author! do you mean your code was not the faster?
I said "open/read" will be faster, you said "No it won't. " - tests show that for pure reading of the file, open/read is faster.
"yeah but you use ifstream", yes I use the one you gave in your solution!
Give me another cpp_doit() function, more efficient, and we'll see if it is faster than open/read!
Are we mixing oranges and apples? simply doing the same operation! ifstream is more powerful? maybe, but - again - for simple operations like reading a file, the system open and read will be faster.

>> ifstream: this is the code you gave to the author! do you mean your code was not the faster?

I never said ifstream would be faster, nor did I claim anything I posted to be the fastest.

I'm only referring to your claim that open/read is faster then fopen/fread function sets.

>>Are we mixing oranges and apples?
Yes, you are. As I've stated, I've never claimed ifstream to be faster then using a C-Function open/read.
What I did say is that fopen/fread would be faster the open/read. I then went on to say that ifstream has buffering, but I left it at that.
The speed of ifstream really depends on the implementation, so it would be difficult to make a claim on its efficientcy.

This is the last I have to say about ifstream, since I don't know any other way to make it clear, that it's not part of the debate.

As far as your last results, I'm not sure how you could have gotten that. But since you haven't posted the modified code, you could still have things in your code that is throughing things off.
On my test, there's not difference between the two. However, if the OS did not have cache, fopen/fread would be faster.

>> ifstream depens on the implementation
ok, so can you provide another implementation, faster?? what does ifstream at low level ? It calls the open/read system functions.
It depends on the implementation, and on both linux and windows it's slower :)

Remember, you said initially fread/fopen ifstream/ofstream do buffering, not open/read.
Well, as we saw before, open and read do buffering, through the OS.
Since MS-DOS do you an OS being used which does not buffering ??
Even MS-DOS got a driver to do buffering later on.
What kind of buffering would be done for fread not done for read?? Why the compiler would take care of buffering while it is typically an OS job!?

You said, "I tested and 'read' is only 2% faster" - well, at least it's faster :) but let see the algo:

>> speed is OS dependant for OS buffering
sure, and APIs are compiler dependant... well, open/read is faster on linux and xp ...
what do you think fread does at low level ? It calls 'read' ... it's why on unix, for instance, open and read are in the 2-System calls family, while fopen fread are in 3-C library functions family. The "3" family calls the "2" family. Sure "3" is more sophisticated in terms of programmer convenience, and it could implement special buffering etc... but in our case, again, mulitple sequential reads of a file once, the system calls are faster.

The modified code ? This is the same, besides the fact that fread does not use eof() anymore, just the f_doit() changed:

>>You said, "I tested and 'read' is only 2% faster" - well, at least it's faster :) but let see the algo:
No. That's not what I said. You seem to be reading what you want to read.
I said there was a 2% difference up and down. That means sometimes fread was faster then read, and sometimes read was faster then fread.

>>This is the same, besides the fact that fread does not use eof() anymore, just the f_doit() changed:
So you're saying with the above code changes, you're still getting 16% difference?
If so, the fopen/fread function set must have a very poor optimization for your library.

Of course, if you run this in DOS only mode, you'll notice that fread is much faster then read.

Yes, it has still 16% difference *without* doing the lengthy copy operation of course :)
Anybody can try the code I provided using a good compiler...
...talking to that, I used g++ 3.3.3 one of the best library around.

It's one of the best library for portability, but it's NOT a good library for optimization, and it's a poor choice for performance test.

That's why you're getting such bad results.
If you used a compiler that is built for optimization, you wouldn't be getting those results.

>>Anybody can try the code I provided using a good compiler...

A good *OPTIMIZE* compiler, will give you results closer to what I posted, and will certainly not give you a 16% difference.

The GNU compiler is a really good compiler for portability. In fact, it's the best portable compiler available.
However, the portability comes at a price. And that price is poor optimization and performance.

Therefore, this certainly would not be the compiler of choice for making a case on any C/C++ performance issues.

Well ... I thought that the power of thousands people working on open-source would provide better soft, especially for theorical matters, like compilers...
We have to keep in mind something though, it's not because a compiler gives the same timing for read and fread that it is better, it could that it simply bad for 'read' :)

Unlike C#, C++ doesn't have native support for sealing classes (so they cannot be sub-classed). At the cost of a virtual base class pointer it is possible to implement a pseudo sealing mechanism
The trick is to virtually inherit from a base class…

C++ Properties
One feature missing from standard C++ that you will find in many other Object Oriented Programming languages is something called a Property (http://www.experts-exchange.com/Programming/Languages/CPP/A_3912-Object-Properties-in-C.ht…