I tried different ways to load file contents in to structs but have some trouble. Simply because the structs contain strings, which makes them variable in size. The file is supposed to hold several instanses which each needs to be loaded in to a struct.

I am thinking that the file header contains the number of structs in the file. The struct header can contain the length of the string.

Would you be so kind to show me how I can read a chunk of data from a file and cast that in to a struct that has a string in it? (btw are std::strings bad for this?)

If you want to read and write the whole struct in one line you can use a char array instead of std::string. Another way is to handle each member in the struct separately. You can write a std::string to file by first writing the length of the string and then the string data.

That is the current layout of the data in the file. str[length] is a NUL character.

"bar" could be used to identify what type of struct we are reading?

Yes. Sometimes you can omit such identifiers, as the format of the file only allows certain structures in certain positions.

Note that if you are explicitly writing the length, the NUL terminator can be omitted. Care must be taken in this case to correctly convert the not-NUL-terminated character array into a std::string. There are constructor overloads that take a character pointer and a length, or the assign() member function could be used.

Are you seriously suggesting a solution by abusing memory like that where you tell the compiler and user you have a one-character array for the string, and then storing the actual string content way outside the array and the object? That in itself is undefined, and then your code doesn't even consider the fact that you're not aligning consecutive structures properly to ensure that their members are aligned.

You cannot use your objects by themselves; they are only good for storing them as pointers in an array given your code to load them. Your code will blow up as soon as you try do treat an object as a value. What you propose is nothing more than a pointer and a dynamic sized string, but instead of having a safe implementation of the pointer, you're way into the realm of undefined behavior.

Are you seriously suggesting a solution by abusing memory like that where you tell the compiler and user you have a one-character array for the string, and then storing the actual string content way outside the array and the object?

Actually, it's a perfectly normal C idiom that was formalized in C99 with flexible array members. Even the Windows headers use it. Ex: the SYMBOL_INFO structure in dbghelp.h. I'm not personally a big fan of this technique, but it's not uncommon.

Are you seriously suggesting a solution by abusing memory like that where you tell the compiler and user you have a one-character array for the string, and then storing the actual string content way outside the array and the object?

Actually, it's a perfectly normal C idiom that was formalized in C99 with flexible array members. Even the Windows headers use it. Ex: the SYMBOL_INFO structure in dbghelp.h. I'm not personally a big fan of this technique, but it's not uncommon.

I'm not saying it's great C++ code, but the struct hack is something you can reasonably expect your C++ compiler to handle without incident seeing that it is a C idiom that occurs in headers that C++ compilers are regularly expected to digest in APIs commonly used from C++ code. On non-x86 platforms alignment could be a deal breaker for this particular code, but the DirectX structures pretty much lock it in as it is. I wouldn't use it myself, but I would expect it to work.

If you are serialising and deserialising data then this method is considerably faster than any standard "C++ Way" of doing things as it allows for fast block loading of data with variable length members. Compressing XML to a binary format is one use, as it the serialisation of assets.

Certainly with loading the ability to simply dump something into memory and then fix up pointers/counts internally is going to be faster than loading a bit, reserving some memory 'somewhere else' (string/vector), loading some more into that, returning to your last bit, loading some more and so on. Lower fragmentation, better on the cache and centralised data which is easier to inspect in a memory dump are all things which can be useful.

Would I reach for this as my first solution? Probably not, but I would certainly consider it if the access pattern I was expected mean that this was the optimial solution to the problem.

I tried different ways to load file contents in to structs but have some trouble. Simply because the structs contain strings, which makes them variable in size. The file is supposed to hold several instanses which each needs to be loaded in to a struct.

I am thinking that the file header contains the number of structs in the file. The struct header can contain the length of the string.

Would you be so kind to show me how I can read a chunk of data from a file and cast that in to a struct that has a string in it? (btw are std::strings bad for this?)

I would order that differently if I were you as you are very likely to waste space with the string in front of an aligned data type. So start with you aligned data types or at least with no dynamic length data types which you know will put you on a 16 byte boundary. This will safe you both file size and runtime memory.

The usual way of using in-place memory offsets:1) int offset = int((char*)(&object+1) - (char*)(&object.name)); //offset address is now between bar and baz?2) object->name = ((char*)&object->name) + int(object->name);//did you mean offset?

1)Offset is the distance from the start of the 'name' field to the end of the structure. The string data itself is written to the file after the structure, so the offset tells you how far forward in the file to jump in order to find the string.2) Upon deserialisation, 'name' actually contains the above offset value, not a pointer. The offset is relative to the address of the 'name' field, so the address of 'name' is added to the integer value of 'name', resulting in a pointer to the string data.

Are you seriously suggesting a solution by abusing memory like that where you tell the compiler and user you have a one-character array for the string, and then storing the actual string content way outside the array and the object?

FWIW, I would also recommend that category of suggestions -- my above example is a similar technique. In my opinion, these in-place memory techniques are far superior to "C++ style" serialisation techniques. For example, in our game engine, deserialising a file that contains hundreds of data structures is a nop; once the file is read from disk into memory by the OS, it's already usable without any parsing or decoding of it's contents (instead of the above pointer patching on-load, I'd do it on-use by using offset templates instead of pointers in my structures)

If you need random access to structs within the file, use a fixed char array, yes it will waste a bit of space but you can access any struct by its position which is much faster than having to go loop through possible all structs to find the one you are looking for (log(n)).

If you always going to read/write all structs at once (no random access) you can have a null terminated char array or string of the exact size of the string.

I could write a sophisticated header type which says where in the file different structs are in that case, like a table of contents in that case. But for now it will be a file which describes the whole GUI so everything is loaded. For a different GUI another file is loaded.

Are you seriously suggesting a solution by abusing memory like that where you tell the compiler and user you have a one-character array for the string, and then storing the actual string content way outside the array and the object?

Yes. I am seriously suggesting that. I have a feeling you've seriously mis-understood what the code is doing.

That in itself is undefined,

No, it's very well defined. (Hint: Read the ISO standard on C strings)

and then your code doesn't even consider the fact that you're not aligning consecutive structures properly to ensure that their members are aligned.

Correct. But adding code to handle that is trival.

You cannot use your objects by themselves; they are only good for storing them as pointers in an array given your code to load them.

Correct. And that is bad because?

Your code will blow up as soon as you try do treat an object as a value.

That sir, is impossible. The compiler would inform you it can't be instanced long before that could possibly happen. I'm not *that* dumb ;)

What you propose is nothing more than a pointer and a dynamic sized string, but instead of having a safe implementation of the pointer, you're way into the realm of undefined behavior.

It is a safe implementation, and it is fully defined as per the ISO standard. You might not like it, but that's not something relevant to it's validity.

As for "What you propose is nothing more than a pointer and a dynamic sized string", well, look again. You may notice that there is no char pointer - which is the entire point of doing it in the first place! It is the most compact (and efficient) way of loading string data from a file stream.

I'd also gently point out that it's used all over the place, here are a couple of samples from the Win32 SDK:

And now you know about the technique, you'll probably notice it in most middleware libs too ;)

It's not the only technique that exists (there are other, equally nasty looking, but valid methods), but for the scenario above, you're going to struggle to find anything that can match it for performance, and the memory usage of the loaded asset.