I have written a program which records protocol messages between an application and a hardware device which matches each application request with each hardware response. This is so that I can later remove the hardware, connect a 'replay' application to the main application and wait for an application request and reply with a matched copy of the requisite hardware reply message.

This works fine on a small interaction session. My problem now is that I need to be able to use the replay over a long long session. With my current implementation, the replay program eventually uses up all available memory on my computer and crashes.

So I need some sort of lookahead - and not parse the whole session in one go.

3 Answers
3

(Just re-iterating so that you know that I know what you are saying)
It sounds like you are effectively creating an in-memory structure of the various requests and responses. With a large session, this is creating a significantly large structure that is then taking more memory than you have available.

Instead of trying to keep the entire thing in memory, would it be feasible to go to a disk based system? For example one could go to a Berkeley db on disk. It has C and C++ bindings that you could use to tie into your current application.

With this approach, one could hash the request (last I recall, Berkeley db likes to have simple keys) and store/retrieve based on the hash. This way, one doesn't keep the entire database (or list) in memory, but rather does fast indexed lookups on disk.

Large interaction data is usually serialised to a file. Write the data to a CSV file or write it to a database and read back from it. Journal the data when it crosses a certain limit. Saving it this way will keep your memory from crashing. Save it regularly after a span of time.

First, know that the std::list is a bad choice for this kind of thing, in particular on reading. The reason is that std::list elements are not guaranteed to be contiguous, make going from element to element very slower because of cache misses (and the lack of predictable memory access behaviour).

I would be you, the first thing I would try would be to replace the std::list by a std::vector, with a call to reserve before starting pushing elements back, to reserve a lot of memory from the start and avoid allocations.
If you use C++11, use the emplace_back() function instead of push_back() to definitively avoid unnecessary copies of the element object.

The vector have a contigous memory which will guarantee fast access on reading the session, and will be more memory efficient because elements are always close: there is no fragmentation of memory.

If that is not enough, consider using something like SQLite. It don't replace in-memory access, but you might use pair of std::array of your records as double buffer in which you put the next batch of records to read that you extract progressively from the sqlite file. To write, just do inserts. Both reading and writing can be done in a separate thread while your application continues running or is processing the records in replay.

The reason this setup might help is that it would make any size of session not (or almost not) impact the application that plays it, because in memory there would be only a part of the whole session readable, making runtime not grow on memory, making things fast. In this case you'll be limited only by the speed access to the disk, which is why I suggest using a (concurrently) double buffer reading of the records. Anyway sqlite is made to be very fast.