I know that execve discards any exiting dynamically allocated memory. My job would be to have a C program which calls a binary and then communicates with it using a shared buffer. The key here is efficiency as the buffer is of a relatively small size, so I want to avoid rpc/the use of syscalls (such as shmat, etc) as much as possible.

Previously, i'd been creating the buffer in a C program then using clone (with the CLONE_VM flag set) followed by an exec call to the binary. Obviously this didn't work as exec replaces the image.

I'm not worried about how efficient set up is, my goal is to have the most efficient communication system once that has been set up.

Unless anyone has a way to execute a binary within the same address space so that they can share malloced areas of memory, I am going to use shmget and shmat (attaching the shared memory from within the executed binary).

Since you are already paying the expense of fork and exec, I don't think you need to worry about the efficiency of shmat. I think you'll have plenty of other (much larger) bottlenecks to chase down.
–
jxhAug 2 '12 at 23:19

Thanks, i'm not worried about the cost of the fork and exec (so startup cost), what I'm worried is the expense of reading/writing to the buffer, and ideally, i'd really like to minimise context switches, hence why i'd like to run them within the same address space (hence why i did clone initially so that they continued sharing a lot, but the exec call ruins all that).
–
user1018513Aug 2 '12 at 23:23

1

Do you expect to continuously call shmat? I assumed there would be one shmat per exec. Size your shared memory to the queueing depth you need up front once, and use the memory like a ring buffer.
–
jxhAug 2 '12 at 23:25

No, i only expect to have to call shmat once. My issue is more the context switch every time from the C program to the binary every time there is data to be read / written. One thing I am unclear about: I did clone(launch_prog, child_stack + max_memory,CLONE_FILES | CLONE_VM | CLONE_IO | CLONE_FS, cmd); followed by an exec call. So is the binary still sharing much of the parent program's context? (ak how expensive would it be to context switch between the two, which is what I'm worried about)
–
user1018513Aug 2 '12 at 23:28

2

If you have two threads running on the same CPU, you are going to have a context switch. There is no avoiding that. If they are running on different CPUs, you may be able to avoid the context switch on a message pass, but unless they are the only processes running on the system, you are going to have a context switch eventually. I believe your invocation of clone mimics vfork, but it all gets trashed after exec. Are you sure you don't want to use threads instead?
–
jxhAug 2 '12 at 23:32

1 Answer
1

The problem statement in your question is not well formed. Since shmat will only be called once per exec, it's cost is amortized across the lifetime of the process. Since in your comments you state you have to exec a different program (it is unclear why), threads are out.

You are afraid that using shared memory associated with shmat to pass a message incurs a greater penalty than some memory shared via some other means. This fear is largely unfounded. You would use the shared memory just like any other dynamically allocated memory, with the caveat that the address offset may be different for each process attached to it.

You do not explicitly state your requirements or the parameters of the problem, but in your comments you state you want to pass a message between two processes without incurring a context switch. This is possible by having the consumer spin wait if the message queue is empty. However, this is only fruitful if the consumer and producer are running on different processors.

In anycase, I would consider all these issues late stage optimizations. Focus first on correctly delivering the messages. Then find out where the bottlenecks are if the performance is not at an acceptable level.

The binary is provided to me and can only be in that format, hence the need to use exec (unless there is another way to execute a binary whilst keeping the same address space). My requirements are to guarantee very high throughput reading and writing data between the producer/consumer (hence minimising the overhead of context switching, or the need to do a syscall for reading and writing). The data in the buffer is of fixed size and is always in the same format (C structure). So ideally I would just have a "shared pointer" but my worry is with running the two in different address spaces.
–
user1018513Aug 3 '12 at 0:29

@user1018513: That explains why you need exec, but there is the big problem of how you are going to convince that third party executable to attach the shared memory you are creating to read its messages. You may be forced to use a pipe or a socketpair and hook into the executable's stdin/stdout.
–
jxhAug 3 '12 at 0:33