I have a python program that needs to generate several guids and hand them back with some other data to a client over the network. It may be hit with a lot of requests in a short time period and I would like the latency to be as low as reasonably possible.

Ideally, rather than generating new guids on the fly as the client waits for a response, I would rather be bulk-generating a list of guids in the background that is continually replenished so that I always have pre-generated ones ready to hand out.

I am using the uuid module in python on linux. I understand that this is using the uuidd daemon to get uuids. Does uuidd already take care of pre-genreating uuids so that it always has some ready? From the documentation it appears that it does not.

Is there some setting in python or with uuidd to get it to do this automatically? Is there a more elegant approach then manually creating a background thread in my program that maintains a list of uuids?

3 Answers
3

Are you certain that the uuid module would in fact be too slow to handle the requests you expect in a timely manner? I would be very surprised if UUID generation accounted for a bottleneck in your application.

I would first build the application to simply use the uuid module and then if you find that this module is in fact slowing things down you should investigate a way to keep a pre-generated list of UUIDs around.

+1! Absolutely! This whole question SCREAMS "premature optimization". Even if popping UUIDs from a set is 100x faster than generating them on the fly, there is little point in adding this kind of complexity if it only accounts for 2% of the overal program. What happens if you multithread this code? Now you have to synchronize access between the UUID-generating thread, and the multiple UUID-consuming threads. This is an interesting idea, but PLEASE hold of on pursuing it until you get the program running, and only then discover that UUID generation is a performance-limiting item.
–
Paul McGuireApr 9 '10 at 3:13

+1. Imagine the disaster that would ensue if something went wrong, and you gave the same GUID twice.
–
Chris ThorntonApr 9 '10 at 20:22

Now try >>> timer=timeit.Timer('uuids.pop()','import uuid; uuids = [str(uuid.uuid1()) for _ in xrange(10000)]') and you will see that it is over 100 times faster.
–
rjuiaa1Apr 8 '10 at 23:07

@rjuiaa, I found popping from a set was faster than popping from a list which is why I used set in my answer
–
John La RooyApr 9 '10 at 0:39

I'm surprised that sets are faster here than lists. What about on the creation side? Sets first have to guarantee uniqueness, which is sort of a waste in this case, since we can be fairly certain that all the UUIDs will be unique. Is the underlying list data storage the cause? Perhaps a deque might address some of the issues with list storage, without incurring the unnecessary uniqueness checking penalty of using a set.
–
Paul McGuireApr 9 '10 at 3:28

Is there no need to synchronize access to the pool across threads? Or are add and pop guaranteed threadsafe?
–
Paul McGuireApr 9 '10 at 3:35

Also, your while True forever loop contain some invariants that get recalculated each time. Every attribute access incurs a lookup, which in this algorithm does not change from loop to loop. Before starting the loop, assign locals for self.uuid_pool, self.uuid_pool.add and uuid.uuid1 - even assigning len to a local will run faster, to access a local vs a global. Is this microoptimization? Of course it is. But if you go with the idea that UUID generation is something expensive, why leave out these easy optimization steps?
–
Paul McGuireApr 9 '10 at 3:35

@Paul McGuire, Yes add and pop are threadsafe. The part that needs to be optimised is accessing the uuids, I assume that there is ample time to fill the pool between bursts. Having a local like get_uid = uuid_pool.uuid_pool.pop might be a good idea though.
–
John La RooyApr 9 '10 at 7:52