Q: What types of clients and commands will be used to automatically test my middleware implementation?

A: We run automatic tests as a way to reduce the complexity of grading so many reports. We have no hidden unit tests or performance bounds that must be reached. The tests check basic things: that the system actually runs when connected to memtier clients and plugged to memcached servers; that the system actually processes the requests as required (get, sets, multiget); that the constraints specified in the project specification are followed regarding message sizes, control commands, etc. Such tests are performed because, in the past, we have received incomplete submissions, even some not working at all. This made it next to impossible to understand the results that were presented in the report since we can only work on the assumption the project submitted follows the specification. We do not have the capacity to manually check the code for every student so, instead, we do a basic sanity check to ensure the system submitted is the platform used for running the experiments.Although we do not do it for all projects, if we feel your performance numbers reflect a system's behavior we cannot understand, it is possible that we will then try to reproduce your results by running the same workloads you claim to have run and see if we get approximately the same results. If we cannot reproduce your results, we will call you for a meeting in person to clarify the situation.Finally, all submitted code will be checked against this and last year's projects. Any plagiarism will be reported and lead to a failing grade.

Q: Can i aggregate my statistics outside off the middleware (after termination of the experiment)?

A: Yes, you can collect the data for each thread separately and then collect it after termination of the experiment with some scripts. In general, you have quite some freedom how you implement this, for instance you could also introduce a helper thread to collect the statistics. Important is that you collect in each thread the required statistics and then either collect/combine these results already in the middleware (online) or afterwards with some scripts (offline).

Q: Why do i have to collect/aggregate statistics over a 5 second window?

A: The 5 second window is an upper limit suggested by us to reduce the overhead of logging (writing to a file) or the amount of data you collect in-memory. If your implementation can deal with smaller windows and thereby collect more datapoints without introducing a visible overhead, feel free to do so.

Q: Should I measure service times separately for each request type? ..,for each thread?

A: Ideally you measure the service time separately for each request type (SET, GET, multi-GET). As we have shown in the exercise session, aggregating (calculating the average) over different request type can lead to a value that in practice never (rarely) is measured/occurs. For building the M/M/m model it is necessary to have the service time (service rate) for each thread separately.

Q: Why don't you just tell me exactly what I need to measure?

A: There are different ways to measure the same things depending on how the system has been constructed. It is important to look at the whole report and the questions we are asking in it because that provides very accurate information on what is needed. For instance, regarding getting service rate data for each worker thread separately, the report requires to demonstrate the middleware balances the load across all memcached servers. This is difficult to accomplish if you do not have information on what each worker thread is doing. We strongly recommend to go over the report template and make a list of all the data required and then proceed to implement the instrumentation accordingly.

Q: Should I implement a network queue?

A: The network queue is, from a functional point of view, not strictly necessary as worker threads could get the requests directly from the network interface. However, if there is no network queue where the requests are placed before being processed, it becomes difficult to do many of the necessary measurement to understand the performance of the system. For instance, it will not be possible to measure the time a message waits to be processed or the average number of requests waiting in the queue at any given time. Without such information, the experimental analysis and definitely the analytical treatment become next to impossible and the results will be very poor. This is why we require that the middleware includes a network queue.

Q: What do I put into the network queue?

A: You decide the type of the objects you place on the network queue. Each design decision has implications in terms of the performance you are going to observe because, depending on this decision, the way messages are actually processed varies. The design decision will also have implications in terms of what you measure inside the middleware as overall response time, service time, waiting time, etc.

Q: Why don't just tell me exactly what I need to do instead of pointing to the several choices available?

A: The course is about making design choices, building a system, and then experimentally, as well as analytically, explaining why the system behaves the way it does. How you implement the network queue, where you actually do the message parsing, what objects you give to the worker threads, etc. will influence how the system behaves. We expect you to make a choice on the design and then explain what are the consequences of that design in terms of the performance observed. To make such choices, you should think carefully about the overall system behavior, the overheads that each operation will cause, and -very important- to look at what we are asking you to do in the complete report. Some of the questions about whether a network queue is needed can be easily answered from what we ask to measure since, as pointed our above, without the queue, such measurements become very difficult.

Q: In case in Section 4.2 I have a different number of clients for each worker thread scenario that produce the maximum throughput, how should I show this?

A: In general throughout the report it is a good idea to be as explicit about your setup and measurement points as possible. This helps us interpret your results and it makes it possible for you to quickly verify and compare inside and across sections. In this particular case you can either include the number of clients as part of the explanation of the table, or as an additional row in the table.