Commit Message

[Issue]
When we offer a customer support service and a problem happens
in a customer's system, we try to understand the problem by
comparing what the customer reports with message logs of the
customer's system.
In this case, we often need to know when the problem happens.
But, currently, there is no timestamp in qemu's error messages.
Therefore, we may not be able to understand the problem based on
error messages.
[Solution]
This patch adds a timestamp to qemu's error message logged by
error_report().
A logic calculating a time is copied from libvirt, src/util/virtime.c.
[TODO]
Other functions below are used to output error messages in qemu.
- qerror_report() is called in many source codes.
- fprintf() is called in vl.c.
A timestamp should be added to these messages as well.
Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com>
---
util/qemu-error.c | 90 +++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 90 insertions(+), 0 deletions(-)
-- 1.7.1

Comments

Hello Seiji-san,
On 02/01/13 15:53, Seiji Aguchi wrote:
> A logic calculating a time is copied from libvirt, src/util/virtime.c.
> +static void error_print_timestamp(void)
Are gmtime() + strftime() unsuitable for some reason?
Also, since the timestamp is ultimately printed with millisecond
resolution, clock_gettime(CLOCK_REALTIME) could be replaced with the
more portable gettimeofday().
Thanks,
Laszlo

On 02/01/2013 11:28 AM, Laszlo Ersek wrote:
> Hello Seiji-san,> > On 02/01/13 15:53, Seiji Aguchi wrote:> >> A logic calculating a time is copied from libvirt, src/util/virtime.c.> >> +static void error_print_timestamp(void)> > Are gmtime() + strftime() unsuitable for some reason?
They are not async-signal safe, so they are not usable in between a
fork() and exec*(). Libvirt avoids them because it DOES log information
including timestamps in between fork/exec (and prior to writing
libvirt's virtime.c, we DID hit cases where libvirt would deadlock a
child process due to the non-safe use of a more naive timestamp
generator), but I don't know if qemu suffers from the same restriction
of when it has anything worth logging.

> > Are gmtime() + strftime() unsuitable for some reason?
>
> They are not async-signal safe, so they are not usable in between a
> fork() and exec*(). Libvirt avoids them because it DOES log information including timestamps in between fork/exec (and prior to
> writing libvirt's virtime.c, we DID hit cases where libvirt would deadlock a child process due to the non-safe use of a more naive
> timestamp generator), but I don't know if qemu suffers from the same restriction of when it has anything worth logging.
In my understanding, gmtime() + strftime() is usable in error_report() because it seems to be called
For just a option check.
>[TODO]
>Other functions below are used to output error messages in qemu.
>- qerror_report() is called in many source codes.
>- fprintf() is called in vl.c.
On the other hand, qerror_report() and fprintf () seems to be called in signal handers.
So, in those cases, some signal-safe functions should be used.
Seiji

Seiji Aguchi <seiji.aguchi@hds.com> writes:
>> > Are gmtime() + strftime() unsuitable for some reason?>> >> They are not async-signal safe, so they are not usable in between a>> fork() and exec*(). Libvirt avoids them because it DOES log>> information including timestamps in between fork/exec (and prior to>> writing libvirt's virtime.c, we DID hit cases where libvirt would>> deadlock a child process due to the non-safe use of a more naive>> timestamp generator), but I don't know if qemu suffers from the same>> restriction of when it has anything worth logging.>> In my understanding, gmtime() + strftime() is usable in error_report()> because it seems to be called> For just a option check.
error_report() is not bound to a particular task. It simply reports
either to the current monitor or to stderr. It's certainly not usuable
in a signal handler, or similar signal-unsafe context.
>>[TODO]>>Other functions below are used to output error messages in qemu.> >- qerror_report() is called in many source codes.> >- fprintf() is called in vl.c.>> On the other hand, qerror_report() and fprintf () seems to be called> in signal handers.
fprintf() in a signal handler? Where?
> So, in those cases, some signal-safe functions should be used.

Seiji Aguchi <seiji.aguchi@hds.com> writes:
> [Issue]> When we offer a customer support service and a problem happens > in a customer's system, we try to understand the problem by > comparing what the customer reports with message logs of the > customer's system.>> In this case, we often need to know when the problem happens.>> But, currently, there is no timestamp in qemu's error messages.> Therefore, we may not be able to understand the problem based on> error messages.>> [Solution]> This patch adds a timestamp to qemu's error message logged by> error_report().>> A logic calculating a time is copied from libvirt, src/util/virtime.c.
Do we really want to add timestamps to error messages unconditionally?
I don't doubt it's useful in your scenario, but most of the time it's
just annoying clutter.
> [TODO]> Other functions below are used to output error messages in qemu.> - qerror_report() is called in many source codes.
Not a problem, it ends up in qerror_print(), which calls error_report().
> - fprintf() is called in vl.c.
All over the place, not just vl.c. Also monitor_printf().
> A timestamp should be added to these messages as well.
[...]

On Sat, Feb 02, 2013 at 08:31:45AM +0100, Markus Armbruster wrote:
> Seiji Aguchi <seiji.aguchi@hds.com> writes:> > > [Issue]> > When we offer a customer support service and a problem happens > > in a customer's system, we try to understand the problem by > > comparing what the customer reports with message logs of the > > customer's system.> >> > In this case, we often need to know when the problem happens.> >> > But, currently, there is no timestamp in qemu's error messages.> > Therefore, we may not be able to understand the problem based on> > error messages.> >> > [Solution]> > This patch adds a timestamp to qemu's error message logged by> > error_report().> >> > A logic calculating a time is copied from libvirt, src/util/virtime.c.> > Do we really want to add timestamps to error messages unconditionally?> I don't doubt it's useful in your scenario, but most of the time it's> just annoying clutter.
Agreed, I think it should be an option. Also remember that management
tools can pass a pipe as stderr when starting QEMU, and then they can
add their preferred timestamping/log formatting outside of QEMU.
Stefan

Stefan Hajnoczi <stefanha@gmail.com> writes:
> On Sat, Feb 02, 2013 at 08:31:45AM +0100, Markus Armbruster wrote:>> Seiji Aguchi <seiji.aguchi@hds.com> writes:>> >> > [Issue]>> > When we offer a customer support service and a problem happens >> > in a customer's system, we try to understand the problem by >> > comparing what the customer reports with message logs of the >> > customer's system.>> >>> > In this case, we often need to know when the problem happens.>> >>> > But, currently, there is no timestamp in qemu's error messages.>> > Therefore, we may not be able to understand the problem based on>> > error messages.>> >>> > [Solution]>> > This patch adds a timestamp to qemu's error message logged by>> > error_report().>> >>> > A logic calculating a time is copied from libvirt, src/util/virtime.c.>> >> Do we really want to add timestamps to error messages unconditionally?>> I don't doubt it's useful in your scenario, but most of the time it's>> just annoying clutter.>> Agreed, I think it should be an option. Also remember that management> tools can pass a pipe as stderr when starting QEMU, and then they can> add their preferred timestamping/log formatting outside of QEMU.
That's actually how I'd solve the problem.

> > Do we really want to add timestamps to error messages unconditionally?> > I don't doubt it's useful in your scenario, but most of the time it's> > just annoying clutter.> > Agreed, I think it should be an option.
OK. I will add the timestamp as an option.
> Also remember that management tools can pass a pipe as stderr when starting QEMU, and> then they can add their preferred timestamping/log formatting outside of QEMU.
To get an exact timestamp, qemu should supply it to the messages.
Also, there are some customers who use qemu in a small environment.
In that case, they don't use the management like vdsm.
Seiji

> > [TODO]> > Other functions below are used to output error messages in qemu.> > - qerror_report() is called in many source codes.> > Not a problem, it ends up in qerror_print(), which calls error_report().
Thanks.
> > > - fprintf() is called in vl.c.> > All over the place, not just vl.c. Also monitor_printf().
I will take a look at the code in monitor_printf() and fprintf() in other places.
> > On the other hand, qerror_report() and fprintf () seems to be called> > in signal handers.> > fprintf() in a signal handler? Where?
Sorry, I misunderstood the code.
In signal handers, qemu just set a value like shutdown_signal and shutdown_pid or so.
Seiji

On Mon, Feb 04, 2013 at 03:58:29PM +0000, Seiji Aguchi wrote:
> > Also remember that management tools can pass a pipe as stderr when starting QEMU, and> > then they can add their preferred timestamping/log formatting outside of QEMU.> > To get an exact timestamp, qemu should supply it to the messages.> Also, there are some customers who use qemu in a small environment.> In that case, they don't use the management like vdsm.
Fair enough, I think an option doesn't hurt anyone.
Stefan

On 02/01/13 15:53, Seiji Aguchi wrote:
> /*> * Print an error message to current monitor if we have one, else to stderr.> * Format arguments like sprintf(). The result should not contain> @@ -207,6 +296,7 @@ void error_report(const char *fmt, ...)> {> va_list ap;> > + error_print_timestamp();> error_print_loc();> va_start(ap, fmt);> error_vprintf(fmt, ap);> -- 1.7.1
Side note: strictly in theory, this would result in two vfprintf()
calls. The message log would remain "record oriented", but (again, in
theory) another thread *might* get interleaved and mess up our format
with a parallel call to error_report() (or more deeply, to fprintf()).
Importantly I'm not talking about "corrupting data"; stdio streams are
automatically locked by the fprintf() family. The thing (theoretically,
possibly) corrupted would be our record-oriented message format, by
interleaved printfs.
(a) I'm not sure if this is possible at all in qemu.
(b) Anyway, there are two ways to fix it:
(b1) In error_report(), lock the stream across the two printfs with
flockfile(). Probably overkill, and in case we're printing to the
monitor, wasteful/useless. Or,
(b2) Format the full message (including the timestamp) into a buffer
(sprintf, vsprintf(), or their glib wrappers with automatic allocation,
if any) and print it with a single error_printf("%s", buf).
Anyway I absolutely do not insist on this, so sorry for the noise.
Thanks
Laszlo

Il 04/04/2013 16:54, Laszlo Ersek ha scritto:
> Side note: strictly in theory, this would result in two vfprintf()> calls. The message log would remain "record oriented", but (again, in> theory) another thread *might* get interleaved and mess up our format> with a parallel call to error_report() (or more deeply, to fprintf()).> > Importantly I'm not talking about "corrupting data"; stdio streams are> automatically locked by the fprintf() family. The thing (theoretically,> possibly) corrupted would be our record-oriented message format, by> interleaved printfs.> > (a) I'm not sure if this is possible at all in qemu.
It would be one more thing that is protected by the big QEMU lock.
Regarding your other comment, please use gmtime_r, not gmtime.
Paolo
> (b) Anyway, there are two ways to fix it:> > (b1) In error_report(), lock the stream across the two printfs with> flockfile(). Probably overkill, and in case we're printing to the> monitor, wasteful/useless. Or,
> (b2) Format the full message (including the timestamp) into a buffer> (sprintf, vsprintf(), or their glib wrappers with automatic allocation,> if any) and print it with a single error_printf("%s", buf).> > Anyway I absolutely do not insist on this, so sorry for the noise.