Comments

This patch contains the major APIs in the library.
Important APIs:
1 QBroker. These structure was used to retrieve errors, every thread must
create one first, Later maybe thread related staff could be added into it.
2 QBlockState. It stands for an block image object.
3 QBlockInfoImageStatic. Now it is not folded with location and format.
4 ABI was kept with reserved members.
structure, because it would cause caller more codes to find one member.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
---
libqblock/libqblock.c | 859 +++++++++++++++++++++++++++++++++++++++++++++++++
libqblock/libqblock.h | 251 ++++++++++++++
2 files changed, 1110 insertions(+), 0 deletions(-)
create mode 100644 libqblock/libqblock.c
create mode 100644 libqblock/libqblock.h

Il 03/09/2012 11:18, Wenchao Xia ha scritto:
> 1 QBroker. These structure was used to retrieve errors, every thread must> create one first, Later maybe thread related staff could be added into it.
Can you use GError instead?
> 3 QBlockInfoImageStatic. Now it is not folded with location and format.
What does "Static" mean?
Paolo

Il 03/09/2012 15:56, Eric Blake ha scritto:
> Exactly how does the *pnum argument work? This interface looks like it> isn't fully thought out yet. Either I want to know if a chunk of> sectors is allocated (I supply start and length of sectors to check),> regardless of how many sectors beyond that point are also allocated> (pnum makes no sense);
pnum makes sense if the [start, start+length) range includes both
allocated and unallocated sectors.
> or I want to know how many sectors are allocated> from a given point (I supply start, and the function returns length, so> nb_sectors makes no sense).
This operation could be O(number of blocks in disk) worst case, so it
makes sense to provide nb_sectors as an upper bound. nb_sectors is
typically dictated by the size of your buffer.
That said, QEMU's internal bdrv_is_allocated function does have one not
entirely appealing property: the block at start + *pnum might have the
same state as the block at start + *pnum - 1, even if *pnum < length.
We may want to work around this in libqblock, but we could also simply
document it.
Paolo
> Either way, I think you are supplying too> many parameters for how I envision checking for allocated sectors.

于 2012-9-3 21:18, Paolo Bonzini 写道:
> Il 03/09/2012 11:18, Wenchao Xia ha scritto:>> 1 QBroker. These structure was used to retrieve errors, every thread must>> create one first, Later maybe thread related staff could be added into it.>> Can you use GError instead?>
read through the GError doc, GError is defined as following:
struct GError {
GQuark domain;
gint code;
gchar *message;
};
I am worried about the message member, I guess program would be
aborted if OOM, which I was tring to avoid, so I used char err_msg[1024]
in my code, and make things simpler.
>> 3 QBlockInfoImageStatic. Now it is not folded with location and format.>> What does "Static" mean?>
It is about sorting the information into following kinds:
1) static. It is values that defined at creating time/modifying time,
mostly some settings, and it would not be automatically changed in I/O.
2) dynamic. Some information that would changes in I/O and other
operations, such as allocated_size, snapshots.
3) statistics.
Now only static one is provided, so I added _static suffix.
> Paolo>

Thank u for the careful reviewing of my codes, I will write down
the typo errors you mentioned on a note.
> On 09/03/2012 03:18 AM, Wenchao Xia wrote:>> This patch contains the major APIs in the library.>> Important APIs:>> 1 QBroker. These structure was used to retrieve errors, every thread must>> create one first, Later maybe thread related staff could be added into it.>> 2 QBlockState. It stands for an block image object.>> 3 QBlockInfoImageStatic. Now it is not folded with location and format.>> 4 ABI was kept with reserved members.>>>> structure, because it would cause caller more codes to find one member.>> Commit message snafu?>
a wrong paste, sorry.
>> +/**>> + * libqblock_init: Initialize the library>> + */>> +void libqblock_init(void);>> Is this function safe to call more than once? Even tighter, is it safe> to call this function simultaneously from multiple threads?>
No, it should be only called once, any other thread should not call
it again, will document it. About the multiple thread user case, qemu
block layer can't support that now, will fix that later.
>> +>> +/**>> + * qb_broker_new: allocate a new broker>> + *>> + * Broker is used to pass operation to libqblock, and got feed back from it.>> s/got feed back/get feedback/>>> + *>> + * Returns 0 on success, negative value on fail.>> Is there any particular interpretation to this negative value, such as> negative errno constant, or always -1?>
Yes, negative values are defined with macros in the header file.
>> +>> +/**>> + * qb_state_new: allocate a new QBloctState struct>> s/Bloct/Block/>>> + *>> + * Following qblock action were based on this struct>> Didn't read well. Did you mean:>> Subsequent qblock actions will use this struct>
Yes.
>> + *>> + * Returns 0 if succeed, negative value on fail.>> Again, is there any particular meaning to which negative value is returned?>>> +>> +/**>> + * qb_open: open a block object.>> + *>> + * return 0 on success, negative on fail.>> + *>> + * @broker: operation broker.>> + * @qbs: pointer to struct QBlockState.>> + * @loc: location options for open, how to find the image.>> + * @fmt: format options, how to extract the data, only valid member now is>> + fmt->fmt_type, set NULL if you want auto discovery the format.>> set to NULL if you want to auto-discover the format>> Maybe also add a warning about the inherent security risks of attempting> format auto-discovery (any raw image must NOT be probed, as the raw> image can emulate any other format and cause qemu to chase down chains> where it should not).>
it seems qemu-img could find out that an image is raw correctly by
probing, do you mean give a warning saying that this image is probably
some formats that qemu do not supported, such as virtual box's image?
>> + * @flag: behavior control flags.>> What flags are currently defined?>
It is the flags defined in the header file, such as LIBQBLOCK_O_RDWR,
will document it.
>> + */>> +int qb_open(struct QBroker *broker,>> + struct QBlockState *qbs,>> + struct QBlockOptionLoc *loc,>> + struct QBlockOptionFormat *fmt,>> + int flag);>> +>> +/**>> + * qb_close: close a block object.>> + *>> + * qb_flush is automaticlly done inside.>> s/automaticlly/automatically/>>> +/**>> + * qb_create: create a block image or object.>> + *>> + * Note: Create operation would not open the image automatically.>> + *>> + * return negative on fail, 0 on success.>> + *>> + * @broker: operation broker.>> + * @qbs: pointer to struct QBlockState.>> + * @loc: location options for open, how to find the image.>> + * @fmt: format options, how to extract the data.>> + * @flag: behavior control flags.>> Again, what flags are defined?>>> +>> +/* sync access */>> +/**>> + * qb_read: block sync read.>> + *>> + * return negative on fail, 0 on success.>> Shouldn't this instead return the number of successfully read bytes, to> allow for short reads if offset exceeds end-of-file? If so, should it> return ssize_t instead of int?>
I had this idea before too, let'me check if qemu block layer can
provide this functionality,
>> + *>> + * @broker: operation broker.>> + * @qbs: pointer to struct QBlockState.>> + * @buf: buffer that receive the content.>> s/receive/receives/>>> + * @len: length to read.>> Is there a magic length for reading the entire file in one go?>
no, if so where should I put with the result.
>> + * @offset: offset in the block data.>> + */>> +int qb_read(struct QBroker *broker,>> + struct QBlockState *qbs,>> + const void *buf,>> + size_t len,>> + off_t offset);>> You do realize that size_t and off_t are not necessarily the same width;> but I'm okay with limiting to size_t reads.>>> +/**>> + * qb_write: block sync write.>> + *>> + * return negative on fail, 0 on success.>> Again, this should probably return number of successfully written bytes,> as an ssize_t.>>> + *>> + * @broker: operation broker.>> + * @qbs: pointer to struct QBlockState.>> + * @buf: buffer that receive the content.>> s/receive/supplies/>>> +/* advance image APIs */>> +/**>> + * qb_is_allocated: check if following sectors was allocated on the image.>> + *>> + * return negative on fail, 0 or 1 on success. 0 means unallocated, 1 means>> + *allocated.>> Formatting is off.>
coming later.
>> + *>> + * @broker: operation broker.>> + * @qbs: pointer to struct QBlockState.>> + * @sector_num: start sector number. If 'sector_num' is beyond the end of the>> + *disk image the return value is 0 and 'pnum' is set to 0.>> + * @nb_sectors: how many sector would be checked, it is the max value 'pnum'>> + *should be set to. If nb_sectors goes beyond the end of the disk image it>> + *will be clamped.>> + * @pnum: pointer to receive how many sectors are allocated or unallocated.>> This interface requires the user to know how big a sector is. Would it> be any more convenient to the user to pass offsets, rather than sector> numbers; and/or have a function for converting between offsets and> sector numbers?>
OK, I need a function returning how big a sector is in an image.
>> + */>> +int qb_is_allocated(struct QBroker *broker,>> + struct QBlockState *qbs,>> + int64_t sector_num,>> + int nb_sectors,>> Shouldn't nb_sectors be size_t?>>> + int *pnum);>> Exactly how does the *pnum argument work? This interface looks like it> isn't fully thought out yet. Either I want to know if a chunk of> sectors is allocated (I supply start and length of sectors to check),> regardless of how many sectors beyond that point are also allocated> (pnum makes no sense); or I want to know how many sectors are allocated> from a given point (I supply start, and the function returns length, so> nb_sectors makes no sense). Either way, I think you are supplying too> many parameters for how I envision checking for allocated sectors.>

Il 04/09/2012 05:15, Wenchao Xia ha scritto:
>>>> Can you use GError instead?>>> read through the GError doc, GError is defined as following:> struct GError {> GQuark domain;> gint code;> gchar *message;> };> I am worried about the message member, I guess program would be> aborted if OOM, which I was tring to avoid, so I used char err_msg[1024]> in my code, and make things simpler.
That's true. On the other hand, and IMHO, not aborting in the library
code is a non-goal as long as the rest of the block layer still does.
>>> 3 QBlockInfoImageStatic. Now it is not folded with location and>>> format.>>>> What does "Static" mean?>>> It is about sorting the information into following kinds:> 1) static. It is values that defined at creating time/modifying time,> mostly some settings, and it would not be automatically changed in I/O.> 2) dynamic. Some information that would changes in I/O and other> operations, such as allocated_size, snapshots.> 3) statistics.> Now only static one is provided, so I added _static suffix.
Makes sense, thanks for the clarification. Perhaps QBlockStaticInfo is
a shorter and simpler name?
Paolo

于 2012-9-3 22:05, Paolo Bonzini 写道:
> Il 03/09/2012 15:56, Eric Blake ha scritto:>> Exactly how does the *pnum argument work? This interface looks like it>> isn't fully thought out yet. Either I want to know if a chunk of>> sectors is allocated (I supply start and length of sectors to check),>> regardless of how many sectors beyond that point are also allocated>> (pnum makes no sense);>> pnum makes sense if the [start, start+length) range includes both> allocated and unallocated sectors.>>> or I want to know how many sectors are allocated>> from a given point (I supply start, and the function returns length, so>> nb_sectors makes no sense).>
About using byte offset instead of sector, I think sector is better,
because the allocation status is based on sector, all bytes data in a
sector would have the same status.
> This operation could be O(number of blocks in disk) worst case, so it> makes sense to provide nb_sectors as an upper bound. nb_sectors is> typically dictated by the size of your buffer.>> That said, QEMU's internal bdrv_is_allocated function does have one not> entirely appealing property: the block at start + *pnum might have the> same state as the block at start + *pnum - 1, even if *pnum < length.> We may want to work around this in libqblock, but we could also simply> document it.>> Paolo>
int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
int nb_sectors,int *pnum)
will the issue happen when nb_sectors > *pnum? if so it seems a bug,
because caller is asking a range of sectors's allocation status, and
*pnum did not reflect the real status.
>> Either way, I think you are supplying too>> many parameters for how I envision checking for allocated sectors.>
yes, it is a bit confusing, how about:
int qb_check_allocate_status(struct QBroker *broker,
struct QBlockState *qbs,
offset sector_start,
size_t sector_number,
size_t *pnum,
int *status)
user input sector_start and sector_number to ask check it in this range,
following parameter receive the status, return indicate exception.
>

Il 04/09/2012 09:05, Wenchao Xia ha scritto:
>> That said, QEMU's internal bdrv_is_allocated function does have one not>> entirely appealing property: the block at start + *pnum might have the>> same state as the block at start + *pnum - 1, even if *pnum < length.>> We may want to work around this in libqblock, but we could also simply>> document it.>>>> Paolo>>> int bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,> int nb_sectors,int *pnum)> will the issue happen when nb_sectors > *pnum? if so it seems a bug,> because caller is asking a range of sectors's allocation status, and> *pnum did not reflect the real status.
Actually it does.
bdrv_is_allocated says it didn't find out anything about sector start +
*pnum and later.
>>> Either way, I think you are supplying too>>> many parameters for how I envision checking for allocated sectors.>>> yes, it is a bit confusing, how about:> > int qb_check_allocate_status(struct QBroker *broker,> struct QBlockState *qbs,> offset sector_start,> size_t sector_number,> size_t *pnum,> int *status)> user input sector_start and sector_number to ask check it in this range,> following parameter receive the status, return indicate exception.
That's ok too.
But do not use size_t for sectors. Testing, or reading, or writing 2 TB
with a single call is more than enough.
Paolo

于 2012-9-4 14:50, Paolo Bonzini 写道:
> Il 04/09/2012 05:15, Wenchao Xia ha scritto:>>>>>> Can you use GError instead?>>>>> read through the GError doc, GError is defined as following:>> struct GError {>> GQuark domain;>> gint code;>> gchar *message;>> };>> I am worried about the message member, I guess program would be>> aborted if OOM, which I was tring to avoid, so I used char err_msg[1024]>> in my code, and make things simpler.>> That's true. On the other hand, and IMHO, not aborting in the library> code is a non-goal as long as the rest of the block layer still does.>
Hard problem for me, do you have some suggestion about OOM issue?
Using GLib's more functions, such as GError and Gsource main event
loop, would cause this issue more difficult to solve later.
Do we have an alternative robust lib as glib but reports OOM instead
exit on Linux?
>>>> 3 QBlockInfoImageStatic. Now it is not folded with location and>>>> format.>>>>>> What does "Static" mean?>>>>> It is about sorting the information into following kinds:>> 1) static. It is values that defined at creating time/modifying time,>> mostly some settings, and it would not be automatically changed in I/O.>> 2) dynamic. Some information that would changes in I/O and other>> operations, such as allocated_size, snapshots.>> 3) statistics.>> Now only static one is provided, so I added _static suffix.>> Makes sense, thanks for the clarification. Perhaps QBlockStaticInfo is> a shorter and simpler name?>
OK.
> Paolo>

On 09/04/2012 12:42 AM, Wenchao Xia wrote:
>>> +/**>>> + * libqblock_init: Initialize the library>>> + */>>> +void libqblock_init(void);>>>> Is this function safe to call more than once? Even tighter, is it safe>> to call this function simultaneously from multiple threads?>>> No, it should be only called once, any other thread should not call> it again, will document it. About the multiple thread user case, qemu> block layer can't support that now, will fix that later.
What a shame. That makes libraries much harder to use. It is much
nicer to design a library where the initialization is idempotent and
thread-safe, to be called from multiple threads. Consider:
app links against liba and libb;
liba links against libqb
libb links against libqb
How am I supposed to write liba and libb to guarantee only one single
race-free call to libqblock_init, unless libqblock_init() is idempotent?
Also, should there be a counterpart function for tearing down the
resources used by the library when it is no longer needed? If so, then
that implies reference counting - each call to init atomically increases
the refcount, and the library frees resources only when the refcount
atomically goes back to 0.
>>> + * @fmt: format options, how to extract the data, only valid member>>> now is>>> + fmt->fmt_type, set NULL if you want auto discovery the format.>>>> set to NULL if you want to auto-discover the format>>>> Maybe also add a warning about the inherent security risks of attempting>> format auto-discovery (any raw image must NOT be probed, as the raw>> image can emulate any other format and cause qemu to chase down chains>> where it should not).>>> it seems qemu-img could find out that an image is raw correctly by> probing, do you mean give a warning saying that this image is probably> some formats that qemu do not supported, such as virtual box's image?
No, you got it backwards. For all non-raw images, qemu can correctly
probe the image. But for raw images, the guest may have set enough
information in the image to make a probe _think_ that the image is
non-raw, and therefore cause qemu to misbehave. That is, the security
hole is choosing to probe a raw image, because the probe will not always
successfully return raw.

Il 04/09/2012 13:35, Eric Blake ha scritto:
>> > No, it should be only called once, any other thread should not call>> > it again, will document it. About the multiple thread user case, qemu>> > block layer can't support that now, will fix that later.> What a shame. That makes libraries much harder to use. It is much> nicer to design a library where the initialization is idempotent and> thread-safe, to be called from multiple threads. Consider:> > app links against liba and libb;> liba links against libqb> libb links against libqb> > How am I supposed to write liba and libb to guarantee only one single> race-free call to libqblock_init, unless libqblock_init() is idempotent?
I agree, libqblock_init should use pthread_once (or we can add QemuOnce
to qemu-thread-*.c).
Paolo

> Il 04/09/2012 05:15, Wenchao Xia ha scritto:>>>>>> Can you use GError instead?>>>>> read through the GError doc, GError is defined as following:>> struct GError {>> GQuark domain;>> gint code;>> gchar *message;>> };>> I am worried about the message member, I guess program would be>> aborted if OOM, which I was tring to avoid, so I used char err_msg[1024]>> in my code, and make things simpler.>> That's true. On the other hand, and IMHO, not aborting in the library> code is a non-goal as long as the rest of the block layer still does.>
About the Gerror lib, with a look at its doc, I think it provides
similar capabilities with my implement, no key feature provided.
Considering the memory issue, I hope to drop Gerror now.
>>>> 3 QBlockInfoImageStatic. Now it is not folded with location and>>>> format.>>>>>> What does "Static" mean?>>>>> It is about sorting the information into following kinds:>> 1) static. It is values that defined at creating time/modifying time,>> mostly some settings, and it would not be automatically changed in I/O.>> 2) dynamic. Some information that would changes in I/O and other>> operations, such as allocated_size, snapshots.>> 3) statistics.>> Now only static one is provided, so I added _static suffix.>> Makes sense, thanks for the clarification. Perhaps QBlockStaticInfo is> a shorter and simpler name?>> Paolo>