Mailing List Archive

[PATCH] Segments can span multiple clusters with tap:qcow
Hey,
In blktap's qcow we need split up read/write requests if the requests
span multiple clusters. However, with our MAX_AIO_REQUESTS define we
assume that there is only ever a single aio request per tapdisk request
and under heavy i/o we can run out of room causing us to cancel
requests.

The attached patch dynamically allocates (based on cluster_bits) the
various io request queues the driver maintains.

Signed-off-by: Mark McLoughlin <markmc@redhat.com>

Cheers,
Mark.
Re: [PATCH] Segments can span multiple clusters with tap:qcow [ In reply to ]
On 25/4/07 21:41, "Mark McLoughlin" <markmc@redhat.com> wrote:

> In blktap's qcow we need split up read/write requests if the requests
> span multiple clusters. However, with our MAX_AIO_REQUESTS define we
> assume that there is only ever a single aio request per tapdisk request
> and under heavy i/o we can run out of room causing us to cancel
> requests.
>
> The attached patch dynamically allocates (based on cluster_bits) the
> various io request queues the driver maintains.

The current code allocates aio-request info for every segment in a request
ring (MAX_AIO_REQUESTS == BLK_RING_SIZE * MAX_SEGMENTS_PER_REQUEST). This
patch seems to take into account that each segment (part-of-page) can itself
be split into clusters, hence the page_size/cluster_size calculation, but
shouldn't this be multiplied by the existing MAX_AIO_REQUESTS? Otherwise you
provide only enough aio requests for one segment at a time, rather than a
request ring's worth of segments?

-- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Segments can span multiple clusters with tap:qcow [ In reply to ]
Hi Keir,

On Thu, 2007-04-26 at 10:09 +0100, Keir Fraser wrote:
> On 25/4/07 21:41, "Mark McLoughlin" <markmc@redhat.com> wrote:
>
> > In blktap's qcow we need split up read/write requests if the requests
> > span multiple clusters. However, with our MAX_AIO_REQUESTS define we
> > assume that there is only ever a single aio request per tapdisk request
> > and under heavy i/o we can run out of room causing us to cancel
> > requests.
> >
> > The attached patch dynamically allocates (based on cluster_bits) the
> > various io request queues the driver maintains.
>
> The current code allocates aio-request info for every segment in a request
> ring (MAX_AIO_REQUESTS == BLK_RING_SIZE * MAX_SEGMENTS_PER_REQUEST). This
> patch seems to take into account that each segment (part-of-page) can itself
> be split into clusters, hence the page_size/cluster_size calculation, but
> shouldn't this be multiplied by the existing MAX_AIO_REQUESTS? Otherwise you
> provide only enough aio requests for one segment at a time, rather than a
> request ring's worth of segments?

Absolutely, well spotted. I fixed that typo after testing, but
obviously forgot to run "quilt refresh" before sending ...

Fixed version attached.

Thanks,
Mark.
Re: [PATCH] Segments can span multiple clusters with tap:qcow [ In reply to ]
On 26/4/07 10:18, "Mark McLoughlin" <markmc@redhat.com> wrote:

>> The current code allocates aio-request info for every segment in a request
>> ring (MAX_AIO_REQUESTS == BLK_RING_SIZE * MAX_SEGMENTS_PER_REQUEST). This
>> patch seems to take into account that each segment (part-of-page) can itself
>> be split into clusters, hence the page_size/cluster_size calculation, but
>> shouldn't this be multiplied by the existing MAX_AIO_REQUESTS? Otherwise you
>> provide only enough aio requests for one segment at a time, rather than a
>> request ring's worth of segments?
>
> Absolutely, well spotted. I fixed that typo after testing, but
> obviously forgot to run "quilt refresh" before sending ...
>
> Fixed version attached.

This one doesn't build (free_aio_state: line 164: structure has no member
named 'private'). Perhaps free_aio_state() should take a 'struct
disk_driver' rather than a 'struct td_state'?

-- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Segments can span multiple clusters with tap:qcow [ In reply to ]
On Thu, 2007-04-26 at 11:00 +0100, Keir Fraser wrote:
> On 26/4/07 10:18, "Mark McLoughlin" <markmc@redhat.com> wrote:
>
> >> The current code allocates aio-request info for every segment in a request
> >> ring (MAX_AIO_REQUESTS == BLK_RING_SIZE * MAX_SEGMENTS_PER_REQUEST). This
> >> patch seems to take into account that each segment (part-of-page) can itself
> >> be split into clusters, hence the page_size/cluster_size calculation, but
> >> shouldn't this be multiplied by the existing MAX_AIO_REQUESTS? Otherwise you
> >> provide only enough aio requests for one segment at a time, rather than a
> >> request ring's worth of segments?
> >
> > Absolutely, well spotted. I fixed that typo after testing, but
> > obviously forgot to run "quilt refresh" before sending ...
> >
> > Fixed version attached.
>
> This one doesn't build (free_aio_state: line 164: structure has no member
> named 'private'). Perhaps free_aio_state() should take a 'struct
> disk_driver' rather than a 'struct td_state'?

Gah, merge error going from 3.0.4 to 3.0.5. This one builds.

Thanks,
Mark.
RE: [PATCH] Segments can span multiple clusters withtap:qcow [ In reply to ]
Hi,

This patch is back to only allocating enough requests for one segment:

+ /* A segment (i.e. a page) can span multiple clusters */
+ s->max_aio_reqs = (getpagesize() / s->cluster_size) + 1;

In fact, this code allocates exactly two AIO requests for QCoW images
created by qcow-create, which have a default cluster size of 4K.

For a while now, tapdisk has supported EBUSY -- that is, if a plugin
returns -EBUSY to tapdisk, tapdisk will put the last segment back on its
queue and wait until the plugin has made progress before reissuing the
request. Thus users should not observe an error when QCoW runs out of
AIO requests. This is attested by the fact that even with only 2 AIO
requests allocated, QCoW block devices can handle a heavy load: I just
mkfs'ed and copied a 1GB file to a QCoW image with no problem --
although it took quite a long while to do so, since only two segments
were served at a time ;).

If you were observing errors while writing to QCoW devices, I'd like to
know how you were causing them -- we may need to make some other changes
to fix them. However, I'm not convinced that this patch is necessary.

Thanks,
Jake

> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-
> bounces@lists.xensource.com] On Behalf Of Mark McLoughlin
> Sent: Thursday, April 26, 2007 3:21 AM
> To: Keir Fraser
> Cc: xen-devel
> Subject: Re: [Xen-devel] [PATCH] Segments can span multiple clusters
> withtap:qcow
>
> On Thu, 2007-04-26 at 11:00 +0100, Keir Fraser wrote:
> > On 26/4/07 10:18, "Mark McLoughlin" <markmc@redhat.com> wrote:
> >
> > >> The current code allocates aio-request info for every segment in
a
> request
> > >> ring (MAX_AIO_REQUESTS == BLK_RING_SIZE *
MAX_SEGMENTS_PER_REQUEST).
> This
> > >> patch seems to take into account that each segment (part-of-page)
can
> itself
> > >> be split into clusters, hence the page_size/cluster_size
calculation,
> but
> > >> shouldn't this be multiplied by the existing MAX_AIO_REQUESTS?
> Otherwise you
> > >> provide only enough aio requests for one segment at a time,
rather
> than a
> > >> request ring's worth of segments?
> > >
> > > Absolutely, well spotted. I fixed that typo after testing, but
> > > obviously forgot to run "quilt refresh" before sending ...
> > >
> > > Fixed version attached.
> >
> > This one doesn't build (free_aio_state: line 164: structure has no
> member
> > named 'private'). Perhaps free_aio_state() should take a 'struct
> > disk_driver' rather than a 'struct td_state'?
>
> Gah, merge error going from 3.0.4 to 3.0.5. This one builds.
>
> Thanks,
> Mark.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Segments can span multiple clusters withtap:qcow [ In reply to ]
On 3/5/07 02:06, "Jake Wires" <Jake.Wires@xensource.com> wrote:

> This patch is back to only allocating enough requests for one segment:
>
> + /* A segment (i.e. a page) can span multiple clusters */
> + s->max_aio_reqs = (getpagesize() / s->cluster_size) + 1;
>
> In fact, this code allocates exactly two AIO requests for QCoW images
> created by qcow-create, which have a default cluster size of 4K.

What shall we do -- revert the whole patch or fix this line?

-- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Re: [PATCH] Segments can span multiple clusters withtap:qcow [ In reply to ]
On Thu, 2007-05-03 at 18:40 +0100, Keir Fraser wrote:
> On 3/5/07 02:06, "Jake Wires" <Jake.Wires@xensource.com> wrote:
>
> > This patch is back to only allocating enough requests for one segment:
> >
> > + /* A segment (i.e. a page) can span multiple clusters */
> > + s->max_aio_reqs = (getpagesize() / s->cluster_size) + 1;
> >
> > In fact, this code allocates exactly two AIO requests for QCoW images
> > created by qcow-create, which have a default cluster size of 4K.
>
> What shall we do -- revert the whole patch or fix this line?

Goodness, words fail me. In all the confusion I created, that fix got
lost again. It should be:

+ s->max_aio_reqs = ((getpagesize() / s->cluster_size) + 1) * MAX_SEGMENTS_PER_REQ * MAX_REQUESTS;

Thanks,
Mark.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
RE: [PATCH] Segments can span multiple clusters withtap:qcow [ In reply to ]
On Wed, 2007-05-02 at 18:06 -0700, Jake Wires wrote:
> Hi,
>
> This patch is back to only allocating enough requests for one segment:
>
> + /* A segment (i.e. a page) can span multiple clusters */
> + s->max_aio_reqs = (getpagesize() / s->cluster_size) + 1;
>
> In fact, this code allocates exactly two AIO requests for QCoW images
> created by qcow-create, which have a default cluster size of 4K.
>
> For a while now, tapdisk has supported EBUSY -- that is, if a plugin
> returns -EBUSY to tapdisk, tapdisk will put the last segment back on its
> queue and wait until the plugin has made progress before reissuing the
> request.

Yep.

> Thus users should not observe an error when QCoW runs out of
> AIO requests. This is attested by the fact that even with only 2 AIO
> requests allocated, QCoW block devices can handle a heavy load: I just
> mkfs'ed and copied a 1GB file to a QCoW image with no problem --
> although it took quite a long while to do so, since only two segments
> were served at a time ;).

Uggh.

> If you were observing errors while writing to QCoW devices, I'd like to
> know how you were causing them -- we may need to make some other changes
> to fix them. However, I'm not convinced that this patch is necessary.

I was seeing errors with the pre-EBUSY code, but I figured we would
still prefer to allocate the max number of segments.

Point taken that it's not critical, though. At this point, reverting
until after 3.1.0 wouldn't be crazy.

Thanks,
Mark.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel