Mailing List Archive

Strange Problem when using DRBD + ext3
Hi,

I'm running into a nasty problem when copying files via scp onto an ext3
filesystem on a drbd device - see attached oops.txt.

The problem hasn't yet been seen when using ftp or rsync to copy the same
directory, it can be 100% reproduced using scp. Problem also does not occur
if mounting /dev/sda10 directly without using the drbd device. Problem also
doesn't occur if mounting /dev/nb0 as ext2. The drbd device is in normal,
connected state at begin of copy.

here's what shows up in the logs of the secondary node:
Sep 5 13:59:14 delphi2 kernel: drbd0: magic?? m: 1602992775 c: 64534 l:
29694
Sep 5 13:59:14 delphi2 kernel: drbd0: Connection lost.(pc=0,uc=0)
Sep 5 13:59:14 delphi2 kernel: drbd0: Connection established.
Sep 5 13:59:14 delphi2 kernel: drbd0: size=497983 KB / blksize=4096 B
Sep 5 13:59:14 delphi2 kernel: drbd0: magic?? m: -1179153002 c: 65064 l:
52046
Sep 5 13:59:14 delphi2 kernel: drbd0: Connection lost.(pc=0,uc=0)

command line:

scp -pr otest:/usr/local/apache/htdocs/* /mnt/test/htdocs

Copied data:

[root@tropica2 test]# du -s *
109824 htdocs
32804 journal.dat

[root@tropica2 test]# find htdocs |wc
1456 1456 69523

Versions:
Linux: 2.2.17
ext3 patches: 0.0.2f
drbd version: 0.5.7 (.tar.gz from website)
scp: SSH Version OpenSSH_2.1.1

Loaded Modules:
[root@tropica2 log]# lsmod
Module Size Used by
drbd 30432 1
ecc 8148 0 (unused)
w83781d 16984 0 (unused)
sensors 5596 0 [w83781d]
i2c-isa 1124 0 (unused)
i2c-viapro 3520 0 (unused)
i2c-core 11676 0 [w83781d sensors i2c-isa i2c-viapro]

(lm - sensors + ecc monitoring module + drbd)

Any idea where to go looking for this one?

Bye, Martin

"you have moved your mouse, please reboot to make this change take effect"
--------------------------------------------------
Martin Bene vox: +43-316-813824
simon media fax: +43-316-813824-6
Andreas-Hofer-Platz 9 e-mail: mb@example.com
8010 Graz, Austria
--------------------------------------------------
finger mb@example.com for PGP public key
Re: Strange Problem when using DRBD + ext3 [ In reply to ]
Hi,

On Tue, Sep 05, 2000 at 02:45:19PM +0200, Martin Bene wrote:
>
> I'm running into a nasty problem when copying files via scp onto an ext3
> filesystem on a drbd device - see attached oops.txt.

Looks like exactly the same misbehaviour coming from drbd as the one
that stops raid1/5 from working with journaling.

> Sep 5 13:58:49 tropica2 kernel: Assertion failure in jfs_prelock_buffer_check() at journal.c line 337: "bh->b_jlist == 0 || bh->b_jlist == BJ_LogCtl || bh->b_jlist == BJ_IO"

What this means is that the buffer is being journaled by the
filesystem, but that drbd is trying to force it out to disk anyway.
That is bad --- if drbd commits buffers to storage before the
journaling system wants it to, then we have a write ordering violation
and an unexpected reboot can result in incorrect journal recovery.

At some point I'll probably teach ext3 to use its own internal
mechanism to mark dirty buffers rather than reusing the buffer_head
dirty flag, just to try to make it less likely for such device drivers
to do the wrong thing in these situations.

Cheers,
Stephen
Re: Re: Strange Problem when using DRBD + ext3 [ In reply to ]
On Tue, 5 Sep 2000, Stephen C. Tweedie wrote:

> Hi,
>
> On Tue, Sep 05, 2000 at 02:45:19PM +0200, Martin Bene wrote:
> >
> > I'm running into a nasty problem when copying files via scp onto an ext3
> > filesystem on a drbd device - see attached oops.txt.
>
> Looks like exactly the same misbehaviour coming from drbd as the one
> that stops raid1/5 from working with journaling.
>
> > Sep 5 13:58:49 tropica2 kernel: Assertion failure in jfs_prelock_buffer_check() at journal.c line 337: "bh->b_jlist == 0 || bh->b_jlist == BJ_LogCtl || bh->b_jlist == BJ_IO"
>
> What this means is that the buffer is being journaled by the
> filesystem, but that drbd is trying to force it out to disk anyway.
> That is bad --- if drbd commits buffers to storage before the
> journaling system wants it to, then we have a write ordering violation
> and an unexpected reboot can result in incorrect journal recovery.

This problem still exists on 2.4 RAID code?
Re: Re: Strange Problem when using DRBD + ext3 [ In reply to ]
Hi,

On Wed, Sep 06, 2000 at 01:00:23PM -0300, Marcelo Tosatti wrote:

> This problem still exists on 2.4 RAID code?

No, 2.4 raid should be fine.

Cheers,
Stephen
Re: Re: Strange Problem when using DRBD + ext3 [ In reply to ]
Am Die, 05 Sep 2000 schrieb Stephen C. Tweedie:
>Hi,
>
>On Tue, Sep 05, 2000 at 02:45:19PM +0200, Martin Bene wrote:
>>
>> I'm running into a nasty problem when copying files via scp onto an ext3
>> filesystem on a drbd device - see attached oops.txt.
>
>Looks like exactly the same misbehaviour coming from drbd as the one
>that stops raid1/5 from working with journaling.
>
>> Sep 5 13:58:49 tropica2 kernel: Assertion failure in jfs_prelock_buffer_check() at journal.c line 337: "bh->b_jlist == 0 || bh->b_jlist == BJ_LogCtl || bh->b_jlist == BJ_IO"
>
>What this means is that the buffer is being journaled by the
>filesystem, but that drbd is trying to force it out to disk anyway.
>That is bad --- if drbd commits buffers to storage before the
>journaling system wants it to, then we have a write ordering violation
>and an unexpected reboot can result in incorrect journal recovery.
>
>At some point I'll probably teach ext3 to use its own internal
>mechanism to mark dirty buffers rather than reusing the buffer_head
>dirty flag, just to try to make it less likely for such device drivers
>to do the wrong thing in these situations.
>

Stephen, are you really sure that drbd is forcing blocks to disk ?
DRBD is _only_ writing blocks to disk, that where handed to it by
the do_request function. For each request DRBD issues a _new_ request
to the local hard disk. (By copying the buffer head) The Request
to DRBD is finished when the new request was finished and the remote
write is done.

I have not yet tested ext3 on DRBD, but I got the following patch:

#ifdef BH_JWrite
if (test_bit(BH_JWrite, &req->bh->b_state))
set_bit(BH_JWrite, &bh->b_state);
#endif

where

req .... the original request for DRBD
bh .... the cloned buffer head for the lower device.

Also the syncer (resynchronisation process after one disk was replaced) does
never foce a block to disk, or use a block from the buffer cache (since a block
in the buffer chace might be modified by the filesystem).
-- Isn't this the problem of the 2.2.x implementation of raid1/5 ?

-Philipp
Re: Re: Strange Problem when using DRBD + ext3 [ In reply to ]
Hi,

Ressurecting an old email due to new evidence...

On Thu, Sep 07, 2000 at 09:07:23PM +0200, Philipp Reisner wrote:
> >Looks like exactly the same misbehaviour coming from drbd as the one
> >that stops raid1/5 from working with journaling.
> >
> >> Sep 5 13:58:49 tropica2 kernel: Assertion failure in jfs_prelock_buffer_check() at journal.c line 337: "bh->b_jlist == 0 || bh->b_jlist == BJ_LogCtl || bh->b_jlist == BJ_IO"
> >
> >What this means is that the buffer is being journaled by the
> >filesystem, but that drbd is trying to force it out to disk anyway.

> Stephen, are you really sure that drbd is forcing blocks to disk ?
> DRBD is _only_ writing blocks to disk, that where handed to it by
> the do_request function.

It looks as if we may have a better understanding of this now ---
some of the more cunning block device drivers can sleep in
ll_rw_block() after being called to queue a write but before the
buffer gets as far as make_request(). It is the make_request() which
locks the buffer, so while the driver is asleep, there is nothing
protecting the buffer from being journaled. Once the driver wakes up,
the buffer is now under journaling control and the locking check
fails.

ext3-0.0.6 is going to deal with dirty buffers in a different way and
should fix this (if I'm right about what is going on here).

Cheers,
Stephen