Mailing List Archive

RAID1 boot - no bootable media found
Hi all,
Sorry for a bit of cross-posting between gentoo-amd64 and
linux-raid where I first posted this earlier this morning. I figure if
it's a RAID issue they will likely see the problem but if it's a
Gentoo problem (software or documentation) or a BIOS issue then likely
the very best folks in the world are here. Thanks in advance.

The basic issue is trying to boot from a RAID1 boot partition using
grub-static. Apparently grub itself isn't found. See the link below
for the Gentoo instructions on doing this. Note that I'm SATA-based on
the motherboard where as he had some sort of controller although he is
using software RAID.

Hopefully the post below is self explanatory but if not ask
questions. Since I made this post I've tried a couple more things:

1) Switching to AHCI in BIOS - no change

2) Documenting drive hook-up on the DX58SO motherboard

P0: drive 1
P1: CD_RW
P2: drive 2
P3: unused
P4: drive 3
P5: unused

3) Documenting codes shown on screen:
a) When Intel logo shows up:
BA
b) After the logo goes away
E7
E7
BA
BA

These appear to be POST codes from this page:

http://www.intel.com/support/motherboards/desktop/sb/CS-025434.htm

BA Detecting presence of a removable media (IDE, CD-ROM detection, etc.)
E7 Waiting for user input

My motherboard is the last in the list at the bottom of the page.

The monitor is slow to display after changes so possibly they are two
identical strings of codes and I only see the last one the first time
through.

Thanks. More info follows below.

- Mark


[PREVIOUSLY POSTED TO LINUX-RAID]

Hi,
I brought up new hardware yesterday for my first RAID install. I
followed this Gentoo page describing a software RAID1/LVM install:

http://www.gentoo.org/doc/en/gentoo-x86+raid+lvm2-quickinstall.xml

Note that I followed this page verbatim, even if it wasn't what I
wanted, with exceptions:

a) My RAID1 is 3 drives instead of 2
b) I'm AMD64 Gentoo based.
c) I used grub-static

I did this install mostly just to get a first-hand feel for how to
do a RAID install and to try out some of the mdadm commands for real.
My intention was to blow away the install if I didn't like it and do
it again for real once I started to get a clearer picture about how
things worked. For instance, this set of instructions used RAID1 on
the /boot directory which I wasn't sure about.

NOTE: THIS INSTALL PUTS EVERYTHING ON RAID1. (/, /boot, everything)
I didn't start out thinking I wanted to do that.

So, the first problem is that on the reboot to see if the install
worked the Intel BIOS reports 'no bootable media found'. I am very
unclear how any system boots software RAID1 before software is loaded,
assuming I understand the Gentoo instructions. The instructions I used
to install grub where

root (hd0,0)
setup (hd0)
root (hd1,0)
setup (hd1)
root (hd2,0)
setup (hd2)

but the system finds nothing to boot from. to me this sounds like BIOS
so looking around I'm currently set up for compatibility but would
think that switching to AHCI support would be a better long term
solution. Any chance this setting is the root cause?

I can boot from CD and assemble the /boot RAID

livecd ~ # cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: <none>
livecd ~ # mdadm --assemble /dev/md1 /dev/sda1 /dev/sdb1 /dev/sdc1
mdadm: /dev/md1 has been started with 3 drives.
livecd ~ # cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sda1[0] sdc1[2] sdb1[1]
112320 blocks [3/3] [UUU]

unused devices: <none>
livecd ~ # mdadm --misc --stop /dev/md1
mdadm: stopped /dev/md1
livecd ~ # cat /proc/mdstat
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
unused devices: <none>
livecd ~ #


Everything I expect to see on /boot seems to be there when using ls.

Note that one possible clue - when the Intel BIOS screen first
comes up I see some hex digits flashing around in the lower right.
I've not seen this before on other machines and I beleive the
motherboard (DX58SO) does support some sort of RAID in hardware so
maybe there's confusion there? I've not selected RAID in BIOS but
possible it's trying to be too clever?

Let me know what other info might be needed. I have concerns about
this install and will likely blow it away today and do a new one but I
figured maybe there's an opportunity to learn here before I do that.

Cheers,
Mark
Re: RAID1 boot - no bootable media found [ In reply to ]
Mark Knecht posted on Sun, 28 Mar 2010 10:14:03 -0700 as excerpted:

> I brought up new hardware yesterday for my first RAID install. I
> followed this Gentoo page describing a software RAID1/LVM install:
>
> http://www.gentoo.org/doc/en/gentoo-x86+raid+lvm2-quickinstall.xml
>
> Note that I followed this page verbatim, even if it wasn't what I
> wanted, with exceptions:
>
> a) My RAID1 is 3 drives instead of 2
> b) I'm AMD64 Gentoo based.
> c) I used grub-static

Had you gotten anything off the other list, I see no other replies here.
Do you have that install or are you trying over as you mentioned you might?

That post was a bit long to quote in full, and somewhat disordered to try
to reply per element, so I just quoted the above and will cover a few
things as I go.

1) I'm running kernel/md RAID here, too (and was formerly running LVM2,
which is what I expect you mean by LVM, and I'll continue simply calling
it LVM), so I know some about it.

2) The Gentoo instructions don't say to, but just in case... you didn't
put /boot and / on LVM, only on the RAID-1, correct? LVM is only for non-
root non-boot. (Actually, you can put / on LVM, if and only if you run an
initrd/initramfs, but it significantly complicates things. Keeping / off
of LVM simplifies things considerably, so I'd recommend it.) This is
because while the kernel can auto-detect and configure RAID, or the RAID
config can be fed to it on the command line. The kernel cannot by itself
figure out how to configure LVM -- only the LVM userspace knows how to
read and configure LVM, so an LVM userspace and config must be available
before it can be loaded. This can be accomplished by using an initrd/
initramfs with LVM loaded on it, but things are MUCH less complex if /
isn't LVM, so LVM can be loaded from the normal /.

3) You mention not quite understanding how /boot works on md/RAID -- how
does grub know where to look? Well, it only works on md/kernel RAID-1,
and that only because RAID-1 is basically the same as a non-RAID setup,
only instead of one disk, there's several, each a mirror duplicate of the
others (but for a bit of RAID metadata). Thus, grub basically treats each
disk as if it wasn't in RAID, and it works, because it's organized almost
the same as if it wasn't in RAID. That's why you have to install grub
separately to each disk, because it's treating them as separate disks, not
RAID mirrors. But it doesn't work with other RAID levels because they mix
up data stripes, and grub doesn't know anything about that.

4) Due to personal experience recovering from a bad disk (pre-RAID, that's
why I switched to RAID), I'd actually recommend putting everything portage
touches or installs to on / as well. That way, everything is kept in sync
and you don't get into a situation where / including /bin /sbin and /etc
are a snapshot from one point in time, while portage's database in /var/db
is a different one, and stuff installed to /usr may be an entirely
different one. Not to mention /opt if you have anything installed
there... If all that's on /, then it should all remain in sync. Plus
then you don't have to worry about something boot-critical being installed
to /usr, which isn't mounted until about midway thru the boot cycle.

4 cont) What then goes on other partitions is subdirs of the above,
/usr/local, very likely, as you'll probably want to keep it if you
reinstall, /home, for the same reason, /var/log, so a runaway log can't
eat up all the space on /, it's limited to eating up everything on the log
partition, likely /tmp, which I have as tmpfs here but which otherwise you
may well want to be RAID-0 for speed, /var/tmp, which here is a symlink to
my /tmp so it's on tmpfs too, very possibly /usr/src and the linux kernel
tree it contains, as RAID-0 is fine for that as it can simply be
redownloaded off the net if need be, same with your portage dir,
/usr/portage by default tho you can point that elsewhere (maybe to the
same partition holding /usr/src, but if you use FEATURES=buildpkg, you
probably want your packagedir on something with some redundancy, so not on
the same RAID-0) if you want, etc... If you have a system-wide mail setup
with multiple users, you may want a separate mail partition as well (if
not, part of /home is fine). Desktop users may well find a separate,
likely BIG, partition for their media storage is useful, etc... FWIW,
the / partition on my ~amd64 workstation with kde4 is 5 gigs (according to
df). On my slightly more space constrained 32-bit netbook, it's 4 gigs.
Used space on both is ~2.4 gigs, with the various other partitions as
mentioned separate, but with everything portage touches on /. (That
compares to what appears to be a 1-gig / md3 root in the guide, with /var
and /usr on their own partitions/volumes, but they have an 8 gig /usr, a 4
gig /var, and a 4 gig /opt, totaling 17 gigs, that's mostly on that 4-5
gig /, here.)

5) The hexidecimal digits you mentioned during the BIOS post process
indicate, as you guessed, BIOS POST and config process progress. I wasn't
aware that they're documented, but as your board is an Intel and the link
you mentioned appears to be Intel documentation for them, it seems in your
case they are, which is nice. =:^)

6) Your BIOS has slightly different SATA choices than mine. Here, I have
RAID or JBOD (plain SATA, "just a bunch of disks", as my two choices.
JBOD mode would compare to your AHCI, which is what I'd recommend. (Seems
Intel wants AHCI to be a standard, thus killing the need for individual
SATA controller drivers like the SATA_SIL drivers I run here. That'd be
nice, but I don't know how well it's being accepted by others.)
Compatibility mode will likely slow things down, and RAID mode would be
firmware based RAID, which on Linux would be supported by the device-
mapper (as is LVM2). JBOD/SATA/AHCI mode, with md/kernel RAID, is
generally considered a better choice than firmware RAID with device-mapper
support, well, unless you need MSWormOS RAID compatibility, in which case
the firmware/device-mapper mode is probably better as it's more compatible.

6 cont) So I'd recommend AHCI. However, the on-disk layout may be
different between compatibility and AHCI mode, so it's possible the disk
won't be readable after switching and you'd need to repartition and
reinstall, which you were planning on doing anyway, so no big deal.


OK, now that those are covered... what's wrong with your boot?

Well, there's two possibilities. Either the BIOS isn't finding grub
stage-1, or grub stage-1 is found and loaded, but it can't find stage 1.5
or 2, depending on what it needs for your setup. Either way, that's a
grub problem. As long as you didn't make the mistake of putting /boot on
your LVM, which grub doesn't groke, and since it can pretend md/kernel
RAID-1 is an ordinary disk, we really don't need to worry about the md/
RAID or LVM until you can at LEAST get to the grub menu/prompt.

So we have a grub problem. That's what we have to solve first, before we
deal with anything else.

Based on that, here's the relevant excerpt from your post (well, after a
bit of a detour I forgot to include in the above, so we'll call this point
7):

> NOTE: THIS INSTALL PUTS EVERYTHING ON RAID1. (/, /boot, everything)
> I didn't start out thinking I wanted to do that.

7) Well, not quite. /boot and / are on RAID-1, yes. But the guide puts
the LVM2 physical volume on md4, which is created as RAID-0/striped. I
don't really agree with that as striped is fast but has no redundancy.
Why you'd put stuff like /home, /usr (including stuff you may well want to
keep in /usr/local), /var (including portage's package database in /var/
db), and presumably additional partitions as you may create them (media
and mail partitions were the examples I mentioned above) on a non-
redundant RAID-0, I don't know. That'd be what I wanted on RAID-1, here,
to make sure I still had copies of it if any of the disks died.

7 cont) Actually, given that md/raid is now partitionable (years ago it
wasn't, with LVM traditionally layered on top to overcome that), and after
some experience of my own with LVM, I decided it wasn't worth the hassle
of the extra LVM layer here, and when I redid my system last year, I
killed the LVM and just use partitioned md/kernel RAID now. If you want
the flexibility of LVM, great, but here, I decided it simply wasn't worth
the extra hassle of maintaining it. So I'd recommend NOT using LVM and
thus not having to worry about it. But it's your choice.

OK, now on to the grub issue...

> So, the first problem is that on the reboot to see if the install
> worked the Intel BIOS reports 'no bootable media found'. I am very
> unclear how any system boots software RAID1 before software is loaded,
> assuming I understand the Gentoo instructions. The instructions I used
> to install grub where
>
> root (hd0,0)
> setup (hd0)
> root (hd1,0)
> setup (hd1)
> root (hd2,0)
> setup (hd2)

That /looks/ correct. But particularly with RAID, grub's mapping between
BIOS drives, kernel drives and grub drives, sometimes gets mixed up.
That's one of the things I always hate touching, since I'm never quite
sure if it's going to work or not, or that I'm actually telling it to
setup where I think I'm telling it to setup, until I actually test it.

Do you happen to have a floppy on that machine? If so, probably the most
error resistant way to handle it is to install grub to a floppy disk,
which unlike thumb drives and possibly CD/DVD drives, has no potential to
interfere with the hard drive order as seen by BIOS. Then boot the floppy
disk to the grub menu, and run the setup from there.

One thing I discovered here is that I could only setup one disk at a time,
regardless of whether I was doing it from (in Linux, from a floppy grub
menu, or from a bootable USB stick grub boot menu). Changing the root
would seem to work after the first setup, but the second setup would have
some weird error and testing a boot from that disk wouldn't work, so
obviously it didn't take.

But doing it a disk at a time, root (hd0,0) , setup (hd0), reboot (or
restart grub if doing it from Linux), root (hd1,0), setup (hd1), reboot...
same thing for each additional disk (you have three, I have four). THAT
worked.

However you do it, test them, both with all disks running, and with only
one running (turn off or disconnect the others). Having a RAID-1 system
and installing grub to all the disks isn't going to do you a lot of good
if when one dies, you find that it was the only one that had grub
installed correctly!

There's another alternative that I'd actually recommend instead, however.
The problem with a RAID-1 boot, is that if you somehow screw up something
while updating /boot, since it's RAID-1, you've screwed it up for all
mirrors on that RAID-1. Since RAID-1 is simply mirroring the data across
the multiple disks, it can be better to not RAID that partition at all,
but to have each disk have its own /boot partition, un-RAIDed, which
effectively becomes a /boot and one more (two in your case of three disks,
three in my case of four disks, tho here I actually went with two separate
RAID-1s instead) /boot backups.

That solves a couple problems at once. First of all, when you first
install, you install to just one, as an ordinary disk, test it, and when
it's working and booting, you can copy that install to the others, and do
the boot sector grub setup on each one separately, as its own disk, having
tested that the first one is working. Then you'd test each of the others
as well.

Second, when you upgrade, especially when you upgrade grub, but also when
you upgrade the kernel, you only upgrade the one. If it works, great, you
can then upgrade the others. If it fails, no big deal, simply set your
BIOS to boot from one of the others instead, and you're back to a known
working config, since you had tested it after the /last/ upgrade, and you
didn't yet do this upgrade to it since you were just testing this upgrade
and it broke before you copied it to your backups.

So basically, the only difference here as opposed to the guide, is that
you don't create /dev/md1, you configure and mount /dev/sda1 as /boot, and
when you have your system up and running, /then/ you go back and setup
/dev/sdb1 as your backup boot (say /mnt/boot/). And when you get it setup
and tested working, then you do the same thing for /dev/sdc1, except that
you can use the same /mnt/boot/ backup mount-point when mounting it as
well, since presumably you won't need both backups mounted at once.

Everything else will be the same, and as it was RAID-1/mirrored, you'll
have about the same space in each /dev/sd[abc]1 partition as you did in
the combined md1.

As for upgrading the three separate /boot and backups, as I mentioned,
when you upgrade grub, DEFINITELY only upgrade one at a time, and test
that the upgrade worked and you can boot from it before you touch the
others. For kernel upgrades, it doesn't matter too much if the backups
are a bit behind, so you don't have to upgrade them for every kernel
upgrade. If you run kernel rcs or git-kernels, as I do, I'd suggest
upgrading the backups once per kernel release (so from 2.6.32 to 2.6.33,
for instance), so the test kernels are only on the working /boot, not its
backups, but the backups contain at least one version of the last two
release kernels. Pretty much the same if you run upstream stable kernels
(so 2.6.33, 2.6.33.1, 2.6.33.2...), or Gentoo -rX kernels. Keep at least
on of each of the last two kernels on the backups, tested to boot of
course, and only update the working /boot for the stable or -rX bumps.

If you only upgrade kernels once a kernel release cycle or less (maybe
you're still running 2.6.28.x or something), then you probably want to
upgrade and test the backups as soon as you've upgraded and tested a new
kernel on the working /boot.

Hope it helps...

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
Re: Re: RAID1 boot - no bootable media found [ In reply to ]
On Mon, Mar 29, 2010 at 11:39 PM, Duncan <1i5t5.duncan@cox.net> wrote:
<SNIP>
>
> Hope it helps...
>
> --
> Duncan - List replies preferred.   No HTML msgs.

Immensely!

OK, a quick update and keeping it short for now:

1) I dumped the RAID install Sunday. It's new hardware and it wasn't
booting, I didn't know why but I do know how to install Gentoo without
RAID so I went for simple instead of what I wanted. That said the
machine still didn't boot. Same "no bootable media" message. After
scratching my head for an hour it dawned on me that maybe this BIOS
actually required /boot to be marked bootable. I changed that and the
non-RAID install booted and is running Gentoo at this time. This is
the only machine of 10 in this house that requires that but at least
it's a reasonable fix. The machine now runs XFCE & Gnome, X seems fine
so far, haven't messed with sound, etc., so the bootable flag was the
key this time around.

2) Even non-RAID I'm having some troubles with the machine. (kernel
bugs in dmesg) I've asked a question in the LKML and gotten one
response, as well as on the Linux-RAID list, but I'm not making much
headway there yet. I'll likely post something here today or tomorrow
in some other thread with a better title having to do with 100% waits
for long periods of time. Those are probably non-Gentoo so I am
hesitant to start a thread here and bother anyone but I suspect that
you or others will probably have some good ideas at least about what
to look at.

3) I LOVE your idea of managing 3 /boot partitions by hand instead of
using RAID. Easy to do, completely testable ahead of time. If I ensure
that every disk can boot then no matter what disk goes down the
machine still works, at least a little. Not that much work and even if
I don't do it for awhile it doesn't matter as I can do repairs without
a CD. (well....)

4) You're correct that the guide did md4 as striped. I forgot to say
that I didn't. I used RAID1 there also as my needs for this machine
are reliability not speed.

5) Last for now, I figured that once the machine was running non-RAID
I could always redo the RAID1 install from within Gentoo instead of
using the install CD. That's where I'm at now but not sure if I'll do
that yet due to the issue in #2.

As always, thanks very much for the detailed post. Lots of good stuff there!

Cheers,
Mark
Re: RAID1 boot - no bootable media found [ In reply to ]
Mark Knecht posted on Tue, 30 Mar 2010 06:56:14 -0700 as excerpted:

> 3) I LOVE your idea of managing 3 /boot partitions by hand instead of
> using RAID. Easy to do, completely testable ahead of time. If I ensure
> that every disk can boot then no matter what disk goes down the machine
> still works, at least a little. Not that much work and even if I don't
> do it for awhile it doesn't matter as I can do repairs without a CD.
> (well....)

That's one of those things you only tend to realize after running a RAID
for awhile... and possibly after having grub die, for some reason I don't
quite understand, on just a kernel update... and realizing that had I
setup multiple independent /boot and boot-backup partitions instead of a
single RAID-1 /boot, I'd have had the backups to boot to if I'd have
needed it.

So call it the voice of experience! =:^)

Meanwhile, glad you figured the problem out. A boot-flag-requiring-
BIOS... that'd explain the problem for both the RAID and no-RAID version!

100% waits for long periods... I've seen a number of reasons for this.
One key to remember is that I/O backups have a way of stopping many other
things at times. Among the reasons I've seen:

1a) Dying disk. I've had old disks that would sometimes take some time to
respond, especially if they had gone to sleep. If you hear several clicks
(aka "the click of death") as it resets the disk and tries again... it's
time to think about either replacing the old disk, or sending in the new
one for a replacement.

1b) Another form of that is hard to read data sectors. It'll typically
try to read a bad sector quite a number of times, often for several
minutes at a time, before either giving up or reading it correctly.
Again, if you're seeing this, get a new disk and get your data transferred
before it's too late!

2) I think this one was fixed and I only read of it, I didn't experience
it myself. Back some time ago, if a network interface were active using
DHCP, but couldn't get a response from a DHCP server, it could cause
pretty much the entire system to hang for some time, every time the fake/
random address normally assigned from the zero-conf reserved netblock
expired. The system would try to find a DHCP server again, and again, if
one didn't answer, would eventually assign the zero-conf block fake/random
address again, but would cause a system hang of upto a minute (the default
timeout, AFAIK), before it would do so. Again, this /should/ have been
fixed quite some time ago, but one can never be sure what similar symptom
bug may be lurking in some hardware or other.

3) Back to disks, but not the harbinger of doom that #1 is, perhaps your
system is simply set to suspend the disks after a period of inactivity,
and it takes them some time to spin back up. I've had this happen to me,
but it was years ago and back on MS. But because of the issues I've had
more recently with (1), I'm sure it'd still be an issue in some
configurations. (Fortunately, laptop mode on my netbook with 120 gig SATA
hard drive seems to work very well and almost invisibly, to the point I
don't worry about disk sleep there at all, as the resume is smooth enough
I basically don't even notice -- save for the extra hour and a half of
runtime I normally get with laptop mode active! FWIW, the thing "just
works" in terms of both suspend2ram and hibernate/suspend2disk, as well.
=:^)

4) Kernels before... 2.6.30 I believe... could occasionally exhibit a read/
write I/O priority inversion on ext3. The problem had existed for
sometime, but was attributed to the normal effects of the then default
data=ordered as opposed to data=writeback journaling, until some massive
stability issues with ext4 (which ubuntu had just deployed as a non-
default option for their new installs, the problem came in combining that
with stuff like the unstable black-box nVidia drivers, which crashed
systems in the middle of writes on occasion!) prompted a reexamination of
a number of related previous assumptions. 2.6.30 had a quick-fix. 2.6.31
had better fixes, and additionally and quite controversially, switched
ext3 defaults to data=writeback, which with the new fixes, was judged
sufficiently stable to be the default. (As a reiserfs user who lived thru
the period before it got proper data=ordered, I'll never trust
data=writeback again, so I disagree with Linus decision to make it the
ext3 default, but at least I can change that on the systems I run.) So if
you're running a kernel older than 2.6.30 or .31, this could potentially
be an issue, tho it's unlikely to be /too/ bad under normal conditions.

Those are the possibilities I know of.

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
Re: Re: RAID1 boot - no bootable media found [ In reply to ]
On Tue, Mar 30, 2010 at 11:08 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Tue, 30 Mar 2010 06:56:14 -0700 as excerpted:
>
>> 3) I LOVE your idea of managing 3 /boot partitions by hand instead of
>> using RAID. Easy to do, completely testable ahead of time. If I ensure
>> that every disk can boot then no matter what disk goes down the machine
>> still works, at least a little. Not that much work and even if I don't
>> do it for awhile it doesn't matter as I can do repairs without a CD.
>> (well....)
>
> That's one of those things you only tend to realize after running a RAID
> for awhile... and possibly after having grub die, for some reason I don't
> quite understand, on just a kernel update... and realizing that had I
> setup multiple independent /boot and boot-backup partitions instead of a
> single RAID-1 /boot, I'd have had the backups to boot to if I'd have
> needed it.
>
> So call it the voice of experience! =:^)
>
> Meanwhile, glad you figured the problem out.  A boot-flag-requiring-
> BIOS... that'd explain the problem for both the RAID and no-RAID version!

I've set up a duplicate boot partition on sdb and it boots. However
one thing I'm unsure if when I change the hard drive boot does the old
sdb become the new sda because it's what got booted? Or is the order
still as it was? The answer determines what I do in grub.conf as to
which drive I'm trying to use. I can figure this out later by putting
something different on each drive and looking. Might be system/BIOS
dependent.

>
> 100% waits for long periods...  I've seen a number of reasons for this.
> One key to remember is that I/O backups have a way of stopping many other
> things at times.  Among the reasons I've seen:
>

OK, so some new information is another person the RAID list is
experiencing something very similar with different hardware.

As for your ideas:

> 1a) Dying disk.
> 1b) hard to read data sectors.

All new drives, smartctl says no problems reading anything and no
registered error correction has taken place.

>
> 2) DHCP

Not using it, at least not intentionally. Doesn't mean networking
isn't doing something strange.

>
> 3) suspend the disks after a period of inactivity

This could be part of what's going on, but I don't think it's the
whole story. My drives (WD Green 1TB drives) apparently park the heads
after 8 seconds (yes 8 seconds!) of inactivity to save power. Each
time it parks it increments the Load_Cycle_Count SMART parameter. I've
been tracking this on the three drives in the system. The one I'm
currently using is incrementing while the 2 that sit unused until I
get RAID going again are not. Possibly there is something about how
these drives come out of park that creates large delays once in
awhile.

OK, now the only problem with that analysis is the other guy
experiencing this problem doesn't use this drive so that problem
requires that he has something similar happening in his drives.
Additionally I just used one of these drives in my dad's new machine
with a different motherboard and didn't see this problem, or didn't
notice it but I'll go study that and see what his system does.

>
> 4) I/O priority inversion on ext3

Now this one is an interesting idea. Maybe I should try a few
different file systems for no other reason than eliminating the file
system type as the cause. Good input.

Thanks for the ideas!

Cheers,
Mark
Re: RAID1 boot - no bootable media found [ In reply to ]
Mark Knecht posted on Tue, 30 Mar 2010 13:26:59 -0700 as excerpted:

> I've set up a duplicate boot partition on sdb and it boots. However one
> thing I'm unsure if when I change the hard drive boot does the old sdb
> become the new sda because it's what got booted? Or is the order still
> as it was? The answer determines what I do in grub.conf as to which
> drive I'm trying to use. I can figure this out later by putting
> something different on each drive and looking. Might be system/BIOS
> dependent.

That depends on your BIOS. My current system (the workstation, now 6+
years old but still going strong as it was a $400+ server grade mobo) will
boot from whatever disk I tell it to, but keeps the same BIOS disk order
regardless -- unless I physically turn one or more of them off, of
course. My previous system would always switch the chosen boot drive to
be the first one. (I suppose it could be IDE vs. SATA as well, as the
switcher was IDE, the stable one is SATA-1.)

So that's something I guess you figure out for yourself. But it sounds
like you're already well on your way...

>> 100% waits for long periods...

>> 1a) Dying disk.
>> 1b) hard to read data sectors.
>
> All new drives, smartctl says no problems reading anything and no
> registered error correction has taken place.

That's good. =:^) Tho of course there's an infant mortality period of the
first few (1-3) months, too, before the statistics settle down. So just
because they're new doesn't necessarily mean they're not bad.

FWIW, when I switched to RAID was after having two drives go out at almost
exactly the year point. Needless to say I was a bit paranoid. So when I
got the new set to setup as RAID, the first thing I did (before I
partitioned or otherwise put any data of value on them) was run
badblocks -w on all of them. It took well over a day, actually ~3 days
IIRC but don't hold me to the three. Luckily, doing them in parallel
didn't slow things down any, as it was the physical disk speed that was
the bottleneck. But I let the thing finish on all four drives, and none
of them came up with a single badblock in any of the four patterns.
Additionally, after writing and reading back the entire drive four times,
smart still said nothing relocated or anything. So I was happy. And the
drives have served me well, tho they're probably about at the end of their
five year warranty right about now.

The point being... it /is/ actually possible to verify that they're
working well before you fdisk/mkfs and load data. Tho it does take
awhile... days... on drives of modern size.

>> 3) suspend the disks after a period of inactivity
>
> This could be part of what's going on, but I don't think it's the whole
> story. My drives (WD Green 1TB drives) apparently park the heads after 8
> seconds (yes 8 seconds!) of inactivity to save power. Each time it parks
> it increments the Load_Cycle_Count SMART parameter. I've been tracking
> this on the three drives in the system. The one I'm currently using is
> incrementing while the 2 that sit unused until I get RAID going again
> are not. Possibly there is something about how these drives come out of
> park that creates large delays once in awhile.

You may wish to take a second look at that, for an entirely /different/
reason. If those are the ones I just googled on the WD site, they're
rated 300K load/unload cycles. Take a look at your BIOS spin-down
settings, and use hdparm to get a look at the disk's powersaving and
spindown settings. You may wish to set the disks to something more
reasonable, as with 8 second timeouts, that 300k cycles isn't going to
last so long...

You may recall a couple years ago when Ubuntu accidentally shipped with
laptop mode (or something, IDR the details) turned on by default, and
people were watching their drives wear out before their eyes. That's
effectively what you're doing, with an 8-second idle timeout. Most laptop
drives (2.5" and 1.8") are designed for it. Most 3.5" desktop/server
drives are NOT designed for that tight an idle timeout spec, and in fact,
may well last longer spinning at idle overnight, as opposed to shutting
down every day even.

I'd at least look into it, as there's no use wearing the things out
unnecessarily. Maybe you'll decide to let them run that way and save the
power, but you'll know about the available choices then, at least.

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
Re: Re: RAID1 boot - no bootable media found [ In reply to ]
A bit long in response. Sorry.

On Tue, Mar 30, 2010 at 11:56 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Tue, 30 Mar 2010 13:26:59 -0700 as excerpted:
>
>> I've set up a duplicate boot partition on sdb and it boots. However one
>> thing I'm unsure if when I change the hard drive boot does the old sdb
>> become the new sda because it's what got booted? Or is the order still
>> as it was? The answer determines what I do in grub.conf as to which
>> drive I'm trying to use. I can figure this out later by putting
>> something different on each drive and looking. Might be system/BIOS
>> dependent.
>
> That depends on your BIOS.  My current system (the workstation, now 6+
> years old but still going strong as it was a $400+ server grade mobo) will
> boot from whatever disk I tell it to, but keeps the same BIOS disk order
> regardless -- unless I physically turn one or more of them off, of
> course.  My previous system would always switch the chosen boot drive to
> be the first one.  (I suppose it could be IDE vs. SATA as well, as the
> switcher was IDE, the stable one is SATA-1.)
>
> So that's something I guess you figure out for yourself.  But it sounds
> like you're already well on your way...
>

It seems to be constant mapping meaning (I guess) that I need to
change the drive specs in grub.conf on the second drive to actually
use the second drive.

I made the titles for booting different for each grub.conf file to
ensure I was really getting grub from the second drive. My sda grub
boot menu says "2.6.33-gentoo booting from sda" on the first drive,
sdb on the second drive, etc.

<SNIP>
>
> The point being... it /is/ actually possible to verify that they're
> working well before you fdisk/mkfs and load data.  Tho it does take
> awhile... days... on drives of modern size.
>

I'm trying badblocks right now on sdc. using

badblocks -v /dev/sdc

Maybe I need to do something more strenuous? It looks like it will be
done an an hour or two. (i7-920 with SATA drives so it should be fast,
as long as I'm not just reading the buffers or something like that.

Roughly speaking 1TB read at 100MB/S should take 10,000 seconds or 2.7
hours. I'm at 18% after 28 minutes so that seems about right. (With no
errors so far assuming I'm using the right command)

>>> 3) suspend the disks after a period of inactivity
>>
>> This could be part of what's going on, but I don't think it's the whole
>> story. My drives (WD Green 1TB drives) apparently park the heads after 8
>> seconds (yes 8 seconds!) of inactivity to save power. Each time it parks
>> it increments the Load_Cycle_Count SMART parameter. I've been tracking
>> this on the three drives in the system. The one I'm currently using is
>> incrementing while the 2 that sit unused until I get RAID going again
>> are not. Possibly there is something about how these drives come out of
>> park that creates large delays once in awhile.
>
> You may wish to take a second look at that, for an entirely /different/
> reason.  If those are the ones I just googled on the WD site, they're
> rated 300K load/unload cycles.  Take a look at your BIOS spin-down
> settings, and use hdparm to get a look at the disk's powersaving and
> spindown settings.  You may wish to set the disks to something more
> reasonable, as with 8 second timeouts, that 300k cycles isn't going to
> last so long...

Very true. Here is the same drive model I put in a new machine for my
dad. It's been powered up and running Gentoo as a typical desktop
machine for about 50 days. He doesn't use it more than about an hour a
day on average. It's already hit 31K load/unload cycles. At 10% of
300K that about 1.5 years of life before I hit that spec. I've watched
his system a bit and his system seems to add 1 to the count almost
exactly every 2 minutes on average. Is that a common cron job maybe?

I looked up the spec on all three WD lines - Green, Blue and Black.
All three were 300K cycles. This issue has come up on the RAID list.
It seems that some other people are seeing this and aren't exactly
sure what Linux is doing to cause this.

I'll study hdparm and BIOS when I can reboot.

My dad's current data:

gandalf ~ # smartctl -A /dev/sda
smartctl 5.39.1 2010-01-28 r3054 [x86_64-pc-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
Always - 0
3 Spin_Up_Time 0x0027 129 128 021 Pre-fail
Always - 6525
4 Start_Stop_Count 0x0032 100 100 000 Old_age
Always - 21
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age
Always - 0
9 Power_On_Hours 0x0032 099 099 000 Old_age
Always - 1183
10 Spin_Retry_Count 0x0032 100 253 000 Old_age
Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age
Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age
Always - 20
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
Always - 5
193 Load_Cycle_Count 0x0032 190 190 000 Old_age
Always - 31240
194 Temperature_Celsius 0x0022 121 116 000 Old_age
Always - 26
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age
Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
Offline - 0

gandalf ~ #


>
> You may recall a couple years ago when Ubuntu accidentally shipped with
> laptop mode (or something, IDR the details) turned on by default, and
> people were watching their drives wear out before their eyes.  That's
> effectively what you're doing, with an 8-second idle timeout.  Most laptop
> drives (2.5" and 1.8") are designed for it.  Most 3.5" desktop/server
> drives are NOT designed for that tight an idle timeout spec, and in fact,
> may well last longer spinning at idle overnight, as opposed to shutting
> down every day even.
>
> I'd at least look into it, as there's no use wearing the things out
> unnecessarily.  Maybe you'll decide to let them run that way and save the
> power, but you'll know about the available choices then, at least.
>

Yeah, that's important. Thanks. If I can solve all these RAID problems
then maybe I'll look at adding RAID to his box with better drives or
something.

Note that on my system only I'm seeing real problems in
/var/log/message, non-RAID, like 1000's of these:

Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 45276264 on sda3
Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46309336 on sda3
Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46567488 on sda3
Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46567680 on sda3

or

Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555752 on sda3
Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555760 on sda3
Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555768 on sda3
Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555776 on sda3


However I see NONE of that on my dad's machine using the same drive
but different chipset.

The above problems seem to result in this sort of problem when I try
going with RAID as I tried again this monring:

INFO: task kjournald:5064 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kjournald D ffff880028351580 0 5064 2 0x00000000
ffff8801ac91a190 0000000000000046 0000000000000000 ffffffff81067110
000000000000dcf8 ffff880180863fd8 0000000000011580 0000000000011580
ffff88014165ba20 ffff8801ac89a834 ffff8801af920150 ffff8801ac91a418
Call Trace:
[<ffffffff81067110>] ? __alloc_pages_nodemask+0xfa/0x58c
[<ffffffff8129174a>] ? md_make_request+0xde/0x119
[<ffffffff810a9576>] ? sync_buffer+0x0/0x40
[<ffffffff81334305>] ? io_schedule+0x3e/0x54
[<ffffffff810a95b1>] ? sync_buffer+0x3b/0x40
[<ffffffff81334789>] ? __wait_on_bit+0x41/0x70
[<ffffffff810a9576>] ? sync_buffer+0x0/0x40
[<ffffffff81334823>] ? out_of_line_wait_on_bit+0x6b/0x77
[<ffffffff81040a66>] ? wake_bit_function+0x0/0x23
[<ffffffff8111f400>] ? journal_commit_transaction+0xb56/0x1112
[<ffffffff81334280>] ? schedule+0x8f4/0x93b
[<ffffffff81335e3d>] ? _raw_spin_lock_irqsave+0x18/0x34
[<ffffffff81040a38>] ? autoremove_wake_function+0x0/0x2e
[<ffffffff81335bcc>] ? _raw_spin_unlock_irqrestore+0x12/0x2c
[<ffffffff8112278c>] ? kjournald+0xe2/0x20a
[<ffffffff81040a38>] ? autoremove_wake_function+0x0/0x2e
[<ffffffff811226aa>] ? kjournald+0x0/0x20a
[<ffffffff81040665>] ? kthread+0x79/0x81
[<ffffffff81002c94>] ? kernel_thread_helper+0x4/0x10
[<ffffffff810405ec>] ? kthread+0x0/0x81
[<ffffffff81002c90>] ? kernel_thread_helper+0x0/0x10
Thanks,
Mark
Re: RAID1 boot - no bootable media found [ In reply to ]
Mark Knecht posted on Thu, 01 Apr 2010 11:57:47 -0700 as excerpted:

> A bit long in response. Sorry.
>
> On Tue, Mar 30, 2010 at 11:56 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>> Mark Knecht posted on Tue, 30 Mar 2010 13:26:59 -0700 as excerpted:
>>
>>> [W]hen I change the hard drive boot does the old sdb become the new
>>> sda because it's what got booted? Or is the order still as it was?

>> That depends on your BIOS.

> It seems to be constant mapping meaning (I guess) that I need to change
> the drive specs in grub.conf on the second drive to actually use the
> second drive.
>
> I made the titles for booting different for each grub.conf file to
> ensure I was really getting grub from the second drive. My sda grub boot
> menu says "2.6.33-gentoo booting from sda" on the first drive, sdb on
> the second drive, etc.

Making the titles different is a very good idea. It's what I ended up
doing too, as otherwise, it can get confusing pretty fast.

Something else you might want to do, for purposes of identifying the
drives at the grub boot prompt if something goes wrong or you are
otherwise trying to boot something on another drive, is create a (probably
empty) differently named file on each one, say grub.sda, grub.sdb, etc.

That way, if you end up at the boot prompt you can do a find /grub.sda
(or /grub/grub.sda, or whatever), and grub will return a list of the
drives with that file, in this case, only one drive, thus identifying your
normal sda drive.

You can of course do similar by cat-ing the grub.conf file on each one,
since you are keeping your titles different, but that's a bit more
complicated than simply doing a find on the appropriate file, to get your
bearings straight on which is which in the event something screws up.

>>
>> The point being... [using badblocks] it /is/ actually possible to
>> verify that they're working well before you fdisk/mkfs and load data.
>> Tho it does take awhile... days... on drives of modern size.
>>
> I'm trying badblocks right now on sdc. using
>
> badblocks -v /dev/sdc
>
> Maybe I need to do something more strenuous? It looks like it will be
> done an an hour or two. (i7-920 with SATA drives so it should be fast,
> as long as I'm not just reading the buffers or something like that.
>
> Roughly speaking 1TB read at 100MB/S should take 10,000 seconds or 2.7
> hours. I'm at 18% after 28 minutes so that seems about right. (With no
> errors so far assuming I'm using the right command)

I used the -w switch here, which actually goes over the disk a total of 8
times, alternating writing and then reading back to verify the written
pattern, for four different write patterns (0xaa, 0x55, 0xff, 0x00, which
is alternating 10101010, alternating 01010101, all ones, all zeroes).

But that's "data distructive". IOW, it effectively wipes the disk. Doing
it when the disks were new, before I fdisked them let alone mkfs-ed and
started loading data, was fine, but it's not something you do if you have
unbacked up data on them that you want to keep!

Incidently, that's not /quite/ the infamous US-DOD 7-pass wipe, as it's
only 4 passes, but it should reasonably ensure against ordinary recovery,
in any case, if you have reason to wipe your disks... Well, except for
any blocks the disk internals may have detected as bad and rewritten
elsewhere, you can get the SMART data on that. But a 4-pass wipe, as
badblocks -w does, should certainly be good for the normal case, and is
already way better than just an fdisk, which doesn't even wipe anything
but the allocation tables!

But back to the timing. Since the -w switch does a total of 8 passes (4
each write and read, alternating) while you're doing just one with just
-v, it'll obviously take 8 times the time. So 80K seconds... 22+ hours.

So I guess it's not days... just about a day. (Probably something more,
as the first part of the disk, near the outside edge, should go faster
than the last part, so figure a bit over a day, maybe 30 hours...)


[8 second spin-down timeouts]

> Very true. Here is the same drive model I put in a new machine for my
> dad. It's been powered up and running Gentoo as a typical desktop
> machine for about 50 days. He doesn't use it more than about an hour a
> day on average. It's already hit 31K load/unload cycles. At 10% of 300K
> that about 1.5 years of life before I hit that spec. I've watched his
> system a bit and his system seems to add 1 to the count almost exactly
> every 2 minutes on average. Is that a common cron job maybe?

It's unlikely to be a cron job. But check your logging, and check what
sort of atime you're using on your mounts (relatime is the new kernel
default, but it was atime until relatively recently, say 2.6.30 or .31 or
some such, and noatime is recommended unless you have something that
actually depends on atime, alpine is known to need it for mail, and some
backup software uses it, tho little else on a modern system will, I always
use noatime on my real disk mounts, as opposed to say tmpfs, here). If
there's something writing to the log every two minutes or less, and the
buffers are set to timeout dirty data and flush to disk every two
minutes... And simply accessing a file will change the atime on it if you
have that turned on, thus necessitating a write to disk to update the
atime, with those dirty buffers flushed every X minutes or seconds as well.

> I looked up the spec on all three WD lines - Green, Blue and Black. All
> three were 300K cycles. This issue has come up on the RAID list. It
> seems that some other people are seeing this and aren't exactly sure
> what Linux is doing to cause this.

It's probably not just Linux, but a combination of Linux and the drive
defaults.

> I'll study hdparm and BIOS when I can reboot.
>
> My dad's current data:

> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
> WHEN_FAILED RAW_VALUE

> 4 Start_Stop_Count 0x0032 100 100 000 Old_age
> Always - 21

> 9 Power_On_Hours 0x0032 099 099 000 Old_age
> Always - 1183

> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age
> Always - 20

> 193 Load_Cycle_Count 0x0032 190 190 000 Old_age Always
> - 31240

Here's my comparable numbers, several years old Seagate 7200.8 series:

4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always - 996

9 Power_On_Hours 0x0032 066 066 000 Old_age
Always - 30040

12 Power_Cycle_Count 0x0032 099 099 020 Old_age
Always - 1045

Note that I don't have #193, the load-cycle counts. There's a couple
different technologies here. The ramp-type load/unload yours uses is
typical of the smaller 2.5" laptop drives. These are designed for far
shorter idle/standby timeouts and thus a far higher cycle count, load
cycles, typical rating 300,000 to 600,000. Standard desktop/server drives
use a contact park method and a lower power cycle count, typically 50,000
or so. That's the difference.

At 300,000 load cycle count rating, your WDs are at the lower end of the
ramp-type ratings, but still far higher than comparable contact power-
cycle ratings. Even tho the ramp-type they use is good for far more
cycles, as you mentioned, you're already at 10% after only 50 days.

My old Seagates, OTOH, about 4.5 years old best I can figure (I bought
them around October, 30K operating hours ~3.5 years, and I run them most
but not all of the time, so 4.5 years is a good estimate), rated for only
50,000 contact start/stop cycles (they're NOT the ramp type), but SMART
says only about 1000. or 2% of the rating, gone. (If you check the stats
they seem to recommend replacing at 20%, assuming that's a percentage,
which looks likely, but either way, that's a metric I don't need to worry
about any time soon.)

OTOH, at 30,000+ operating hours (about 3.5 years if on constantly, as I
mentioned above), that one's running rather lower. Again assuming it's a
percentage metric, it would appear they rate them @ 90,000 hours. (I
looked up the specsheets tho, and couldn't see anything like that listed,
only 5 years lifetime and warranty, which would be about half that, 45,000
hours. But given the 0 threshold there, it appears they expect the start-
stop cycles to be more critical, so they may indeed rate it 90,000
operating hours.) That'd be three and a half years of use, straight thru,
so yeah, I've had them, probably four and half years now, probably five in
October -- I don't have them spin down at all and often leave my system on
for days at a time, but not /all/ the time, so 3.5 years of use in 4.5
years sounds reasonable.

> Yeah, that's important. Thanks. If I can solve all these RAID problems
> then maybe I'll look at adding RAID to his box with better drives or
> something.

One thing they recommend with RAID, which I did NOT do, BTW, and which I'm
beginning to worry about since I'm approaching the end of my 5 year
warranties, is buying either different brands or models, or at least
ensuring you're getting different lot numbers of the same model. The idea
being, if they're all the same model and lot number, and they're all part
of the same RAID so in similar operating conditions, they're all likely to
go out pretty close to each other. That's one reason to be glad I'm
running 4-way RAID-1, I suppose, as one hopes that when they start going,
even if they are the same model and lot number, at least one of the four
can hang on long enough for me to buy replacements and transfer the
critical data. But I also have an external 1 TB USB I bought, kept off
most of the time as opposed to the RAID disks which are on most of the
time, that I've got an external backup on, as well as the backups on the
RAIDs, tho the external one isn't as regularly synced. But in the event
all four RAID drives die on me, I HAVE test-booted from a USB thumb drive
(the external 1TB isn't bootable -- good thing I tested, eh!), to the
external 1TB, and CAN recover from it, if I HAVE to.

> Note that on my system only I'm seeing real problems in
> /var/log/message, non-RAID, like 1000's of these:
>
> Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 45276264 on sda3
> Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46309336 on sda3
> Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46567488 on sda3
> Mar 29 14:06:33 keeper kernel: rsync(3368): READ block 46567680 on sda3
>
> or
>
> Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555752 on
> sda3
> Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555760 on
> sda3
> Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555768 on
> sda3
> Mar 29 14:07:36 keeper kernel: flush-8:0(3365): WRITE block 33555776 on
> sda3

That doesn't look so good...

> However I see NONE of that on my dad's machine using the same drive but
> different chipset.
>
> The above problems seem to result in this sort of problem when I try
> going with RAID as I tried again this monring:
>
> INFO: task kjournald:5064 blocked for more than 120 seconds. "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.

[snipped the trace]

Ouch! Blocked for 2 minutes...

Yes, between the logs and the 2-minute hung-task, that does look like some
serious issues, chipset or other...

Talking about which...

Can you try different SATA cables? I'm assuming you and your dad aren't
using the same cables. Maybe it's the cables, not the chipset.

Also, consider slowing the data down. Disable UDMA or reduce it to a
lower speed, or check the pinouts and try jumpering OPT1 to force SATA-1
speeds (150 MB/sec instead of 300 MB/sec) as detailed here (watch the
wrap!):

http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?
p_faqid=1337

If that solves the issue, then you know it's related to signal timing.

Unfortunately, this can be mobo related. I had very similar issues with
memory at one point, and had to slow it down from the rated PC3200, to
PC3000 speed (declock it from 200 MHz to 183 MHz), in the BIOS.
Unfortunately, initially the BIOS didn't have a setting for that; it
wasn't until a BIOS update that I got it. Until I got the update and
declocked it, it would work most of the time, but was borderline. The
thing was, the memory was solid and tested so in memtest86+, but that
tests memory cells, not speed, and at the rated speed, that memory and
that board just didn't like each other, and there'd be occasional issues
(bunzip2 erroring out due to checksum mismatch was a common one, and
occasional crashes). Ultimately, I fixed the problem when I upgraded
memory.

So having experienced the issue with memory, I know exactly how
frustrating it can be. But if you slow it down with the jumper and it
works, then you can try different cables, or take off the jumper and try
lower UDMA speeds (but still higher than SATA-1/150MB/sec), using hdparm
or something. Or exchange either the drives or the mobo, if you can, or
buy an add-on SATA card and disable the onboard one.

Oh, and double-check the kernel driver you are using for it as well.
Maybe there's another that'll work better, or driver options you can feed
to it, or something.

Oh, and if you hadn't re-fdisked, re-created new md devices, remkfsed, and
reloaded the system from backup, after you switched to AHCI, try that.
AHCI and the kernel driver for it is almost certainly what you want, not
compatibility mode, but that could potentially screw things up too, if you
switched it and didn't redo the disk afterward.

I do wish you luck! Seeing those errors brought back BAD memories of the
memory problems I had, so while yours is disk not memory, I can definitely
sympathize!

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
Re: Re: RAID1 boot - no bootable media found [ In reply to ]
Good stuff. I'll snip out the less important to keep the response
shorter but don't think for a second that I didn't appreciate it. I
do!

On Fri, Apr 2, 2010 at 2:43 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Mark Knecht posted on Thu, 01 Apr 2010 11:57:47 -0700 as excerpted:
<SNIP>
>
> Making the titles different is a very good idea.  It's what I ended up
> doing too, as otherwise, it can get confusing pretty fast.
>
> Something else you might want to do, for purposes of identifying the
> drives at the grub boot prompt if something goes wrong or you are
> otherwise trying to boot something on another drive, is create a (probably
> empty) differently named file on each one, say grub.sda, grub.sdb, etc.

I'll consider that, once I get the hard problems solved.
<SNIP>
>> Roughly speaking 1TB read at 100MB/S should take 10,000 seconds or 2.7
>> hours. I'm at 18% after 28 minutes so that seems about right. (With no
>> errors so far assuming I'm using the right command)
>
> I used the -w switch here, which actually goes over the disk a total of 8
> times, alternating writing and then reading back to verify the written
> pattern, for four different write patterns (0xaa, 0x55, 0xff, 0x00, which
> is alternating 10101010, alternating 01010101, all ones, all zeroes).

OK, makes sense then.

I ran one pass of badblocks on each of the drives. No problem found.

I know some Linux folks don't like Spinrite but I've had good luck
with it so that's running now. Problem is it cannot run the drives at
the same time and it looks like it wants at least 24 hours to do the
whole drive so using it would take 3 days. I will likely let it run
through the first drive (I'm busy today) and then tomorrow drop back
into Linux and possibly try your badblocks on all 3 drives. I'm not
overly concerned about losing the install.

<SNIP>
>
>
> [8 second spin-down timeouts]
>
>> Very true. Here is the same drive model I put in a new machine for my
>> dad. It's been powered up and running Gentoo as a typical desktop
>> machine for about 50 days. He doesn't use it more than about an hour a
>> day on average. It's already hit 31K load/unload cycles. At 10% of 300K
>> that about 1.5 years of life before I hit that spec. I've watched his
>> system a bit and his system seems to add 1 to the count almost exactly
>> every 2 minutes on average. Is that a common cron job maybe?
>
> It's unlikely to be a cron job.  But check your logging, and check what
> sort of atime you're using on your mounts (relatime is the new kernel
> default, but it was atime until relatively recently, say 2.6.30 or .31 or
> some such, and noatime is recommended unless you have something that
> actually depends on atime, alpine is known to need it for mail, and some
> backup software uses it, tho little else on a modern system will, I always
> use noatime on my real disk mounts, as opposed to say tmpfs, here).  If
> there's something writing to the log every two minutes or less, and the
> buffers are set to timeout dirty data and flush to disk every two
> minutes...  And simply accessing a file will change the atime on it if you
> have that turned on, thus necessitating a write to disk to update the
> atime, with those dirty buffers flushed every X minutes or seconds as well.

Here is fstab from my dad's machine which racks up 30
Load_Cycle_Counts and hour:

# NOTE: If your BOOT partition is ReiserFS, add the notail option to opts.
LABEL="myboot" /boot ext2 noauto,noatime 1 2
LABEL="myroot" / ext3 noatime 0 1
LABEL="myswap" none swap sw 0 0
LABEL="homeherb" /home/herb ext3 noatime 0 1
/dev/cdrom /mnt/cdrom auto noauto,ro 0 0
#/dev/fd0 /mnt/floppy auto noauto 0 0

# glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for
# POSIX shared memory (shm_open, shm_unlink).
# (tmpfs is a dynamically expandable/shrinkable ramdisk, and will
# use almost no memory if not populated with files)
shm /dev/shm tmpfs
nodev,nosuid,noexec 0 0

On the other hand there is some cron stuff going on every 10 minutes
or so. Possibly it's not 1 event ever 2 minutes but maybe 5 events
every 10 minutes?

Apr 2 07:10:01 gandalf cron[6310]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 07:20:01 gandalf cron[6322]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 07:30:01 gandalf cron[6335]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 07:40:01 gandalf cron[6348]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 07:50:01 gandalf cron[6361]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 07:59:01 gandalf cron[6374]: (root) CMD (rm -f
/var/spool/cron/lastrun/cron.hourly)
Apr 2 08:00:01 gandalf cron[6376]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 08:10:01 gandalf cron[6388]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 08:20:01 gandalf cron[6401]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 08:30:01 gandalf cron[6414]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 08:40:01 gandalf cron[6427]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 08:50:01 gandalf cron[6440]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 08:59:01 gandalf cron[6453]: (root) CMD (rm -f
/var/spool/cron/lastrun/cron.hourly)
Apr 2 09:00:01 gandalf cron[6455]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 09:10:01 gandalf cron[6467]: (root) CMD (test -x
/usr/sbin/run-crons && /usr/sbin/run-crons )
Apr 2 09:18:01 gandalf sshd[6479]: Accepted keyboard-interactive/pam
for root from 67.188.27.80 port 51981 ssh2
Apr 2 09:18:01 gandalf sshd[6479]: pam_unix(sshd:session): session
opened for user root by (uid=0)


<SNIP>
>
> Note that I don't have #193, the load-cycle counts.  There's a couple
> different technologies here.  The ramp-type load/unload yours uses is
> typical of the smaller 2.5" laptop drives.  These are designed for far
> shorter idle/standby timeouts and thus a far higher cycle count, load
> cycles, typical rating 300,000 to 600,000.  Standard desktop/server drives
> use a contact park method and a lower power cycle count, typically 50,000
> or so.  That's the difference.

I also purchased two Enterprise Edition drives - the 500GB size. They
are also spec'ed at 300K

http://www.wdc.com/en/products/products.asp?DriveID=489

My intention was to use them in a RAID0 and then back them up daily to
RAID1 for more safety. However I'm starting to think this TLER feature
may well be part of this problem. I don't want to start using them
however until I understand this 30/minute issue. No reason to wear
everything out!

<SNIP>
>
> One thing they recommend with RAID, which I did NOT do, BTW, and which I'm
> beginning to worry about since I'm approaching the end of my 5 year
> warranties, is buying either different brands or models, or at least
> ensuring you're getting different lot numbers of the same model.  The idea
> being, if they're all the same model and lot number, and they're all part
> of the same RAID so in similar operating conditions, they're all likely to
> go out pretty close to each other.  That's one reason to be glad I'm
> running 4-way RAID-1, I suppose, as one hopes that when they start going,
> even if they are the same model and lot number, at least one of the four
> can hang on long enough for me to buy replacements and transfer the
> critical data.

Exactly! My plan for this box is a 3 disk RAID1 as 3 disks is all it will hold.

Most folks don't understand that if 1 drive has a 1% chance of failing
then 3 drives is more like a 3% chance of failing assuming they are
are truly independent. If they all come from the same lot and 1 fails
then it's logically more likely that the other 2 will fail in the next
few days or weeks. Certainly much faster then getting them from
different companies.


<SNIP>
>>
>> INFO: task kjournald:5064 blocked for more than 120 seconds. "echo 0 >
>> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>
> [snipped the trace]
>
> Ouch!  Blocked for 2 minutes...
>
> Yes, between the logs and the 2-minute hung-task, that does look like some
> serious issues, chipset or other...
>
> Talking about which...
>
> Can you try different SATA cables?  I'm assuming you and your dad aren't
> using the same cables.  Maybe it's the cables, not the chipset.

Now that's an interesting thought. n my other machines I used the
cables Intel shipped with the MB. However in this case I couldn't
because the SATA connectors don't point upward but come out
horizontally. Due to proximity to the drive container I had to get 90
degree cables and all 3 drives are using those right now. I can switch
two of the drives to the Intel cables.

That said Spinrite has been running for hours without and problem at
all and it will tell me if there are delays, sectors not found, etc.,
so if it was as blatant a problem as it appears to be when running
Linux then I really think I would have seen it by now. I would have
guessed I would have seen it running badblocks also, but possibly not.

>
> Also, consider slowing the data down.  Disable UDMA or reduce it to a
> lower speed, or check the pinouts and try jumpering OPT1 to force SATA-1
> speeds (150 MB/sec instead of 300 MB/sec) as detailed here (watch the
> wrap!):
>
> http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?
> p_faqid=1337
>
> If that solves the issue, then you know it's related to signal timing.

Will try it.

>
> Unfortunately, this can be mobo related.  I had very similar issues with
> memory at one point, and had to slow it down from the rated PC3200, to
> PC3000 speed (declock it from 200 MHz to 183 MHz), in the BIOS.
> Unfortunately, initially the BIOS didn't have a setting for that; it
> wasn't until a BIOS update that I got it.  Until I got the update and
> declocked it, it would work most of the time, but was borderline.  The
> thing was, the memory was solid and tested so in memtest86+, but that
> tests memory cells, not speed, and at the rated speed, that memory and
> that board just didn't like each other, and there'd be occasional issues
> (bunzip2 erroring out due to checksum mismatch was a common one, and
> occasional crashes). Ultimately, I fixed the problem when I upgraded
> memory.

OK, so I have 6 of these drives and multiple PCs. While not a perfect
test I can try putting a couple into another machine and building a 2
drive RAID1 just to see what happens.
>
> So having experienced the issue with memory, I know exactly how
> frustrating it can be.  But if you slow it down with the jumper and it
> works, then you can try different cables, or take off the jumper and try
> lower UDMA speeds (but still higher than SATA-1/150MB/sec), using hdparm
> or something.  Or exchange either the drives or the mobo, if you can, or
> buy an add-on SATA card and disable the onboard one.
>
> Oh, and double-check the kernel driver you are using for it as well.
> Maybe there's another that'll work better, or driver options you can feed
> to it, or something.

The kernel driver is ahci. Don't know that I have any alternatives
when booting AHCI from BIOS, but I can look at the other modes with
other drivers and see if the problems still occurs. That's a bit of
work but probably worth it.

This is all a big table of experiments that eventually limit the
problem to a single location. (Hopefully!)

>
> Oh, and if you hadn't re-fdisked, re-created new md devices, remkfsed, and
> reloaded the system from backup, after you switched to AHCI, try that.
> AHCI and the kernel driver for it is almost certainly what you want, not
> compatibility mode, but that could potentially screw things up too, if you
> switched it and didn't redo the disk afterward.
>
> I do wish you luck!  Seeing those errors brought back BAD memories of the
> memory problems I had, so while yours is disk not memory, I can definitely
> sympathize!

As always, thanks for the help. I'm very interested, and yes, even a
little frustrated! ;-)

Cheers,
Mark
Re: Re: RAID1 boot - no bootable media found [ In reply to ]
On Fri, Apr 2, 2010 at 10:18 AM, Mark Knecht <markknecht@gmail.com> wrote:
<SNIP>
>
> I also purchased two Enterprise Edition drives - the 500GB size. They
> are also spec'ed at 300K
>
> http://www.wdc.com/en/products/products.asp?DriveID=489
>
> My intention was to use them in a RAID0 and then back them up daily to
> RAID1 for more safety. However I'm starting to think this TLER feature
> may well be part of this problem. I don't want to start using them
> however until I understand this 30/minute issue. No reason to wear
> everything out!
>
> <SNIP>

Duncan,
Just a quick follow-up. I tested the WD Green drives for the last
36 hours in Spinrite and found no problems. I think the drives are OK.

I then took the two 500GB Raid Enterprise drives above and made a
new RAID1 on which I did a quick Gentoo install. This uses the same
SATA controller, same cables and same drivers as I was using earlier
with the Green drives. While doing the install I saw no problems or
delays on them. Now I've got a new RAID1 install but cannot get it to
boot as grub doesn't seem to automatically assemble the RAID1 device.
I'm moderately confident that once I figure out how to get grub to
understand / is on RAID1 then I may be in good shape.

Thanks for your help,

Cheers,
Mark
Re: Re: RAID1 boot - no bootable media found [ In reply to ]
On Sat, Apr 3, 2010 at 4:13 PM, Mark Knecht <markknecht@gmail.com> wrote:
> On Fri, Apr 2, 2010 at 10:18 AM, Mark Knecht <markknecht@gmail.com> wrote:
> <SNIP>
>>
>> I also purchased two Enterprise Edition drives - the 500GB size. They
>> are also spec'ed at 300K
>>
>> http://www.wdc.com/en/products/products.asp?DriveID=489
>>
>> My intention was to use them in a RAID0 and then back them up daily to
>> RAID1 for more safety. However I'm starting to think this TLER feature
>> may well be part of this problem. I don't want to start using them
>> however until I understand this 30/minute issue. No reason to wear
>> everything out!
>>
>> <SNIP>
>
> Duncan,
>   Just a quick follow-up. I tested the WD Green drives for the last
> 36 hours in Spinrite and found no problems. I think the drives are OK.
>
>   I then took the two 500GB Raid Enterprise drives above and made a
> new RAID1 on which I did a quick Gentoo install. This uses the same
> SATA controller, same cables and same drivers as I was using earlier
> with the Green drives. While doing the install I saw no problems or
> delays on them. Now I've got a new RAID1 install but cannot get it to
> boot as grub doesn't seem to automatically assemble the RAID1 device.
> I'm moderately confident that once I figure out how to get grub to
> understand / is on RAID1 then I may be in good shape.
>
>   Thanks for your help,
>
> Cheers,
> Mark
>

OK, after a solid day of testing it really looks like the cause is the
WD10EARS drive. With RAID1 on these new enterprise drives I no longer
see 100% waits and I have no error messages anywhere that I can find
them. Additionally the early SMART data looks like there's not the
same LOAD_CYCLE_COUNT incrementing problem either.

I sort of feel like this should be reported to someone - LKML or maybe
the SATA driver folks. It's completely reproducible, and I'd be happy
to help some developer debug it if they wanted to, but I'm not sure
how to go about something that specific.

Anyway, it appears my problems are solved. I'm using RAID for / and
your ideas about multiple /boot partitions. I haven't tested the other
boots yet but I'm sure it will likely work.

Thanks,
Mark
Re: RAID1 boot - no bootable media found [ In reply to ]
Mark Knecht posted on Mon, 05 Apr 2010 11:17:21 -0700 as excerpted:

> Anyway, it appears my problems are solved. I'm using RAID for / and your
> ideas about multiple /boot partitions. I haven't tested the other boots
> yet but I'm sure it will likely work.

Good to see that. As you may have noticed, I've been busy and haven't
replied to... this makes three of your posts (this is just a quick reply,
not my usual detail).

I still have them marked unread, and will try to get back to them and see
if there's any loose ends to comment on, soon (a week or so), tho no
promises.

--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman