Mailing List Archive

Hard drive failure...
Hi,

Was running Myth CVS just great as of about a week ago. Have about 60gb
worth of shows recorded, and then things just "hung". Couldn't even task
switch to do a clean shutdown. Did a hard reset and then it wouldn't even
load up lilo - gave me an Error 0x10 (media error). Booted from Mandrake 9
CD 1, got to point where the install was ready to pick a partition and
switched to another session and repaired the /dev/hda5 partition (where I
load from) and got things working again (after it fixed a zillion errors).
It then booted up ok, I made sure to back everything up to another networked
PC, and it ran ok for 24 hours then hung again. Now it won't boot again.

This PC had run perfectly for about 2 years with Win2k and never a glitch of
trouble. Do you think turning on the dma stuff as suggested in the
setup/faq could have screwed it up? If so, would it really take a few weeks
to screw up?

Also, should I try and switch the file system (using linux default now -
ext2 right?) to some journaled one or something "safer"?

Finally, is there an exhaustive test that I can run to see what/where the
problem is (like dma or some parameter)? I hate to throw out an 80gb
7200rpm drive but if it's really bad I will. I'm almost tempted to reformat
with NTFS and see how long it will run Win2k again - "almost" being the key
word there.

What's the right/best way to repair the partition if any?

Thanks for any advice,

JC
Still a linux newbie!
Re: Hard drive failure... [ In reply to ]
JC wrote:

>This PC had run perfectly for about 2 years with Win2k and never a glitch of trouble. Do you think turning on the dma stuff as suggested in the
>setup/faq could have screwed it up? If so, would it really take a few weeks to screw up?
>
This is technically possible with older hardware, but I find it highly
unlikely with new hardware, unless your IDE controller or something else
on your motherboard is also malfunctioning.

>Also, should I try and switch the file system (using linux default now -
>ext2 right?) to some journaled one or something "safer"?
>
It can't hurt, and may prevent a lot of problems if this happens again.
I'd give ext3 a try, simply because you can switch back and forth from
ext2 to ext3 with no problems in about a minute. There should be docs
for your distro of choice on how to do this, but it's usually as simple
as running "tune2fs -j /dev/hdaX", and then changing /etc/fstab's entry
to ext3.

>What's the right/best way to repair the partition if any?
>
>
I haven't had to do this in a while, but it seems to me that fsck should
have some options to do a more exhaustive disk scan... let's check the
man page! ;)

well, passing -c to e2fsck apparently forces it to run badblocks and
mark all corrupt blocks on your disk in the "bad block inode", which I
assume just keeps track of bad blocks so the fs doesn't try to use them.
Apparently using -c twice will force it to do a non-destructive
read-write test to do this, which is probably about as comprehensive as
you're going to get. Another possibly useful option is -f, which forces
a check even if it thinks it's fine.

Give that a shot and let me know what the output is, I'm curious as to
what would have cause such extensive damage.

Graeme
Re: Hard drive failure... [ In reply to ]
A little-known trick: if you change the fstab filesystem-type entry to
"ext3,ext2" the mount operation will attempt mounting as ext3 and then
fallback to ext2.


#if Unit3 /* Mar 06, 12:51 */
> It can't hurt, and may prevent a lot of problems if this happens again.
> I'd give ext3 a try, simply because you can switch back and forth from
> ext2 to ext3 with no problems in about a minute. There should be docs
> for your distro of choice on how to do this, but it's usually as simple
> as running "tune2fs -j /dev/hdaX", and then changing /etc/fstab's entry
> to ext3.
#endif /* unit3@demoni.ca */
Re: Hard drive failure... [ In reply to ]
Andy Davidoff wrote:

>A little-known trick: if you change the fstab filesystem-type entry to
>"ext3,ext2" the mount operation will attempt mounting as ext3 and then
>fallback to ext2.
>
>
Wow, that's incredibly handy.

Mind you, in newer kernels you can just put "auto", and while you get
some bitching in the boot log as it tries to detect the fs type, it
hasn't failed for me so far. :)

Graeme
Re: Hard drive failure... [ In reply to ]
From: Unit3 <unit3@demoni.ca>
Date: Thu, 06 Mar 2003 12:51:01 -0600
> JC wrote:
> >This PC had run perfectly for about 2 years with Win2k and never a glitch of trouble. Do you think turning on the dma stuff as suggested in the
> >setup/faq could have screwed it up? If so, would it really take a few weeks to screw up?
> >
> This is technically possible with older hardware, but I find it highly
> unlikely with new hardware, unless your IDE controller or something else
> on your motherboard is also malfunctioning.

It seems pretty unlikely to me too.

> >Also, should I try and switch the file system (using linux default now -
> >ext2 right?) to some journaled one or something "safer"?

A journaled filesystem like ext3 works great as a filesystem for
normal files. For the type of file usage that MythTV does (create a
huge file once, and then only do reads from it), it really only
provides 1 big advantage: fast fsck time due to better crash
tolerance. Running fsck on a 100gig partition takes too long for
ext2, so I switched to ext3, and I've been pretty happy with it.
Here's my entry in /etc/fstab:
/dev/hda7 /data ext3 defaults,noatime,data=writeback 0 2

> >What's the right/best way to repair the partition if any?
>
> I haven't had to do this in a while, but it seems to me that fsck should
> have some options to do a more exhaustive disk scan... let's check the
> man page! ;)
>
> well, passing -c to e2fsck apparently forces it to run badblocks and
> mark all corrupt blocks on your disk in the "bad block inode", which I
> assume just keeps track of bad blocks so the fs doesn't try to use them.
> Apparently using -c twice will force it to do a non-destructive
> read-write test to do this, which is probably about as comprehensive as
> you're going to get. Another possibly useful option is -f, which forces
> a check even if it thinks it's fine.

I recently lost the hard drive on my MythTV's machine. I ran the
badblock's non-destructive read-write test (which takes a long time to
run), and came up with nothing. Apparently modern hard drives do
their own bad block reallocation automatically. A lot of hard drives
have 3 year warranties, and so mine was still covered. I ran Maxtor's
diagnostic program on it and got a failure. After that, it was as
simple as RMA.

Cliff Draper Sun Microsystems, Forte Tools
My opinions may or may not reflect those of my employer.
---------------------------- food for thought ---------------------------
The meta-Turing test counts a thing as intelligent if it seeks to
devise and apply Turing tests to objects of its own creation.
Re: Hard drive failure... [ In reply to ]
JC wrote:
> Hi,
>
> Was running Myth CVS just great as of about a week ago. Have about 60gb
> worth of shows recorded, and then things just "hung". Couldn't even task
> (snip)
> Now it won't boot again.
>
> This PC had run perfectly for about 2 years with Win2k and never a glitch of
> trouble. Do you think turning on the dma stuff as suggested in the
> setup/faq could have screwed it up? If so, would it really take a few weeks
> to screw up?
>

It is NOT likely a hardware problem. It is likely a corrupt filesystem.
Type "man fsck", Read, Understand, then Run "fsck". This should get
your computer back up and running.


> Also, should I try and switch the file system (using linux default now -
> ext2 right?) to some journaled one or something "safer"?
> (snip)
> JC

Another alternitive to running a journaled file system, is to mount your
existing partitions using the "sync" option. IMHO this is a much better
solution.

Linux, by default, sets 'ext2' partitions to be written async. This is
not a good thing, because data is not written to the disk right a way,
therefore if you hit the reset button, you are doomed to loose data,
everytime. (no other OS does this, as far as I know) Mounting the
filesystems w/ the sync option will definitely sacrifce speed though,
but is much safer. I say that this is a better option, because it
doesn't carry the overhead/hassle of a Journeled filesystem and almost
as safe.

But, I would only do this if you have setup a seperate partition for you
video that can still be mounted 'async', so that you don't loose speed
there.

-rac
Re: Hard drive failure... [ In reply to ]
Ok, here's the scoop...

I booted from the CD, entered "rescue", did e2fsck -c -y /dev/hda5 and it
found and fixed lots of errors. I rebooted when done and all came up just
like it was never messed up. I suspect some files *must* be messed up, but
it's not visible. Myth even came up.

So next, I ran tune2fs -j /dev/hda5 (my 68gb /video partition) and also on
/dev/hda6 (my 3.4gb os partition). I also have a 1gb swap partition (have
512mb ram). That ran fine. Next I put in "ext3,ext2" in my fstab for both
the hda5 and hda6 as suggested. Here's my fstab (feel free to criticize!):

[root@p800 mythtv]# cat /etc/fstab
# WAS: /dev/hda6 / ext2 defaults 1 1
/dev/hda6 / ext3,ext2 defaults 1 1
none /dev/pts devpts mode=0620 0 0
//192.168.0.6/C /mnt/C smbfs credentials=/etc/samba/auth.192.168.0.6.crombej
0 0
//192.168.0.6/DVD /mnt/DVD smbfs
credentials=/etc/samba/auth.192.168.0.6.crombej 0 0
none /mnt/floppy supermount
dev=/dev/fd0,fs=auto,--,iocharset=iso8859-1,sync,codepage=850,umask=0 0 0
//c117787-d/C /mnt/p4_C smbfs credentials=/etc/samba/auth.c117787-d.crombej
0 0
//192.168.0.6/V /mnt/p4_v smbfs
credentials=/etc/samba/auth.192.168.0.6.crombej 0 0
//c117787-d/X_VIDEO /mnt/p4_x smbfs
credentials=/etc/samba/auth.c117787-d.crombej 0 0
none /proc proc defaults 0 0
# WAS: /dev/hda5 /video ext2 defaults 1 2
/dev/hda5 /video ext3,ext2 defaults 1 2
/dev/hda7 swap swap defaults 0 0

And here's the output from "mount"
[root@p800 mythtv]# mount
/dev/hda6 on / type ext3,ext2 (rw)
none on /proc type proc (rw)
none on /dev/pts type devpts (rw,mode=0620)
/dev/hda5 on /video type ext2 (rw)
//192.168.0.6/C on /mnt/C type smbfs (0)
//192.168.0.6/DVD on /mnt/DVD type smbfs (0)
//c117787-d/C on /mnt/p4_C type smbfs (0)
//192.168.0.6/V on /mnt/p4_v type smbfs (0)
//c117787-d/X_VIDEO on /mnt/p4_x type smbfs (0)
[root@p800 mythtv]#

Notice how /hda6 shows ext3,ext2 but /hda5 only shows ext2? Why is that?
To prove that I really did the tune2fs -j on it, I ran it again and it
reported "The filesystem already has a journal.". Is that right?

Finally, should I change hda5 and 6 both to be
"defaults,noatime,data=writeback 0 2" or only the OS parition or the Video
one or what? I've read about it but your practical experience is way more
valuable than my books!

Thanks!
JC

P.S. Now that Myth isn't currently running and I'm just sitting at KDE, is
there some "system test" program I should run to just run the heck out of it
(hard drive, memory, video, etc.) to see if all is working well and will
stay up? Hate to start doing some heavy Myth coding and have it crash on me
again.


----- Original Message -----
From: "Cliff Draper" <Cliff.Draper@sun.com>
To: <mythtv-users@snowman.net>
Sent: Thursday, March 06, 2003 3:10 PM
Subject: Re: [mythtv-users] Hard drive failure...


> From: Unit3 <unit3@demoni.ca>
> Date: Thu, 06 Mar 2003 12:51:01 -0600
> > JC wrote:
> > >This PC had run perfectly for about 2 years with Win2k and never a
glitch of trouble. Do you think turning on the dma stuff as suggested in
the
> > >setup/faq could have screwed it up? If so, would it really take a few
weeks to screw up?
> > >
> > This is technically possible with older hardware, but I find it highly
> > unlikely with new hardware, unless your IDE controller or something else
> > on your motherboard is also malfunctioning.
>
> It seems pretty unlikely to me too.
>
> > >Also, should I try and switch the file system (using linux default
now -
> > >ext2 right?) to some journaled one or something "safer"?
>
> A journaled filesystem like ext3 works great as a filesystem for
> normal files. For the type of file usage that MythTV does (create a
> huge file once, and then only do reads from it), it really only
> provides 1 big advantage: fast fsck time due to better crash
> tolerance. Running fsck on a 100gig partition takes too long for
> ext2, so I switched to ext3, and I've been pretty happy with it.
> Here's my entry in /etc/fstab:
> /dev/hda7 /data ext3 defaults,noatime,data=writeback 0 2
>
> > >What's the right/best way to repair the partition if any?
> >
> > I haven't had to do this in a while, but it seems to me that fsck should
> > have some options to do a more exhaustive disk scan... let's check the
> > man page! ;)
> >
> > well, passing -c to e2fsck apparently forces it to run badblocks and
> > mark all corrupt blocks on your disk in the "bad block inode", which I
> > assume just keeps track of bad blocks so the fs doesn't try to use them.
> > Apparently using -c twice will force it to do a non-destructive
> > read-write test to do this, which is probably about as comprehensive as
> > you're going to get. Another possibly useful option is -f, which forces
> > a check even if it thinks it's fine.
>
> I recently lost the hard drive on my MythTV's machine. I ran the
> badblock's non-destructive read-write test (which takes a long time to
> run), and came up with nothing. Apparently modern hard drives do
> their own bad block reallocation automatically. A lot of hard drives
> have 3 year warranties, and so mine was still covered. I ran Maxtor's
> diagnostic program on it and got a failure. After that, it was as
> simple as RMA.
>
> Cliff Draper Sun Microsystems, Forte Tools
> My opinions may or may not reflect those of my employer.
> ---------------------------- food for thought ---------------------------
> The meta-Turing test counts a thing as intelligent if it seeks to
> devise and apply Turing tests to objects of its own creation.
> _______________________________________________
> mythtv-users mailing list
> mythtv-users@snowman.net
> http://www.snowman.net/mailman/listinfo/mythtv-users
>
>
Re: Hard drive failure... [ In reply to ]
Just out of curiosity, now that my system is up-and-running. What's the
best way to back up the whole thing to a networked drive?

Do a tar or something? Can I do it from the root (/) and get everything
(even "open" files) but excluding my /video partition and mounted samba
shares? In Windows I'd run the system backup program, check a few boxes and
let it rip. What do I do in Mandrake 9 / linux?

Tanks!
JC

P.S. Here's my current mounts:
[root@p800 mythtv]# mount
/dev/hda6 on / type ext3,ext2 (rw)
none on /proc type proc (rw) --- Don't know what this is! ;-)
none on /dev/pts type devpts (rw,mode=0620) --- Don't know what this is!
;-)
/dev/hda5 on /video type ext2 (rw) --- Don't want to backup this
//192.168.0.6/C on /mnt/C type smbfs (0) --- Don't want to backup this
//192.168.0.6/DVD on /mnt/DVD type smbfs (0) --- Don't want to backup this
//c117787-d/C on /mnt/p4_C type smbfs (0) --- Don't want to backup
this
//192.168.0.6/V on /mnt/p4_v type smbfs (0) --- Don't want to backup
this
//c117787-d/X_VIDEO on /mnt/p4_x type smbfs (0) --- Don't want to backup
this

----- Original Message -----
From: "Ryan A. Carris" <rac@racarris.com>
To: "Discussion about mythtv" <mythtv-users@snowman.net>
Sent: Thursday, March 06, 2003 8:18 PM
Subject: Re: [mythtv-users] Hard drive failure...


> JC wrote:
> > Hi,
> >
> > Was running Myth CVS just great as of about a week ago. Have about 60gb
> > worth of shows recorded, and then things just "hung". Couldn't even
task
> > (snip)
> > Now it won't boot again.
> >
> > This PC had run perfectly for about 2 years with Win2k and never a
glitch of
> > trouble. Do you think turning on the dma stuff as suggested in the
> > setup/faq could have screwed it up? If so, would it really take a few
weeks
> > to screw up?
> >
>
> It is NOT likely a hardware problem. It is likely a corrupt filesystem.
> Type "man fsck", Read, Understand, then Run "fsck". This should get
> your computer back up and running.
>
>
> > Also, should I try and switch the file system (using linux default now -
> > ext2 right?) to some journaled one or something "safer"?
> > (snip)
> > JC
>
> Another alternitive to running a journaled file system, is to mount your
> existing partitions using the "sync" option. IMHO this is a much better
> solution.
>
> Linux, by default, sets 'ext2' partitions to be written async. This is
> not a good thing, because data is not written to the disk right a way,
> therefore if you hit the reset button, you are doomed to loose data,
> everytime. (no other OS does this, as far as I know) Mounting the
> filesystems w/ the sync option will definitely sacrifce speed though,
> but is much safer. I say that this is a better option, because it
> doesn't carry the overhead/hassle of a Journeled filesystem and almost
> as safe.
>
> But, I would only do this if you have setup a seperate partition for you
> video that can still be mounted 'async', so that you don't loose speed
> there.
>
> -rac
>
> _______________________________________________
> mythtv-users mailing list
> mythtv-users@snowman.net
> http://www.snowman.net/mailman/listinfo/mythtv-users
>
>
RE: Hard drive failure... [ In reply to ]
I found rsync a good way to do backups. Here's a good explanation of
different way to use rsync for backup
http://www.mikerubel.org/computers/rsync_snapshots/

Ben

-----Original Message-----
From: mythtv-users-bounces@snowman.net
[mailto:mythtv-users-bounces@snowman.net] On Behalf Of JC
Sent: Friday, March 07, 2003 12:47 AM
To: Discussion about mythtv
Subject: Re: [mythtv-users] Hard drive failure...


Just out of curiosity, now that my system is up-and-running. What's the
best way to back up the whole thing to a networked drive?

Do a tar or something? Can I do it from the root (/) and get everything
(even "open" files) but excluding my /video partition and mounted samba
shares? In Windows I'd run the system backup program, check a few boxes
and let it rip. What do I do in Mandrake 9 / linux?

Tanks!
JC

P.S. Here's my current mounts:
[root@p800 mythtv]# mount
/dev/hda6 on / type ext3,ext2 (rw)
none on /proc type proc (rw) --- Don't know what this is! ;-)
none on /dev/pts type devpts (rw,mode=0620) --- Don't know what this
is!
;-)
/dev/hda5 on /video type ext2 (rw) --- Don't want to backup this
//192.168.0.6/C on /mnt/C type smbfs (0) --- Don't want to backup this
//192.168.0.6/DVD on /mnt/DVD type smbfs (0) --- Don't want to backup
this
//c117787-d/C on /mnt/p4_C type smbfs (0) --- Don't want to backup
this
//192.168.0.6/V on /mnt/p4_v type smbfs (0) --- Don't want to
backup
this
//c117787-d/X_VIDEO on /mnt/p4_x type smbfs (0) --- Don't want to
backup this

----- Original Message -----
From: "Ryan A. Carris" <rac@racarris.com>
To: "Discussion about mythtv" <mythtv-users@snowman.net>
Sent: Thursday, March 06, 2003 8:18 PM
Subject: Re: [mythtv-users] Hard drive failure...


> JC wrote:
> > Hi,
> >
> > Was running Myth CVS just great as of about a week ago. Have about
> > 60gb worth of shows recorded, and then things just "hung". Couldn't

> > even
task
> > (snip)
> > Now it won't boot again.
> >
> > This PC had run perfectly for about 2 years with Win2k and never a
glitch of
> > trouble. Do you think turning on the dma stuff as suggested in the
> > setup/faq could have screwed it up? If so, would it really take a
> > few
weeks
> > to screw up?
> >
>
> It is NOT likely a hardware problem. It is likely a corrupt
filesystem.
> Type "man fsck", Read, Understand, then Run "fsck". This should get

> your computer back up and running.
>
>
> > Also, should I try and switch the file system (using linux default
> > now - ext2 right?) to some journaled one or something "safer"?
> > (snip)
> > JC
>
> Another alternitive to running a journaled file system, is to mount
> your existing partitions using the "sync" option. IMHO this is a much

> better solution.
>
> Linux, by default, sets 'ext2' partitions to be written async. This
> is not a good thing, because data is not written to the disk right a
> way, therefore if you hit the reset button, you are doomed to loose
> data, everytime. (no other OS does this, as far as I know) Mounting
> the filesystems w/ the sync option will definitely sacrifce speed
> though, but is much safer. I say that this is a better option,
> because it doesn't carry the overhead/hassle of a Journeled filesystem

> and almost as safe.
>
> But, I would only do this if you have setup a seperate partition for
> you video that can still be mounted 'async', so that you don't loose
> speed there.
>
> -rac
>
> _______________________________________________
> mythtv-users mailing list
> mythtv-users@snowman.net
> http://www.snowman.net/mailman/listinfo/mythtv-users
>
>


_______________________________________________
mythtv-users mailing list
mythtv-users@snowman.net
http://www.snowman.net/mailman/listinfo/mythtv-users
Re: Hard drive failure... [ In reply to ]
On Thu, Mar 06, 2003 at 12:01:51PM -0500, JC wrote:
> Hi,
>
> Was running Myth CVS just great as of about a week ago. Have about 60gb
> worth of shows recorded, and then things just "hung". Couldn't even task
> switch to do a clean shutdown. Did a hard reset and then it wouldn't even
> load up lilo - gave me an Error 0x10 (media error). Booted from Mandrake 9
> CD 1, got to point where the install was ready to pick a partition and
> switched to another session and repaired the /dev/hda5 partition (where I
> load from) and got things working again (after it fixed a zillion errors).
> It then booted up ok, I made sure to back everything up to another networked
> PC, and it ran ok for 24 hours then hung again. Now it won't boot again.

Go to your drive maker's web site and download whatever hard drive test
utilities they supply. Linux won't make a drive go bad but incorrect hdparm
settings could cause data or fs corruption. Basically you want to test the
following (preferably in this order).

1. Is the drive phisically good?

2. If so is the cable good, properly attached and is the drive jumpered
properly.

3. If so you've got a driver or configuration issue so make sure your
kernel has support for your particular ide chipset compiled in and read up
on hdparm.

--
Ray