Mailing List Archive

0x7f in SectorIdNotFound errors
A while ago, I saw a rash of errors on a hard drive at the same alleged
sector:

Jul 26 06:30:12 ithaki kernel: hdb: drive_cmd: error=0x7f {
DriveStatusError UncorrectableError SectorIdNotFound TrackZeroNotFound
AddrMarkNotFound }, LBAsect=1647111536511, high=98175, low=8355711,
sector=0
Jul 26 06:30:12 ithaki kernel: hdb: drive_cmd: status=0x7f { DriveReady
DeviceFault SeekComplete DataRequest CorrectedError Index Error }
Aug 6 06:30:18 ithaki kernel: hdb: drive_cmd: error=0x04 {
DriveStatusError }
Aug 6 06:30:18 ithaki kernel: hdb: drive_cmd: status=0x51 { DriveReady
SeekComplete Error }
Aug 12 01:10:17 ithaki kernel: hdb: drive_cmd: status=0x7f { DriveReady
DeviceFault SeekComplete DataRequest CorrectedError Index Error }
Aug 12 01:10:18 ithaki kernel: hdb: drive_cmd: error=0x7f {
DriveStatusError UncorrectableError SectorIdNotFound TrackZeroNotFound
AddrMarkNotFound }, LBAsect=1647111536511, high=98175, low=8355711,
sector=0
Aug 14 01:10:33 ithaki kernel: hdb: drive_cmd: error=0x7f {
DriveStatusError UncorrectableError SectorIdNotFound TrackZeroNotFound
AddrMarkNotFound }, LBAsect=1647111536511, high=98175, low=8355711,
sector=0
Aug 14 01:10:33 ithaki kernel: hdb: drive_cmd: status=0x7f { DriveReady
DeviceFault SeekComplete DataRequest CorrectedError Index Error }

In hex, that LBAsect is 0x17f7f7f7f7f. The disk is:

Model Number: WDC WD3000JB-00KFA0
Serial Number: WD-WCAMR2255526
Firmware Revision: 08.05J08

Kernel is Linux ithaki 2.6.16.18d865-sata #1 SMP Wed May 31 17:10:12 PDT
2006 i686 GNU/Linux. IDE controller is:

0000:00:1f.1 IDE interface: Intel Corp. 82801EB/ER (ICH5/ICH5R) Ultra
ATA 100 Storage Controller (rev 02)

FS is ext3. smartctl didn't report any errors but, then, it wouldn't
necessarily if the problem was garbage fs metadata. I found a few other
LKML postings with 0x7f patterns in part of the LBAsect.

http://www.ussg.iu.edu/hypermail/linux/kernel/0605.3/1124.html
http://www.ussg.iu.edu/hypermail/linux/kernel/0405.2/1227.html
http://www.ussg.iu.edu/hypermail/linux/kernel/0307.2/1725.html

I haven't found any postings in which anyone points out the 0x7f
pattern, which is why I thought I should post. I've failed to reproduce
the problem since the above dates. I'm not subscribed to the list but
would be interested to be cc()d on any replies.
-------------------------------------
Martin's Outlook, BlueArc Engineering

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: 0x7f in SectorIdNotFound errors [ In reply to ]
Ar Llu, 2006-08-28 am 12:56 -0700, ysgrifennodd Martin Dorey:
> Aug 14 01:10:33 ithaki kernel: hdb: drive_cmd: status=0x7f { DriveReady
> DeviceFault SeekComplete DataRequest CorrectedError Index Error }
>
> In hex, that LBAsect is 0x17f7f7f7f7f. The disk is:

That may well have been f7f7f7f7f7... because several bits will have
been masked (its not 32bit addressed at controller<>device level)

> FS is ext3. smartctl didn't report any errors but, then, it wouldn't
> necessarily if the problem was garbage fs metadata. I found a few other
> LKML postings with 0x7f patterns in part of the LBAsect.

If you force an fsck do you see any errors ? I guess not but if you have
a chance to check please do.

I'm not sure where F7 came from as it is not a poison value we typically
use. The fact the value is odd is also significant. Most of the kernel
deals in 1K block sizes so any error/corruption occurred fairly low down
once we got into sectors. That seems to rule out, for example, ext3
metadata corruption because it would be very strange drive geometry to
start a partition on an odd sector boundary, and the ext3 meta data
doesn't go down to sector granularity.

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
RE: 0x7f in SectorIdNotFound errors [ In reply to ]
> it would be very strange drive geometry to
> start a partition on an odd sector boundary

In which case, perhaps I should have mentioned this before:

martind@ithaki:~$ sudo fdisk -lu /dev/hdb

Disk /dev/hdb: 300.0 GB, 300069052416 bytes
255 heads, 63 sectors/track, 36481 cylinders, total 586072368 sectors
Units = sectors of 1 * 512 = 512 bytes

Device Boot Start End Blocks Id System
/dev/hdb1 63 586067264 293033601 83 Linux
martind@ithaki:~$

> If you force an fsck

I'll schedule some downtime but I thought the above might be worth
mentioning immediately.
-------------------------------------
Martin's Outlook, BlueArc Engineering

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
RE: 0x7f in SectorIdNotFound errors [ In reply to ]
> If you force an fsck do you see any errors ?

Some leaks but that's all?

martind@ithaki:~$ sudo e2fsck -f -n /dev/hdb1
e2fsck 1.37 (21-Mar-2005)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(27960582--27961605) -(27961607--27962101)
Fix? no

Free blocks count wrong for group #1126 (17197, counted=17216).
Fix? no

Free blocks count wrong for group #1293 (17902, counted=18046).
Fix? no

Free blocks count wrong (40205032, counted=40205195).
Fix? no

Free inodes count wrong for group #852 (16301, counted=16302).
Fix? no

Free inodes count wrong for group #1126 (15834, counted=15837).
Fix? no

Free inodes count wrong for group #1292 (15671, counted=15677).
Fix? no

Free inodes count wrong (35235302, counted=35235312).
Fix? no


u89: ********** WARNING: Filesystem still has errors **********

u89: 1399322/36634624 files (1.7% non-contiguous), 33053368/73258400
blocks
martind@ithaki:~$
-------------------------------------
Martin's Outlook, BlueArc Engineering
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/