Mailing List Archive

Stardom SATA HSM violation
Hi KML

I am installing gentoo 2007.0 (kernel 2.6.19) on a dual AMD Opteron server (total of 4 cores). The hard disk is a Stardom 2611-2S-S1 device: actually two 250GB drives in a RAID0 config managed by the device itself - it should appear to the kernel as one SATA drive. If it matters, the underlying HDs are "Seagate Barracuda 7200 10"s. Here's the device:

http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm

During the install and at different points in the process I get an "HSM violation" and the system becomes unresponsive. It looks like a similar situation to:

http://lkml.org/lkml/2007/6/6/195

Will more recent kernels work with this hardware (should I keep it and try the install again) or should I switch hardware to something more compatible (like an Adaptec card)?

Thanks!
Bryan

--
console output:

tag 0 cmd 0x39 Emask 0x2 stat 0x58 err 0x0 (HSM violation)
exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen

--
Output from hdparm -I /dev/sda:

/dev/sda:

ATA device, with non-removable media
Model Number: STARDOM V.36.A0B
Serial Number:
Firmware Revision: V.36.A0B
Standards:
Used: ATA/ATAPI-6 T13 1410D revision 0

[snip]

Commands/features:
Enabled Supported:
* SMART feature set
* Power Management feature set
* Advanced Power Management feature set
* 48-bit Address feature set
* Mandatory FLUSH_CACHE
* SATA-I signaling speed (1.5 Gb/s)
* SATA-II signaling speed (3.0 Gb/s)

--
Parts of dmesg:
libata version 2.00 loaded
sata_nv 0000:00:05.0: version 2.0
ata1: SATA max UDMA/133 cmd 0xD480 ctl 0xD402 bmdma 0xCC00 irq 21
ata2: SATA max UDMA/133 cmd 0xD080 ctl 0xD002 bmdma 0xCC08 irq 21
scsi0 : sata_nv
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: ATA-6, max UDMA/133,976794112 sectors: LBA48
ata1.00: ata1: dev 0 multi count 1
ata1.00: applying bridge limits
ata1.00: configured for UDMA/100



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Hi,

[Adding linux-ide to CC]

On 25/08/07, Bryan Woods <bryan@arbores.ca> wrote:
> Hi KML
>
> I am installing gentoo 2007.0 (kernel 2.6.19) on a dual AMD Opteron server (total of 4 cores). The hard disk is a Stardom 2611-2S-S1 device: actually two 250GB drives in a RAID0 config managed by the device itself - it should appear to the kernel as one SATA drive. If it matters, the underlying HDs are "Seagate Barracuda 7200 10"s. Here's the device:
>
> http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm
>
> During the install and at different points in the process I get an "HSM violation" and the system becomes unresponsive. It looks like a similar situation to:
>
> http://lkml.org/lkml/2007/6/6/195
>
> Will more recent kernels work with this hardware (should I keep it and try the install again) or should I switch hardware to something more compatible (like an Adaptec card)?
>
> Thanks!
> Bryan
>
> --
> console output:
>
> tag 0 cmd 0x39 Emask 0x2 stat 0x58 err 0x0 (HSM violation)
> exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>
> --
> Output from hdparm -I /dev/sda:
>
> /dev/sda:
>
> ATA device, with non-removable media
> Model Number: STARDOM V.36.A0B
> Serial Number:
> Firmware Revision: V.36.A0B
> Standards:
> Used: ATA/ATAPI-6 T13 1410D revision 0
>
> [snip]
>
> Commands/features:
> Enabled Supported:
> * SMART feature set
> * Power Management feature set
> * Advanced Power Management feature set
> * 48-bit Address feature set
> * Mandatory FLUSH_CACHE
> * SATA-I signaling speed (1.5 Gb/s)
> * SATA-II signaling speed (3.0 Gb/s)
>
> --
> Parts of dmesg:
> libata version 2.00 loaded
> sata_nv 0000:00:05.0: version 2.0
> ata1: SATA max UDMA/133 cmd 0xD480 ctl 0xD402 bmdma 0xCC00 irq 21
> ata2: SATA max UDMA/133 cmd 0xD080 ctl 0xD002 bmdma 0xCC08 irq 21
> scsi0 : sata_nv
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata1.00: ATA-6, max UDMA/133,976794112 sectors: LBA48
> ata1.00: ata1: dev 0 multi count 1
> ata1.00: applying bridge limits
> ata1.00: configured for UDMA/100
>

Regards,
Michal

--
LOG
http://www.stardust.webpages.pl/log/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Michal Piotrowski wrote:
> Hi,
>
> [Adding linux-ide to CC]
>
> On 25/08/07, Bryan Woods <bryan@arbores.ca> wrote:
>> Hi KML
>>
>> I am installing gentoo 2007.0 (kernel 2.6.19) on a dual AMD Opteron server (total of 4 cores). The hard disk is a Stardom 2611-2S-S1 device: actually two 250GB drives in a RAID0 config managed by the device itself - it should appear to the kernel as one SATA drive. If it matters, the underlying HDs are "Seagate Barracuda 7200 10"s. Here's the device:
>>
>> http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm
>>
>> During the install and at different points in the process I get an "HSM violation" and the system becomes unresponsive. It looks like a similar situation to:
>>
>> http://lkml.org/lkml/2007/6/6/195
>>
>> Will more recent kernels work with this hardware (should I keep it and try the install again) or should I switch hardware to something more compatible (like an Adaptec card)?
>>
>> Thanks!
>> Bryan
>>
>> --
>> console output:
>>
>> tag 0 cmd 0x39 Emask 0x2 stat 0x58 err 0x0 (HSM violation)
>> exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>>
>> --
>> Output from hdparm -I /dev/sda:
>>
>> /dev/sda:
>>
>> ATA device, with non-removable media
>> Model Number: STARDOM V.36.A0B
>> Serial Number:
>> Firmware Revision: V.36.A0B
>> Standards:
>> Used: ATA/ATAPI-6 T13 1410D revision 0
>>
>> [snip]
>>
>> Commands/features:
>> Enabled Supported:
>> * SMART feature set
>> * Power Management feature set
>> * Advanced Power Management feature set
>> * 48-bit Address feature set
>> * Mandatory FLUSH_CACHE
>> * SATA-I signaling speed (1.5 Gb/s)
>> * SATA-II signaling speed (3.0 Gb/s)
>>
>> --
>> Parts of dmesg:
>> libata version 2.00 loaded
>> sata_nv 0000:00:05.0: version 2.0
>> ata1: SATA max UDMA/133 cmd 0xD480 ctl 0xD402 bmdma 0xCC00 irq 21
>> ata2: SATA max UDMA/133 cmd 0xD080 ctl 0xD002 bmdma 0xCC08 irq 21
>> scsi0 : sata_nv
>> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>> ata1.00: ATA-6, max UDMA/133,976794112 sectors: LBA48
>> ata1.00: ata1: dev 0 multi count 1
>> ata1.00: applying bridge limits
>> ata1.00: configured for UDMA/100

Please post full dmesg and full 'hdparm -I' result. Also, if possible,
please try 2.6.22.5. Even if it doesn't fix the problem, it would
report error conditions better.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
> On Mon, 03 Sep 2007 17:53:00 +0900 Tejun Heo <htejun@gmail.com> wrote:
> Michal Piotrowski wrote:
> > Hi,
> >
> > [Adding linux-ide to CC]
> >
> > On 25/08/07, Bryan Woods <bryan@arbores.ca> wrote:
> >> Hi KML
> >>
> >> I am installing gentoo 2007.0 (kernel 2.6.19) on a dual AMD Opteron server (total of 4 cores). The hard disk is a Stardom 2611-2S-S1 device: actually two 250GB drives in a RAID0 config managed by the device itself - it should appear to the kernel as one SATA drive. If it matters, the underlying HDs are "Seagate Barracuda 7200 10"s. Here's the device:
> >>
> >> http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm
> >>
> >> During the install and at different points in the process I get an "HSM violation" and the system becomes unresponsive. It looks like a similar situation to:
> >>
> >> http://lkml.org/lkml/2007/6/6/195
> >>
> >> Will more recent kernels work with this hardware (should I keep it and try the install again) or should I switch hardware to something more compatible (like an Adaptec card)?
> >>
> >> Thanks!
> >> Bryan
> >>
> >> --
> >> console output:
> >>
> >> tag 0 cmd 0x39 Emask 0x2 stat 0x58 err 0x0 (HSM violation)
> >> exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> >>
> >> --
> >> Output from hdparm -I /dev/sda:
> >>
> >> /dev/sda:
> >>
> >> ATA device, with non-removable media
> >> Model Number: STARDOM V.36.A0B
> >> Serial Number:
> >> Firmware Revision: V.36.A0B
> >> Standards:
> >> Used: ATA/ATAPI-6 T13 1410D revision 0
> >>
> >> [snip]
> >>
> >> Commands/features:
> >> Enabled Supported:
> >> * SMART feature set
> >> * Power Management feature set
> >> * Advanced Power Management feature set
> >> * 48-bit Address feature set
> >> * Mandatory FLUSH_CACHE
> >> * SATA-I signaling speed (1.5 Gb/s)
> >> * SATA-II signaling speed (3.0 Gb/s)
> >>
> >> --
> >> Parts of dmesg:
> >> libata version 2.00 loaded
> >> sata_nv 0000:00:05.0: version 2.0
> >> ata1: SATA max UDMA/133 cmd 0xD480 ctl 0xD402 bmdma 0xCC00 irq 21
> >> ata2: SATA max UDMA/133 cmd 0xD080 ctl 0xD002 bmdma 0xCC08 irq 21
> >> scsi0 : sata_nv
> >> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> >> ata1.00: ATA-6, max UDMA/133,976794112 sectors: LBA48
> >> ata1.00: ata1: dev 0 multi count 1
> >> ata1.00: applying bridge limits
> >> ata1.00: configured for UDMA/100
>
> Please post full dmesg and full 'hdparm -I' result. Also, if possible,
> please try 2.6.22.5. Even if it doesn't fix the problem, it would
> report error conditions better.

Presumably in the week and a half between Bryan's report and your request,
Bryan has gone off and got an adaptec card. Bryan, it would be helpful if
you could rebuild the original systam and help us get to the bottom of this
bug, thanks.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Andrew Morton wrote:
>> On Mon, 03 Sep 2007 17:53:00 +0900 Tejun Heo <htejun@gmail.com> wrote:
>> Michal Piotrowski wrote:
>>> Hi,
>>>
>>> [Adding linux-ide to CC]
>>>
>>> On 25/08/07, Bryan Woods <bryan@arbores.ca> wrote:
>>>> Hi KML
>>>>
>>>> I am installing gentoo 2007.0 (kernel 2.6.19) on a dual AMD Opteron server (total of 4 cores). The hard disk is a Stardom 2611-2S-S1 device: actually two 250GB drives in a RAID0 config managed by the device itself - it should appear to the kernel as one SATA drive. If it matters, the underlying HDs are "Seagate Barracuda 7200 10"s. Here's the device:
>>>>
>>>> http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm
>>>>
>>>> During the install and at different points in the process I get an "HSM violation" and the system becomes unresponsive. It looks like a similar situation to:
>>>>
>>>> http://lkml.org/lkml/2007/6/6/195
>>>>
>>>> Will more recent kernels work with this hardware (should I keep it and try the install again) or should I switch hardware to something more compatible (like an Adaptec card)?
>>>>
>>>> Thanks!
>>>> Bryan
>>>>
>>>> --
>>>> console output:
>>>>
>>>> tag 0 cmd 0x39 Emask 0x2 stat 0x58 err 0x0 (HSM violation)
>>>> exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>>>>
>>>> --
>>>> Output from hdparm -I /dev/sda:
>>>>
>>>> /dev/sda:
>>>>
>>>> ATA device, with non-removable media
>>>> Model Number: STARDOM V.36.A0B
>>>> Serial Number:
>>>> Firmware Revision: V.36.A0B
>>>> Standards:
>>>> Used: ATA/ATAPI-6 T13 1410D revision 0
>>>>
>>>> [snip]
>>>>
>>>> Commands/features:
>>>> Enabled Supported:
>>>> * SMART feature set
>>>> * Power Management feature set
>>>> * Advanced Power Management feature set
>>>> * 48-bit Address feature set
>>>> * Mandatory FLUSH_CACHE
>>>> * SATA-I signaling speed (1.5 Gb/s)
>>>> * SATA-II signaling speed (3.0 Gb/s)
>>>>
>>>> --
>>>> Parts of dmesg:
>>>> libata version 2.00 loaded
>>>> sata_nv 0000:00:05.0: version 2.0
>>>> ata1: SATA max UDMA/133 cmd 0xD480 ctl 0xD402 bmdma 0xCC00 irq 21
>>>> ata2: SATA max UDMA/133 cmd 0xD080 ctl 0xD002 bmdma 0xCC08 irq 21
>>>> scsi0 : sata_nv
>>>> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>> ata1.00: ATA-6, max UDMA/133,976794112 sectors: LBA48
>>>> ata1.00: ata1: dev 0 multi count 1
>>>> ata1.00: applying bridge limits
>>>> ata1.00: configured for UDMA/100
>> Please post full dmesg and full 'hdparm -I' result. Also, if possible,
>> please try 2.6.22.5. Even if it doesn't fix the problem, it would
>> report error conditions better.
>
> Presumably in the week and a half between Bryan's report and your request,
> Bryan has gone off and got an adaptec card. Bryan, it would be helpful if
> you could rebuild the original systam and help us get to the bottom of this
> bug, thanks.

I reported a very similar bug back a few releases ago.
Anyone who wants to try it themselves, can do this with hdparm-7.7 (from sourceforge):

hdparm --drq-hsm-error /dev/sda

Whether or not it hangs the machine does depend upon exactly which SATA LLD is used,
and what model/revision of drive is installed. But if it hangs for you (eg. Tejun),
then you now have a way to reproduce a HSM error "on demand" for testing. :)

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
> On Wed, 05 Sep 2007 13:23:35 -0400 Mark Lord <liml@rtr.ca> wrote:
> Andrew Morton wrote:
> >> please try 2.6.22.5. Even if it doesn't fix the problem, it would
> >> report error conditions better.
> >
> > Presumably in the week and a half between Bryan's report and your request,
> > Bryan has gone off and got an adaptec card. Bryan, it would be helpful if
> > you could rebuild the original systam and help us get to the bottom of this
> > bug, thanks.
>
> I reported a very similar bug back a few releases ago.
> Anyone who wants to try it themselves, can do this with hdparm-7.7 (from sourceforge):
>
> hdparm --drq-hsm-error /dev/sda
>
> Whether or not it hangs the machine does depend upon exactly which SATA LLD is used,
> and what model/revision of drive is installed. But if it hangs for you (eg. Tejun),
> then you now have a way to reproduce a HSM error "on demand" for testing. :)
>

Hey, we just found something which doesn't crash my Vaio!

sony:/home/akpm/hdparm-7.7> 0 ./hdparm --drq-hsm-error /dev/sda

/dev/sda:
triggering "stuck DRQ" host state machine error
do_drq_hsm_error: Success
ata status=0x58 ata error=0x00


ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 0
res 58/00:01:00:00:00/00:00:00:00:00/40 Emask 0x2 (HSM violation)
ata3: soft resetting port
ata3.00: configured for UDMA/100
ata3: EH complete
sd 2:0:0:0: [sda] 195371568 512-byte hardware sectors (100030 MB)
sd 2:0:0:0: [sda] Write Protect is off
sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA


How dull. (ata_piix)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Andrew Morton wrote:
>..
> Hey, we just found something which doesn't crash my Vaio!
>
> sony:/home/akpm/hdparm-7.7> 0 ./hdparm --drq-hsm-error /dev/sda
>
> /dev/sda:
> triggering "stuck DRQ" host state machine error
> do_drq_hsm_error: Success
> ata status=0x58 ata error=0x00
>
> ata3.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
> ata3.00: cmd ec/00:00:00:00:00/00:00:00:00:00/40 tag 0 cdb 0x0 data 0
> res 58/00:01:00:00:00/00:00:00:00:00/40 Emask 0x2 (HSM violation)
> ata3: soft resetting port
> ata3.00: configured for UDMA/100
> ata3: EH complete
> sd 2:0:0:0: [sda] 195371568 512-byte hardware sectors (100030 MB)
> sd 2:0:0:0: [sda] Write Protect is off
> sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00
> sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>
> How dull. (ata_piix)

:)

On my two very similar notebooks, it crashes libata when a PATA drive is used
behind a Marvell converter chip, but not when a SATA drive is used directly.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Tejun Heo wrote:
> Michal Piotrowski wrote:
>> Hi,
>>
>> [Adding linux-ide to CC]
>>
>> On 25/08/07, Bryan Woods <bryan@arbores.ca> wrote:
>>> Hi KML
>>>
>>> I am installing gentoo 2007.0 (kernel 2.6.19) on a dual AMD Opteron server (total of 4 cores). The hard disk is a Stardom 2611-2S-S1 device: actually two 250GB drives in a RAID0 config managed by the device itself - it should appear to the kernel as one SATA drive. If it matters, the underlying HDs are "Seagate Barracuda 7200 10"s. Here's the device:
>>>
>>> http://www.synetic.net/Synetic-Products/Stardoms/SR-2611-SA/Stardom-2611.htm
>>>
>>> During the install and at different points in the process I get an "HSM violation" and the system becomes unresponsive. It looks like a similar situation to:
>>>
>>> http://lkml.org/lkml/2007/6/6/195
>>>
>>> Will more recent kernels work with this hardware (should I keep it and try the install again) or should I switch hardware to something more compatible (like an Adaptec card)?
>>>
>>> Thanks!
>>> Bryan
>>>
>>> --
>>> console output:
>>>
>>> tag 0 cmd 0x39 Emask 0x2 stat 0x58 err 0x0 (HSM violation)
>>> exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
>>>
>
> Please post full dmesg and full 'hdparm -I' result. Also, if possible,
> please try 2.6.22.5. Even if it doesn't fix the problem, it would
> report error conditions better.
>

The full dmesg and hdparm -I command output are attached.

I have received word from the vendor that the Stardom 2611 will do
RAID0 or 1 under windows, but only RAID1 under Linux. (Their manual
said it worked with Linux but failed to mention the RAID mode
restriction: argh!)

They recommended the 2600 model for RAID0 with Linux, but that model
is only SATA-I so I will probably go with alternate hardware.

The vendor also suggested the possibility of a firmware upgrade to
the 2611 - I am still waiting to hear. I will post a followup if
this happens.

Thanks all for your help and suggestions!

Regards,
Bryan
Re: Stardom SATA HSM violation [ In reply to ]
Bryan Woods wrote:

> The full dmesg and hdparm -I command output are attached.
>
> I have received word from the vendor that the Stardom 2611 will do
> RAID0 or 1 under windows, but only RAID1 under Linux. (Their manual
> said it worked with Linux but failed to mention the RAID mode
> restriction: argh!)
>
> They recommended the 2600 model for RAID0 with Linux, but that model
> is only SATA-I so I will probably go with alternate hardware.
>
> The vendor also suggested the possibility of a firmware upgrade to
> the 2611 - I am still waiting to hear. I will post a followup if
> this happens.

If possible, please post dmesg from 2.6.22.5.

Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Hello,

Mark Lord wrote:
> I reported a very similar bug back a few releases ago.
> Anyone who wants to try it themselves, can do this with hdparm-7.7 (from
> sourceforge):
>
> hdparm --drq-hsm-error /dev/sda
>
> Whether or not it hangs the machine does depend upon exactly which SATA
> LLD is used,
> and what model/revision of drive is installed. But if it hangs for you
> (eg. Tejun),
> then you now have a way to reproduce a HSM error "on demand" for
> testing. :)

Neat. Is this the FIFO-draining issue?

Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Tejun Heo wrote:
> Hello,
>
> Mark Lord wrote:
>> I reported a very similar bug back a few releases ago.
>> Anyone who wants to try it themselves, can do this with hdparm-7.7 (from
>> sourceforge):
>>
>> hdparm --drq-hsm-error /dev/sda
>>
>> Whether or not it hangs the machine does depend upon exactly which SATA
>> LLD is used,
>> and what model/revision of drive is installed. But if it hangs for you
>> (eg. Tejun),
>> then you now have a way to reproduce a HSM error "on demand" for
>> testing. :)
>
> Neat. Is this the FIFO-draining issue?

Yeah, that's the one. And I still patch my own kernels to
automatically drain up to 512 words from the FIFO when this happens.

Works like a charm. Patch below for demonstration purposes.

Signed-Off-By: Mark Lord <mlord@pobox.com>
---

--- linux/drivers/ata/libata-sff.c.orig 2007-04-26 12:02:46.000000000 -0400
+++ linux/drivers/ata/libata-sff.c 2007-04-29 08:29:27.000000000 -0400
@@ -413,6 +413,24 @@
ap->ops->irq_on(ap);
}

+static void ata_drain_fifo (struct ata_port *ap, struct ata_queued_cmd *qc)
+{
+ u8 stat = ata_chk_status(ap);
+ /*
+ * Try to clear stuck DRQ if necessary.
+ */
+ if ((stat & ATA_DRQ) && (!qc || qc->dma_dir != DMA_TO_DEVICE)) {
+ unsigned int i, limit = 512;
+ printk("Draining up to %u words from data FIFO.\n", limit);
+ for (i = 0; i < limit ; ++i) {
+ ioread16(ap->ioaddr.data_addr);
+ if (!(ata_chk_status(ap) & ATA_DRQ))
+ break;
+ }
+ printk("Drained %u/%u words.\n", i, limit);
+ }
+}
+
/**
* ata_bmdma_drive_eh - Perform EH with given methods for BMDMA controller
* @ap: port to handle error for
@@ -469,7 +487,7 @@
}

ata_altstatus(ap);
- ata_chk_status(ap);
+ ata_drain_fifo(ap, qc);
ap->ops->irq_clear(ap);

spin_unlock_irqrestore(ap->lock, flags);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Mark Lord wrote:
> Tejun Heo wrote:
>> Hello,
>>
>> Mark Lord wrote:
>>> I reported a very similar bug back a few releases ago.
>>> Anyone who wants to try it themselves, can do this with hdparm-7.7 (from
>>> sourceforge):
>>>
>>> hdparm --drq-hsm-error /dev/sda
>>>
>>> Whether or not it hangs the machine does depend upon exactly which SATA
>>> LLD is used,
>>> and what model/revision of drive is installed. But if it hangs for you
>>> (eg. Tejun),
>>> then you now have a way to reproduce a HSM error "on demand" for
>>> testing. :)
>>
>> Neat. Is this the FIFO-draining issue?
>
> Yeah, that's the one. And I still patch my own kernels to
> automatically drain up to 512 words from the FIFO when this happens.
>
> Works like a charm. Patch below for demonstration purposes.
>
> Signed-Off-By: Mark Lord <mlord@pobox.com>

I think there have been enough cases where this draining was necessary.
IIRC, ata_piix was involved in those cases, right? If so, can you
please submit a patch which applies this only to affected controllers?
I don't feel too confident about applying this to all SFF controllers.

Thanks.

--
tejun

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
> I think there have been enough cases where this draining was necessary.
> IIRC, ata_piix was involved in those cases, right? If so, can you
> please submit a patch which applies this only to affected controllers?
> I don't feel too confident about applying this to all SFF controllers.

Old IDE does it on all controllers bar a couple. So we have a very good
knowledge of what does/doesn't work. The one that needs care in old ide
is an ordering issue where a state machine reset done first causes the
drain of the I/O to hang.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Alan Cox wrote:
>> I think there have been enough cases where this draining was necessary.
>> IIRC, ata_piix was involved in those cases, right? If so, can you
>> please submit a patch which applies this only to affected controllers?
>> I don't feel too confident about applying this to all SFF controllers.
>
> Old IDE does it on all controllers bar a couple. So we have a very good
> knowledge of what does/doesn't work. The one that needs care in old ide
> is an ordering issue where a state machine reset done first causes the
> drain of the I/O to hang.

Hmmm... So, do we apply draining to all PATA? Or is ata_piix SATA
affected too?

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Tejun Heo wrote:
> Alan Cox wrote:
>>> I think there have been enough cases where this draining was necessary.
>>> IIRC, ata_piix was involved in those cases, right? If so, can you
>>> please submit a patch which applies this only to affected controllers?
>>> I don't feel too confident about applying this to all SFF controllers.
>> Old IDE does it on all controllers bar a couple. So we have a very good
>> knowledge of what does/doesn't work. The one that needs care in old ide
>> is an ordering issue where a state machine reset done first causes the
>> drain of the I/O to hang.
>
> Hmmm... So, do we apply draining to all PATA? Or is ata_piix SATA
> affected too?

I would think all SFF controllers, since a lot of first gen SATA are
really bridged solutions. If they are flagging DRQ, I say oblige them :)

Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Jeff Garzik wrote:
> Tejun Heo wrote:
>> Alan Cox wrote:
>>>> I think there have been enough cases where this draining was necessary.
>>>> IIRC, ata_piix was involved in those cases, right? If so, can you
>>>> please submit a patch which applies this only to affected controllers?
>>>> I don't feel too confident about applying this to all SFF controllers.
>>> Old IDE does it on all controllers bar a couple. So we have a very good
>>> knowledge of what does/doesn't work. The one that needs care in old ide
>>> is an ordering issue where a state machine reset done first causes the
>>> drain of the I/O to hang.
>>
>> Hmmm... So, do we apply draining to all PATA? Or is ata_piix SATA
>> affected too?
>
> I would think all SFF controllers, since a lot of first gen SATA are
> really bridged solutions. If they are flagging DRQ, I say oblige them :)

Alright, then the posted patch should be good enough. Mark, can you be
bothered to regenerate the patch and post it one more time (again)? It
seems we all agree the update is needed.

Thanks a lot.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Re: Stardom SATA HSM violation [ In reply to ]
Tejun Heo wrote:
> Alan Cox wrote:
>>> I think there have been enough cases where this draining was necessary.
>>> IIRC, ata_piix was involved in those cases, right? If so, can you
>>> please submit a patch which applies this only to affected controllers?
>>> I don't feel too confident about applying this to all SFF controllers.
>> Old IDE does it on all controllers bar a couple. So we have a very good
>> knowledge of what does/doesn't work. The one that needs care in old ide
>> is an ordering issue where a state machine reset done first causes the
>> drain of the I/O to hang.
>
> Hmmm... So, do we apply draining to all PATA? Or is ata_piix SATA
> affected too?

ata_piix SATA is definitely affected when a PATA_drive to SATA_host bridge is present.
Possibly other times.

Cheers

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/