Would
# echo 1 > /sys/module/firewire_sbp2/parameters/workarounds
# <plug device in>
improve it? This reduces the maximum data size per SCSI request and is said to
be necessary for some bridges from Symbios (bought by LSI). The Datafab
enclosures contain a Symbios bridge AFAIK.

(In reply to comment #1)
> Would
> # echo 1 > /sys/module/firewire_sbp2/parameters/workarounds
> # <plug device in>
> improve it? This reduces the maximum data size per SCSI request and is said to
Uhm, no, same as before.
> be necessary for some bridges from Symbios (bought by LSI). The Datafab
> enclosures contain a Symbios bridge AFAIK.
The chipset has LSI on it, so it could be Symbios.
OK, I tried to enable the SCSI debug with:
echo 9216 > /sys/module/scsi_mod/parameters/scsi_logging_level
Apart of flooding the logfile, in this condition the device seems to works, with
or without workaround, and it continues to work after disabling the SCSI debug
logging.
BTW, it seems there are only "Read(10)" in the logfile.
If I cycle off-on, then the problem (on the next test) reappears.
Does this help?
pg

(In reply to comment #3)
> This smells like a duplicate of bug 434830, which should be resolved by patches
> added to rawhide and f8 kernel 2.6.24.3-23.fc8, which just finished building in
> koji earlier today. Give that a spin, if you would...
>
> http://koji.fedoraproject.org/packages/kernel/2.6.24.3/23.fc8/
Uhm, I installed the -28, which should have the same patches (I need also WiFi
updates).
A first test with "hdparm" and "dd" had no visible improvements.
One thing I forgot to mention is that also removing and reloading the
firewire-sbp.ko module brings the device back to life.
Other ideas?
pg

(In reply to comment #1)
> Would
> # echo 1 > /sys/module/firewire_sbp2/parameters/workarounds
> # <plug device in>
> improve it? This reduces the maximum data size per SCSI request and is said to
> be necessary for some bridges from Symbios (bought by LSI). The Datafab
> enclosures contain a Symbios bridge AFAIK.
OK, I repeated the suggest operation, starting from device off (I was reading
too fast and I miss the <plug device in> concept, sorry).
A first test, with the new kernel, seems to be successful, "hdparm" and "dd"
were working fine, without causing errors.
So it seems this is the real issue.
Should we close this one or wait a little bit?
Thanks a lot!
pg

(In reply to comment #1)
> Would
> # echo 1 > /sys/module/firewire_sbp2/parameters/workarounds
> # <plug device in>
> improve it? This reduces the maximum data size per SCSI request and is said to
> be necessary for some bridges from Symbios (bought by LSI). The Datafab
> enclosures contain a Symbios bridge AFAIK.
I forgot to thank you: so thank you very much!
pg

Re comment #5:
If you are reasonably sure that the workaround fixes the heavy transfers, then
check dmesg for a message from firewire_sbp2 about the activated workaround +
firmware revision + model ID of the drive. We can then use these values to
permanently enable the request size limit for this device, so that it will work
out of the box.
(BTW, I have two devices which appear to be LSI based too, but these are both
CD-RWs and they are sealed devices so that I can't swap in HDDs for test
purposes. I occasionally use them for tests like CDDA extraction or CD burning,
but they have never shown a problem like this. Apparently those applications
never reach the dangerous request sizes or the devices have other revisions of
the bridge or of the firmware.)

(In reply to comment #7)
> If you are reasonably sure that the workaround fixes the heavy transfers, then
> check dmesg for a message from firewire_sbp2 about the activated workaround +
> firmware revision + model ID of the drive. We can then use these values to
> permanently enable the request size limit for this device, so that it will work
> out of the box.
I guess these are the lines you might need:
firewire_sbp2: Workarounds for fw1.0: 0x1 (firmware_revision 0x002600, model_id
0x000000)
firewire_sbp2: fw1.0: logged in to LUN 0000 (0 retries)
scsi 17:0:0:0: Direct-Access LSILogic SYM13FW500-Disk 1.00 PQ: 0 ANSI: 0
I'm going to use this device, so if something new happens I'll report.
I guess in few days this bug could be closed.
> (BTW, I have two devices which appear to be LSI based too, but these are both
> CD-RWs and they are sealed devices so that I can't swap in HDDs for test
> purposes. I occasionally use them for tests like CDDA extraction or CD burning,
> but they have never shown a problem like this. Apparently those applications
> never reach the dangerous request sizes or the devices have other revisions of
> the bridge or of the firmware.)
For the casual observer: why the old stack does not show the problem (I mean the
udev one, bug #429430)? Is it automatically limiting the transfer size?
pg

> firewire_sbp2: Workarounds for fw1.0: 0x1 (firmware_revision 0x002600,
> model_id 0x000000)
> firewire_sbp2: fw1.0: logged in to LUN 0000 (0 retries)
> scsi 17:0:0:0: Direct-Access LSILogic SYM13FW500-Disk 1.00 PQ: 0 ANSI: 0
Thanks, we will add these markers to fw-sbp2's internal quirks list.
> For the casual observer: why the old stack does not show the problem
> (I mean the udev one, bug #429430)? Is it automatically limiting the
> transfer size?
The old sbp2 driver always had the limit set to the low level of that
workaround. Therefore we never noticed whether there may be more devices out
there which need the transfer size limitation.
However, I sent a patch in to Linux 2.6.25-rc1 which adjusts sbp2 to use the
same default limit which firewire-sbp2 does (i.e. the Linux SCSI stack's
defaults). So, if it weren't for your report, all users with HDDs with
SYM13FW500 bridge would encounter that problem from Linux 2.6.25-rc1 onwards
with the old drivers too.
(Note to self: My two presumably LSI based CD-RWs have firmware_revision
0x000038, model_id 0x000000, and firmware_revision 0x000035, model_id 0x000000
respectively.)

(In reply to comment #16)
> This message should automatically appear (without specifying a module parameter)
> if the kernel contains the patch from comment #10.
Well, it does not, so do I see the following possibilities:
1) I got the wrong kernel
2) The patch is not there/not properly
3) The patch is wrong
4) Somewhere else: for example the firmware revision is not propagated properly
and what we see in the logs is not what the workarounds section receives.
I can double check 1) :-)...
In the rpm changelog I see (note the -37, I've the -38, but this includes the
previous one, I hope):
* Fri Mar 14 2008 Jarod Wilson <jwilson@redhat.com> 2.6.24.3-37
- Resync firewire patches w/linux1394-2.6.git
- Add firewire selfID/AT/AR debug support via optional
module parameters
- firewire: fix DMA coherence on x86_64 systems w/memory mapped
over the 4GB boundry (#434830)
One more thing, in order to re-read the option "workarounds", it is not enough
to detach/attach the sbp2 bay, I've also to remove the firewire-sbp module.
If I remember correctly, previously was enough to reset the 1394 bus (cycle
power on the sbp2 bay or detach/attach).
Is there any way I can check/confirm 2) and or 3)?
Some debugging option or similar?
I think 4) is beyond my possibilities.
Thanks.
pg

D'oh, my mistake. I didn't check closely enough. The patch with the workaround
for your device didn't actually make it into the f8 kernel yet. Its definitely
there for rawhide, and I'll get it into F8 properly soonish.

> One more thing, in order to re-read the option "workarounds",
> it is not enough to detach/attach the sbp2 bay, I've also to
> remove the firewire-sbp module.
The echo command from comment #1 should be effective during runtime for each
plugged in (or re-plugged) device.
If you alter the module parameter by means of /etc/modprobe.d/some_file or
/etc/modprobe.conf, then the module needs indeed to be reloaded to become aware
of the altered parameter.

(In reply to comment #18)
> D'oh, my mistake. I didn't check closely enough. The patch with the workaround
> for your device didn't actually make it into the f8 kernel yet. Its definitely
> there for rawhide, and I'll get it into F8 properly soonish.
Oh, OK, no problem.
As soon as it pops up I'll give it a try and report here, so you can close this
bug (or can I close it?).
pg

Got the patch into 2.6.24.3-39.fc8, but I haven't fired up a build (waiting for
other kernel guys to add more stuff before a new build). I think that as the
original bug reporter, you do have permissions to close the bug. I'm somewhat
inclined to just close it anyhow, we're about 99.999% sure this fixes the
problem, given that manually specifying the workaround fixes it.

(In reply to comment #21)
> Got the patch into 2.6.24.3-39.fc8, but I haven't fired up a build (waiting for
> other kernel guys to add more stuff before a new build). I think that as the
Thank you very much!
I'll look closely the koji page :-)
> original bug reporter, you do have permissions to close the bug. I'm somewhat
OK, good.
> inclined to just close it anyhow, we're about 99.999% sure this fixes the
> problem, given that manually specifying the workaround fixes it.
Are you in hurry? :-)
Actually, I like to close bugs, too... ;-)
Anyway, if it is OK with you, I would prefer to give at least one try to the
coming kernel, then I'll close this bug and the other, #429430.
Thanks again for your support!
If you need some help, testing something else in the firewire stack (or others),
just let me know.
pg

(In reply to comment #22)
> > inclined to just close it anyhow, we're about 99.999% sure this fixes the
> > problem, given that manually specifying the workaround fixes it.
>
> Are you in hurry? :-)
Nah, I just like to try to keep the bug list I'm looking at as free of stuff
that I'm pretty sure is already fixed, or I get easily distracted. :)
> Actually, I like to close bugs, too... ;-)
>
> Anyway, if it is OK with you, I would prefer to give at least one try to the
> coming kernel, then I'll close this bug and the other, #429430.
Works for me. I'll just put the bug into NEEDINFO for now (then its not closed,
but its also off my main bug tracking view :).
> Thanks again for your support!
> If you need some help, testing something else in the firewire stack (or others),
> just let me know.
Absolutely. I think we're in pretty damned good shape on the storage side now,
might be getting back over to the dv side of the house soon... :)

OK, I tested quickly the new kernel, 2.6.24.3-40.fc8, and it seems to work fine,
hdparm & dd did not caused any errors and the workaround activation is reported
by dmesg (of course, without the modprobe.conf entry).
I'll close the bug, BUT one thing I noticed is that:
cat /sys/bus/firewire/drivers/sbp2/module/parameters/workarounds
returns 0, I hope this is correct.
pg

(In reply to comment #26)
> The 0 is correct there, because there are no globally enabled workarounds,
> they're only being activated for the one device. Glad we finally got it taken
> care of! :)
Yep, now you'll have to fix the DV thing! :-)
Anyhow, thank you very much for your support, well done!
pg