Context Navigation

WD Book not working

Description

Hi Lars, hi Greg,

first of all many thanks for your hard work.
I have used all your builds (1-8) in the past. I mainly attach sometimes a WD MyBook external hard drive and had no problems so far. The only difference is that with your driver my backup is much faster.

But yesterday i found out, that there is another problem. Starting with build 7 of the driver my camera is no longer able to connect to the pc. After some minutes, the camera thinks that the connection is established, but i can't use the drive from OS/2. Additionally the boot process has changed. The boot stops for nearly one minute during loading ndfs32.ifs (NetDrive latest available version).

Both problems are gone if switching back to build 6.

I have a dual core AMD machine here with acpi 3.18 running.
If i can provide more info, please let me know.

In the meantime, I have received the WD MyBook? you have sent to me. Many thanks.
I can report that the WD MyBook? is properly detected on my system (VIA plug in PCI card with EHCI and UHCI companion controllers). I cannot properly access it, I am currently formatting it with NTFS to do additional checking (I would appreciate instructions on how to create a 465 GB JFS partition on it ...). Nonetheless it is correctly detected and can be seen as a USB device in USBREGMGR (USB Resource manager).In short: it works ok from a low level data transmission point of view.
1.) Is the WD book not usable at all ? Can you check with USBRESMGR if the drive shows up as a USB device or not at all ?
2.) Can you please try the version of USBEHCD.SYS attached to ticket #6 ?
3.) Can you please try to operate your system with the OS2APIC.PSD driver ? Try to run with OS2APIC.PSD /APIC. I can give additional help on additional switches if that won't work.
4.) Can you please try to operate your system with ACPI.PSD /SMP /PIC ?

Note: my system is a single core system. Nonetheless, I am running the SMP kernel with OS2APIC.PSD /APIC as the SMP kernel is the standard eCS configuration.

0.) In the meantime I have formatted the MyBook? as one single NTFS partition. I copied files to it (using WinXP). Under OS/2, using the NTFS.IFS driver, I could properly read those files.
1.) NeoWPS widget is broken. Use the old USBMSDD tool (the one that was used with eCS 1.2. It still exists, let me know if you don't remember how to set it up).
2.) Read the notes in Ticket #6 about the beep. Can you hear a low beep or a high beep ?

If you hear a high beep, you properly get through the "BIOS handoff" phase which seems to be a problem with other systems.

One additional thought: as a test boot your system WITHOUT any PSD driver at all (you can still use the SMP kernel though). You will end up with only using one core. Check if you can then properly use USB.

Maybe it is another option to do some kind of manual delta debugging (​http://de.wikipedia.org/wiki/Delta_Debugging).
Because we know that one of the changes between build 9 and 10 introduces the problem, maybe you can build different intermediate versions from the svn. Then i can test this and so we can isolate the change that introduces the problem. What do you think?

I think it will be more complicated than that. Changes in the USBOHCD driver might have a repercussion on the correct operation of the USBEHCD driver as the USBOHCD driver is the "companion" to USBEHCD on your system. And I have experienced the same issues on my system.
Furthermore, USB is very timing sensitive.
I think I will need to release a complete set of drivers and you will need to check with new USBOHCD and USBEHCD.

1.) Can you unplug and replug the WD book and see if it is then recognized ?
2.) Does the WD book show up in USB dock ?
3.) Does ANY USB 2.0 device (memory stick for example) show up in USB dock or is only WD book affected ?
4.) Can you turn off USB 2.0 in BIOS ? If yes, please do and tell me if then WD book is detected and can be operated. I need to check if there is a general problem in handover between HC (EHCI) and companion HC (OHCI in your case). There will be if 3.) indicates that no USB 2.0 device can be properly operated at all.

1.) Can you unplug and replug the WD book and see if it is then recognized ?

still not recognized

2.) Does the WD book show up in USB dock ?

no

3.) Does ANY USB 2.0 device (memory stick for example) show up in USB dock or is only WD book affected ?

The only device that works is the camera, i tried two usb sticks also but these are also not visible

4.) Can you turn off USB 2.0 in BIOS ? If yes, please do and tell me if then WD book is detected and can be operated. I need to check if there is a general problem in handover between HC (EHCI) and companion HC (OHCI in your case). There will be if 3.) indicates that no USB 2.0 device can be properly operated at all.

Turning off USB 2.0 does not help. But without 2.0 the camera is also not working.

Can you just do the opposite: use my latest USBEHCD.SYS and use USBOHCD.SYS 10.162 ?
Yes please take trace (make sure you also use the correct TFF files when you mix drivers).
I am still not clear if it's the USBOHCD.SYS or the USBEHCD.SYS driver that is causing the problem. If you disable USB 2.0 (EHCI) the camera and als WD book should fall back to using USBOHCD.SYS. But that also seems to fail. Is the camera a USB 2.0 device or a USB 1.x device ?

Can you just do the opposite: use my latest USBEHCD.SYS and use USBOHCD.SYS 10.162 ?

no change; the only detected device is the camera

Yes please take trace (make sure you also use the correct TFF files when you mix drivers).

I do not have the org 10.162 tff files any longer.

I am still not clear if it's the USBOHCD.SYS or the USBEHCD.SYS driver that is causing the problem. If you disable USB 2.0 (EHCI) the camera and als WD book should fall back to using USBOHCD.SYS. But that also seems to fail. Is the camera a USB 2.0 device or a USB 1.x device ?

Have disabled 2.0 in the bios and in a second try also removed the driver from config. In both cases no device was detected (also the camera not).

try new rbri.zip. It it doesn't work don't send any traces. If it still does not work, I will need to add additional tracepoints to the driver to really add any benefit to tracing for the problem at hand. As far as I understand USB 2.0 camera works but WD book + other memory sticks do not. That is clearly a timing problem.

try new rbri.zip. It it doesn't work don't send any traces. If it still does not work, I will need to add additional tracepoints to the driver to really add any benefit to tracing for the problem at hand. As far as I understand USB 2.0 camera works but WD book + other memory sticks do not. That is clearly a timing problem.

No change with the new version :-(

BTW: Can you verify the used driver versions from the traces. After all this changes during the last weeks, sometimes i fear i have a wrong version running (just to be sure)

The driver currently is not under version control. On the other hand I only have been changing USBEHCD.SYS. Always throw away older versions.
1.) please take a trace for USBEHCD.SYS ONLY. Do not trace USBOHCD.SYS.
2.) only attach USB 2.0 devices. In fact only attach the WD book and nothing else. That will give me a chance to see how the port the WD book attached to behaves: is it powered on ? Does it change to enabled on a port reset ? etc.
3.) as an additional step for you to test if it has an influence: comment out all USBOHCD.SYS from config.sys. See if that changes anything.

Sorry but tracing is the only way for me to gather enough info from your system to really see what's going on. Unfortunately USB HCs from different manufacturers all behave differently. In short: we might need to go additional rounds.

Sorry but tracing is the only way for me to gather enough info from your system to really see what's going on. Unfortunately USB HCs from different manufacturers all behave differently. In short: we might need to go additional rounds.

For me that is not a problem. My fear is more that it is no fun for you.

I am not sure if it will fix the problem but:
can you change config.sys lines to read:
BASEDEV=USBEHCD.SYS /S:1 /V /FS
BASEDEV=USBOHCD.SYS /V /FS
BASEDEV=USBOHCD.SYS /V /FS

In particular the /S:1 is important. You can also vary it from 1,2,4,8,16,32,64 and see if that makes a difference. Also /FS might be necessary so that on shutdown the ports are reset. Just experiment, start off with the lines given above.

Your trace output looks like what I would expect and is comparable to what I see on my system therefore device attach/detach as such does not seem to be the problem. I will need to add additional tracing to see if the HC is actually executing or just sitting there doing nothing.
Another cause of problem might be that the ISR is never executed (again, I would need to add tracing to the ISR to see if it fires).
Is it possible that there is an IRQ conflict in your system ?
Your initial pci.exe output shows that the EHCI controller is on the same IRQ as some other device. As a test, try to use no PSD at all (which will leave you with one CPU only), or use OS2APIC.PSD /APIC or use latest ACPI.PSD 3.19.14.

Also, can you completely power down your system and power back up ? I have to do that on my development machine as power down via front button is NOT the same as powering it off and back on via power switch on the rear.

I still think I screwed up something about root hub handling. Please try with new rbri.zip.
As an additional test: plug in your WD book into a port. Hopefully it'll work. Then unplug the WD book and plug in a USB 1.x device (USB mouse etc.) into the SAME port. It should then also work. You should be able to go back and forth.

I still think I screwed up something about root hub handling. Please try with new rbri.zip.
As an additional test: plug in your WD book into a port. Hopefully it'll work. Then unplug the WD book and plug in a USB 1.x device (USB mouse etc.) into the SAME port. It should then also work. You should be able to go back and forth.

No change; do you like to have a new trace.

By the way: did you change anything about your system ? Installed additional memory and the like ?

I switched from the acpi 3.18 to ACPI.PSD 3.19.14. But at the moment i have to run this one with some special parameters, because of traps (​http://svn.netlabs.org/acpi/ticket/524). Nothing else changed.

To be honest: I am clueless of why it won't work. The camera still works, correct ?
If yes, it cannot be an IRQ problem ...
Have you completely powered off and repowered the system ? Are you using this system with other OSes ?
Also as a test: Please check without any PSD loaded. This works also with SMP kernels (but only 1 processor will be active).
Also, comment out USBOHCD.SYS from config.sys and try with USBEHCD.SYS only.
Also, can you try to plug your WD book into the same port that you plugged the camera ?

I now backlevelled the only significant thing for EHCI that changed in between usbhcd9 and usbhcd10. Retry rbri.zip.
If it works I will have a conflict: I had to change EHCI BIOS handover due to ticket #6. I will have to find a solution that satisfies both ...

Yes, beeps stem from using /FS. The beeps have different pitches with increased frequency from USBUHCD.SYS - USBOHCD.SYS - USBEHCD.SYS. In short: USBEHCD.SYS gives a very high beep, USBOHCD.SYS (in your case) a lower beep. There is a beep per driver instance, for you that's 1 high beep from USBEHCD.SYS and 2 lower beeps from USBOHCD.SYS.

In comment 4 you state that usbhcd9 (version 10.171) works ok whereas usbhcd10 (version
10.172) does not. Are you now saying that usbhcd9 did not work ?

1.) Look back at usbhcd6 (10.168, SVN: 93) and usbhcd7 (10.169, SVN: 116):
moved PCIPMPowerUp from INIT_COMPLETE to INIT. It used to be right before EHCIStopBios. Reconsider also for USBUHCD.SYS + USBOHCD.SYS ?
2.) reconsider setting just PORTSC_P_OWNER instead of (PORTSC_P_OWNER | PORTSC_WKDSCNNT_E)
3.) restore EHCI handover, change didn't help

How much RAM do you have installed ? More than 2GB, correct ? How much ? If it is not too much hassle (and as a test only, a fix will follow): reduce your RAM to <= 2 GB and retry with latest drivers.
I have a suspicion that all this centers around a nasty NVidia chipset EHCI HC related HW bug.
I DID make a change in dynamic memory allocation in between usbhcd9 and usbhcd10. But I thought it wouldn't turn out to be that relevant ...

0.) please let me know how much RAM your system has. I'd expect it to have at least 4 GB.
1.) send trace in any case (if fixed or not)
2.) reget device report from USB dock for the camera. It should now announce itself as a USB 2.0 device (with a non-functional USBEHCD it announces itself as a USB 1.1 device ...)
3.) let me know if all your USB 2.0 devices now work ok
4.) it should now work with/without ACPI.PSD regardless. The error was somewhere else, see below.

If it works: The error was so fundamental that about everything could have happened, in particular, the EHCI HC DMA engine could overwrite/corrupt memory in the system arena with whatever side effect that can have. You should retry to run ACPI.PSD without any parameters (unless you also had problems with USBEHCD.SYS 10.162). If it now works, report back to the ACPI developer [[BR]]
Be aware that your EHCI HC HW has a severe bug and what I did was to work around it. The negative side effect is that more memory <= 16 MB physical address is now used. This could lead to the well known problem that if you have many device drivers loaded (where for historical reasons most of these dynamically allocate below the 16 MB phys. addr. boundary) that some device driver might refuse to load. The USBEHCD.SYS device driver is now behaving like 10.162 in that respect.

How much RAM do you have installed ? More than 2GB, correct ? How much ? If it is not too much hassle (and as a test only, a fix will follow): reduce your RAM to <= 2 GB and retry with latest drivers.
I have a suspicion that all this centers around a nasty NVidia chipset EHCI HC related HW bug.
I DID make a change in dynamic memory allocation in between usbhcd9 and usbhcd10. But I thought it wouldn't turn out to be that relevant ...

To late, i have disassembled the PC already.
I have 4GB RAM.
What i did:

The latest trace you have taken: do I interpret correctly that you have now taken this trace with 2 GB of RAM but with the OLD driver (in other words: not my latest version of rbri.zip) ?
For your info: I am looking at "EHCIResetHost Trace Exit": HCOR.fmListBaseAddr and HCOR.nextAsyncListAddr. For your system with the quirky NVidia chipset, it is essential that these physical addresses are <= 2 GB if you have >= 4 GB of RAM installed (which you normally do as you normally have 4 GB installed).
On the next trace (taken with new rbri.zip) you should then see physical addresses <= 16 MB (that is <= 0x1000000). Currently I see 0xc7e3D000 and 0xc7e47000, that is something around >= 3 GB.

I don't pretend to understand it. I will do some quick hack to step through the memory allocation routines. Please check in the next couple of days if everything is working as expected. If something goes bad again just report back. For the time being I leave the bug open. Just let me know when I can close it.