Thanks for the configuration tips; I'll give those a try tomorrow or Friday; hopefully that will establish a point of comparison.

Which kernel config do you want? I've got the one for the kernel that I primarily use with the iwlwifi driver compiled into the kernel, and the other experimental one that's the same except the iwlwifi driver is a module. I'd be happy to post either (or both).

Which kernel config do you want? I've got the one for the kernel that I primarily use with the iwlwifi driver compiled into the kernel, and the other experimental one that's the same except the iwlwifi driver is a module. I'd be happy to post either (or both).

jyoung ... well, for the sake of being able to debug this I would suggest you use just one. Now, wasn't there some issue with the kernel module not being able to load due to it being already compiled in kernel? This should never happen, and I think you should decide if the driver and firmware are to be in kernel or not as I'm inclined to think this issue is due to mistake on your part (possibly not running make && make modules, or not copying the kernel to boot, or your bootloader pointing to a previously compiled kernel ... something of that nature).

Without our understanding that the driver/module/firmware is working/loaded correctly its difficult to debug things in userland, so this needs to be straightened out first. I'd suggest recompiling the kernel from a clean slate (at least a 'make clean')

You would then check the bootloader has an entry for this specific kernel, and is the default kernel at boottime (or selected at bootime). You should then check to see if the driver is loaded (via dmesg) or modpobe'd (without error). This would then be the kernel config to pastbin, and the kernel that all future information is in sync with.

khayyam, I do kind of agree that it sounds like I've missed a step (forgotten to copy over the kernel, booted off the wrong kernel, setup grub.conf to boot off the wrong kernel, etc.), and that was my reaction when I started getting this error. But, I've gone through the steps, and I'm not missing one that I'm aware of. That said, I'd be happy to post a line-by-line breakdown of what I'm doing when I recompile my kernel, as well as grub.conf

When I posted back on Sept. 7, I mentioned an even more zealous experiment that I tried, going back to the original config file and making only one change - setting iwlwifi to be compiled as a module - and even then the problem was present. It seems really weird that a fresh setup would have this problem.

When I went to wpa_supplicant after having problems almost every time a kernel or wicd/networkmanager version changed, I uninstalled wicd and/or networkmanager, and also manually removed ALL related config files I could find that emerge -C seemed to miss. Then I removed them from all run levels.

Since then, about 6 months ago, I've never had a single connection problem that wasn't directly due to my router or cable modem location (just too many walls and appliances between it and my computer, and lousy cox cable connection). This is with three Gentoo installs, and various other distros like Arch, Mint, Ubuntu, Pclos, Mageia, Kubuntu, and a few others.

It sees your wireless, just iwconfig needs the legacy wireless extensions which are disabled by default for any modern nl80211 based driver.
Use iw instead of iwconfig, or if you depend on wireless-tools for whatever reason enable CONFIG_CFG80211_WEXT in your kernel. wpa_supplicant should be passed the -Dnl80211 parameter.

Looking at this thread is worth it, as it's full of iwlwifi details discussed by long time users, and one Gentoo dev.

I re-read your thread and didn't see any CONFIG_CFG80211_WEXT reference (maybe I missed it), but since apparently iwlwifi needs CONFIG_CFG80211_WEXT enabled but it's disabled by default, maybe you missed it? Whether or not it would help in your case is another question, but if it's disabled it might be worth a shot.

Sorry, I've been a bit slow in responding, I haven't had much time to go over the .config.

There seems to be two issues here, the first is the driver loading as a module and the second is the disassoc (which I think may, as wrc1944 suggests, have some relation to CONFIG_CFG80211_WEXT).

Having taken a close look at the above .config I think perhaps the reason the module is seen to be loaded when calling modprobe is as I first suggested some posts back. The debug options for iwlwifi are enabled (CONFIG_IWLWIFI_DEBUG=y, CONFIG_IWLWIFI_DEBUG_EXPERIMENTAL_UCODE=y, and CONFIG_IWLWIFI_DEVICE_TRACING=y), the 'ucode' option "enables the use of experimental ucode for testing and debugging", CONFIG_FIRMWARE_IN_KERNEL=y and CONFIG_EXTRA_FIRMWARE="iwlwifi-6000-4.ucode" are also enabled, so the ucode is in kernel, and I assume that this then treats the module as (partially) builtin. So, the solution seem to be to disable the DEBUG options, and not include the firmware in kernel. The firmware should then be loaded when the module is loaded.

As to the second problem: CONFIG_CFG80211_WEXT=y but the card uses -Dnl80211, so the use of WEXT may be unnecessary here (the posts wrc1944 links to explain better the possible issues ITR). Issues of disassoc are in my expereince often related to powersaving, and in that regard you have CONFIG_PM_RUNTIME=n (which allows I/O devices to be put into energy-saving states) and CONFIG_CFG80211_DEFAULT_PS=y (which enables powersave for 80211). My guess is that both of these should either be enabled or disabled.

best ... khay

EDIT: seeing that the card is a iwlwifi you might also want to check this thread

The last few posts have been great - a lot of ideas, and the two threads seem like that could be a real help with the wpa_supplicant issues. Unfortunately, I'm starting to think I have a more serious problem.

After trying a few things based on the recent posts, I decided to try compiling with . My thinking was that, perhaps, compling this into the kernel was somehow pulling iwlwifi in. At least it couldn't hurt to try. But, the newly compiled kernel didn't load the cfg80211 module at boot, and when I tried to load it manually, it said that it was already in the kernel - just like iwlwifi.

So, I tried two experiments: First, I switched CONFIG_TUN=y to =m. I've been using the tun universal tun driver for a while compiled into the kernel, but I wanted to see if it would produce the same results as iwlwifi and cfg80211 when compiled as a module. It did; the kernel couldn't load it because it was already in the kernel. Then, I switch CONFIG_B43=m. This is a driver for a broadcom wireless card; I don't have such a card anymore, but I used to, so I'm familiar with the driver. It hasn't been in the kernel or as a module in a while, but because I knew about it I decided to do the same test on it. This time, 'modprobe b43 --first-time' produced 'FATAL: Module b43 not found.' I checked, and that is the right name for the module. In fact, the module was compiled and lives in a subdirectory of /lib/modules/, the kernel just couldn't detect it.

I repeated this several times, making sure that I was compiling the right kernel. I confirmed that the parameters were set in .config after I ran menuconfig. I ran 'make clean' before running 'make && make modules_install'. And, I was certain to copy the new kernel each time over to /boot and that I was booting off the new kernel when I did these tests.

It's starting to look like I've done something horribly wrong that's preventing my operating system from correctly identifying which modules exist and are loaded. I think that I need to resolve this issue before I move forward with the wpa_supplicant situation; any ideas? I could see if there's a new kernel available and emerge gentoo-sources and start from scratch; maybe that would wipe out whatever's causing this issue.

grep CONFIG_CFG80211 /usr/src/linux/.config
CONFIG_CFG80211=m
# CONFIG_CFG80211_DEVELOPER_WARNINGS is not set
# CONFIG_CFG80211_REG_DEBUG is not set
# CONFIG_CFG80211_DEFAULT_PS is not set
# CONFIG_CFG80211_DEBUGFS is not set
# CONFIG_CFG80211_INTERNAL_REGDB is not set
# CONFIG_CFG80211_WEXT is not set

It's starting to look like I've done something horribly wrong that's preventing my operating system from correctly identifying which modules exist and are loaded. I think that I need to resolve this issue before I move forward with the wpa_supplicant situation; any ideas? I could see if there's a new kernel available and emerge gentoo-sources and start from scratch; maybe that would wipe out whatever's causing this issue.

jyoung ... yes, as I've said this is definitely not the normal. When running 'make modules_install', 'depmod' is called, and this should create a list of module dependencies (modules.dep). Why this would be subseqently missing information about available modules, and why then the kernel would think they are 'in kernel', I really can't say, but something is amiss.

Can you provide the installed package versions of the following: module-init-tools (or kmod if your using this inplace of module-init-tools), gcc, binutils, and glibc. Also, if you run 'revdep-rebuild -pv' does it show anything (specificly, zlib or modules-init-tools) that is broken? I'm clutching at straws as I'm not sure where else to start looking.

(since the first line is just the output header). But, only one loaded:

Code:

lsmod|wc -l
2

The module that's loaded is the nvidia graphics card driver, which seems to load as a module regardless of what I do to the other drivers. This could be an important clue - I haven't messed with that one in over a year, so it seems likely that whatever happened to scramble the kernel's list of modules is more recent than that.

One experiment I could try would be to switch nvidia to be compiled into the kernel, and then see if I can switch it back. But, I dare not do this since a failure would cripple my ability to launch X.

jyoung ... and the Changelog for kmod-9-r2.ebuild (17 Jul 2012) states "remove broken version". So, this package has been out of tree for some time, probably this is the cause as the other packages (bar gcc which is ~arch), and the output of revdep-rebuild, are fairly sane (though you should probably revdep-rebuild and fix geoclue).

khayyam, I'm happy to make those changes, but if kmod is not longer in the tree then maybe I should switch to something else - you mentioned module-init-tools? I'm not actually aware of what the merits of either one are; I'll look them up today when I get to work.

The output of ls -l /lib/modules/$(uname -r) is quite revealing. These files should all have the same date yet module.dep is dated Aug 1 21:15 along with several others with an August time stamp. This indicates that these August files were not updated when you compiled the kernel and would explain the problem with loading modules.

I see khayyam has already advised on a course of action so we'll leave it at that for now and see what happens _________________Good luck

I'm happy to make those changes, but if kmod is not longer in the tree then maybe I should switch to something else - you mentioned module-init-tools? I'm not actually aware of what the merits of either one are; I'll look them up today when I get to work.

jyoung ... no, I said that particular package-version was removed, kmod is still in tree with kmod-10 being the current version (note kmod isn't stablised, so these packages are keyworded ~arch). I made it clear that the package could be updated when I wrote "edit /etc/portage/package.accept_keywords if your keywording on that particular version, and emerge --update --oneshot kmod" ... I suspect you have '=sys-apps/kmod=9-r2' in package.accept_keywords which would explain why you still have the broken package and why it wasn't updated after it was removed.

Hey, thanks folks for the fix! With the update to kmod I recompiled my kernel and, after rebooting, iwlwifi loaded just fine as a module.

With that more serious problem out of the way, I'm interested in the wpa_supplicant issue. With iwlwifi as a module, I was able to use the '-Dnl80211' option instead of '-Dwext'. Unfortunately, that doesn't seem to have affected my connectivity. I'm still getting a spotty connection that drops after a few minutes on the secured network (at work). With the modifications to wpa_supplicant.conf that khayyam suggested last week (thanks), I did the experiment of connecting to other secured networks. Surprisingly, that worked just fine. It seems like it's just this one network (or kind of network) that's giving me trouble.

There's a couple of ideas that some of you posted about on October 7th. I'm planning on exploring those this afternoon, and I'll post back when I have results.

Hey, thanks folks for the fix! With the update to kmod I recompiled my kernel and, after rebooting, iwlwifi loaded just fine as a module.

jyoung ... ok, good.

jyoung wrote:

With that more serious problem out of the way, I'm interested in the wpa_supplicant issue. With iwlwifi as a module, I was able to use the '-Dnl80211' option instead of '-Dwext'. Unfortunately, that doesn't seem to have affected my connectivity. I'm still getting a spotty connection that drops after a few minutes on the secured network (at work). With the modifications to wpa_supplicant.conf that khayyam suggested last week (thanks), I did the experiment of connecting to other secured networks. Surprisingly, that worked just fine. It seems like it's just this one network (or kind of network) that's giving me trouble.

This suggests that the issue isn't with powersave, this would effect all connections and not simply one network. Is the problem AP's ESSID hidden, or are there multiple AP's with the same ESSID (some sort of WDS setup)? If so try adding "ap_scan=2" to wpa_supplicant.conf. Also, what sort of network is this (A,G,N ... A,G ... N only), some AP's have issues with N clients. You can disable N by passing "11n_disable=1"

/etc/modprobe.d/iwlwifi.conf

Code:

options iwlwifi 11n_disable=1

or via modprobe ...

Code:

# modprobe -r iwlwifi && modprobe iwlwifi 11n_disable=1

jyoung wrote:

There's a couple of ideas that some of you posted about on October 7th. I'm planning on exploring those this afternoon, and I'll post back when I have results.

If you are talking about kernel options re powersave then this is looking less likely. I'm inclined to think the issue is something specific to the AP.

I agree; yesterday and today I tried first disabling CONFIG_CFG80211_DEFAULT_PS and then disabling CONFIG_CFG80211_WEXT. There might be a *slight* improvement, but with spotty connections, it's hard to say. I'd have to collect data for a few days on how long I remain connected before I get dropped, but in any case the connectivity isn't great and it certainly isn't much of an inprovement over what it was before.

I'm in a situation where there are multiple AP with the same ESSID. What is the functional meaning of 'ap_scan=2'?

I'm actually not sure what kind of network it is - can I tell from iwlist? I found the thread from the October 7 post really interesting. It would be too bad if I couldn't take advantage of N speed, but at this point I'm more interested in having a reliable connection.

I agree; yesterday and today I tried first disabling CONFIG_CFG80211_DEFAULT_PS and then disabling CONFIG_CFG80211_WEXT. There might be a *slight* improvement, but with spotty connections, it's hard to say. I'd have to collect data for a few days on how long I remain connected before I get dropped, but in any case the connectivity isn't great and it certainly isn't much of an inprovement over what it was before.

... this may not be the case with iwlwifi, but the point is there is a common symptom of disassoc with powersave. Again, I don't think this is the reason your connection to that one AP drops, as the symptom would be seen with other AP's.

jyoung wrote:

I'm in a situation where there are multiple AP with the same ESSID. What is the functional meaning of 'ap_scan=2'?

OK, I'm fairly certain this is whats causing the disassoc. 0 = no scanning, 1 = wpa_supplicant requests scan and uses the results to select the AP, 2 = wpa_supplicant does not scan but just requests to associate. When the AP doesn't broadcast its ESSID or there are multiple BSSID's for the same ESSID, then scanning either isn't going to provide any usefull information (as in the case of a hidden ESSID), or in the case of WDS the BSSID will not allow wpa_supplicant to disambiguate the network. I'm not altogether sure what happens but I believe that with ap_scan=2 wpa_supplicant will be less likely to get confused when the BSSID doesn't match the ESSID. Its actually an area I need to do more research on.

jyoung wrote:

I'm actually not sure what kind of network it is - can I tell from iwlist? I found the thread from the October 7 post really interesting. It would be too bad if I couldn't take advantage of N speed, but at this point I'm more interested in having a reliable connection.

You mean if the AP is set for N only or mixed A,G,N? Well, I imagine you would see "HT40" for 802.11n and "HT20/HT40" for mixed, in the output of 'iw dev wlan0 scan'. Anyhow, you should really read a little about how 802.11n produces the high throughput, and the reality of the claims being made for it.

Well folks, I think disabling N did it. I tried it on Monday, and had success for most of the day. At the end I lot and regained the connection many times in 10 minutes, right before I had to leave. Because of this, I gave it another trial today, and had no issues. I can report back in a few days to confirm and log this thread as solved, but it's looking good. khayyam, good call (an interesting link).

I also tried 'ap_scan=2' on Monday. Oddly enough, this prevented me from connecting at all.

Also, before we close this out, wrc1944 mentioned a while back having issues with wicd or networkmanager configuration files around. I should probably ask, has anyone had issues with high-level packages depending on networkmanager? During my last world update, networkmanager kept getting pulled in because things like banshee depend on it. This kind of baffles me, and it's rather annoying since I don't really like networkmanager. Nothing against it, but it can't run without X (last time I checked), and I have had issues involving it interfering with other software (iwconfig, in that case).

I also tried 'ap_scan=2' on Monday. Oddly enough, this prevented me from connecting at all.

I see, I should have perhaps also mentioned providing 'scan_ssid=1' in the 'network={}' block. This should make wpa_supplicant omit the broadcast, and skip straight to associate (note if doing this you need to provide 'key-mgmt=', 'group=', 'pairwise=', and 'proto=' for that network). As I said, this is an area I need to research more, but the above should work (at least it has for me), though I tend to keep only one network 'enabled' (meaning, the AP I intend to connect to is 'disabled=0' and the others are 'disabled=1' which may play some role in the matter).

jyoung wrote:

Also, before we close this out, wrc1944 mentioned a while back having issues with wicd or networkmanager configuration files around. I should probably ask, has anyone had issues with high-level packages depending on networkmanager? During my last world update, networkmanager kept getting pulled in because things like banshee depend on it. This kind of baffles me, and it's rather annoying since I don't really like networkmanager. Nothing against it, but it can't run without X (last time I checked), and I have had issues involving it interfering with other software (iwconfig, in that case).

@jyoung is networkmanager in your USE flags? if so any package that has it as option ..
some packages may have a builtin non-fatal conditional/preferential dependency at runtime to call networkmanager_________________Defund the FCC.

It's been over two weeks, and my connection is still solid. I'm going to list this thread as solved. Thanks for all the help!

DONAHUE, I did find networkmanager in my USE flags, thanks!

To sum up for any future readers of this thread, the instabilities in the wireless connection were resolved by forcing the driver to not connect at N speeds, even though the wireless card is an N-speed card.