Asus RT-AC66U + shibby = WAN limited to 100M?

I'm running shibby's Tomato (1.28.0000 MIPSR2-121 K26AC USB AIO-64K) on my Asus RT-AC66U. I had my ISP upgrade my line to a 1G (1000M) connection. But, when they set the line to 1G, the connection goes offline and won't come back online.

The Asus RT-AC66U is capable of 1G speeds, and I've swapped out all my ethernet cables with new (tested) Cat5e cables that can sustain 1G speeds.

But, if I go to Advanced > Miscellaneous > WAN Port Speed the drop down menu only lists AUTO and port speeds up to 100M FULL (1000M FULL isn't listed). Is this drop down list only showing "available" selections? What I mean is, since the WAN port is only reporting a 100M FULL port state (because my ISP has the connection set to 100M on their end), is that why there's now 1000M available in the drop down? Or, should 1000M be available?

I have a really good feeling this is an issue with my ISP (and not my hardware), but just wanted to get other people's opinion. The reason I say I have a good feeling it's an issue on their end is, we also ran a test connecting direct to one of my machines (ISP's ethernet direct into the 1G ethernet port on my computer), and same thing happened—port link speed would be 100M initially (which is correct, because the ISP has the connection set to 100M), but as soon as they set it to 1G (1000M), the port would go offline. But, I know the port on that computer is capable of 1G (as I have it attached to my switch stack right now and the link speed is 1G).

As well, I checked another router I have (an ASUS RT-N16) and it's also capable of 1G on the WAN. But, it's only connected to a 100M line—and same thing—Advanced > Miscellaneous > WAN Port Speed the drop down menu only lists AUTO and port speeds up to 100M FULL (1000M FULL isn't listed). Thus, I'm assuming Tomato limits what's in that drop down to "available" speeds.

To expand a bit on one point: when your ISP says "they set the line to 1G", you should ask them exactly what that means. With 1000mbit (gigE) you are not supposed to hard-set the speed or duplex; 802.3ab mandates it be negotiated between both ends of the link. This may be why you don't see a pulldown option in TomatoUSB (or it could be a bug/missing feature). There is equipment out there (some switches, etc.) which let you force the speed anyway.

Now, that said: if your ISP is hard-setting the speed and duplex, then for this to work correctly, you also need to hard-set the speed and duplex to the same thing. This is the only reliable way to properly work around 802.3 speed/duplex negotiation problems. Let me be crystal clear: you CANNOT have one end using auto and the other end using hard-set speed/duplex. On classic 10mbit and 100mbit networks this would often result in variable behaviour (depends on driver, firmware, and actual PHY capabilities), but most commonly it would result in either wrong speed negotiated or (much more common) incorrect duplex -- and consistently one end would claim something (utilities showing duplex/speed reporting "100/full") even though the actual PHY itself is not operating at that. In the case of 1000mbit networks, if one end is hard-set and the other is auto, it simply won't work: period. And that's certainly what you're experiencing.

GigE solved all of this nonsense by requiring autonegotiation. And it works about 99% of the time. I have only seen on very rare occasion a case where one of my LAN workstations (which neg'd 1000mbit/full correctly every time) "randomly" fail back to 100mbit. And I believe that to be purely a Realtek driver bug, because moving to a different brand of NIC (Atheros or Intel) resolved the problem permanently (no cabling changes).

What you may be experiencing, however, is PHY vendor incompatibility between your RT-N16 and whatever PHY type/brand your ISP is using (on whatever you connect to -- it may be a fibre-to-copper bridge somewhere between you and their plant, I simply do not know. They must be using some kind of repeater or otherwise, because you can't run Ethernet for miles, as you know... so there must be a "middle-man" device somewhere in the mix). They need to ensure that they're touching the right device/repeater -- if they mess with one further up the chain, you wouldn't see it (this happens all the time in the enterprise world, where you have router X <--> media converter <--> repeater <--> media converter <--> switch <--> router Y, and some idiot thinks the topology is just router X <--> router Y and thinks that changing things on router Y will somehow magically fix a link problem between router X <--> media converter. Bzzt, wrong!)

I have personally witnessed this problem on RT-N16 H/W revision A1 units when connected to very specific versions and models of Motorola cable modems. It's safe to say that Motorola changed the PHY vendor within their SB6120 products at some point. I documented all of this (and if you read the linked thread, please read it very slowly and in full. (Sadly) The way I wrote it, it's easy to overlook/misunderstand). The original (non-HW-rev-A1) units behaved fine, so like I said, it's a vendor incompatibility thing. It happens.

Chances are your ISP isn't going to be able to replace the PHY on whatever GBIC or mini-GBIC they have you connected to somewhere, so you may end up having to try another model of router, or possibly a different H/W revision of RT-N16 (it's only printed on the bottom of the router, not on the box).

Otherwise, as ridiculous as it sounds, you might try getting an inexpensive 4-port switch and put it in between your connection from your ISP and your router (ex. Ethernet-from-ISP <--> 4-port switch <--> RT-N16 WAN port). The problem with this solution, aside from needing more hardware + AC outlets, is that the only visibility you have into the internal state of the switch are stupid LEDs (many of which just tell you speed and not duplex). You may try it and find that the switch shows 1000mbit to both you and the ISP but still doesn't work, for example (this is one of many areas where consumer-grade hardware fails the team). I make no promises if it'll work, but it's one way around vendor incompatibilities. Otherwise possibly consider investing in a managed gigE switch; HP ProCurve products tend to be reasonably-priced and can give a high amount of detail even at the link level. Oh, and remember: if you try this workaround, the MAC address your ISP sees will change (if they track or register that kind of thing, sometimes for ACLs); they won't see the MAC address of your RT-N16 any longer, only the MAC is that's associated with the intermediary switch. Do not use MAC address cloning or spoofing of any kind, all this does is make the situation worse. If you need a product recommendation, I've had good experience with the (8-port) D-Link DGS-2208, but I think it's been EOL'd.

P.S. -- I really hope you don't think you're going to get gigE speeds out of your RT-N16 if using the WAN port because you won't. There isn't enough processing power to handle that amount of packets per second; consumer-grade routers are not made to handle that kind of speed. I think even the higher-end RT-N66U tops out at something like 600mbit. There really isn't any consumer-grade router that I know of which can do this; this is why people using, for example, Verizon FiOS end up using the router/device that Verizon gives them. Otherwise if you really need that kind of speed, you need to start talking to SMB or enterprise-grade companies like Juniper or Cisco or SonicWall to find out if they offer routers that can handle gigE traffic with NAT + whatever features you demand (DHCP server, etc.). Expect prices in the 4-digit range (USD).

We initially attempted the connection via auto-negotiating, but it would continually fall back to 100Mb. That's when they tried to force it at 1000Mb but I told them that wouldn't work because, on the router (using shibby's Tomato), 1000Mb isn't an option (only 10Mb/100Mb/Auto). That's when I plugged the connection directly into my MacBook, which I know has/can sustain 1000Mb. Initially, it was set to Auto, but again, dropped to 100Mb. Then we tried to hard-set to 1000Mb, but when we did this, we couldn't even establish a connection.

This is why I'm pretty sure it's more of an issue on their end. My initial feeling is that the run, from the switch in the basement to the server room is too long. Basically, they run the backbone of their network (they only provide internet to a small area here in Toronto) on fibre, installing their own switches in the building they "cover". From there, a Cat6 is run from the switch to your demarc (which is in my server room). I'm making a rough estimate there, but I'm pretty sure the run is over 300', so I'm thinking that the run length might be limiting the connection to 100Mb? But, that's only a guess?

It could also be a PHY incompatibility as well, though I honestly don't know enough about that topic to say either way (would be a shame though, if it was causing issues with both the RT-N66U and the MacBook). But, this will be something we'll need to look into further when the technician comes in.

Really, I was just trying to rule out my hardware/software before getting the technician in to check the run length, etc. In any of the Tomato builds I have running (RAF/Victek, Toastman and Shibby), none of them give the option to hard-set the WAN port to 1000Mb—all three builds only give the options I'd listed above (10Mb/100Mb/Auto). So, I was concerned that the Tomato builds were limited to 100Mb. But, as a test, on the weekend, I plugged the WAN port into my 1000Mb switch and as soon as I did, the WAN port status (on the router) lit up at 1000Mb. So, at least I've answered that question.

And just an FYI, the routers I'm working with are: Asus RT-N66U (RAF/Victek Tomato), an Asus RT-N66U (Toastman Tomato) and an Asus RT-AC66U (Shibby Tomato). I realize I won't get full 1000Mb, but our ISP recently started offering 1000Mb (unlimited transfers) lines for $100 less than the current 100Mb (1TB transfers) package we have. So, while I know we won't get the full 1000Mb, I was pretty sure (based on smallnetbuilder.com tests) any of these router would be able to get at least 500Mb speeds—and anything beyond that would be a bonus (basically, I figured, getting 5x the speed + unlimited transfer for $100 LESS than I'm paying now would a good deal <G>). And since I was already running at Asus RT-AC66U (Shibby Tomato) as my primary router (handling the 100Mb connection for over a year now without any issues), figured, it would be an "easy upgrade" (is it ever, though? <G>). I'd been running a Cisco 1841 before that (which was never stabled, and could only sustain around 80Mb, and I really hated the Cisco UI) and before that, I was running a SonicWALL (which I really liked, but it was quite old and could only sustain 30Mb). One day when the 1841 was giving me a (weekly) headache, I needed to take it offline. Usually, I'd temporarily swap it out with the SonicWALL, but on that particular day I'd had an extra RT-N66U, so as an experiment, I quickly configured the Tomato build I had on it (I've been running various version of Tomato for years on my personal gear) to match what the 1841 was doing and I just found that, in my situation, it out-performed the Cisco box (totally stable, and sustained 90–95Mb on the DL and about 85–09Mb on the upload). So, I just left it connected and "forgot about it".

Anyway, now that I've confirmed the RT-N66U (RAF/Victek Tomato) can do 1000Mb, I'll get the technician in to test their run to my demarc. We'll see what happens...

Re: cable length: for 1000mbit, 300 feet is bordering on too long for CAT5/CAT5e/CAT6/CAT6a/CAT7. The maximum length is 100 metres (328 feet), but degradation often happens earlier than that. You need to include the length of intermediary connections like patch panels and all of that; professionally I start worrying once the cable reaches around 275 feet.

Solutions for Ethernet length problems include:

a) Use multi-mode fibre (not single-mode! That's for super long distance, e.g. up to 80km), then use media converters on both ends (to convert from fibre to copper). The media converters are also responsible for link and PHY negotiation, so having good/high-end ones is necessary. They will also need to be AC-powered (inline ones will not work reliably given the distances of cables being used). A media converter might only be needed on one end as well, depending on what's decided (ex. ISPs device has a fibre mini-GBIC installed in it natively, rather than a 1000BT copper GBIC, so they run a fibre pair to you, which then connects to your media converter, then from that goes pure CAT5e/CAT6 to your router). Just remember that in this topology speed and duplex negotiations happen at 3 places: 1) between the two media converters connected with MM fibre (this should always be gigE/full, no exceptions), 2) between the ISP's media converter and their equipment, and 3) between your media convert and your equipment.

b) Stick to using CAT5e/CAT6 cable, and invest in a gigE Ethernet signal booster. The distance increases vary, but usually they double or triple the length, so up to ~900 feet in a lot of cases. These require AC power, obviously. I believe these are transparent boosters, in the sense that they do not act as intermediary switches (i.e. in this topology: device A <--> cable <--> booster <--> cable <--> device B, device B will see the MAC of device A and vice-versa).

c) Stick to using CAT5e/CAT6 cable, and installing an AC-powered switch (that includes one of those little generic 4-port desktop switches) somewhere mid-way or even 2/3rds of the way. Effectively what this does is act as a signal booster but without any signal degradation (should make obvious sense why). The downside to this is that the MACs seen by both ends are the intermediary switch. This is the most inexpensive solution, but is used even in medium-sized businesses who cannot afford to invest in fibre runs or expensive GBICs.

Re: other issues: I understand your predicament. I think vendor PHY incompatibility is still a possibility; if your router and your MacBook both behave identically, then chances are it's the ISPs PHY that is misbehaving (they should open a case with their product vendor and work with them to find out what's going on -- all PHYs have very low-level tracking details that can be used to determine this, but some products don't show this stuff by default. I know how to view this on Juniper devices but not Cisco).

A MacBook is a laptop, so here's an idea: have a technician show up and have him take you to the device that your Ethernet cable actually terminates at (on their end). Bring with you a short (3-5 feet) piece of CAT5e/CAT6 cable (probably not crossover, although if auto-MDIX is involved it doesn't matter). Have him plug one end into your laptop and the other into whatever your gigE port would be on their device.

If 1000mbit suddenly negotiates correctly (which was previously failing on your MacBook when within your abode), then you've now determined cable distance is responsible. and can discuss the above options. It's a 5 minute test, if that, and it's pretty conclusive. They should have no problem with it either, barring some security/access restriction concerns, but it's really the only choice. Otherwise if they won't let you do that (silliness!!!), then you can invest in an Ethernet analysis device -- you hook it to both ends of an Ethernet cable and it'll tell you distance, faults, quality degradation, signal level, all sorts of stuff. They are very, very useful for situations like this. A good technician (i.e. your ISP) would have equipment like this readily available for all techs. The one I linked you there is a Fluke device and costs around US$1500 (yes you read that right).

Re: router and throughput: once you sole the above, if you still have speed problems, I do suggest talking to Juniper. They have solutions that can do what you want (NAT + routing + include a couple gigE ports) but they're usually in the 4-digit range. Consumer-grade products are still built and designed around the concept of residential networks being classic ADSL or early-2000s-era cable modems, i.e. 20-30mbit connections at most. And no, I do not work for Juniper or any network equipment vendor.

Well, at this point, whatever the issue, it's on the ISPs end, and these guys are "good guys" and will do whatever is necessary to get the service up and running. We're not talking some massive, nation-wide ISP who doesn't give a damn about their customers—these guys provide this service to a small neighbourhood within a single city (Toronto) and will do pretty much what they have to to get it up and running. They're sending a technician in early next week (my timeline request—they'd have come the next day if it worked for me) to see where the issues lay and we'll then work on getting it resolved. Fingers crossed. <G>

Anyway, a couple questions...

1. If it is, in fact, a PHY issue, would that not affect the 100M service as well? Since we're not using any different hardware between the 100M service and the 1000M service (they're just increasing the advertised rate on the port on their end), would a PHY issue be present "across the board" (i.e., affecting the 100M service), or would it/could it just become apparent when increasing the increased advertisement rate to 1000M (on their end), resulting in it falling back to 100M?

2. Same question with the run length. If the run is beyond an acceptable distance, what would actually happen? Would it just result in degradation of the connection, or would it/could it result in the connection falling back to 100M?

1. No, it would not necessarily affect 100mbit or 10mbit. It has to do with how 802.3 works with regards to gigE and what is supposed to happen if negotiation can't work (hint: it is designed per protocol to fall back to 100mbit and all the nuances of 802.3u). So the latter part of your question/point is correct.

2. If the run is beyond acceptable distance, two things could happen (or a combination of both, e.g. intermittently going between the two): a) signal loss (you don't see link, or lose link), or b) packet loss (an effect of essentially layer 2 problems (more like "layer 1.5" though)). In my experience I've seen (b) happen when the length starts to reach around the 300 foot mark, and I've seen (a) happen when the length started to get around the 320-330 foot mark. Yes there is a possibility that "somewhere" in there, with some deviation, that its possible the link might just fall back to 100mbit/full and work, but I imagine that would work intermittently or have bad signal degradation (probably show up as packet loss).

Electrical distance can be confirmed (accurately) with an Ethernet tester, like the Fluke one I listed (though there are non-Fluke models which are significantly cheaper that can do the same). Signal level (dB) and SNR (noise) can also be checked. Decent models (ex. Fluke) can also determine what sorts of 802.3 negotiation speeds you'd get depending on the quality of cable, it's bandwidth (frequency), and other things. E.g. "after 8 link negotiation tests, only 2 of the 8 got link, and only at 100mbit/full".

Whoever the ISP is, hopefully their tech has one of these devices. If you or they aren't sure, inform them in advance that they should bring a low-level Ethernet analysis/cable tester -- AND NOT FOR CONTINUITY BUT FOR ACTUAL QUALITY (you know continuity is fine -- you just don't know the actual quality/signal level). Many places do stupid crap like bring out generic $10 devices and say "well I see link, so everything is fine" -- this is an improper analysis and is half-ass. I'd fire anyone who did that (as the old saying goes: if you're going to do a job, do it well and to the fullest capability possible, otherwise don't bother doing it at all). For all we know, the CAT5e that's in the wall is actually horrible quality and works fine for 100mbit but craps the bed with gigE and higher bandwidths (it's very possible/a very real thing).

I know this is an older thread but it's one of the top google results for this issue so I'm going to bump it with what I just ran into.

I just purchased an asus rt-n66r and pretty much right away flashed the latest Shibby build on to it. I kept noticing that when I would plug the n66r into the cable modem (SB6120) the link port was showing a 100mb (amber) connection instead of 1000mb (blue) connection. I didn't have this issue with the last router (e3200 w/ shibby), or with the n66r on the stock or Merlin firmwares. It turned out to be a flaky network cable. As soon as I replaced the old CAT6 cable with new CAT6 cable it was able to negotiate a gigabit connection every single time.

I don't know why but Shibby's tomato on the rt-n66r seems to be much pickier about the quality of the network cable in regards to negotiating a gigabit connection but it definitely is. I spent a most of my morning trying to figure out what was going on resetting NVRAM, flashing stock and merlin builds, and tweaking various other settings until it dawned on me to try replacing the network cable.

Not currently possible with Tomato. Maybe a quarter of that bandwidth is possible right now. If you need features like what Tomato offers on gigabit WAN then you should consider an x86 solution like pfsense.

Click to expand...

Humm, so WAN gigabit is now available if I read this current post ? Just to be sure...

minos I think you're misunderstanding the issue. The issue isn't just the link speed. Even with a 1Gb link speed negotiated the WNR3500L isn't capable of pushing 1Gb of data between WAN & LAN/WLAN. This isn't a software issue. It's a hardware issue.

If you're talking about pushing 1Gb of data between WAN & LAN/WLAN, you need more power than any consumer router is capable of providing. That is an incredible amount of data being shoved around, you either need commercial hardware (with ASICs dedicated to shoving packets around) or a PC. The PC is a far cheaper option. It doesn't even have to be a new PC, just find something used with a Core2 inside and you'd likely have more routing power than you'd ever want or need.