PCI-E Lane problem?

So recently (kinda) I had burned out one of my 6970's, I had RMA'd and XFX sent me a refurb unit. After installing the refurb unit and enabling CFX I started getting strange artifacts when adjusting resolutions. Looked like GPU corruption, sort of like this. Now this ONLY happened with CFX enabled and only on resolution changes, Alt-tabs anything that affected fullscreen, and it was only for that split second the screen was adjusting. We'll say it lasted 1 frame. It was simply a flash of corruption, blink and you'd miss it. No other artifacts besides that.

So me and XFX have been going back and forth trying to figure out the problem for the last 3 months. I had already tested some things i.e. adjusted my OC's to normal, updating the mobo bios, and changing around several ATI drivers before I contacted them. They suggested maybe it was PSU related since it only happened when the screen is changing, suggesting it might be a problem with the PSU not being able to supply adequate power on downclocking and upclocking the card during these circumstances. So I tested the PSU with a PSU tester and believe it or not it actually failed. It was giving me an H.H. on the PG. So I RMA'd the PSU and got the replacement, tested it and it showed 500 for the PG which was within specs. So I installed the PSU and did some testing hoping that was the problem..well...it wasn't.

So now I'm back to pointing fingers at XFX, I tried a couple more things such as changing the finger I had CFX running through and changing the Bridge. Nothing made any difference. Now they told me to test the cards by themselves several different ways, I had already told them I had done that and that the artifacts only show up in CFX.

Last night they told me to download Uni and Furmark (which I already have and had tested with) and to run them on each card. So I do, and run several other game benches as well. To my surprise (not really), My old card was running better then my refurb card. Average FPS difference in synthetic benchmarks (Uni/Furmark/3Dmark11) was about 6 frames in each test all at whatever the max setting's that were allowed.

Now here's where it starts getting interesting, I ran a couple different game benchmarks (Fear, Metro, WIC) and the results varied.
In Metro fully maxed I averaged about 2-3 frames difference between cards in average and minimum frames. But my max frames on the refurb card was almost 300 while on the other card was only around 50. Why there was such a big difference I do not know. So I run FEAR benchmark and the refurb card is getting half the frames of the other card (116 compared to 178) on average and (62 compared to 122) minimum, but maximum average frame rate was equal at 245 fps. So I move on to the next bench which was World in Conflict. Again the older card beats out the refurb card by a double digit amount. Average FPS for the older card was 74 with the low being 43 and the max being 129. The refurb card however ran a 64/37/111. Fully maxed out that is pretty big difference and while completely playable at those numbers its still a rather large difference.

So I will add this whole time I was lazy and didn't want to take out either card, because theirs a lot of stuff that has to be disconnected and reconnected ect.. So both cards where left in and I simply unplugged the one that wasn't being tested but left it in the slot. So both cards where running at x8 this whole time. So as a last ditch effort just to make sure that I keep the testing environment as even as possible. I completely take out the refurb card which was in my primary PCI-E lane and swap in the Older card, which was in the secondary PCI-E lane. So now that card is the only card in the system and the lane is running at a full x16, And to my surprise (kinda..more like dread) upon running the benches I get the same exact numbers as the refurb card...

So all day while running the boring tests over and over I'm thinking about pointing more fingers at XFX, and before I was gonna finish I find out that its probably not the card, but my primary PCI-E Lane. So my question is...Is the PCI-E Lane no good? What would cause this sort of thing to happen? Would it cause those kind of artifacts? I suppose if both cards in CFX aren't running sync'd with one card not fully performing because of connection issues then artifacts are possible in theory at least? What are your thoughts? Is this an RMA'able offense? Should I report my findings to XFX? I still don't know if this is the cause for the Artifacts in CFX as I still wasn't able to replicate the problem on either card by itself. So it is possible that one of the cards is the culprit of the original problem and this is just another side problem like the PSU.

TLDR ; Skim through paragraph 5-6 and then read the last paragraph.Edited by dekciW - 4/25/12 at 8:26pm