As the title says,this blog is about discussing the PC world,new trends,new technologies.I'm personally a chip fanatic,but I like programming too.Anything that fits these categories will be discussed here.

Monday, June 13, 2011

Update: I'm waiting on complete AT review(still in preview stage). He did retest the A8 with 1600 and 1866MHz RAM and it made some massive difference. On the CPU side Llano is around 3-8% faster than Deneb at the same clock,solid improvement but nothing spectacular.For a slightly tweaked shrink it is a good result on the CPU side. Turbo may be a bit of a let down since max. turbo states are rarely hit due to shared TDP in which the GPU part is prioritized over CPU cores.Top desktop part has no Turbo and works at fixed 2.9Ghz clock with power management p-states in between(800MHz-2.9GHz).

I'm making a chart that summarizes Llano's general performance Vs SandyBridge parts,stay tuned.

Sunday, June 12, 2011

You all remember my original blog post about BD ES weirdness that goes on in Far East(and probably elsewhere). Problem is/was that those chips are gimped in many ways so that competition is unable to figure out what is the true potential of the first new AMD core design since original K7.

Intel announced yesterday the new AVX2 ISA extensions that should be introduced with Haswell in 2013. We finally get 256bit integer AVX instructions since AVX1 was limited to FP when it comes to 256bit support.There are other additions like support for FMA(256b/128b but FP only),but main one is 256b integer SIMD support.

It seems someone got a hold of retail A8 3850 part and tested it here. Thanks goes to dresdenboy for the link.
Part works at 2.6Ghz and can turbo up to2.9Ghz (with all cores loaded/GPU idle? correction: this model has no turbo and works at 2.9Ghz). GPU part works @ 600MHz and features 400SP, so on par with 6550M discrete part.

Now on to results!

User ran a set of Futuremark tests: 3dmark11, Vantage and 3dmark06. User also managed to OC the part ,both the CPU and GPU portion to some rather high levels,just via serial bus tuning(45%). He used air cooling. Final OC speeds are 3.77GHz(45% 30% OC) for CPU and 870Mhz(45% OC) for GPU,DDR3 was also OCed to 2320Mhz(45% OC).Memory OC is very important since GPU still depends on memory BW and it's important for ensuring GPU performance scales linear with (GPU)clockspeed increase.

Thursday, June 9, 2011

Sorry in advance for the longer post :).
Disclaimer : This is just my speculation which is probably just that,speculation. I have not signed any NDA documents nor do I have the hardware discussed here.

Since we have all been witnesses of very strange Zambezi(and Llano) ES scores,this is my try to "predict" and explain what may have been going on with these scores. I will post what I expect as an end score,so in the end,when Zambezi launches,we can see how far away from the real thing was I :).

Let's start with my theory about why Zambezi X8 has such a low scores .I do believe there is at least some BIOS microcode patching going on,but mostly it's something else.As dresdenboy suggested before,and I agree with him,there is some power cap pre-programmed in the ES we are seeing in Chinese forums.This may explain the frowned AMD's motherboard partners who received the same tweaked chips for validation process ;).
Just like in Llano's case,actual clocks are being kept really low in order to keep the CPU within the TDP spec(via Turbo 2.0 interface) that AMD designated in the ES sampling process.This may be 35,45,65,95 or 125W. From the looks of things,current BD ES are limited at 45 or 65W and they keep throttling down whenever the limit is crossed(measured and estimated digitally in BD).
What this means in practice? Just as in case of Llano ES in "New Llano leaks" thread,BD ES throttles down to approx. 2x lower clock speed in singlethreaded workloads (from what is shown in CPUz).This happens in MT workloads as the limit is easily reached in this case.There seems to be a limited "Turbo" ability too,so say 2.8Ghz ES part may be able to Turbo to what I think is 2.0Ghz( 10x multi in reality :P ) or upto the power limit - which is reached in this case.
So for example,2.8Ghz ES (1.4Ghz chip with 2Ghz Turbo and advanced C6 power savings turned ON) scored 23.4s in SPi. When the tester disabled the C6 and seemingly locked the ES @ 3.2Ghz(1.6Ghz effectively while preventing cores from going into deep sleep thus reaching the TDP limit sooner) the scores in SuperPI actually went down,to 26.7s. The cores now did "Turbo" to approx. one multiplier up and finished the test at 1.8Ghz. This is in line with the lower SPI score.

Now that I explained my theory and what I think is going on here,let's move on to my prediction of scores,all based on the Chinese leaks thread.

Next one is Fritz chess.This is a tricky one. 1 core score from here is 1877pts,with C6 enabled and limited turbo to 2Ghz. User runs the MT test with 8 cores and gets 9454pts result. How is this possible? Well ,in my opinion ,the TDP limit kicked in again,limiting the each core to 1.6Ghz while multithreaded(MT) test was run. We know that scaling of modules is 80% of native dual hypothetical Bulldozer dual core design(as per AMD themselves),meaning 6.4x factor instead of perfect 8x=> 0.8x8=6.4 . We have : 1877 x 1.6/2x6.4=9600 pts, close enough huh ? Error is just 1% from actual score ;).
What I think will be the score of 3.2Ghz Zambezi 8C in Fritz chess? 19220 pts,give or take 2%.

Next one popular Cinebench 11.5. The "gimped" BD ES scored just 4.6pts. Too low? You bet. This is in line with Phenom X4 @ 3.5Ghz ,while this was supposed to be brand new octal core from AMD running at close 3.2Ghz. Well explanation is easy and is again ,as in previous case,power capping.
So we have supposedly 2.8Ghz scoring 4.6pts in C11.5. As my theory goes,this is actually a score of 1.4Ghz or 1.6Ghz 8C Zambezi which is limited via power cap (since I don't know what they set in BIOS,2.8GHz or 3.2GHz). What will be the score of retail 3.2Ghz Zambezi in this benchmark? My estimate is as follows: 1) worst case scenario 4.6pts x 3.2/1.6=9.2pts ; 2) best case scenario 4.6x 3.2/1.4=10.51 pts. Now 10.51pts may sound too high since 980x scores 9.2pts ,but remember this slide?

This leaked AMD slide by Donanimhaber states that Zambezi 8C @ XX Ghz will be approximately 1.80x faster than Thuban X6 1100T in C11.5,which scores 5.9pts. This is exactly : 5.9x1.8=10.6pts.Close enough?

Following C11.5 is famous 3dmark06.The score is also on the same link as all above. "3.2Ghz" Zambezi 8C supposedly scores ~4500pts... Yeah,that's right,a score just a bit better than Phenom X4 @ 3Ghz that , launched in Jan 2009, is getting. So we supposedly have a brand new core design,packing 8 cores in total,with IPC improvements,that still somehow sucks so badly that it is 2x slower per core and per clock than ancient Phenom II X4(just forget X6,it's miles ahead of this poor Zambezi).
So what I think is the real score of Zambezi in this benchmark? It should be between 7800 and 8800pts. Point is this is one weird test that doesn't scale past 6 cores nicely,so it's a bit trickier to figure out what will "normal" Zambezi score here. The Donanimhaber slide indicates 50% better score for 3+Ghz(I assume) X8 Vs 1100T,which scores 5900pts. So we have projected by AMD : 1.5x5900=8800pts and estimated here,by me, 7800-8800pts.

So there you go,I tried my best to try and figure out what is going on with those gimped Bulldozer ES out in the (Chinese) wilderness . Not much left to go,around 45 days or so.We shall soon know how wrong was I. Stay tuned for more.