Author
Topic: CPU and GPU benchmarks (Read 71567 times)

and we have managed to bring interesting stuff for the community here...

The http://www.anandtech.com/ decided to use Photoscan in 2014 benchmark suite, so they will test a lot of CPus and GPUs under controlled environment so all of us will see what the best price/performance devices are suited for your projects. the results will be released later this year, we expect that dual CPUs will be tested in 1-2Q too. ( 4P systems later ?? ) So stay tuned when it is released.........

This way want thanx to Ian Cutress from Anandtech that he and his team is inspired with this interesting software from Agisoft team and willing to use it in benchmark apps for the 2014 year. http://www.anandtech.com/ - its the site where i get a lot of the indepht information about how the hardware stuff work and so can help many times with hardware problems and issues....

I might get a chance to try the Mac Pro next week. The boss just reviewed one, but I have to get him to give me remote access to test. The 1866 C13 memory is going to kill performance though. ECC JEDEC standards are pretty lax, which makes me wonder if you can put normal non-ECC memory in it for a bit of a boost (and the system will enable XMP too).

I do hope you use a project with a decent photo size. The 'official' benchmark file we use on this forums does not have large enough photos in my opinion. Especially for the Dense Cloud stage, the results in Million Samples/sec deviates way too much with results on real life projects. Because of the small photo size on the sample project, I think too much time is spend on setting up and overhead, and not enough on the actual number crunching.

For example, the AMD R9 290 GPU scores around 750 million samples/sec with this sample project, but on big projects (a set of 300 images with 21MP size) the average score is around 850. And I've processed projects where the same card scores over 1000.

Basically we have to balance benchmark time - we can't have something that runs for a day. Wouldn't get paid otherwise! Especially as my normal benchmark routine takes 20-30 hours to get through (we are a computer component review website).

Even if you feel you cannot take the results quantitatively, take them qualitatively - it at least shows X > Y.

Yup with the GPU benches its a bit complicated, its because depht map reconstruction depend on few more things as just Mpix value. it depend on scene composition, so same Mpix value will give different Mpix/sec performance as the scene change. That mean if you do GIS scene and say some staue-building reconstruction it will give a bit different results. From my observations is best if is used HIGH or ULTRA settings for the reconstruction then is the GPU used to 100 %. so when set to medium the GPU is used say to just 60-70%, but it depend even on CPU speed.

But for other stages is OK with the settings......

On my Dual Xeon setup i get best results with all cores disabled, because of the interconnect saturation as data move over PCI-E bus adn CPU bus. but thats a Dual CPU, single CPU setup could get a bit better results..... Its just my observation, it could be a bit different on new E5v2 CPUs as then improved some CPU stuff....

For standard tests its enought with this. to have more precise Dual CPu results a bigger dataset could/should be used. If i get a bit more FREE time, we will digg intro this matter with IAN, so stay tuned and enjoy the single CPU + GPU setup results....

As in BUG REPORT forum thread is mentioned one problem with 80-160 Core quad CPU system, so if everything go and problem is solved we could do test even on so ULTRA system for you.....

And as IAN say, they are a bit balanced test. they cannot run benchmark for too much time. its a review and not specialized company that do indepht benches for specific app for free.....

as we get results in we could do specialized benchmark for the most powerfull CPU + GPU setup. say fastest CPU + Fastest GPU, Dual CPU, Quad CPU and multi GPU with the fastest GPUs with one heavy scene. so we get results for the say 3 systems, and see the eficiency of CPU + GPU scaling. becasue ifg you add second CPu you get not 100% speedup, just say 80-85 %, usinng 4 CPUs you get 70% speedup per CPU added to system......

By recommendation from Agisoft, the GPU numbers have two CPU cores disabled each. Meaning that Stage 1/3/4, which do not use the GPU, might look slower than when the GPU is not used.

.....

I have two computers, each with a quad core I7 and a GPU. PhotoScan on each is set with "2 CPUs disabled". During much of the Align Photos and Build Mesh all CPUs are running at or very close to 100% according to Windows Performance Monitor and Open Hardware Monitor. There is no difference in performance during these stages when the number of CPUs "disabled" in PhotoScan is changed.

Indeed, the number of available cores only applies to the Dense Cloud phase. I can tell for sure because I got it set to use zero cores, and it still Aligns fine.

My comment about the project size was not to say your benchmark is not useful (on the contrary, I love it!), I just wanted to warn you that the results you get from a small project can be slightly skewed because Photoscan is spending a relatively short time on actually crunching numbers.

There are also phases in the Mesh Generation where the process runs entirely on a single core.

I might get a chance to try the Mac Pro next week. The boss just reviewed one, but I have to get him to give me remote access to test. The 1866 C13 memory is going to kill performance though. ECC JEDEC standards are pretty lax, which makes me wonder if you can put normal non-ECC memory in it for a bit of a boost (and the system will enable XMP too).

I love this thread! Glad so see you here, Ian Wishgranter - splendid initiative!

Looking forward to upcoming tests. Maybe to see what I did wrong since I will have to purchase my own workstation very soon - don't have time to wait for the test results

Speculating that maybe not so many of my upcoming projects might need 256GB, I am tempted to go for the faster 128GB. But how much performance difference should I expect between these two options, in theory? Assuming the projects generaly would stay within the 128GB limit, and if bigger, splitting in chunks (does that sound reasonable?).

Here's a general graph on a memory intensive benchmark (Dirt3), FPS vs PI:

When I get around to doing a memory scaling article on a few upcoming platforms and scenarios, I will have data to back it up. I should have access to a dual socket motherboard within the next few months, and some high end CPUs (either 8, 10 or 12 cores) to do some other testing for you, but please bear with me