FINAL FANTASY XIV: A Realm Reborn Official Benchmark (Exploration)Tested on:5/10/2013 11:49:23 PMScore:7659Average Framerate:64.499Performance:Extremely High -Easily capable of running the game on the highest settings.

My boy needed a new card, so I put in a few bucks and got myself one of these:)

FINAL FANTASY XIV: A Realm Reborn Official Benchmark (Exploration)Tested on:5/12/2013 11:35:23 AMScore:7220Average Framerate:60.433Performance:Extremely High -Easily capable of running the game on the highest settings.

Waco wrote:One can hope that both new consoles being 8-way parallel will help developers finally see the light of SMP.

One can still...hope...right? :lol:

It's not as if developers these days don't want to parallelize things, it's just that some things can't really be solved or even helped by parallelization.

In the end, a simulation can only go as fast as its slowest part, and that means, when you dig right down into it, that single-thread performance is the most important thing for a simulation. Sure, you need to run all parts of the simulation, and massive parallelism allows you to run more complex simulations, but the speed of the simulation is still limited by your single-threaded performance.

Most games are relatively simple simulations, so you don't really need to parallelize the CPU work that much; the graphics part is the complex part, and thanks to the extremely discrete nature of graphics and CPU parts (with respect to the addressing of memory and such) it's almost impossible to have the CPU and GPU work effectively on the same data and task at the same time.

This is why AMD's pushing toward HSA and hUMA, but it remains to be seen how much of those concepts these machines (PS4 and XB1) actually implement. At a very high theoretical level, it would be optimal if the CPU cores can take over when there's some computationally-intense single-thread work to be done for the graphics, and if the GPU cores could take over when there is lots of relatively simple, easily-parallelized work to be done, and I think that's their aim. We can already do this on a machine with discrete graphics and CPU hardware, but it requires some very intelligent programming with foresight into what hardware you're using and exactly what kind of loads you're applying; I suspect AMD's goal is to automate all that.

In the end, I personally wonder if the actual compute throughput of the CPUs and GPUs of the next-gen consoles is really sufficient, even with the benefit of HSA/hUMA and lightweight console-game OSes (though rumor has it the XB1's OS takes up 3GB of memory?!) will be enough to make the next-gen console games significantly more impressive in any way than current-gen games. The machines are so weak...

I've done a lot of physics engine programming as a hobby and I just have to say, there's no excuse to not take advantage of multiple cores. At the very least you can just thread your more complex physics codes and not have so many operations per thread.

auxy wrote:It's not as if developers these days don't want to parallelize things, it's just that some things can't really be solved or even helped by parallelization.

I'm a parallel software dev. Some things can't be solved or helped, no, but most game engine tasks are pretty easy to parallelize. Most developers (publishers and bosses really) don't care about getting the engine to run faster as long as it's "fast enough" though.

Waco wrote:I'm a parallel software dev. :P Some things can't be solved or helped, no, but most game engine tasks are pretty easy to parallelize. Most developers (publishers and bosses really) don't care about getting the engine to run faster as long as it's "fast enough" though.

"Most" game engine tasks outside of graphics and physics don't need to be parallelized, which was my point. (I guess you could also include pathfinding in that.) Sure, you CAN, but from what I have come to understand, it's not really helpful and generally more trouble than it ends up being worth.

My point was twofold; one, I don't think that expanding CPU parallelization is going to really make that much of a difference in the overall quality or capability of games moving forward. I know Cryptic, the developers of Champions Online, Star Trek Online, and the recent Neverwinter (Online, although it's actually not part of the title) games, said that they only run one game thread and one graphics thread because running more doesn't make their game run better. I don't know how true that is (seems like you could run at least a couple of graphics threads), but I do think it echoes the sentiments of a lot of developers. The Dolphin emulator team says that their emulator runs 3 threads at most (4 if you run low-level sound emulation in its own thread, but that's foolish) because there's just no use for more.

Leading on from that, then, the second point; I don't think the "next-gen" consoles are going to be all that impressive. I think they're going to peak (in terms of shiny graphics) really early due to the developer-friendly x86/GCN hardware (and the overall lack of utility of eight slow processors for gaming), and then people will lose interest in actually gaming on them. This is one place where I think Microsoft actually has the advantage, and I think in the end the XB1 might possibly end up being the more successful console financially simply because they're hedging their bets on the hardcore console gaming crowd losing interest in gaming and using the machine for other things.

Anyway, this is all offtopic to the thread, and it's possible I'm completely wrong about all of this -- I'm running on logic derived from hearsay and speculation after all, because I'm neither a game developer nor a parallel software developer.

auxy wrote:"Most" game engine tasks outside of graphics and physics don't need to be parallelized, which was my point.

If you have a fast CPU, sure, that's true.

If you have 8 low-power CPU cores you kinda do.

To echo this, I have programmed a lot for WarCraft 3. Its engine is limited to 1 running CPU core (no parallel threads) and each thread has an operation limit of 30K. In-order execution at its worst. Hitting this thread limit results in a thread crash with no catch functionality, a lot of coders have scratched their head at what happened to half their engine mechanics. If there was a way to break the war3 engine out and use more cores then the performance issue wouldn't be just so plain terrible. The fact that developers are getting paid to program and they choose not to break up their code is a slap to the face - I would have given good money to have an improved engine on that game.

Star Brood wrote:To echo this, I have programmed a lot for WarCraft 3. Its engine is limited to 1 running CPU core (no parallel threads) and each thread has an operation limit of 30K.

That's scripting, not programming; you're just writing a script for an ancient game engine to process. The operation limit is just a limitation of the engine, not of single-core processing; this has nothing to do with parallelization, really.

Besides, keep in mind that when War3 came out, there was no such thing as a dual-core processor. Nobody had SMP, why would they have parallelized it?

Star Brood wrote:The fact that developers are getting paid to program and they choose not to break up their code is a slap to the face - I would have given good money to have an improved engine on that game.

They're getting paid to make things work.

If a single process/thread is fast enough for a given task on most machines they don't have a whole lot of reason to make it SMP-aware.

I mentioned the OP limit on a thread because it made us need to branch our code out to multiple threads. A projectile system, for example, would crash the thread between 20-100 projectiles depending on how many physics are applied to the projectiles. I was using the OP limit of an example when we've needed to break things out into multiple threads, and as a testament to the opportunity for multiple cores to come into play. There's no reason why something like a projectile system's operations couldn't be split over multiple cores.

There are plenty of areas where this would improve performance in games - you mentioned pathing algorithms earlier. Pathing calculations apply to almost every game, but are almost always running on a single core despite the massive number of calculations needed to predict where a unit might go. Some developers may claim that it doesn't improve performance, but they must not be running the right simulations if they really think that. A single CPU in their "recommended system specs" should never be hitting 100% unless the remaining cores in that recommended CPU are running a substantial amount of subroutines already.

And Microsoft now says the Xbox One should be "fully capable" of gaming at 4K...

On a side note, the GPU core on the X1/PS4 does not seem to be the same one on Kabini, and since AMD claimed a while back that they'd seperated their CPU and GPU core engineering they might indeed not be (at the very least the clock speeds are different, I don't know about the rest of it.) Kabini is also running with plain DDR3, where X1 has the SRAM and PS4 has GDDR5. I don't know if any other aspects of the consoles are different from Kabini but I wouldn't be surprised.

Savyg wrote:On a side note, the GPU core on the X1/PS4 does not seem to be the same one on Kabini, and since AMD claimed a while back that they'd seperated their CPU and GPU core engineering they might indeed not be (at the very least the clock speeds are different, I don't know about the rest of it.) Kabini is also running with plain DDR3, where X1 has the SRAM and PS4 has GDDR5. I don't know if any other aspects of the consoles are different from Kabini but I wouldn't be surprised.

Well, if you look at the benchmarks codedivine did, they were comparing CPU performance (thus "Jaguar vs. Stars", not "Kabini vs. Llano") specifically and that was my concern (since the topic of the thread has been the CPU-heavy nature of this benchmark.)

auxy wrote:Well, if you look at the benchmarks codedivine did, they were comparing CPU performance (thus "Jaguar vs. Stars", not "Kabini vs. Llano") specifically and that was my concern (since the topic of the thread has been the CPU-heavy nature of this benchmark.)

I don't think the consoles are as light as they look on CPU considering they're custom procs and you're not likely to have AAA stuff optimized for dual core/lower amounts of RAM. At least one of his benchmarks are optimized for dual core from what I can tell.