VideoReDo does a great job of transcoding to .m2ts from the file formats I use, but the downer for me is that they don't plan to support hardware encoding acceleration. It's all CPU based.

Software like VRD that runs on my eight cores still takes a while (almost two hours to transcode a 90+ minute feature) and all the while my system fans are blasting. Rejig with the default settings uses the GPU but screws up the output file so that it's all jerky or has major video sync issues.

So I'm looking for recommendations (including payware) that do a good, fast job of transcoding. The problem I've seen with most paid options is that they tend to be both expensive and bloatware.

You'll find that all the GPU based encoders deliver lower quality and are slower than x264 at the veryfast preset. If you don't want to use the CLI encoder use one of the graphics front ends like Handbrake or Vidcoder.

If your CPU is an Intel with Quick Sync the QS encoder is about 2x faster than x264 but still delivers lower quality.

You'll find that all the GPU based encoders deliver lower quality and are slower than x264 at the veryfast preset.

you're still spreading this FUD? only you have made it even more egregious, it used to be x264+ultrafast now you're claiming that x264+veryfast is faster.

can we be honest with ourselves for a second, you haven't tested all the gpu based encoders have you? you have tested the basic CUDA based encoder that apps like badaboom and media coder have to offer but i'm willing to bet you haven't tested all of the following:

AMD's APP encoder, the quick sync version featured on Haswell cpu's, the OCL and CUDA powered main concept encoders, the sony OCL and CUDA powered encoder, Elemental's high end CUDA encoder, the new hardware encoder that was revised for Maxwell based gpu's or even the OCL capabilities built into the latest builds of x264.

if you haven't tested each and every single one of these for yourself then you shouldn't give people inaccurate information.

and realistically, if the rumors about the upcoming GTX880 are true (the rumors are flying that it will feature a 3 core Denver based ARM processor built into the gpu), pure software based will soon go the way of the dodo assuming that cpu's themselves don't eventually become little more than a basic processor that does little more than shovel data to add in cards that do all the actual processing.

You'll find that all the GPU based encoders deliver lower quality and are slower than x264 at the veryfast preset.

you're still spreading this FUD? only you have made it even more egregious, it used to be x264+ultrafast now you're claiming that x264+veryfast is faster.

can we be honest with ourselves for a second, you haven't tested all the gpu based encoders have you? you have tested the basic CUDA based encoder that apps like badaboom and media coder have to offer but i'm willing to bet you haven't tested all of the following:

AMD's APP encoder, the quick sync version featured on Haswell cpu's, the OCL and CUDA powered main concept encoders, the sony OCL and CUDA powered encoder, Elemental's high end CUDA encoder, the new hardware encoder that was revised for Maxwell based gpu's or even the OCL capabilities built into the latest builds of x264.

if you haven't tested each and every single one of these for yourself then you shouldn't give people inaccurate information.

Have you tested each and every one of those encoding methods yourself? Can you post the results?
The thread title is "Affordable transcode options with OpenCL / GPU support". How many of the encoders you've mentioned fit the thread topic?
The number one rule when accusing someone of providing inaccurate information would be to provide evidence to the contrary. There's nothing in your post to that effect.

Originally Posted by deadrats

and realistically, if the rumors about the upcoming GTX880 are true (the rumors are flying that it will feature a 3 core Denver based ARM processor built into the gpu), pure software based will soon go the way of the dodo assuming that cpu's themselves don't eventually become little more than a basic processor that does little more than shovel data to add in cards that do all the actual processing.

Realistically, what's that got to do with x264's encoding quality?
I can't recall the last time I read an article on h264 encoding where the x264 encoder wasn't used as the quality benchmark. I'm yet to read an article where anybody's claimed a hardware encoder matches x264 for quality. Can you provide a link?

The only option that I know of now is the CUDA encoder inside of the Hybrid program or the external encoder feature in Virtualdub, using the same CUDA encoder that is in the Hybrid program.

I've heard that future NVidia graphics card will do better than what is capable now which from my experimenting was no faster than x264 superfast with at least as good of quality and maybe worse which isn't saying a lot since x264 superfast is not great quality. The other downfall of CUDA with my experimenting was that the filesize was almost twice as big as the x264 file and larger than the original file that I was trying to compress. So my conclusion was that CUDA 264 did a much worse job of compressing a file at the same quality and speed of x264 superfast.

I would recommend using the external encoder feature to encode x264. The Hybrid program or Handbrake which is the easiest to use. Selur can help with the Hybrid program. There is a thread here just for using his program which is capable of using most CLI encoders which come with his program.

The Virtualdub forum has instructions on how to use the external encoder feature. Virtualdub doesn't come with it's own encoders so you'll need to track those down (the forum also lists where to download these files) or you could cheat and just download the Hybrid program and direct Virtualdub to the CLI encoders in his program.

All of the programs are just frontends for the x264.exe encoder that can be run from a command prompt or with ffmpeg which has libx264 and H264 built in.

Have you tested each and every one of those encoding methods yourself? Can you post the results?
The thread title is "Affordable transcode options with OpenCL / GPU support". How many of the encoders you've mentioned fit the thread topic?
The number one rule when accusing someone of providing inaccurate information would be to provide evidence to the contrary. There's nothing in your post to that effect.

let's see, the "software" cuda based encoder found in apps like media coder is free, yes i have tested it.

the hardware h264 encoding chip found in maxwell and kepler class cards is free as in built into the gpu, no i haven't tested it, only app that supports it is media coder, again free.

main concept's cuda based encoder varies in cost depending on what apps integrate the sdk, i have tested it and the price varies from as little as $50 for some apps to as much as $1500 for total code. i have not tested the OCL version by main concept, pricing is similar.

i have tested sony's cuda powered encoder, cost is as little as $50, have not tested the OCL version.

AMD's APP encoder is built into video cards that use the GCN architecture, it's supported by a few app, notably A's encoder, cost free (it also supports MS' h264 encoder).

i have tested intel's QS encoder, only the h264 portion, and only the IB version, cost is built into the processor.

I'm yet to read an article where anybody's claimed a hardware encoder matches x264 for quality. Can you provide a link?

if you have the cash, Elemental makes a gpu powered encoder that is world class, they are the guys behind badaboom, they used the general public as guinea pigs to develop their cuda powered encoder, then drove interest by partnering up with Adobe for their "Mercury" engine and then they went after the professional market. these guys already have a gpu powered hevc encoder that can do real time 4k encoding and there is an admin over at the doom9 forums that has used it personally and swears that it matches x264 quality but flat out smokes it in throughput.

gpu's are awesome for video encoding, it's just that the people that write software for casual users are lazy, good for nothing, jackasses who are barely passably competent as programmers.

they suck balls and they know it.

edit: i found one of the threads over at doom9 where the poster known as Blu Misfit talks about Elemental's stuff and this guy is known as a x264 cheerleader so for him to say this it says something about Elemental's product.

In my testing QS on a Haswell is slightly better than QS on a Sandy Bridge but still worse than x264 at veryfast.

which haswell do you now have? did you test it with lookahead cranked up to the max, with the highest setting? plus we need to keep in mind that the QS found on haswell supports a bunch of advanced h264 features like b-pyramid that unfortunately no software exposes, basically the hardware is capable of much better quality but no one is writing code that fully uses it.

this is one are that i feel companies like Nvidia, AMD and Intel have really dropped the ball, they spend millions developing these technologies (i read one report that claimed Intel spent 5 years and 100 million bucks developing QS) and then they wait for some 3rd party developer to write software that uses it.

they have so many engineers on staff, they should write basic software in house that fully exploits all the hardware's capabilities and then release that code as open source so that adoption is rapid.

if you have the cash, Elemental makes a gpu powered encoder that is world class.......

The "if you have the cash" qualification would seem to eliminate it from coming under the topic of this thread. Is that link to a review or is it just a paid advertisement? No mention of the x264 encoder in the article anywhere. No mention of the names of any software used for comparison encodes. No real comparison encodes, for that matter.
Desktop 1, 2, & 3? Seriously??

"In terms of performance, the only close-to-equivalent test bed that I could configure was a 3.33 GHz 12-core workstation running a fast software enterprise encoder program that shall remain nameless."

The majority of the article covers topics like de-interlacing and closed caption encoding which have nothing to do with the actual encoding quality.

Originally Posted by deadrats

gpu's are awesome for video encoding, it's just that the people that write software for casual users are lazy, good for nothing, jackasses who are barely passably competent as programmers.

they suck balls and they know it.

Are you supporting jagabo's claim now?

Originally Posted by deadrats

edit: i found one of the threads over at doom9 where the poster known as Blu Misfit talks about Elemental's stuff and this guy is known as a x264 cheerleader so for him to say this it says something about Elemental's product.

The "if you have the cash" qualification would seem to eliminate it from coming under the topic of this thread.

you conveniently ignored the evolution of the discussion, a question was asked and jagabo made an overly broad statement which i challenged. another poster then made a statement that he didn't know of any gpu powered encoder that could match x264 and i offered him a link to a solution that could match it and was significantly faster.

Are you supporting jagabo's claim now?

jagabo's is, and always has been, that the very basic underlying technology behind CUDA and gpgpu in general is somehow poorly suited for video encoding.

my contention has always been that the relatively poor showing of gpu powered encoders on the desktop has zip to do with the underlying technology and more to do with other factors such as developers with ulterior motives, developers with inadequate coding skills and other special interest segments that have a vested interest in keeping software based encoders at the forefront.

So is there anything in the "affordable GPU encoding" area which competes with x264 for quality while encoding at a higher speed?

the answer to that is not that cut and dry.

if you are talking about bit rate starved encodes done at stupid low bit rates, like say 1500 kps for 1080p resolution, then no, there is no gpu powered encoder available to the general public that is capable of beating any software based encoder, primarily because software based encoders blur images significantly more than gpu powered encoders thus masking artifacts.

in fact, much of x264 vaunted quality comes from it's deblocking filter which can be cranked real high and it's so-called psy-rd optimizations also serve to try and hide artifacts by blurring the image in some parts while sharpening it in others. the deblocking filter in particular acts like a customizable blurring filter and at lo bit rates this software encoder is unmatched if only because it kills details to the point where artifacts can no longer be seen.

if the quality test is at sane bit rates, i.e. blu-ray level bit rates for 1080p, then yes gpu powered encoders can match software encoders and in fact one of the main developer's, DS, has repeatedly complained about "unfair comparisons in which the tester uses too much bit rate and then concludes that there's no difference between encoder quality" and when pressed he admitted to me on a number of occasions over at doom9 that "of course if you use enough bit rate all encoders will look good, even mpeg-2).

as far as speed and cost is concerned that's a bit of a head fake. yes x264 is free and many apps that support it are free but the hardware required to run it isn't and saying that i7+x264+very fast is faster than a cuda encoder that conveniently ignores the fact that one has to spend a considerable amount to but an i7, the motherboard, the ram and if he wants faster encoding performance has to usually upgrade all 3, maybe not the ram, and then re-install the OS and apps.

with a gpu powered encoder, like NVENC built into Kepler and Maxwell class Nvidia cards or even a CUDA based encoder, you have the handful of free apps, but lets say you have to spend $100 on a piece of software like tmpg or sony's apps to get good gpu encoding, added to the cost of a fast card, say you buy a good mid range card in the $250 range, that's still cheaper than going the x264 route and it will be much cheaper and easier to upgrade.

in fact, AMD's cards offer a much better value, the GCN architecture is tailor made for gpu compute and all benchmarks have it flying when used with the OCL encoders found in sony's apps and you can get a good card for about $100.

don't dismiss gpu encoding just because some guys seem to have a bit rate starvation fetish where they seem to love seeing how badly they can mess up their video by dropping the bit rate before it becomes unwatchable.

with a gpu powered encoder, like NVENC built into Kepler and Maxwell class Nvidia cards or even a CUDA based encoder, you have the handful of free apps, but lets say you have to spend $100 on a piece of software like tmpg or sony's apps to get good gpu encoding, added to the cost of a fast card, say you buy a good mid range card in the $250 range, that's still cheaper than going the x264 route and it will be much cheaper and easier to upgrade.

I'd like to find something that does an OK job of using the hardware and also lets me give it the desired output file size, with a two-pass approach taking best advantage of variable bit rate. AVCWare offers hardware encoding, but doesn't let me fine tune the output file size. In its easy pulldown convert-to list I didn't even see .m2ts as one of the output options. Probably one of the options gives you that output but the list doesn't tell you what the file extension is going to be.

I tried the Sony Movie Studio 13 Platinum trial and it has a pretty interface but I don't know that it does any better. I can afford to pay something over $100, but my experience is that prettier interfaces give you less quality programming effort in the long run ...

i neglected to mention that x264 DOES have some rudimentary gpu acceleration via Open CL, in which only lookahead is offloaded to the gpu.

test results have been a mixed bag, on my system, X6 1045t, 8gigs ddr3, 9600GSO enabling OCL results in a slightly slower encode by about 3-5fps when the encode using uf is in the 130fps range. with slower presets there is no statistical difference with encode speed.

one of the main x264 developers has said that lookahead performance can increase by about 40% on the latest AMD apu's and by a factor of 2 on the latest AMD discrete video cards, my tests were done with a relatively old and slow card that only uses ddr2 and a narrow memory width and most of the benchmarks i have seen were also done by people with older cards where memory bandwidth was an issue.

there have been some reports that enabling OCL on x264 results in slightly lower quality but in my tests i could see no difference and in all honesty with better hardware if you could use higher lookahead settings with OCL than without that should offset any quality differences.

Anyone here know of anybody that's tried VSO Video Converter? I have VSO ConvertXtoDVD and like it well enough for what it does. I think the other product supports hardware encoding and sells for under $40 ...

you're confusing CUDA with OCL and the 2 available CUDA encoders with the OCL patch for x264. and no, it's not only available in the paid premium version.

on the video tab choose x264 from the drop down menu, go to advanced and you will see "enable OCL if possible".

i just tested it on my system, X6 1045t, 8gigs ddr3, 9600GSO, source was a vc-1 12mb/s 1080p and target was 12mb/s encoded with x264 using the medium preset, 1 pass, no resizing, no de-interlace, no sharpening and all other settings were left at default. without OCL it took 535.5 seconds to finish the test encode (source was 4 min 11 sec long). with OCL the encode was done in 589.3 seconds and using GPU-Z it barely loaded up the gpu and onboard ram. quality-wise i see no difference between the two encodes.

keep in mind that the X6 was released sometime in the second quarter of 2010 and the 9600gso was released in late 2008 and it was just a rebadged 8800gs which in turn was released in 2007.

i would love to see someone with a high end hexacore intel cpu do a similar test, using OCL both with the built in intel gpu and maybe a discrete high end video card like an r9 290 or a 780ti.

I know their web site claims it:
"H.264 encoding GPU acceleration (Intel QuickSync, nVidia CUDA, OpenCL)"
But even when I go into the advanced interface and try to check "forced OpenCL if available" (something like that) it doesn't seem to matter. I'll try reinstalling it and see if I get lucky.

How wrong am I about this? I don't see the arguments in there for OpenCL support ... and from the ffmpeg help ...
"When FFmpeg is configured with --enable-opencl, it is possible to set the options for the global OpenCL context.
The list of supported options follows:
‘build_options’
Set build options used to compile the registered kernels.
The specified index must be one of the indexes in the device list which can be obtained with ffmpeg -opencl_bench or cv_opencl_get_device_list(). ‘device_idx’
Select the index of the device used to run OpenCL code.
The specified index must be one of the indexes in the device list which can be obtained with ffmpeg -opencl_bench or av_opencl_get_device_list()."

it's not ffmpeg that supports OCL, it's x264, i know for a fact that the x264 build contained in media coder is built with OCL support. did you try with the 32bit media coder build or the 64 bit (i used the 64 bit).

it's not ffmpeg that supports OCL, it's x264, i know for a fact that the x264 build contained in media coder is built with OCL support. did you try with the 32bit media coder build or the 64 bit (i used the 64 bit).

Thanks for the links. I'll check them out. If you visit the pages I listed, I think you could understand how I could be confused,

now i understand what you're doing wrong; media coder is set up a bit confusingly, that "gpu" checkbox is meant to enable the 2 CUDA encoders, when that check box is checked the encoder is changed to CUDA encoder or NVENC, both of which are nvidia only encoders.

you don't need to check the "gpu" box in order to use x264's OCL capabilities.

with regards to the "improving quality practically" what the media coder author has done is enable a high quality denoise filter by default if a hardware encoder is chosen, such as intel's quick sync or nvidia's cuda.

open cl is not an ATI/AMD only gpu computer programming framework even though it's come to be synonymous with it. Nvidia released their framework for general purpose computing on their gpu's and they called it CUDA. AMD developed an open source alternative called OCL that runs on all gpu's, including Intel's and Nvidia's.

the reality is more complex than that as general purpose computing on a gpu can be achieved via DX9 class HLSL (high level shader language), DX10 class DX Compute, OCL, CUDA (which is C for Nvidia gpu's), FORTRAN (Nvidia has a FORTRAN compiler for their gpu's), and there's probably one or two more that i'm not familiar with (i heard rumors a while ago of a JAVA compiler designed for gpgpu).

this is why it annoys me to no end when people make broad statements like gpu's are no good for video encoding or some similar claim because it completely ignores numerous intertwined variables that make up the gpgpu compute landscape.

with regards to you, do as i said before for enabling OCL in media coder but ignore the "gpu" checkbox (leave it unchecked).

with regards to you, do as i said before for enabling OCL in media coder but ignore the "gpu" checkbox (leave it unchecked).

Thanks for the explanation. A's Video Converter is processing a file now, and I can see that the GPU is engaged on the AMD System Monitor tool. It's basically running the GPU at 20-22 percent, and the CPU cores are nowhere near as taxed as previously. Only one core is really spiking, and windows task manager is saying that the program is pulling well under 20% overall. It also seems to be going considerably faster than the software-only programs. I'll let you know about the result later.

you conveniently ignored the evolution of the discussion, a question was asked and jagabo made an overly broad statement which i challenged. another poster then made a statement that he didn't know of any gpu powered encoder that could match x264 and i offered him a link to a solution that could match it and was significantly faster.

You're rewriting history even though it's all there in black and white. I asked a question. You quoted me and replied. I replied again. Nothing to do with anything anyone else posted.

jagabo's is, and always has been, that the very basic underlying technology behind CUDA and gpgpu in general is somehow poorly suited for video encoding.

I can't say I've ever taken it that way. He's said it's not up to x264's quality, but the "underlying technology" assumption seems to be yours.

Well, despite a previous vow to never install MediaCoder on a PC of mine again, I did just that. You'll have to provide me with the magic CUDA encoder settings, because I couldn't find them. "Keyframe pumping"..... I've never been sure I've witnessed it before, given I generally use a fairly low CRF value, but now I'm sure I'm quite familiar with the effect.

Anyway......... I ran an x264 CRF18 encode (default settings) on an AVI I had handy. This old E6750 CPU managed about 45fps. Using the default MediaCoder/CUDA settings I was encoding at over 100fps, resulting in an encode of just over half the bitrate and about a tenth of the quality (if it can be measured that way). I couldn't seem to coax anything better out of a CUDA encode while selecting the quality based encoding method, and I couldn't fix the problem of a few rows of pixels worth of crud down one side. The source video was mod16 and I didn't resize.

Maybe it's just this antiquated PC. A 8600GT is the newest video card I have. I'm not a gamer so I don't upgrade video cards often. The same applies to the video card drivers if that's likely to make a difference. The drivers I'm using would be over a year old. I'm running XP.
The difference though, is despite all that, the quality of the x264 encode was fine.

Seeing as I couldn't find a way to run a quality based CUDA encode which gave me an acceptable quality output... it was the same (ie awful) regardless of the quality value I chose.... I tried an average bitrate encode instead using the bitrate the CRF18 x264 encode gave me. The result was much better, although encoding speed slowed to around 80fps. Still nearly twice the speed of x264 on this PC though. It reduced the keyframe pumping significantly, but didn't eliminate it. Not that it matters. I prefer quality based encoding. If another encoder doesn't offer an equivalent of x264's CRF encoding I can't see myself using it, even if it is much faster. MediaCoder/CUDA doesn't even seem to allow 2 pass encoding.

jagabo's is, and always has been, that the very basic underlying technology behind CUDA and gpgpu in general is somehow poorly suited for video encoding.

Nonsense. Obviously some parts are well suited for GPUs. I can't say for sure whether the difference between x264 and the free GPU encoders are those parts that aren't well suited for GPUs (and hence not implemented there). The x264 developer's claim that's the case.

could you upload the test avi you used, i'd like to try it out myself, also what bit rate did x264 crf 18 give you and what preset did you use?

and yes, the drivers do make a difference, it's been my experience that some driver revisions seem to kill quality, usually when Nvidia is trying to improve overall performance (the latest beta drivers improved performance considerably but definitely had a quality effect with some encoders).

software encoders like x264, divx, xvid and main concept's h264 encoder can never be fully ported to CUDA and it has nothing to do with the h264 spec, the difficulty of certain parts to be highly threaded or any of the other claims the x264 developers have made over the years. the simple fact is that h264 encoders, like x264 and mc h264, these encoders were first built from the ground up as far back as 2002 (maybe even before that) and CUDA wasn't introduced until 2008, these developers had sunk 6 years worth of x286 compatible code, with assembler and compiler intrinsics, there no way to now port that C/++/assembler code to run on a completely different architecture using different coding conventions and memory constructs, at least in any remotely efficient way.

this is why even the main concept CUDA and OCL powered encoders only feature a subset of all the features that their full software based encoder does.

now if these developers had the desire i have no doubt that they could build an h264, mpeg-2, mpeg-4 asp encoder from the ground up that ran entirely on a gpu and fast at that, hell the gpeg-2 people did.

the sad thing is we're seeing the same crap being repeated with hevc, a standard built from the ground up with an eye towards gpu acceleration and easy parallelism. you know the threading model i suggested for encoders like x264 use in order to benefit from gpu's, namely segmenting the video on gop boundaries, the one you said would lead to cache thrashing, you might find it interesting to note that x265 uses gop level parallelism for it's threading model.

you know the threading model i suggested for encoders like x264 use in order to benefit from gpu's, namely segmenting the video on gop boundaries, the one you said would lead to cache thrashing, you might find it interesting to note that x265 uses gop level parallelism for it's threading model.

And eventually gop level parallelism will start cache/memory thrashing. The question is, "at what point?" That will vary of course, depending on the video, the hardware.

And eventually gop level parallelism will start cache/memory thrashing. The question is, "at what point?" That will vary of course, depending on the video, the hardware.

it doesn't thrash with 6gb 4k source files being transcoded to 1080p (i've run numerous tests using x265 + tears of steal). i've never quite understood why you believed it would behave this way. desktop intel cpu's have about 25 gigabytes/sec memory bandwidth between the cpu and system ram, video cards easily double, triple, quadruple and go beyond that (mid range nvidia and amd cards are at the 100 gigabytes/sec mark), blu-ray spec video is roughly 25mb/s data rate.

then we have this question, intel use inclusive cache hierarchies meaning the data from the L1 is mirrored in the L2 and so on and AMD uses exclusive cache hierarchies, do you believe that one is more susceptible than they other? what about intel's new L4 that is found on some haswell's and will eventually be included in all Intel cpu's, namely the 128mb eDRAM L4, that won't help eliminate thrashing?

lastly we have media coder which already features segmental video encoding, i have yet to see any thrashing taking place in my tests.

in short i know you're an old school programmer but i think that maybe you're a bit too old school, you still think in terms of pre-Pentium cpu's i honestly don't think it would thrash with 500 threads and certainly not on any modern video card.

@hello_hello
and yes, the drivers do make a difference, it's been my experience that some driver revisions seem to kill quality, usually when Nvidia is trying to improve overall performance (the latest beta drivers improved performance considerably but definitely had a quality effect with some encoders).

Yeah, the driver on the disc wasn't that good and somewhere along the way, the driver stopped working at all. I had to get help in the Nvidia forum trying to find a driver that worked with CUDA. I don't know what it's like now. It just wasn't worth my trouble at the time to mess with it since the quality was inferior to x264.exe. To get CUDA to look comparible to x264 I had to make the file size twice as big. My only interest in it in the first place was to see if I could get it to work with Virtualdub's External Encoder which I could.

EDIT: I just used the CUDA encoder at default settings and the results were way worse than I had remembered. Looking at the numbers, you would think that the CUDA H264 encoder would've produced the best looking picture but the picture was pathetic looking. The DivX265, X265 and X264 pictures looked identical to my "human" eye.

Using the same 4097x2160 input avi in Virtualdub with the external encoder feature to compare CUDA H264 using default settings, X264 at superfast preset, DivX265 at fastest preset and X265 at ultrafast preset...