Like you I used XMP RAM settings. I noticed all my RAM timings were off until I enabled XMP which then set them correctly. I still think those running into random segfaults have RAM related issues as it has been shown as well Ryzen is very RAM dependent due to the CXX.
I should put an option on to see if people are setting their RAM to the ramstick settings OR what the bios thinks it should be. I am 99% sure this is the issue (assuming hte hardware is good)

AGESA 1.0.0.6 is due out now-2weeks so that should sort some more RAM issues (ie defaults as well as overclocking)

Yes most issues are probably bad RAM timings. I am currently stable at 3.9GHz 1.3626v CPU 1.0750v NB and 3.5v DRAM. anything higher would require more RAM than I can handle on air, so 3.9GHz will stay for now \o/
I can probably also assume that gcc-7, =>4.9 Linux kernel and proper RAM timings according to your kit specifications should give users a stable setup. XMP is not enabled by default and most people ignore BIOS settings and expect
things to just work™ but with Ryzen you do need to pay attention like with anything good like Gentoo

Can you post 'for i in gcc libreoffice llvm chromium; do genlop -t $i|tail -3;done' I like to compare what others have.

_________________The best argument against democracy is a five-minute conversation with the average voter
Great Britain is a republic, with a hereditary president, while the United States is a monarchy with an elective king

If there is interest by someone at AMD about corefile, I can gather them.
But I doubt they'll be of much use - it is totally random.
And I've never seen an ICE in GCC, all my crashes are in bash and head.

here the dmesg is 99% of the time "error 6 in bash[400000 a7000]"

error 6 is probably an user read in an invalid page, so I suspect is the cpu cache to have problem, silicon or whatever (architectural)

The situation is much better, but still had a few compilation errors (weird one though). I'm now running with the XMP profile of my DDR4 without changing the voltage, cool'n'quiet on with -j13 for MAKEOPTS. Before I wasn't able to build anything in this configuration and my whole system would become unstable once segfault will start to appear.
Now I've rebuilt randomly GCC and mesa a few times + some other stuff. I did had two compilation errors with mesa but overall it's much more stable. I'm gonna start again toying with my ram timings and voltage to see if it can reach a stable point.

The situation is much better, but still had a few compilation errors (weird one though). I'm now running with the XMP profile of my DDR4 without changing the voltage, cool'n'quiet on with -j13 for MAKEOPTS. Before I wasn't able to build anything in this configuration and my whole system would become unstable once segfault will start to appear.
Now I've rebuilt randomly GCC and mesa a few times + some other stuff. I did had two compilation errors with mesa but overall it's much more stable. I'm gonna start again toying with my ram timings and voltage to see if it can reach a stable point.

RAM settings - JEDEC (2133MHz, 1.2V; CL 15); XMP profile 1 (2933 MHz, 1.35V; CL16); XMP profile 2 (3200 MHz, 1.35V; CL16)
Pulling one stick of RAM and just using 1x8GB (only at 2133) with either stick of RAM just in case one is bad.
Turning SMT on or off
OP Codes - no options in my BIOS to change
CPU frequency governor on Performance vs ondemand
Compiling GCC 6.3 and using -march=znver1
Reducing makeopts to -j13 or -j8

In all cases, I will get random segfaults during compiles. They happen more frequently with large packages and higher levels of threading (i.e. -j16 segfaults more than -j8 ) but even with memory at 2133, SMT off and -j8, compiles will segfault. I've just compiled kernel 4.11.3 and will see if that makes any difference, but I'm not holding my breath.

Having gone through all of this, my question at this point is whether ANYONE has a stable, Ryzen 7 system?

FWIW, memtest86+ runs through multiple passes with no issues. Prime95 under Win10 in stress test mode with 16 helpers running also doesn't crash (at least for the 2-3 hours that I let it run).

Harvard vs von Neumann architecture_________________The best argument against democracy is a five-minute conversation with the average voter
Great Britain is a republic, with a hereditary president, while the United States is a monarchy with an elective king

The CPU apparently flips random bits in memory/cache.
If the memory area contains code the process may crash.

Hold that thought ..
I can't reconcile it with your earlier statement,

Quote:

It might be OK for a pure gaming PC

The inference being that its OK for games to crash.

I'm not aware of segfaults in gcc. its usually in bash, during a build, not gcc itself.

We only know that AM4 systems containing Ryzen processor can generate segfaults under load.
Its a feature of the system, not the CPU. At least, its not been demonstrated that its the CPU.
AMD may know more but they don't have a fix yet.
Tweaking the CPU may fix the system problem but that does not imply the CPU was the root cause.

My money is on the Vcore or Vram PSU transient response behaviour causing brownouts.
They are the hardest bits of a PC to get right and issues there only appear as system load changes._________________Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

My money is on the Vcore or Vram PSU transient response behaviour causing brownouts.
They are the hardest bits of a PC to get right and issues there only appear as system load changes.

Many bios options were cited in this thread to alleviate the problem and none worked for me. Then I discovered that setting LLC to "high" in bios reduces the 10-20 errors for an emerge -e @world to about 5.

So, your money could be well put, but I don't like the consequences of it, because, if so, it could have no solution at all for this line of CPUs. The problem is that AMD is selling very appealing multi core/threads CPUs that fail exactly at the job that makes them appealing.

The CPU apparently flips random bits in memory/cache.
If the memory area contains code the process may crash.

Hold that thought ..
I can't reconcile it with your earlier statement,

Quote:

It might be OK for a pure gaming PC

The inference being that its OK for games to crash.

I'm not aware of segfaults in gcc. its usually in bash, during a build, not gcc itself.

No. I've seen several non reproducible gcc segfaults myself.
Games don't use all cores at 100% like compiling with -j16 does.
This is a CPU bug. I wish AMD would officially acknowledge it.

I see random MCE's on a core ("u-op cache tag parity error") when all 16 are fully taxed for
a few hours (i.e. encode a dvd with x264 all options set to max) The latest round of bios updates has reduced them significantly, though.
Reducing memory frequency to minumum also cut down on random segfaults a lot.

EDIT: Oh, and there's a far more annoying bug: sometimes processes just stop. They don't crash or anything,
but they don't make any progress either (i.e. RIP doesn't move), but can be killed easily.

Hmm, well I haven't read through every post, but I've gotten through at least half of them. Not sure if anyone else has mentioned this themselves yet, but I figured I'd at least throw it out there in case it's useful:

I built my Ryzen 7 1800X machine a week after launch day. Nothing was overclocked and was just running at stock speeds. My mobo's BIOS version at the time of initial install was v1.0. I (very sporadically) ran into the same segfaults when building larger packages on that initial install (identical to the first post in this thread), but for the most part everything seemed to build without issue and I didn't think much of the few segfaults. Not long after, I upgraded to BIOS v1.3 for my mobo which included the updated AGESA 1.0.0.4a code.

While running v1.3 of the BIOS. I needed to re-emerge world and during that re-emerge is when I started seeing the segfaults constantly. They started to happen more and more often as I got through the package list. By the end, I could no longer build mesa without segfaulting which became my standard for testing the segfaults. *sometimes* gcc would build without segfaulting, but pretty much any large package couldn't make it through. This went on for a few weeks. I was about to RMA the CPU and as a last ditch effort just before, I decided to try a fresh install on a new partition. Interestingly, that install worked completely flawlessly and built every package (mesa included). For the past 1.5 - 2 months, that install has continued to work without issue and is what I'm still currently using. I update almost every day and haven't had any segfaults no matter what gets built.

Has anyone tried a second install running a brand new BIOS to see what happens? I don't know the inner workings of GCC all that well, but is it possible that something during those early buggy BIOS releases causing something to build incorrectly (GCC, libtool, something?) that would always propagate to anything else that was built? The only difference between my initial install when I first built the machine and my current install is the BIOS version. No other hardware has changed. The BIOSes before v1.3 were pretty flaky for me and I didn't use the machine too much, so I can't comment if the segfaults were actually happening a lot during that time or not. I did see the few segfaults with v1.0 initially, so at least something was going on from the beginning.

Bigfoot77, yes I tried 3 times a complete gentoo installation and the problem is still there. But your story made me think I always used the same kernel configuration from my main installation... Did you change your kernel config during re-installing?

Bigfoot77, yes I tried 3 times a complete gentoo installation and the problem is still there. But your story made me think I always used the same kernel configuration from my main installation... Did you change your kernel config during re-installing?

when I did my install I did a complete new start: fresh install, fresh make.conf (kept use) and fresh configured kernel_________________The best argument against democracy is a five-minute conversation with the average voter
Great Britain is a republic, with a hereditary president, while the United States is a monarchy with an elective king