Today I submitted changes to the garbage collector that make typical worst-case stop-the-world times less than 100 microseconds. This should particularly improve pauses for applications with many active goroutines, which could previously inflate pause times significantly.

High GC pauses are one if the things JVM users struggle with for a long time.

What are the (architectural?) constraints which prevent JVM from lowering GC pauses to Go levels, but are not affecting Go?

The other collectors in openjdk are, unlike Go's, compacting generational collectors. That is to avoid fragmentation problems and to provide higher throughput on server-class machines with large heaps by enabling bump pointer allocation and reducing the CPU time spent in GC. And at least under good conditions CMS can achieve single-digit millisecond pauses, despite being paired with a moving young-generation collector.

Go's collector is non-generational, non-compacting and requires write barriers (see this other SO question), which results in lower throughput/more CPU overhead for collections, higher memory footprint (fragmentation) and less cache-efficient placement of objects on the heap (non-compact memory layout).