Our Unity project need a open source C# physics library now, Bepu v2 looks pretty good.
But as you mentioned above, performance may be a huge problem.
I actually do some test, yes, it's a huge problem.

400 BoxCast
Unity cost 2ms, 0 GC
Bepu cost 6494 ms, 53.2 MB GC Alloc

I can't wait Unity to fix it, that will be years waiting, I have only few weeks.
So I need to modify source code to make it run more fast.
But I am not familiar with SIMD so I am not sure what to do.
Can you point me a direction?

For example, GJKDistanceTester.Edge() cost 1753.64 ms, and 3.2MB GC Alloc
How should I modify it?
Thanks in advance!

3000x slower is pretty wonky. I'd first rule out benchmark problems like running in debug mode, measuring the cost of JIT compilation, and so on. I don't think that'll fully explain it, though.

The GC alloc in particular makes no sense- that function definitely should not allocate. It may imply that the fallback implementation unity is using for Vector<T> is allocating somehow.

If there isn't some simple environmental explanation for this (some runtime settings, debug mode, or something), there's no realistic way for you to fix this. If you're working with a time constraint, I'd strongly recommend using something that is proven to work with unity as-is.

Doing benchmarks in the editor using the profiler usually gives some weird unreliable results like this. Honestly if you want to do benchmark it's usually better to make a build instead of using the editor and the profiler. But that confirm that we need Unity to support hardwareAcceleration for the Vector class.

The best we can do is to create a ticket for this and smash the vote button until they add it. Do you know if there is one that already exist? If not it would be a good idea to create one.

I submitted a bug report. Unity QA did not treat it as a bug and recommended that I make a feature request in the forums. After going through the process with QA I believe the thread is our best bet. So it might be worth to go in, pick up the convo, talk about why you'd like to see it supported...

System.Numerics.Vectors gets about 50k downloads off nuget a day, so there are clearly quite a few indirect users at least. Many aren't things that would obviously need it- you can see some top github users on the nuget website: https://www.nuget.org/packages/System.Numerics.Vectors/

Hardware intrinsics are a little harder to get information on since they ship as a part of .NET Core, not as a nuget. These sorts of things form the lowest level foundation that everything else sits on, so even with relatively few direct users, there tend to be a ton of indirect users. The core framework libraries make use of them now, too.