C# non SIMD (naive non optimized version) is in the same ballbark as other similar GC languages. Nim version is not some naive version also and seem rather specially crafted so it can be vectorized and still looses to C# SIMD.
Loses? My comparison is regarding GP's metric perf/lines_of_code. Let m := perf/lines_of_code = 1/(t × lines_of_code) [highest is better], or to make comparison simpler*, m' := 1/m = t × lines_of_code [lowest is better]. Then**:
Nim 1672
Julia 3012
D 3479
C# (SIMD) 5853
C# 8919
>Nim version is not some naive version
It's direct translation of formula, using `mod` rather `x = -x`.
*Rather comparing numbers << 1.
**No blank/comment lines. As cloc and similar tools count.
Nim "cheats" in a similar way C and C++ submissions do: -fno-signed-zeros -fno-trapping-math
Although arguably these flags are more reasonable than allowing the use of -march=native.
Also consider the inherent advantage popular languages have: you don't need to break out to a completely niche language, while achieving high performance. Saying this, this microbenchmark is naive and does not showcase realistic bottlenecks applications would face like how well-optimized standard library and popular frameworks are, whether the compiler deals with complexity and abstractions well, whether there are issues with multi-threaded scaling, etc etc. You can tell this by performance of dynamically typed languages - since all data is defined in scope of a single function, the compiler needs to do very little work and can hide the true cost of using something like Lua (LuaJIT).