One argument from the git devs is that it’s very hard to implement smarter algorithms in C, though. For example, it uses arrays in places where a higher level language would use a hash, because the C version of that is harder to write, maintain, and debug. It’s also much easier to write correct threaded code in Rust than C. Between those 2 alone, using a more robust language could make it straightforward to add performance gains that benefit everyone.
That's a one time gain though. There's no reason for every platform to check the validity of some hash table implementation when that implementation is identical on all of them.
In my opinion, the verification of the implementation should be separate from the task of translating that implementation to bytecode. This leaves you with a simple compiler that is easy to implement but still with a strong verifier that is harder to implement, but optional.
Nobody needs to change a language standard for 9 lines of code. When you really want to use a hash map, its likely that you care about performance, so you don't want to use a generic implementation anyway.
> or a at least a community consensus about which one you pick
There is a hash table API in POSIX:
GNU libc: https://sourceware.org/glibc/manual/latest/html_node/Hash-Search-Function.html
Linux hsearch(3): https://man7.org/linux/man-pages/man3/hsearch.3.html
hsearch(3posix): https://www.man7.org/linux/man-pages/man3/hcreate.3p.html
And who’s volunteering for that verification using the existing toolchain? I don’t think that’s been overlooked just because the git devs are too dumb or lazy or unmotivated.
That came across more harshly than I meant, but I stand by the gist of it: this stuff is too hard to do in C or someone would’ve done it. It can be done, clearly, but there’s not the return on investment in this specific use case. But with better tooling, and more ergonomic languages, those are achievable goals by a larger pool of devs — if not today, because Rust isn’t as common as C yet, then soon.
As a practical example, the latest Git version can be compiled by an extremely simple (8K lines of C) C compiler[1] without modification and pass the entire test suite. Gonna miss the ability to make this claim.
In theory you should be able to use TCC to build git currently [1] [2]. If you have a lightweight system or you're building something experimental, it's a lot easier to get TCC up and running over GCC. I note that it supports arm, arm64, i386, riscv64 and x86_64.
A specification is not a standard. It's a good first start, but what makes a standard more valuable is that it requires more than one entity to approve a change to it. This does look like a step in the right direction though.
The nature considering the future is that our actions _now_ affect the answer _then_. If we tie our foundational tools to LLVM, then it's very unlikely a new platform can exists without support for it. If we don't tie ourselves to it, then it's more likely we can exist without it. It's not a matter of if LLVM will be supported. We ensure that by making it impossible not to be the case. It's a self fulfilling prophecy.
I prefer to ask another question: "Is this useful". Would it be useful, if we were to spin up a different platform in the future, to be able to do so without LLVM. I think the answer to that is a resounding yes.
That doesn't leave rust stranded. A _useful_ path for rust to pursue would be to defined a minimal subset of the compiler that you'd need to implement to compile all valid programs. The type checker, borrow checker, unused variable tracker, and all other safety features should be optional extensions to a core of a minimal portable compiler. This way, the rust compiler could feasibly be as simple as the simplest C compiler while still supporting all the complicated validation on platforms with deep support.
rustc is only loosely tied to LLVM. Other code generation backends exist in various states of production-readiness. There are also two other compilers, mrustc and GCC-rs.
mrustc is a bootstrap Rust compiler that doesn't implement a borrow checker but can compile valid programs, so it's similar to to your proposed subset. Rust minus verification is still a very large and complex language though, just like C++ is large and complex.
A core language that's as simple to implement as C would have to be very different and many people (I suspect most) would like it less than the Rust that exists.
The beauty of the unsafety of C is partially that it's pretty easy to spin up a compiler on a new platform. The same cannot be said of Rust.