What is up with this argument that LLMs cannot be Turing complete?

> There is a simple argument showing that GPTs, even with an infinite context window, cannot be Turing complete. Since the vocabulary size is finite, the rows of the embedding matrix are bounded, meaning each embedding is in a compact subset of Rn. By Tychonoff's Theorem, the product of the compact spaces the input embeddings live in is also compact. Because the transformer architecture does a bounded number of continuous operations, the output probabilities are bounded from above and below. In particular, the probability of emitting the end-of-sequence token (the special token that halts the completion) is bounded from below by some constant, so the probability of stopping in finite time is 1.

Even if the conclussion might be correct, the argument is pure nonsense.

First of all, why do we need to invoke compactness and Tychonoff's Theorem? Any serious discussion of computability must avoid discussing real numbers. And furthermore, here it is completely unnecessary. All sets are actually finite, so any mention of compactness is just fancy wording for finite in this case.

Second, a probability argument doesn't prove anything about turing completeness. Realistically we must consider LLM's as deterministic machines. The fact that we like to interpret the output of an ML model doesn't have anything to do with actual probability of what happens during execution of an LLM (again, it ia deterministic).

Third, even if we accept the probability argument, we can easily resolve it by just removing the end-of-sequence token and guarantee that the process never stops.

I think the real argument against Turing completeness of GPTs (even with infinite context) would be more along the lines of the fact that it must operate with finite number representations. And a consequence of this would be that it maps an infinite context into a finite set of states, which must result in periodic output at some point.

Finally, as a meta-point, why do so many people enjoy making such quasi-mathematical arguments? Often invoking theorems which obviously can't apply, if they just took a bit of time to clearly define the question. To me it seems like ML people know just enough math to notice some superficial correspondence, but lack the rigor to be able to trully evaluate it.