Computing entropies of high-dimensional random vectors for a theoretical neuroscience study. The journey is mostly a repetition of (1) almost giving up because it's completely hopeless, (2) taking a hot shower, (3) realizing there might actually be a path forward, (4) almost giving up because it's completely hopeless.
A notable and interesting point of this article is that convolutions and correlations (convolutions without flipping the filter) are quite a bit more subtle on the sphere than on Cartesian spaces. For a convolution between a function and a filter on R^N you just "slide" the filter around, integrating at each shift, which produces another function on R^N. On a sphere, however, there is not a clear cut way to slide a filter around a sphere. For instance, there are multiple ways to slide a filter centered at the north pole to the south pole, which will result in different filter orientations.
More generally, the space of rotations, which is the argument of the convolution (analogous to the shift amount being the argument of a standard convolution), is 3D (3 Euler angles), whereas the space of points on the sphere is 2D (polar and azimuthal angles). Thus, whereas convolution over R^N returns a function over R^N, convolution over the sphere actually returns a function over the 3D rotation group SO(3). This has interesting consequences for e.g. the convolution theorem on the sphere, which is not as clear cut as simply rewriting the standard convolution theorem in spherical terms.
Sagawa was mistaken in this article; he failed to appreciate the role of mutual information in computing, which is the proper basis for understanding Landauer's principle. I discussed this in https://www.mdpi.com/1099-4300/23/6/701.
If you don't mind my asking, how much does the role of mutual information in linking logical and thermodynamic reversibility depend on considering quantum systems? I.e. does your footnote 37, which discusses independent systems vs "subsystems of correlated systems" hold for classical systems as well?
There are a few folks working on this in neuroscience, e.g. training transformers to "decode" neural activity (https://arxiv.org/abs/2310.16046). It's still pretty new and a bit unclear what the most promising path forward is, but will be interesting to see where things go. One challenge that gets brought up a lot is that neuroscience data is often high-dimensional and with limited samples (since it's traditionally been quite expensive to record neurons for extended periods), which is a fairly different regime from the very large data sets typically used to train LLMs, etc.
This reminds me of a project I worked on during my PhD, where you create a network of scientific documents and notes/threads via Markdown, with a similarly structured "rabbit-hole" linking system: https://github.com/rkp8000/hypothesize . I'm not really a software engineer so never made it ready for general use, but I'm very happy to see a similar idea turned into something real! Kudos to the authors.
MSR has a very clear and accessible tutorial on quantum computing for anyone interested in getting up to speed with the fundamentals: https://www.youtube.com/watch?v=F_Riqjdh2oM .
Yep! This relationship is well known in statistical mechanics. I was just surprised that in many years of intersecting with information theory in other fields (computational neuroscience in particular) I'd never come across it before, even though IMO it provides an insightful perspective.
reply