To be clear: INFTech is a for-profit (I think…?) firm out of Shanghai, and MAP is an international FOSS collective (https://m-a-p.ai/about).
Speaking generally, a lot of software engineering worldwide is done in English, so it makes sense that they’re training models in English even if some/most of the researchers also speak a Chinese language. Plus, HuggingFace is English-native, and working on FOSS models (FOSLMs?) without targeting that community would be like making a command line accounting tool and not immediately posting it to the HackerNews community.
Your comment seems to imply some sort of hidden motivation, but idk, seems pretty straightforwardly benign to me! Plus it’s hard to say how many papers are published in other languages about LLMs, considering we wouldn’t read them.
For clarity: I'm a pandoc diehard (especially because it's written by a philosopher!) but it intentionally doesn't approach this level of functionality, AFAIK.
Wow, I love this -- all the more so because it's implemented in Javascript! The purists are spinning in their beds/graves, but it clearly made the visualization and audio followup steps easier, at the very least. The visuals are killer, and the obvious next step is to somehow translate the higher-level structure of existing programs into this; I would imagine nerds would pay good money to get Djikstra's algorithm on their wall, or an ANN backprop algorithm.
I did find this part funny:
One interesting problem that I did not anticipate while imagining the language was that it turned out so purely functional and absolutely state-less, that it becomes impossible to implement a "print" statement, for to print is to change state, to expect some things to be printed in some particular order is to assume that some expressions will be evaluated in some order.
Isn't this just to say "not imperative"? Regardless, it does make me wonder how one would encode state... maybe introduce variables (icon + color?) and have individual statements ordered on one or both axes?
Fascinating — I thought ultrasound was already regularly in use for reading oxygenation levels, I had no idea it was new!! I’ve gotta try this. I don’t love the modulation side, but the measurement side is incredible. Invasive tech is unnecessary and terrifying IMHO
Hmm, I see, I think understand a bit better now -- thanks.
Is it fair to say that their claims about spatial resolution being >>> existing EEG options are jumping the gun? If I understand correctly, you need to be targeting individual 1mm^2 regions with individual acoustic lenses, which means 17,000 channels would required 17,000 separate, uniquely-tuned ultrasound emitters, yes? Even if that's possible without messing up the data (the MHz range is big, but is it that big?) it seems like a trivial impossibility to fit that in one headset -- even the standard 32-64 EEG channels alone seem like a long shot. But maybe I'm overly cynical, or one emitter could be used to usefully excite multiple regions at once?
Another oddity in that paper is that it reads like we're trying to find persistent signals in the brain, like a needle in a haystack, whereas my understanding was that the field is moving decisively towards tracking signal changes over time in a given region. Is my intuition correct that accounting for a moving target would add considerable complexity to this approach?
Either way, thanks for sharing the link. Definitely thought-provoking stuff...
Thanks for your questions! I was one of the people who worked on the project. To answer your questions:
> Is it fair to say that their claims about spatial resolution being >>> existing EEG options are jumping the gun? If I understand correctly, you need to be targeting individual 1mm^2 regions with individual acoustic lenses, which means 17,000 channels would required 17,000 separate, uniquely-tuned ultrasound emitters, yes? Even if that's possible without messing up the data (the MHz range is big, but is it that big?) it seems like a trivial impossibility to fit that in one headset -- even the standard 32-64 EEG channels alone seem like a long shot. But maybe I'm overly cynical, or one emitter could be used to usefully excite multiple regions at once?
Since the system is linear, you could use a single probe to focus at multiple spots. Each focus would be at a slightly different modulation frequency.
> Another oddity in that paper is that it reads like we're trying to find persistent signals in the brain, like a needle in a haystack, whereas my understanding was that the field is moving decisively towards tracking signal changes over time in a given region. Is my intuition correct that accounting for a moving target would add considerable complexity to this approach?
This method would indeed let you track signals that change over time. Lock-in-amplifiers can output time-varying signals.
I still have lots of questions, but I think that's on me haha. Thanks so much for taking the time for this, and for pushing forward the human race in such a groundbreaking manner. Hope y'all are doing well in these dark times.
A modern clinical ultrasound probe have something in the range of 128 to 512 elements only. But despite that you get a real-time video stream at a lot higher resolution than a 512 pixel postage stamp.
Why isn't ultrasound used in orthopedics? Instead of MRI rigamarole, why can't the doc put the wand to my shoulder in the office and tell me if it's a tear?
I love the joke, but tbf React and Next.js have shown some serious sticking power so far, and show no sign of slowing. Also, JavaScript is so 2022, TypeScript is mandatory now IMO!
Well I’d encourage you to follow your dreams/preferences and adjust your spending accordingly, especially now that the recent mini-shock reminded us that absurd salaries can’t always be the norm, but;
The boring answer based on current demand is that the old languages are still dominant. Of the new languages Rust and Go are indeed at the top, but they’re still ~below C++/C, Java, and C#. OTOH, they are definitely near the top of growing languages, which is probably where your sense of them being “chosen” comes from; if you’re cynically trying to maximize long term career earnings, IMO either would be worth some investment.
The elephant in the room is, as you briefly mentioned, Python and its relation to the AI boom. There’s lots of fantastic shovels being made in other languages (eg llama-cpp), but the huge majority of new libraries are written with python APIs in mind (eg VLLM, Langchain, BentoML, and ofc the classics like PyTorch/Keras, SciKit, and numpy/pandas). Again, speaking cynically, I think there’s a lot of money flying around the python space right now.
Finally, I think it’s worth mentioning my take on the old refrain: languages aren’t really that different so don’t stress about it, but it can be worth it to invest in new paradigms/spaces/application types. It sounds like you’re not a fan of webdev, but instead of hyper focusing on picking a language, maybe consider picking new spaces to explore! I mentioned LLM shovels (aka quantizers, inference platforms) above, but there’s also some other booming spaces such as CRDTs/LocalFirst and spatial computing, to name my two faves.
Best of luck! Exciting time to be a member of the puzzle-solving class :)
Fascinating! Reminds me of the new generation of AR headsets (eg Orion) that are making the impossible possible simply by adding an ANN(-derived) layer above some their of device controllers. I wonder how many problems will fundamentally change in the face of mature brute-forcing techniques…
Lol whenever you can so casually say that a great scientist is wrong, that might be a good heuristic to know you misunderstood the question. It's a philosophical paradox stemming from our conception of species, not an empirical question about fossil records :)
Very true and well put, but IMHO that's not a productive definition of "scientist". You're definitely on the side of common usage, but this is one of the many hills I'd die on; all scientists necessarily employ intuitive intellectual tools at some point in their process, so it feels silly to cut out those who primarily use them if they're still productively employing systematic thought.
The upside of this is that Mathematicians get to be scientists, too! The downside is that you also have to let the darn philosophers in ;)
You're basically saying that scientist and "using systematic thought" should be synonymous. Why have two terms for the same thing? Whereas we definitely need a separate term for people who focus on empirical work, since that implies distinct properties of the kind of work. Mathematical and philosophical work simply aren't the same as scientific work, despite the non-zero overlap.
It's almost like there's a reason the common usage is the way it is.
Well, the alternative is "focuses on empirical work" -- every defined term is inherently redundant, I'd say.
Mathematical and philosophical work simply aren't the same as scientific work, despite **the non-zero overlap.**
Hopefully this makes it clear why I see it as a matter of taste :)
Personally, I find the utility of including all systematic human pursuits as one lineage to be much greater than the utility of cordoning off empirical physical science. The latter is the status quo, often phrased along the lines of "for a long time everyone was silly and wishy-washy, and then science came around in the 1600s." For one thing this silences many great voices from the past, and for another I'd say it's behind the current crises around what exactly constitutes "human" or "social" sciences.
For example: the answer to "what is psychology" is a lot easier to productively answer if we start the search in 400 BCE instead of 1900 CE.
Non-zero overlap is a laughably bad reason for dissolving distinctions between kinds of things. Almost everything has overlap. It's the differences that need names.
"Silencing voices" is entirely beside the point. We don't need to invalidate all past mathematical or philosophical work by not calling it science. Except, of course, that quite a lot of it is flat wrong.
"Social science" and psychology are actually excellent examples. Their status as science is questioned exactly because their connection to empirical evidence is so tenuous.
The title is XKCD-style Nerd Sniping ;) The body of the article is about how a unicellular species formed multicellular-/animal-like groups via 'polar' division. I don't really understand what the difference between a colony of unicellular organisms and a multicellular organism is, but given the last sentence of the article, I don't think they do either:
This discovery could also shed new light on a long-standing scientific debate concerning 600 million-year-old fossils that resemble embryos, and could challenge certain traditional conceptions of multicellularity.
The devil doesn't exist, and real life is complicated. Have fun telling the living Palestinians "whelp, a lot of you already died, so we're gonna let the rest of you die/be deported to a country you've never been to."
Speaking generally, a lot of software engineering worldwide is done in English, so it makes sense that they’re training models in English even if some/most of the researchers also speak a Chinese language. Plus, HuggingFace is English-native, and working on FOSS models (FOSLMs?) without targeting that community would be like making a command line accounting tool and not immediately posting it to the HackerNews community.
Your comment seems to imply some sort of hidden motivation, but idk, seems pretty straightforwardly benign to me! Plus it’s hard to say how many papers are published in other languages about LLMs, considering we wouldn’t read them.
reply