This is the important part. It's not guaranteed to be accurate. They claim it "delivers essentially the same correctness as the model it imitates -- sometimes even finer detail". But can you really trust that? Especially if each frame of the simulation derives from the previous one, then errors could compound.
It seems like a fantastic tool for quickly exploring hypotheses. But seems like once you find the result you want to publish, you'll still need the supercomputer to verify?
I don't know if it's the same thing, but it feels like an analogy:
Protein structure prediction is not considered to be "solved," but the way it was solved was not through physics applied to what is clearly a physics problem. Instead it was solved with lots of data, with protein language modeling, and with deep nets applied to contact maps (which are an old tool in the space), and some refinement at the end.
The end result is correct not because physics simulations are capable of doing the same thing and we could check Alphafold against it, but because we have thousands of solved crystal structures from decades of grueling work from thousands of people.
We still need that crystal structure to be sure of anything, but we can get really good first guesses with AlphaFold and the models that followed, and it has opened new avenues of research because a very very expensive certainty now has very very cheap mostly-right guesses.
When it comes to very complicated things, physics tends to fall down and we need to try non-physics modeling, and/or come up with non-physics abstraction.
I have a thing where I immediately doubt any ML paper that imitates a process then claims that the model is sometimes “even better” than the original process. This almost always means that there is an overzealous experimenter or a PI who didn’t know what they were dealing with.
Hello, lead author here.
First: you are right! A surrogate model is a fancy interpolator so, eventually, it will just be as good as the model it is trying to mimic, not more. The piece that probably got lost in translation is that the codes we are mimicking have some accuracy settings, which sometimes you can't push to maximum because of the computational cost. But with the kind of tools we are developing, we can push these settings when we are creating the training dataset (as this is cheaper than running the full analysis). In this way, the emulator might be more precise than the original code with "standard settings" (because it has been trained using more accurate settings). This claim of course needs check: if I am including an effect that might have a 0.1% on the final answer but the surrogate has an emulation error of order 1%, clearly the previous claim would not be true.
That 'finer detail' sounds suspiciously like inventing significant digits from less significant inputs. You can interpolate, for sure, but it isn't going to add any information.
I'm not sure what you mean by that. Neutral networks are pretty good statistical learning tools, and in this kind of application you'll need some stochastic learning, regardless of using a laptop or a supercomputer. It's not like they used an LLM to predict the simulation steps. If you read the paper, they seem to use a simple fully-connected 5-layer neural network architecture, which is a completely different beast from, say, trillion parameters transformers used for LLMs.
That depends entirely upon a definition of computer Vs calculator and upon the distinction between "invented" (conceived) vs "assembled and working".
ENIAC (1945) wasn't assembled to for cryptography, nor was the Difference Engine (1820s) designed for that purpose.
Between these the Polish Bomba's (1938) were adapted from other designs to break Enigma codes but lacked features of general purpose computers like ENIAC.
Tommy Flowers' Colossus (1943–1945) was a rolling series of adaptions and upgrades purposed for cryptography but programmed via switches and plugs rather than a stored program and lacked ability to modify programs on the fly.
Thanks, this was going to be essentially my response. I'm glad you beat me to it so I didn't have to look up the dates.
But for the interested, the Von Neumann became one of the lead developers on the ENIAC. The Von Neumann architecture is based on a writeup he did of the EDVAC. Von Neumann and Stanislaw Ulam worked out monte carlo simulations for the Manhattan project.
The first programmable electronic computer was developed at the same time as randomized physics simulations and with the same people playing leading roles.
Especially if each frame of the simulation derives from the previous one.
How do you think this universe works, to me that sounds exactly the same.
Every moment is derived from the previous instant.
Leaving aside the question of whether the universe is discrete or continuous, a simulation would still have lower "resolution" than the real world, and some information can be lost with each time step. To compensate for this, it can be helpful to have simulation step t+1 depend on both the step t and step t-1 states, even if this dependency seems "unphysical."
The universe evolves exactly under physical laws, but simulations only approximate those laws with limited data and finite precision. Each new frame builds on the last step’s slightly imperfect numbers, so errors can compound. Imagine trying to predict wind speeds with thermometers in the ocean — you can’t possibly measure every atom of water, so your starting picture is incomplete. As you advance the model forward in time, those small gaps and inaccuracies grow. That’s why “finer detail” from a coarse model usually isn’t new information, just interpolation or amplified noise.
> The universe evolves exactly under physical laws
Has this been confirmed already? Seems like the 'laws' we know are just an approximation of reality. 2) if none external intervention has been detected it doesn't mean there was none.
Fine details. We are talking about NN model vs algorithm. Both are approximation, and in practice model can fill the gaps in data that algorithm cannon, or does not by default. Good example would be image scaling with in-painting for scratches and damaged parts.
Google has also a global weather model yielding by ten day predictions, and open street map runs local as well. Just today with GraphHopper and a map of Europe I can generate 2700 routes per second on my workstation. When I was young these were not things you could run at home!
Add to that Qwen3-Omni which can run on a well spec'd workstation, and will happily carry on natural language spoken conversations with you, and can work intelligently with images and video as well as all the other stuff LLMs already do.
I don't think Paramount would look kindly on giving it Majel Barret's voice, but it sure feels like talking to the computer on the holodeck.
This is the important part. It's not guaranteed to be accurate. They claim it "delivers essentially the same correctness as the model it imitates -- sometimes even finer detail". But can you really trust that? Especially if each frame of the simulation derives from the previous one, then errors could compound.
It seems like a fantastic tool for quickly exploring hypotheses. But seems like once you find the result you want to publish, you'll still need the supercomputer to verify?
Protein structure prediction is not considered to be "solved," but the way it was solved was not through physics applied to what is clearly a physics problem. Instead it was solved with lots of data, with protein language modeling, and with deep nets applied to contact maps (which are an old tool in the space), and some refinement at the end.
The end result is correct not because physics simulations are capable of doing the same thing and we could check Alphafold against it, but because we have thousands of solved crystal structures from decades of grueling work from thousands of people.
We still need that crystal structure to be sure of anything, but we can get really good first guesses with AlphaFold and the models that followed, and it has opened new avenues of research because a very very expensive certainty now has very very cheap mostly-right guesses.
When it comes to very complicated things, physics tends to fall down and we need to try non-physics modeling, and/or come up with non-physics abstraction.
Physicists have been doing this sort of thing for a long time. Arguably they invented computers to do this sort of thing.
ENIAC (1945) wasn't assembled to for cryptography, nor was the Difference Engine (1820s) designed for that purpose.
Between these the Polish Bomba's (1938) were adapted from other designs to break Enigma codes but lacked features of general purpose computers like ENIAC.
Tommy Flowers' Colossus (1943–1945) was a rolling series of adaptions and upgrades purposed for cryptography but programmed via switches and plugs rather than a stored program and lacked ability to modify programs on the fly.
But for the interested, the Von Neumann became one of the lead developers on the ENIAC. The Von Neumann architecture is based on a writeup he did of the EDVAC. Von Neumann and Stanislaw Ulam worked out monte carlo simulations for the Manhattan project.
The first programmable electronic computer was developed at the same time as randomized physics simulations and with the same people playing leading roles.
Has this been confirmed already? Seems like the 'laws' we know are just an approximation of reality. 2) if none external intervention has been detected it doesn't mean there was none.
Fine details. We are talking about NN model vs algorithm. Both are approximation, and in practice model can fill the gaps in data that algorithm cannon, or does not by default. Good example would be image scaling with in-painting for scratches and damaged parts.
I don't think Paramount would look kindly on giving it Majel Barret's voice, but it sure feels like talking to the computer on the holodeck.