Recently we (Elizabeth Van Nostrand and Alex Altair) started a project investigating chaos theory as an example of field formation.[1] The number one question you get when you tell people you are studying the history of chaos theory is “does that matter in any way?”.[2] Books and articles will list applications, but the same few seem to come up a lot, and when you dig in, application often means “wrote some papers about it” rather than “achieved commercial success”.
In this post we checked a few commonly cited applications to see if they pan out. We didn’t do deep dives to prove the mathematical dependencies, just sanity checks.
Our findings: Big Chaos has a very good PR team, but the hype isn’t unmerited either. Most of the commonly touted applications never received wide usage, but chaos was at least instrumental in several important applications that are barely mentioned on wikipedia. And it was as important for weather as you think it is.
Applications
Cryptography and random number generators- Strong No (Alex)
The wikipedia page for Chaos theory has a prominent section on cryptography. This sounds plausible; you certainly want your encryption algorithm to display sensitive dependence on initial conditions in the sense that changing a bit of your input randomizes the bits of your output. Similarly, one could imagine using the sequence of states of a chaotic system as a random number generator. However a quick google search makes me (Alex) think this is not a serious application.
I’ve seen it claimed[3] that one of the earliest pseudo-random number generators used the logistic map, but I was unable to find a primary reference to this from a quick search.
Some random number generators use physical entropy from outside the computer (rather than a pseudo-random mathematical computation). There are some proposals to do this by taking measurements from a physical chaotic system, such as an electronic circuit or lasers. This seems to be backward, and not actually used in practice. The idea is somewhat roasted in the Springer volume “Open Problems in Mathematics and Computational Science” 2014, chapter “True Random Number Generators” by Mario Stipčević and Çetin Kaya Koç.
Other sources that caused me to doubt the genuine application of chaos to crypto include this Crypto StackExchange question, and my friend who has done done cryptography research professionally.
As a final false positive example, a use of lava lamps as a source of randomness once gained some publicity. Though this was patented under an explicit reference to chaotic systems, it was only used to generate a random seed, which doesn’t really make use of the chaotic dynamics. It sounds to me like it’s just a novelty, and off-the-shelf crypto libraries would have been just fine.
Anesthesia, Fetal Monitoring, and Approximate Entropy- No (Elizabeth)
Approximate Entropy (ApEn) is a measurement designed to assess how regular and predictable a system is, a simplification of Kolmogorov-Sinai entropy. ApEn was originally invented for analyzing medical data, such as brain waves under anesthesia or fetal heart rate. It has several descendents, including Sample Entropy; for purposes of this article I’m going to refer to them all as ApEn. Researchers have since applied the hammer of ApEn and its children to many nails, but as far as I (Elizabeth) can tell it has never reached widespread usage.
ApEn’s original application was real time fetal heart monitoring; however as far as I can tell it never achieved commercial success and modern doctors use simpler algorithms to evaluate fetal monitoring data.
ApEn has also been extensively investigated for monitoring brain waves under anesthesia. However commercially available products only offer Spectral Entropy (based purely on information theory, no chaos) and Bispectral Index.
ApEn has been tried out in other fields, including posture, neurological issues, finance, and weather. I was unable to find any evidence any of these made it into practice, although if some day trader was making money with ApEn I wouldn’t expect them to tell me.
EDM is a framework for modeling chaotic systems without attempting to use parameters. It was first created by George Sugihara and Robert May (a prominent early advocate and developer of chaos theory), but Stephen Munch is the scientist most putting the tool into practice. Munch has an excellent-looking experiment in which he applies EDM to wild shrimp management (fisheries being one of two places you can make money with theoretical ecology[4]) and compares his output with other models. Alas, his results will not be available until 2041. At least they’ll be thorough.
Sugihara himself applied the framework across numerous fields (including a stint as a quant manager at Deutsche Bank), however his website for his consulting practice only mentions systems he’s modeled, not instances his work was put into practice. His work as an investment quant sounds like exactly the kind of thing that could show a decisive success, except there’s no evidence he was successful and mild evidence he wasn’t.
Process note: one of the reasons I believed in the story of Chaos Theory as told in the classic Gleick book was that I (Elizabeth) studied theoretical ecology in college, and distinctly remembered learning chaos theory in that context. This let me confirm a lot of Gleick’s claims about ecology, which made me trust his claims about other fields more. I recently talked to the professor who taught me and learned that in the mid 00s he was one of only 2 or 3 ecologists taking chaos really seriously. If I’d gone to almost any other university at the time, I would not have walked out respecting chaos theory as a tool for ecology.
Weather forecasting- Yes (Alex)
Weather forecasting seems to be a domain where ideas from chaos theory had substantial causal impact. That said, it is still unclear to me (Alex) how much this impact depended on the exact mathematical content of chaos theory; it’s not like current weather modeling software is importing a library called chaos.cpp. I think I can imagine a world where people realized early on that weather was pretty complicated, and that predicting it required techniques that didn’t rely on common simplifying assumptions, like locally linear approximations, or using maximum likelihood estimates.
Here is a brief historical narrative, to give you a sense of the entanglement between these two fields. Most of the below can be found in “Roots of Ensemble Forecasting” (Lewis 2004), although I have seen much of it corroborated across many other sources.
By the 1940s, weather forecasting was still being done manually, and there was not much ability to predict that far into the future. As large electronic computers were being developed, it became clear that they could provide substantially more computation for this purpose, perhaps making longer predictions feasible. John von Neumann was especially vocally optimistic on this front.
Initially people assumed that we would make useful weather predictions by doing the following; 1) formulate a dynamical model of the weather based on our knowledge of physics 2) program that model into the computer 3) take measurements of current conditions, and 4) feed those measurements into the computer to extrapolate a prediction for a reasonable timespan into the future. People knew this would be very challenging, and they expected to have to crank up the amount of compute, the number of measurements, and the accuracy of their model in order to improve their forecasts. These efforts began to acquire resources and governmental bodies to give it a serious go. Researchers developed simple models, which would have systematic errors, and then people would go on to attempt to find corrections to these errors. It sounds like these efforts were very much in the spirit of pragmatism, though not entirely consistent with known physical principles (like conservation of energy).
After a decade or so, various scientists began to suggest that there was something missing from the above scheme. Perhaps, instead of using our best-guess deterministic model run on our best-guess set of observations, we should instead run multiple forecasts, with variations in the models and input data. In case our best guess failed to predict some key phenomenon like a storm, this “ensemble” strategy may at least show the storm in one of its outputs. That would at least let us know to start paying attention to that possibility.
It sounds like there was some amount of resistance to this, though not a huge amount. Further work was done to make estimates of the limits of predictability based on the growth rate of errors (Philip Thompson, E. Novikov) and construct more physically principled models.
Around this point (the mid 1950s) enters Edward Lorenz, now known as one of the founding fathers of chaos theory. The oft-related anecdote is that he accidentally noticed sensitive dependence on initial conditions while doing computer simulations of weather. But in addition to this discovery, he was actively trying to convince people in weather forecasting that their simplifying assumptions were problematic. He impacted the field both by producing much good work and by being an active proponent of these new ideas. It is especially notable that the Lorenz system, a paradigmatic chaotic system, came from his deliberate attempt to take a real weather model (of convection cells in a temperature differential) and simplify it down to the smallest possible system that maintained both the chaotic behavior and the reflection of reality. By cutting it down to three dimensions, he allowed people to see how a deterministic system could display chaotic behavior, with spectacular visuals.
Through continued work (especially Edward Epstein’s 1969 paper “Stochastic Dynamic Prediction”) people became convinced that weather forecasting needed to be done with some kind of ensemble method (i.e. not just using one predicted outcome). However, unlike the Lorenz system, useful weather models are very complicated. It is not feasible to use a strategy where, for example, you input a prior probability distribution over your high-dimensional observation vector and then analytically calculate out the mean and standard deviation etc. of each of the desired future observations. Instead, you need to use a technique like Monte Carlo, where you randomly sample from the prior distribution, and run each of those individual data points through the model, producing a distribution of outputs.
But now we have another problem; instead of calculating one prediction, you are calculating many. There is an inherent trade-off in how to use your limited compute budget. So for something like two decades, people continued to use the one-best-guess method while computing got faster, cheaper and more parallelized. During this wait, researchers worked on technical issues, like just how much uncertainty they should expect from specific weather models, and how exactly to choose the ensemble members. (It turns out that people do not even use the “ideal” Monte Carlo method mentioned above, and instead use heuristical techniques involving things like “singular vectors” and “breeding vectors”)
In the early 1990s, the major national weather forecast agencies finally switched to delivering probabilistic forecasts from ensemble prediction systems. The usefulness of these improved predictions is universally recognized; they are critical not just for deciding whether to pack an extra jacket, but also for evacuation planning, deciding when to harvest crops, and staging military operations.
Fractals- Yes (Elizabeth)
Fractals have been credited for a number of advancements, including better mapping software, better antennas, and Nassim Taleb’s investing strategy. I (Elizabeth) am unclear how much the mathematics of fractals were absolutely necessary for these developments (and would bet against for that last one), but they might well be on the causal path in practice.
Mandelbrot’s work on phone line errors is more upstream than downstream of fractals, but produced legible economic value by demonstrating that phone companies couldn’t solve errors via their existing path of more and more powerful phone lines. Instead, they needed redundancy to compensate for the errors that would inevitably occur. Again I feel like it doesn’t take a specific mathematical theory to consider redundancy as a solution, but that may be because I grew up in a post-fractal world where the idea was in the water supply. And then I learned the details of TCP/IP where redundancy is baked in.
Final thoughts
Every five hours we spent on this, we changed our mind about how important chaos theory was. Elizabeth discovered the fractals applications after she was officially done and waiting for Alex to finish his part.
We both find the whole brand of chaos confusing. The wikipedia page on fractals devotes many misleading paragraphs to applications that never made it into practice. But nowhere does it mention fractal antennas, which first created economic value 30 years ago and now power cell phones and wifi. It’s almost like unproductive fields rush to invoke chaos to improve their PR, while productive applications don’t bother. It’s not that they hide it, they just don’t go out of their way to promote themselves and chaos.
Another major thread that came up was that there are a number of cases that benefited from the concepts of uncertainty and unpredictability, but didn’t use any actual chaos math. I have a hunch that chaos may have provided cover to many projects whose funders and bosses would otherwise have demanded an impossible amount of predictability. Formal chaos shouldn’t have been necessary for this, but working around human stupidity is an application.
Quick look: applications of chaos theory
Introduction
Recently we (Elizabeth Van Nostrand and Alex Altair) started a project investigating chaos theory as an example of field formation.[1] The number one question you get when you tell people you are studying the history of chaos theory is “does that matter in any way?”.[2] Books and articles will list applications, but the same few seem to come up a lot, and when you dig in, application often means “wrote some papers about it” rather than “achieved commercial success”.
In this post we checked a few commonly cited applications to see if they pan out. We didn’t do deep dives to prove the mathematical dependencies, just sanity checks.
Our findings: Big Chaos has a very good PR team, but the hype isn’t unmerited either. Most of the commonly touted applications never received wide usage, but chaos was at least instrumental in several important applications that are barely mentioned on wikipedia. And it was as important for weather as you think it is.
Applications
Cryptography and random number generators- Strong No (Alex)
The wikipedia page for Chaos theory has a prominent section on cryptography. This sounds plausible; you certainly want your encryption algorithm to display sensitive dependence on initial conditions in the sense that changing a bit of your input randomizes the bits of your output. Similarly, one could imagine using the sequence of states of a chaotic system as a random number generator. However a quick google search makes me (Alex) think this is not a serious application.
I’ve seen it claimed[3] that one of the earliest pseudo-random number generators used the logistic map, but I was unable to find a primary reference to this from a quick search.
Some random number generators use physical entropy from outside the computer (rather than a pseudo-random mathematical computation). There are some proposals to do this by taking measurements from a physical chaotic system, such as an electronic circuit or lasers. This seems to be backward, and not actually used in practice. The idea is somewhat roasted in the Springer volume “Open Problems in Mathematics and Computational Science” 2014, chapter “True Random Number Generators” by Mario Stipčević and Çetin Kaya Koç.
Other sources that caused me to doubt the genuine application of chaos to crypto include this Crypto StackExchange question, and my friend who has done done cryptography research professionally.
As a final false positive example, a use of lava lamps as a source of randomness once gained some publicity. Though this was patented under an explicit reference to chaotic systems, it was only used to generate a random seed, which doesn’t really make use of the chaotic dynamics. It sounds to me like it’s just a novelty, and off-the-shelf crypto libraries would have been just fine.
Anesthesia, Fetal Monitoring, and Approximate Entropy- No (Elizabeth)
Approximate Entropy (ApEn) is a measurement designed to assess how regular and predictable a system is, a simplification of Kolmogorov-Sinai entropy. ApEn was originally invented for analyzing medical data, such as brain waves under anesthesia or fetal heart rate. It has several descendents, including Sample Entropy; for purposes of this article I’m going to refer to them all as ApEn. Researchers have since applied the hammer of ApEn and its children to many nails, but as far as I (Elizabeth) can tell it has never reached widespread usage.
ApEn’s original application was real time fetal heart monitoring; however as far as I can tell it never achieved commercial success and modern doctors use simpler algorithms to evaluate fetal monitoring data.
ApEn has also been extensively investigated for monitoring brain waves under anesthesia. However commercially available products only offer Spectral Entropy (based purely on information theory, no chaos) and Bispectral Index.
ApEn has been tried out in other fields, including posture, neurological issues, finance, and weather. I was unable to find any evidence any of these made it into practice, although if some day trader was making money with ApEn I wouldn’t expect them to tell me.
Empirical Dynamical Modeling– Unproven (Elizabeth)
EDM is a framework for modeling chaotic systems without attempting to use parameters. It was first created by George Sugihara and Robert May (a prominent early advocate and developer of chaos theory), but Stephen Munch is the scientist most putting the tool into practice. Munch has an excellent-looking experiment in which he applies EDM to wild shrimp management (fisheries being one of two places you can make money with theoretical ecology[4]) and compares his output with other models. Alas, his results will not be available until 2041. At least they’ll be thorough.
Sugihara himself applied the framework across numerous fields (including a stint as a quant manager at Deutsche Bank), however his website for his consulting practice only mentions systems he’s modeled, not instances his work was put into practice. His work as an investment quant sounds like exactly the kind of thing that could show a decisive success, except there’s no evidence he was successful and mild evidence he wasn’t.
Process note: one of the reasons I believed in the story of Chaos Theory as told in the classic Gleick book was that I (Elizabeth) studied theoretical ecology in college, and distinctly remembered learning chaos theory in that context. This let me confirm a lot of Gleick’s claims about ecology, which made me trust his claims about other fields more. I recently talked to the professor who taught me and learned that in the mid 00s he was one of only 2 or 3 ecologists taking chaos really seriously. If I’d gone to almost any other university at the time, I would not have walked out respecting chaos theory as a tool for ecology.
Weather forecasting- Yes (Alex)
Weather forecasting seems to be a domain where ideas from chaos theory had substantial causal impact. That said, it is still unclear to me (Alex) how much this impact depended on the exact mathematical content of chaos theory; it’s not like current weather modeling software is importing a library called
chaos.cpp
. I think I can imagine a world where people realized early on that weather was pretty complicated, and that predicting it required techniques that didn’t rely on common simplifying assumptions, like locally linear approximations, or using maximum likelihood estimates.Here is a brief historical narrative, to give you a sense of the entanglement between these two fields. Most of the below can be found in “Roots of Ensemble Forecasting” (Lewis 2004), although I have seen much of it corroborated across many other sources.
By the 1940s, weather forecasting was still being done manually, and there was not much ability to predict that far into the future. As large electronic computers were being developed, it became clear that they could provide substantially more computation for this purpose, perhaps making longer predictions feasible. John von Neumann was especially vocally optimistic on this front.
Initially people assumed that we would make useful weather predictions by doing the following; 1) formulate a dynamical model of the weather based on our knowledge of physics 2) program that model into the computer 3) take measurements of current conditions, and 4) feed those measurements into the computer to extrapolate a prediction for a reasonable timespan into the future. People knew this would be very challenging, and they expected to have to crank up the amount of compute, the number of measurements, and the accuracy of their model in order to improve their forecasts. These efforts began to acquire resources and governmental bodies to give it a serious go. Researchers developed simple models, which would have systematic errors, and then people would go on to attempt to find corrections to these errors. It sounds like these efforts were very much in the spirit of pragmatism, though not entirely consistent with known physical principles (like conservation of energy).
After a decade or so, various scientists began to suggest that there was something missing from the above scheme. Perhaps, instead of using our best-guess deterministic model run on our best-guess set of observations, we should instead run multiple forecasts, with variations in the models and input data. In case our best guess failed to predict some key phenomenon like a storm, this “ensemble” strategy may at least show the storm in one of its outputs. That would at least let us know to start paying attention to that possibility.
It sounds like there was some amount of resistance to this, though not a huge amount. Further work was done to make estimates of the limits of predictability based on the growth rate of errors (Philip Thompson, E. Novikov) and construct more physically principled models.
Around this point (the mid 1950s) enters Edward Lorenz, now known as one of the founding fathers of chaos theory. The oft-related anecdote is that he accidentally noticed sensitive dependence on initial conditions while doing computer simulations of weather. But in addition to this discovery, he was actively trying to convince people in weather forecasting that their simplifying assumptions were problematic. He impacted the field both by producing much good work and by being an active proponent of these new ideas. It is especially notable that the Lorenz system, a paradigmatic chaotic system, came from his deliberate attempt to take a real weather model (of convection cells in a temperature differential) and simplify it down to the smallest possible system that maintained both the chaotic behavior and the reflection of reality. By cutting it down to three dimensions, he allowed people to see how a deterministic system could display chaotic behavior, with spectacular visuals.
Through continued work (especially Edward Epstein’s 1969 paper “Stochastic Dynamic Prediction”) people became convinced that weather forecasting needed to be done with some kind of ensemble method (i.e. not just using one predicted outcome). However, unlike the Lorenz system, useful weather models are very complicated. It is not feasible to use a strategy where, for example, you input a prior probability distribution over your high-dimensional observation vector and then analytically calculate out the mean and standard deviation etc. of each of the desired future observations. Instead, you need to use a technique like Monte Carlo, where you randomly sample from the prior distribution, and run each of those individual data points through the model, producing a distribution of outputs.
But now we have another problem; instead of calculating one prediction, you are calculating many. There is an inherent trade-off in how to use your limited compute budget. So for something like two decades, people continued to use the one-best-guess method while computing got faster, cheaper and more parallelized. During this wait, researchers worked on technical issues, like just how much uncertainty they should expect from specific weather models, and how exactly to choose the ensemble members. (It turns out that people do not even use the “ideal” Monte Carlo method mentioned above, and instead use heuristical techniques involving things like “singular vectors” and “breeding vectors”)
In the early 1990s, the major national weather forecast agencies finally switched to delivering probabilistic forecasts from ensemble prediction systems. The usefulness of these improved predictions is universally recognized; they are critical not just for deciding whether to pack an extra jacket, but also for evacuation planning, deciding when to harvest crops, and staging military operations.
Fractals- Yes (Elizabeth)
Fractals have been credited for a number of advancements, including better mapping software, better antennas, and Nassim Taleb’s investing strategy. I (Elizabeth) am unclear how much the mathematics of fractals were absolutely necessary for these developments (and would bet against for that last one), but they might well be on the causal path in practice.
Mandelbrot’s work on phone line errors is more upstream than downstream of fractals, but produced legible economic value by demonstrating that phone companies couldn’t solve errors via their existing path of more and more powerful phone lines. Instead, they needed redundancy to compensate for the errors that would inevitably occur. Again I feel like it doesn’t take a specific mathematical theory to consider redundancy as a solution, but that may be because I grew up in a post-fractal world where the idea was in the water supply. And then I learned the details of TCP/IP where redundancy is baked in.
Final thoughts
Every five hours we spent on this, we changed our mind about how important chaos theory was. Elizabeth discovered the fractals applications after she was officially done and waiting for Alex to finish his part.
We both find the whole brand of chaos confusing. The wikipedia page on fractals devotes many misleading paragraphs to applications that never made it into practice. But nowhere does it mention fractal antennas, which first created economic value 30 years ago and now power cell phones and wifi. It’s almost like unproductive fields rush to invoke chaos to improve their PR, while productive applications don’t bother. It’s not that they hide it, they just don’t go out of their way to promote themselves and chaos.
Another major thread that came up was that there are a number of cases that benefited from the concepts of uncertainty and unpredictability, but didn’t use any actual chaos math. I have a hunch that chaos may have provided cover to many projects whose funders and bosses would otherwise have demanded an impossible amount of predictability. Formal chaos shouldn’t have been necessary for this, but working around human stupidity is an application.
Acknowledgements
Thank you to Lightspeed Grants and Elizabeth’s Patreon patrons for supporting her part of this work. Did you know it’s pledge drive week at AcesoUnderGlass.com?
This is a follow up to Elizabeth’s 2022 work on plate tectonics
The second most popular is “oh you should read that book by Gleick”
In Chaos: a very short introduction, page 44, and in this youtube video
The other is epidemiology