Gaia Network: An Illustrated Primer

Primarily written by Rafael Kaufmann

In our first LW post on the Gaia Network, we framed it as a solution to the challenges of building safe, transformative AI. However, the true potential of Gaia as a “world-wide web of causal models” goes far beyond that, and in fact, justifying it in terms of its value to other use cases is key to showing its viability for AI safety. At the same time, the previous post focused more on the “what” and “why”, and didn’t really talk much about the “how”. In this piece, we’ll correct both of these flaws: we’ll visually walk through the Gaia Network’s mechanics, with concrete use cases in mind.

The first two parts will cover use cases related to making science more effective and efficient. These would already be sufficient to justify the importance of building the Gaia Network: as science is the only news, improving science can have a huge positive multiplier effect on our future survival and prosperity. Yet despite a workforce of 8.8 million researchers and funding that adds up to 1.7% of global GDP, science is rightly criticized for inefficiency and limited accountability. The third part will expand beyond the epistemic (scientific) benefits of the Gaia Network and towards pragmatic impact—ie, making all decision-making more effective and efficient, which impacts the entire world population and GDP. And the last two sections will focus on the applications of the Gaia Network on existential risk—first specifically with regard to AI safety, and finally as a general tool for collective sensemaking and coordination around the Metacrisis.

For brevity’s sake, we will not cover any of the implementation details or mathematical grounding. We’ll focus on the core concepts and capabilities, and try to explain them in plain language. We’ll also skim over much of the “hard parts”: the economics and trust modeling. Finally, we will not cover the arguments for convergence and resilience of the network; these have been already sketched out in our previous paper, and merit a more formal and in-depth analysis than we can incorporate into this primer. If there’s some hand-waving in the below that makes you uncomfortable, please let us know in the comments and we will attempt to assuage you.

The beginning will take a bit long with Bayesian statistics, as these are foundational concepts for the Gaia architecture. Feel free to skip the footnotes if you’re overwhelmed. Also, note that everything below assumes explicit or clear-box models (where model parameters have names that reflect their intended semantics). In a future article, we’ll discuss how to incorporate black-box models like neural networks, where most components (neurons) have opaque semantics (or are mostly polysemantic).

So let’s get started. Fast forward to a few years from now…[1]

Better science bottom-up

You’re a plant geneticist working on the analysis of some experimental results that you want to publish. You have a model of how your new maize strain improves yields, and you’ve tested it against an experimental data set. (In the example pseudocode below, we use a Python-based syntax for concreteness, but this could be implemented in any statistical analysis software or framework, like R or Julia or even Excel spreadsheets.)

def model(strainplanted, soiltype, rainfall, cropyield):
	## Set parameter priors
	deltayield ~ Normal(...)
	avgyield_control ~ Normal(...)
	avgyield_experimental ~ Normal(avgield_control + deltayield, ...)
	𝛽_soiltype ~ Normal(...)
	𝛽_rainfall ~ Normal(...)
	...

	## Define likelihood of the target variable cropyield given the covariates and parameters:
	## p(cropyield | strainplanted, soiltype, rainfall, ...params)
	with field = plate("field"):
		with t = plate("t"):
			baseyield = avgyield_control if strainplanted == "control" else avgyield_experimental
			soiltype_effect = 𝛽_soiltype[soiltype]
			rainfall_effect = 𝛽_rainfall * rainfall
			cropyield ~ Normal(𝛼_yield + 𝛽_soiltype + 𝛽_rainfall)
fieldtstrainplantedsoiltyperainfallcropyield
Martha’s Meadow2023controlgood0.520
Martha’s Meadow2024controlgood0.418
Ada’s Acres2023controlbad0.815
Ada’s Acres2024controlbad0.112
Peter’s Patch2023experimentalgood0.435
Peter’s Patch2024experimentalgood0.441
Lee’s Lot2023experimentalbad0.133
Lee’s Lot2024experimentalbad0.234

Like most scientific analyses, this is a hierarchical model, where your local variables represent observations or states of the current context – say, the yield in each given season and field – and are influenced by parameters that represent more generic or abstract variables – average yield for your strain across all fields and seasons, which in turn depends on the expected yield improvement from a given genomics technique. (The latter is generic enough that it’s not really specific to your study, which is why it’s highlighted in orange below.)

Running this model on a data set can be understood as propagating information through the graph. First, the priors for the parameters inform the expected distributions for the local variables. Then as we gather observations for some variables, that information flows back up, giving updated posteriors for the parameters. The amount of information (uncertainty reduction or negentropy) being propagated can be understood as a flow on this graph[2] and indeed can be estimated as an output of many kinds of common inference algorithms.[3]

It’s really useful to think informally of the free energy of the model as the discrepancy between the inferred distribution and the information we have available, between priors and observations.[4] Zero free energy is the ideal state in which all information has been fully incorporated into the inference, is completely internally consistent, and explains away all the uncertainty in the system. Typically we can’t achieve zero free energy, as there’s always some uncertainty (whether aleatoric or epistemic), but we want to minimize it so that our model doesn’t have “extra”, unwarranted uncertainty. To get a better understanding of the concept of free energy and its role in Bayesian modeling and active inference, there are many excellent resources available; we particularly recommend this paper by Gottwald and Braun. Going forward, you can just think of Free Energy Reduction (FER) as a standard unit of account used by each model.

Source: Gottwald and Braun (2020)

But here’s a problem: How do you set priors for your parameters in the first place? Sure, you expect your strain to increase yield, but it would be circular reasoning to build that expectation into the priors. The common practice is to use a flat prior (also known as a weakly informative or regularization prior), that incorporates only information that you have an objective or incontrovertible reason to believe in (ex: penalizing unreasonably low or high values). This can be seen as “not sneaking information into the model”, to avoid fooling yourself (and your stakeholders, the people who will use your study results to make decisions) by publishing unjustifiably confident results.[5] However, typically, most parameters in your study do not represent hypotheses you’re actively trying to learn about; instead, they represent assumptions that are justified by previous studies or expert opinions. For those, you want the opposite kind of prior, a sharp or strong prior.

In the past, if you were very lucky, there would be a published meta-analysis about the parameters for each of your assumptions, to save you the pain of combing through thousands of PDFs, understanding each, and copy-pasting numbers from the relevant tables into your workspace. Unfortunately, this work was so mind-numbingly boring, expensive, thankless and error-prone, that high-quality meta-analyses were exceedingly rare.[6] To make matters worse, unlike the toy example above, real-world scientific models often utilize hundreds to thousands of parameters, and often far more if machine learning is used. Gathering the outputs of every relevant study for every relevant parameter, by hand, was infeasible, so we ended up with constant wheel reinvention and cargo-culted, unjustified assumptions, often used as point estimates with no uncertainty attached.[7]

No longer: now you can simply connect your local model to the Gaia Network by annotating each parameter (in our example, average yield and drought tolerance, for both the control group of traditional maize and the experimental group of genetically modified maize). Your annotation attaches each parameter to a global namespace called the Gaia Ontology. You can browse the Ontology to see the exact definition of the parameter, with example code, and make sure you’re using the right one. Many other scientists have published their studies on the Gaia Network; each published study contributes a posterior distribution for its parameters, and these are algorithmically aggregated into a “sort of weighted average” called a pooled distribution.[8][9]

So at inference time, the Gaia engine just queries the network for the current pooled distributions for each of these parameters – effectively conducting a meta-analysis on the fly – and adopts them as priors. [10][11]

@gaia_bind(deltayield={"v0.Agronomy.YieldImprovementPct":
						{"species":"v0.Agronomy.Species.Maize",
						"intervention":"v0.Agronomy.Genomics.CRISPR"}})
def model(strainplanted, soiltype, rainfall, cropyield):
	## Model code is unchanged

As the illustration above shows, your model is importing information from other studies in the network and using it to increase FER. Gaia keeps track of the “credit assignment[12], which will prove valuable starting in the next step, which is to publish your work.

To contribute to the network, all you have to do is commit your study to GitHub. Gaia will save your posterior distributions for all parameters that you’ve annotated, and share them back with every other study in the network. Your study and your peer studies each have an update chain, an append-only sequence of distributions representing the state of posteriors from each study’s perspective. These are effectively independent representations of the state of knowledge of the parameters in question.[13]

So, immediately you can see that any other agricultural studies about different experimental strains will have their posteriors affected by adding your study to the pool of updates.[14] This effect can be quite large if few studies are being pooled, but it converges, so that after some point the updates become minimal.[15]

But Gaia doesn’t just propagate posterior updates to “sibling” studies: If there are higher-level models for which your parameter is a leaf, it will propagate up to those as well. For instance, a model that forecasts advances in crop technology and their impact on global food security:

Note that, by publishing your model on the Network, you’re not exporting any information other than updates to the values of the specific parameters that you’ve connected – in particular, you’re not sharing any of the underlying data. (This is a “privacy-centric” inference approach, analogous to federated learning. In a follow-up article, we’ll discuss how we can solve the problems of trust that this imposes.)

As mentioned above, the Gaia Protocol assigns credit to every publication (also called attribution). The mechanism for attribution is primarily “subjective”: each node (ie, each study) just measures the net FER impact of each contribution as it’s incorporated into its own update chain.

Above, we mentioned that the pooled distribution is a “sort of weighted” average between each study’s posterior. So where do the weights come from? The Gaia Protocol also answers this question in a bottom-up, “subjective” way. Nodes can independently infer the “right” weights for each parameter and study. To do so, they can use arbitrary “metamodels”, ranging from simple “beauty contest” models that just aggregate the net FER impact that other studies have attributed to a contribution, to “web of trust” models that try to factor out more sophisticated ones that infer the presence of low-quality studies or deliberate fraud via social-type network analysis; to true “metamodels” that infer study quality and parameter relevance, using outside data such as the publisher’s credentials, analyses of the model code and third-party verifications of the data. This means that, at least in the short term, the pooled distribution for a given parameter is actually different depending on which node you ask! Even if all nodes have seen all the updates in the same order, they can give arbitrarily different weights to them. But as the different metamodels themselves accumulate quality signals, nodes eventually converge on a shared inference of the right metamodel to use on which kind of parameter. (As discussed in the introduction, we will not attempt to justify the claim that this protocol converges and is resilient to noise and misinformation/​fraud. For now, see the arguments here.)

Funding science – retroactively and prospectively

Now approaching this from the opposite perspective, say you’re an analyst at a philanthropic foundation, trying to make recommendations for a prize that will be awarded to the most impactful scientific studies. Rather than solely rely on recommendations from the scientific community, or use “impact factors” that just measure popularity, you can query the Gaia Network to get quantitative, apples-to-apples impact metrics.

First, we should just note that being able to understand the “graph of science” in a live, transparent way – what are the research questions, how well developed, how much intensity in explore vs exploit mode, and how they connect to each other – is a game-changer. In the past, you needed to pay expensive fees for products like Web of Science and Scopus, which were based on manual curation and benefitted from the opaqueness of text-on-PDFs as the primary means of scientific communication. Having all the world’s science directly represented as machine-readable and connected models on Gaia – just like code and its dependencies on a package manager – makes all analytics orders of magnitude easier.

Now, back to the question of impact. Here we should distinguish two kinds of impact: epistemic impact—how much a given study has contributed to reducing uncertainty in the Network; and pragmatic impact—how much it has contributed to improving decisions. We’ll leave the pragmatic impact for later and focus on the epistemic impact for now.

So, for every model on the network, it’s easy to compute how much it contributed to FER flow across the network – what’s called credit assignment in neural networks. We just look at the net flow across the model boundary, which is accounted by the Gaia Protocol:

Some care needs to be taken here. First of all, note that the FER credited to a model due to its contributions is always computed by the model that’s receiving the contributions (it’s a subjective value). Plus, there might be significant differences in modeling practices between different fields, which may distort calculations. Later on, when we talk about economics, we’ll see that the Protocol also needs a way to turn that subjective value accounting into intersubjective, mutually agreed upon “local exchange rates”. For now, let’s say that you compute a normalization constant for each domain and use it to get a normalized, apples-to-apples net FER flow across domains.

So this covers retroactive funding in the form of prizes. But this isn’t (and shouldn’t be) how most science gets funded. Most researchers cannot internalize the risk and cost of self-funding their work upfront and hoping for retroactive funding later. Instead, funders – who have access to cheaper capital costs, lower marginal risk sensitivity, and the other advantages that come with a big pile of cash – contract with researchers upfront to trade capital now for a future flow of impact. Before the Gaia Network, establishing effective contracts was very challenging, as it was extremely hard to predict impact, even for the researchers themselves, let alone for the funders. (In economic parlance, it was a classic agent-principal problem created by uncertainty and information asymmetry.) Now, the Gaia Network itself provides the solution: it contains metascience models that model the flow of FER across the network and use it to design interventions – adding more models and more data to specific fields and individual lines of research – that are likely to deliver the highest future flow of FER. Funders and researchers can use these models equally to guide where they should spend the most time and resources.[16] Compared with the recent past, where there were no meaningful metrics of scientific productivity or value added, let alone predictive models of how to improve these, the Gaia Network is a game-changer for science funding.

A distributed oracle for decision-making

The above covers the advancement of science. However, the same capabilities can aid any decision-making that pertains to the real world[17] – what we’ve called pragmatic impact above. Indeed, the Gaia Network has given everyone an actionable, reliable way to “trust the science” – not just on big things like climate change and pandemics, but also on day-to-day things like your diet, exercise, relationships, and so on. And the same applies to business decision-making, which is where we’ll focus next.

Say you manage Ada’s Acres, a large farming operation in the US Midwest. You’re planning your next planting for your 30 thousand hectares, and as usual, your suppliers are trying to push you new seeds, new herbicides and all manner of hardware and software. Meanwhile, your usual buyers are all calling to let you know that global demand forecasts are through the roof, so you stand to gain a lot of money if you have an outstanding harvest. However, you’ve noticed that the soil has been increasingly poor and in need of fertilizer and that herbicide resistance has increased a lot as well. The weather has been increasingly volatile, and you know it’s a matter of time before you have a major crop failure. Maybe it’s time to start giving regenerative farming a real shot?

Luckily, your farm operations software is now connected to the Gaia Network. It gives you a predictive digital twin of your farm that directly learns not just from every scientific experiment in agronomy, but from the “natural experiments” carried out by every other farm that uses the Network. So you can simulate the effects of any combination of practices, seed strains and products and estimate the outcomes, both short-term (expected yield and probability of crop failure for the next harvest) and long-term (soil health and herbicide resistance).

So that was a “small” (operational) use case. Now let’s zoom out to strategy[18]: let’s say that you’re the CEO of Acme Foods, a major food company. In light of increased droughts and crop failures, you’re trying to invest in your supply chain to minimize the risk of supply shortages. Your innovation teams have aggregated a long list of potential investments in precision farming, genomics, and regenerative agriculture. In the past, assembling an investment portfolio out of that long list would have required a long, expensive and very political negotiation exercise. Now that all your suppliers are connected to the Gaia Network and share limited access with you, your portfolio management system becomes a distributed digital twin of your supply chain. You can run complex distributed queries across all the nodes, simulate the effects of different investment combinations and different sets of assumptions (like climate and pest spread scenarios), factor in things like unintended consequences, and pull out an aggregate like a Pareto frontier for the investment profile you want.

Most of the demand for the intelligence in the Gaia Network will come from these decision engines (DEs), like the farm operations software and the portfolio management system. Combined with the ability of the Protocol to assign credit where it’s due, it can provide signals and incentives to provide a better supply of intelligence: more and better models in the places where they are most needed by decision-makers. In a future paper, we will further develop our vision of how these signals can be developed into a complete market and contracting mechanism for directing applied research, exploration and analysis: what we call the knowledge economy.

Even further, if we have “non-local” DEs that use Gaia models to design coordinated strategies that internalize the benefits of cooperation between multiple agents, then we can turn those DEs into Gaia models themselves! They become decision models performing “planning as inference” on behalf of agents (individuals and collectives), helping to solve all kinds of principal-agent problems. In the example above, the food company can use a DE not only to infer the best investments for its own goals but also to design adequate contracts and incentives that will best equalize the goals and constraints of all the players in the supply chain. This delegation economy will also be further explored in a future paper.

A distributed oracle for AI safety

The above discussion of decision-making is our link to AI safety. Yoshua Bengio has proposed to tackle AI safety by building an “AI scientist” – a comprehensive probabilistic world model that would serve as a universal gatekeeper to evaluate the safety of every high-stakes action from every AI agent, instead of attempting to design safety into agents. This is similar to Davidad’s Open Agency Architecture (OAA) proposal. But of course, developing such a monolithic, centralized and comprehensive gatekeeper from scratch would be an extremely costly and lengthy undertaking. Further, as Bengio’s proposal makes clear, the AI scientist needs to have “epistemic humility”: its evaluations need to incorporate the limitations and uncertainty of its own model so that it doesn’t confidently allow actions that seem safe at the time but turn out to be unsafe in retrospect.

We argue that the Gaia Network, including the DEs that work as decision models, qualifies perfectly for the job of a distributed AI scientist. The DEs can query the diverse and constantly evolving knowledge in the network to form an “effective world model” with epistemic humility built in. They can provide the demand signals and resources to improve and expand the world model. They can then use this model to simulate counterfactual outcomes of actions that take into account all available local context and dependencies between contexts, and use these simulations to approximately estimate probabilities for outcomes (marginalization). They can factor in the preferences and safety constraints of all agents that use the Network, which they have already shared in order to enable the DEs to help with their own decisions. This gives all the terms in Bengio’s notional risk evaluation formula (adapted from slide 17 here):

Possibly the most important aspect of this design – which comes particularly to light when comparing it to the OAA design – is that none of the above components is specific to AI safety; they are just repurposed from existing and day-to-day use cases for which the users/​agents already have the incentives to share the required information with the Gaia Network and the DEs. This means that tackling AI safety is no longer “one of the most ambitious scientific projects in human history”, but rather a “fringe benefit” from our pursuit of knowledge and better decision-making. And which, in turn, benefits from all improvements to the efficiency and effectiveness of those pursuits that have already been produced by past and ongoing advances in computational statistics and machine learning – and all that will be generated by the Gaia Network connecting and interoperating the many millions of such models in existence, and increasing the RoI of creating and improving models.

This outcome is not dependent on AI safety funders; nor the foibles of political will in the scientific and policy communities; nor the desire of billions of humans to independently share their preferences with an elicitor. All that is required – beyond some cheap work on core infrastructure, modeling and developer experience – is the same economic behaviors and incentives that exist today: the desire for profit, the pursuit of greater scientific knowledge, and the existence of institutions willing and able to internalize the cost of coordinated action.

An overview of this architecture, adapted from our last post, is given below.

A distributed oracle for the Metacrisis

The very same architecture helps us identify shared pathways through the Metacrisis. Below is a nice visual of the high-level causal model we have in mind when thinking about the Gaia Network’s role. By connecting all the relevant domain models and making apparent not only their interdependencies but also their common causes – the “generator functions” or underlying self-reinforcing dynamics – Gaia helps us understand likely future outcomes of the current trends and establish strategies with the highest potential for nudging our global course away from the two catastrophic attractors that currently seem most likely (chaos and totalitarianism). Not only that, but as we’ve seen, Gaia-powered DEs are also used as coordination surfaces: shared tools for establishing and monitoring contracts, treaties and institutions, with unprecedented scale and reliability. While this “infrastructure for model-augmented wisdom” doesn’t immediately or inherently solve conflicts of power and interests, it does provide a consistent, repeatable and scalable institution for achieving and retaining incremental advances towards a positive-sum, cooperative Gaia Attractor.

Source: Adapted from Potentialism, via Sloww

Conclusion: Back from the future

We just claimed that a lot will change in “a few years from now”. How realistic is this? Here’s the really good news: all the capabilities described above can be implemented with today’s technology.[19] Not only that: we’re already doing it. We have assembled several organizations and individuals into a growing Gaia Consortium, and have of course been leveraging loads of existing components and building some of our own. Examples:

  • Ocean Protocol and DefraDB: Decentralized computing and data management.

  • Fangorn (coming soon): a decentralized platform for building and performing (active) inference on Gaia-connected state space models.

  • Sentient Hubs: Federated model-based decision support.

We are simultaneously working on specific applications of the Gaia Network, focusing primarily on bioregional economies and sustainable supply chains. These have been useful for providing concrete use cases (some of which we saw above) and resourcing. But ultimately we intend to evolve this into a fully open and collaborative R&D effort to build the general-purpose capabilities described above.

If you’re interested in contributing to this work, here are some possible ways to do it:

If you’re interested in any of the above, please reach out!

  1. ^

    Below, Gaia Network and its applications are described both in present and future tense in different narration modes. To avoid confusion, note that Gaia Network is not yet implemented and deployed on a large scale.

  2. ^

    In a simple structure like this, a single backward propagation is enough, but there are cases where we need to iteratively update (message passing). For those cases, think of the net flow that is obtained after propagating up and down enough times.

  3. ^

    For instance, in variational inference algorithms, the free energy (or stochastic estimates of it) is directly used as a minimization objective. Equivalently, its negative, the Evidence Lower Bound [ELBO], is maximized.

  4. ^

    There is an additional concept of free energy associated with decision-making, corresponding to the discrepancy between the veridical posterior justified by priors and observations and the one “desired” in light of a given reward function/​model/​distribution.

  5. ^

    If you already have information that comes from past experiments, or knowledge elicited by independent experts, you can also incorporate it into the priors. The challenge is how to keep track of the grounding behind all of this imported information. This is, in a sense, what the Gaia Network does algorithmically, as we’ll see.

  6. ^
  7. ^

    Even for the parameters of interest in your study, there is a high value in having access to past studies’ posteriors: after having your posteriors “in isolation”, you now want to compare them to previous results in the literature, to check for novelty or consistency.

  8. ^

    Technically, a pooled distribution is not a weighted average of distributions (that would be a mixture distribution); instead, it’s a distribution whose parameters are a weighted average (or other combination) of the parameters of the original distributions. Just so we’re clear: here we’re talking about statistical parameters of the posteriors of scientific parameters; for instance, the mean and variance of the average yield.

  9. ^

    In practice, different studies often use different model structures and local ontologies. Sometimes these are just syntax differences, such as alternative parameterizations (ex: centered vs uncentered parameters, etc), but often they represent different semantics – different statistical constructs, reflecting differences in context and/​or scientific methodology. To enable aggregations to happen between models with these differences, translations are required. To this end, Gaia contributors often publish lens models that perform data translation. As an added benefit of this approach, in cases when there are different semantics that inevitably lead to a loss in translation (as WVO Quine pointed out and Chris Fields has recently formalized), it’s useful for there to be a separate lens model that accounts for and “absorbs” that loss.

  10. ^

    This does mean your model is colored by using the informative Gaia posteriors as priors for a parameter of interest. But you can always turn off the annotations for those parameters to isolate the effects of the information contributed by your study (aka the likelihood).

  11. ^

    In this example, these are independent scalar parameters, but they could be any multidimensional array with any kind of internal correlation structure.

  12. ^

    See also “The Credit Assignment Problem” by Abarm Demsky.

  13. ^

    This is unlike a blockchain, which is designed to ensure that all nodes are “almost always” in full consensus about the entire contents of the global state (which then requires hacks like “L2” chains to improve speed and flexibility).

  14. ^

    How? It depends on the parameterization used, but in most cases, partial pooling brings posterior means closer together. You can have parameterizations with multiple modes, like a Gaussian mixture distribution, but this tends to imply that your parameter is representing multiple categories instead of a scalar and should be changed to reflect that.

  15. ^

    No matter how small, Gaia eventually propagates every nonzero update to every parameter on the network, so we can have eventual consistency. The protocol can choose to batch small updates for efficiency.

  16. ^

    Of course, no one cares about an abstract quantity like FER; they care about concrete advancements in specific areas of science. But that’s the same as saying no one cares about money, but about the goods and services they can buy with it.

  17. ^

    That primarily means we’re excluding “teach AI how to play video games” or “decide which next token to generate for a user” types of scenarios.

  18. ^

    We could zoom even further out to tackle the domains of strategy consulting, and ask more “meta” questions. What are the theories of change, how do they connect to each other, how well developed, and how much intensity is in explore vs exploit mode? We will explore these further in a follow-up article.

  19. ^

    There are some areas where current solutions aren’t fully adequate, but these are matters of incremental progress, not qualitative breakthroughs.