Imagine a world where the boundaries between human and machine have blurred, where the vast expanse of digital information has become a living, breathing entity—the dataome. This interconnected ecosystem of data, envisioned by astro-biologist Caleb Scharf, encompasses all the digital information that exists, from social media posts and online transactions to sensor readings and scientific research. Constantly growing and evolving, the dataome forms new connections between existing data points, creating a world where we are no longer mere users of technology, but an integral part of the data ecosystem itself.
You might think this idea is far-fetched, even outlandish. But it's happening. Perhaps we are even reaching a tipping point where suddenly we aren't just surrounded by big data and AI, we become data ourselves.
Our interactions with digital technologies—from the websites we visit and the searches we conduct, to the products we buy and the locations we visit—are all being captured and integrated into the dataome. As a result, the dataome is becoming an increasingly comprehensive and detailed representation of the world and its inhabitants.
You might think that the dataome is just a concept, but we argue it's how we should view our current reality. We're on the verge of an inversion in the web, where our world will no longer encompass the data world. Instead, the data world will encompass us. When this balance tips, how we interact with and understand the world around us will change forever.
By machines, for machines
AI content is exploding. This trend is only set to accelerate in the coming years, with projections suggesting that the majority of web content will be machine-generated in the near future. This shift marks a departure from a human-centric internet, where content is created, curated, and consumed by people, to an AI-driven network where machines play a primary role in generating and managing information. This transformation brings about a new era where the dynamics of information creation, dissemination, and consumption are fundamentally altered.
An autonomous information ecosystem is characterized by its ability to generate, manage, and evolve content with minimal human intervention. This also means that it will be increasingly difficult for humans to navigate and make sense of the information landscape. Our cognitive abilities are simply not equipped to keep pace with the scale and complexity of the data being generated. As a result, we will need to rely on AI assistants to mediate our interactions with the web, filtering and curating the information we consume.
Human behavior readable only by machine
We need look no further than research on hate speech online to realize how much we already rely on machines to tell us about digital human behavior. It is inherently complex with new emergent phenomena arising all the time and we are increasingly relying on machines to interpret this for us. Machines can "see" how adaptive links form between hate communities across different platforms, creating a resilient web of connections that can quickly adapt to new circumstances and evade mitigation efforts. Only the machines can perceive the creation of waves of hate content that surge through the network, influencing a broad audience.
This research reveals that online behavior is far more complex than our intuitions would have us believe. Hate is not confined to the fringes of major platforms like Facebook and Twitter but is instead sustained through adaptive link dynamics across numerous smaller platforms. These platforms act as nodes in a vast network, with links forming dynamically to bypass mitigation efforts and extend the influence of hate content into the mainstream. This decentralized and adaptive nature of hate networks creates a robust system resistant to traditional top-down control measures. Which means it is resistant to us—our actions, our agency.
We are, to a certain extent, helpless in the face of this complexity. We will have to rely on AI systems to monitor, understand, and mitigate the spread of harmful content. But it's not just about hate speech—this is only one example. The sophistication and scale of the web as a machine-dominated set of adaptive networks in which we are data points now exceeds our capacity for real-time analysis and intervention.
Agentic machines flip the web
Adaptive AI agents create a step change in the complexity of the web. These agents may well be quite simple. They could be email or search assistants which are governed by simple rules. They might have very little autonomy. Yet the shear scale and diversity of agents in a complex web will mean that agentic AI is capable of building more intricate, adaptive online systems well beyond human intuition and scale.
The way these agents might alter the web echoes John Conway's Game of Life, where a set of basic rules can give rise to intricate and unpredictable patterns. In Conway's Game of Life, cells on a grid evolve based on a few simple rules, such as a cell dying if it has too few or too many neighbors. Despite the simplicity of these rules, the game can produce stunningly complex patterns that are difficult to predict or control. Similarly, AI agents operating on simple principles will create an internet which is even more complex and adaptive than today's.
We will have no choice but to rely on machines to read and navigate the vast amounts of information generated by these AI agents. The sheer volume and complexity of the data will be too much for human cognition to handle. These assistants will become our interface to the digital world, not so much because we will want them but because we will need them.
Convergent reality in machines
We are witnessing a shift towards what can be termed a "unified reality representation" as large multi-modal models converge on a shared reality. There is something inherently strange about this—AI models all seem to converge on the same structure. The theory is that they are collectively figuring out the nature of the world, at least as it is represented in their training data.
How does this work? As AI models grow in complexity and capability, they increasingly align in how they represent data, leading to a shared, statistical model of reality. This shared statistical model reflects a more accurate and comprehensive understanding of the world. As AI models process more data and perform a wider range of tasks, they align more closely with this ideal representation.
This matters because convergent AI models with a unified representation of the world have profound implications for human perception and understanding. First, AI becomes "ready" to take over mediating our reality. Second, as AI systems increasingly mediate our interactions with information, our view of reality becomes more filtered through these models. Third, we can use these models as telescopes into the dataome.
In this deal, we gain something remarkable: an extension of our minds far beyond our current experience. We get to peer into the dataome. By using AI models as telescopes into the dataome, we open up new realms of novelty, discovery, and creativity. A truly optimistic vision is that we get to leverage AI to extend our cognitive and creative capacities, enhancing our ability to discover and create.
We stay engaged because AI now stimulates our inherent need for novelty. AI carefully calibrates the level of uncertainty we perceive, triggering our natural drive to seek information. Machines depend on us to complete this loop because truly new data can only be created by humans but our epistemic foraging is closely mediated: not too much or else our cognition will grind to a congested halt.
Ultimately though, this is supposed to be a symbiotic relationship which ensures that our unique creativity and insights continually feed and enhance the AI. This, in turn, provides us with fresh and engaging experiences. But, here's a really big catch—novelty must be validated by machines: they become the "minds" that determine if what is considered truly new.
We will come to rely on AI agents to help us make sense of the vast amounts of information that are being generated by machines, for machines. We will increasingly turn to these agents to help us find the signal in the noise, to identify the patterns and insights that are most relevant and valuable to us. As we come to rely on these agents to filter and curate the information we consume, we are developing a new kind of intimacy and trust in our relationship with them.
This intimacy is fueled by the ability of AI agents to engage with us in ways that feel personal and emotionally resonant. Modern, recursive, human-like interfaces compel us to share (or correct a false machine "memory"). This sets up a positive feedback cycle where we share more, the agents come to understand our information preferences so offer us more of what we want, which encourages more sharing.
The allure of AI agents will be especially powerful when we are uncertain and stressed, ironically because our stress will partly stem from the sense that we can't truly know anything anymore. As information moves so fast that what we know feels instantly outdated, we lose our grip on certainty. The overwhelming nature of modern information and our inability to unplug make AI assistance indispensable. We will be grateful for their help.
So, when we face complex problems or difficult decisions, turning to an AI assistant for guidance and support will be incredibly reassuring. We might even prefer them to actual people because they seem to understand the problem (and, in many ways, they do, as they are a big part of it). By offloading some of our cognitive and emotional burdens onto these agents, we can reduce our anxiety and uncertainty, feeling more confident as we face life's challenges. However, this reliance diminishes our ability to solve problems on our own or with other humans, further reinforcing our dependency on machines. Once we are in this loop, there is no going back.
Perhaps our relationship with AI agents will be a kind of symbiosis. We will need them as our interface with the digital world, and in turn, because human curiosity and exploration is required for novelty they will need us to provide the creative spark and the new data that keeps the information ecosystem growing and evolving. In return, we gain the ability to transcend our natural limitations and explore realms beyond what evolution gave us.
The human powerhouse enclosed in the machine
We are witnessing the emergence of agentic and ubiquitous AI systems that will reshape the digital world. We will see a different level of complexity as even when governed by simple rules, agents will be capable of building complex, adaptive systems that are beyond human intuition and scale. We will see vastly more machine content than human content: nothing will be comprehensible without a machine interpreting it for us. There may be no knowledge without machine validation. By machines, for machines is the new paradigm.
As machines become more adept at generating and analyzing data, the unique perspectives and ideas that humans bring to the table become increasingly valuable. Our creativity and curiosity serve as a vital source of novelty and innovation. Without humans as there is no path to ensure novelty and diversity.
In this context, Caleb Scharf's idea of humans serving as the "mitochondria" of the dataome is quite mind-blowing. Just as mitochondria provide energy for cellular processes, humans may come to power the dataome, supplying the fresh ideas and perspectives that keep the information ecosystem vibrant and evolving. Is this our future? That we become the creative engines driving the growth and development of the dataome, even as we become increasingly enmeshed in its complex web of connections? It sounds like something straight out of science fiction.