The $1 Trillion Question
The $1 Trillion Question, Mortality in the Age of Generative Ghosts, Your Mind and AI, How to Design AI Tutors for Learning, The Imagining Summit Preview: Adam Cutler, and Helen's Book of the Week.
Graph RAG doesn't necessarily replace knowledge graphs but can serve as a complementary tool, especially in scenarios where rapid, scalable, and dynamic summarization of large unstructured datasets is required.
Knowledge management and enterprise search are notoriously challenging endeavors. For years, people have yearned for a "Google for the enterprise" or an "Alexa, tell me sales from last quarter"—style capability. However, there are numerous reasons why it's not that straightforward. First and foremost, while enterprises might believe they possess a substantial amount of data, it pales in comparison to the vast expanse of the internet. Moreover, enterprise data is often siloed, lacking in metadata, contextual, and riddled with ambiguities stemming from the manner and purpose of its collection. Furthermore, access and security models are of utmost importance.
The primary tools for enhancing data access and knowledge management have been centered around the development of knowledge graphs. However, this is an arduous task in itself. Then came the advent of LLMs, which sparked widespread excitement about the possibility of fine-tuning an LLM for enterprise data. Nevertheless, the same problems persist. Only banks and other highly regulated companies have successfully developed robust internal LLMs, thanks to their strict data management protocols that have underpinned their strong data cultures.
Now, with the emergence of RAG (retrieval augmented generation), it's possible to bypass fine-tuning and directly query a corpus of information. However, the holy grail remains the contextual and relational representation of knowledge. There's a bit of magical thinking involved here too—the notion that by somehow networking the information, knowledge and wisdom will magically emerge from the graph.
Knowledge graphs provide a sophisticated method for information retrieval, presenting a holistic view of interconnected data. This approach allows for a deeper understanding of global contexts, going beyond mere data compression to reveal extensive connections and interactions between various entities. This scalability is crucial for enterprises dealing with large-scale data repositories, enabling them to maintain robust performance even as data volumes expand. Unlike vector embeddings, which can pinpoint specifics like who, what, when, and where, knowledge graphs excel in illustrating 'why'—the reasons and deep links between various pieces of information. This capability makes knowledge graphs uniquely powerful for contextual understanding, as they not only present data but also its interdependencies and underlying rationale.
A less obvious emergent property of LLMs is the way they inherently construct knowledge graphs. Building these graphs, whether from public or private data, is typically challenging. However, new research from Microsoft leverages this emergent property to enhance information retrieval significantly. It achieves this by enabling the language model to query data using its own intrinsic knowledge graph.
The paper from Microsoft—From Local to Global: A Graph RAG Approach to Query-Focused Summarization—shows how an LLM can build a knowledge graph and how this can be used to essentially leapfrog both regular RAG and traditional knowledge graph building techniques.
The Artificiality Weekend Briefing: About AI, Not Written by AI