This week we are leaning into multiple metaphors: AI as a mirror, UX design as a window or room, and life as information. Plus: read about Michael Levin's upcoming presentation at the Imagining Summit, Helen's Book of the Week, and our upcoming events.
It’s easy to fall prey to the design illusion that because LLMs look sleek, they must be well-designed. But aesthetics alone do not equal design. As Steve Jobs once said, “Design is not just what it looks and feels like. Design is how it works.”
It’s curious that these two papers, tackling such similar ideas, came out at the same time. Is this coincidence, or does it tell us something about where the study of life and intelligence is heading?
Researchers are exploring methods to enable large language models to display more human-like, emergent reasoning capabilities.
Chain of Thought and Tree of Thoughts approaches improved reasoning but were still fundamentally linear.
Human reasoning operates more like a graph, interconnecting thoughts in parallel, dynamic networks.
Graph of Thoughts (GoT) models allow language models to reason in an interconnected, non-linear graph structure.
In an initial test, GoT outperformed other techniques in a sorting task, increasing quality while reducing costs.
For complex planning, GoT could enable simultaneously considering multiple options and constraints.
This approach may allow more sophisticated reasoning and decision-making, closer to human cognition, though whether it enables emergent capabilities remains to be seen.
A recent paper on Graph of Thoughts reasoning highlights the progression towards Agentic AI, one of our Artificiality Pro Obsessions. As AI tools become increasingly autonomous and capable of handling complex tasks with minimal supervision, their reasoning abilities, task design, and capacity to recover from failures, is crucial.
To encourage Large Language Models (LLMs) to reason in more human-like ways, researchers have been exploring various methods, with a recent focus on enabling models to be more flexible and how they combine ideas. We see three big steps in prompt engineering techniques, each of which demonstrate a significant step up in the ability of prompt engineers to tap into the knowledge in a large language model: chain of thought, tree of thoughts, and recently graph of thoughts reasoning.
Initially, Chain of Thought reasoning was found to enhance the effectiveness of LLMs. By breaking down complex problems into simpler components, models could reason more effectively. Humans do this too: one of the best predictors of someone’s ability to solve a problem is how early in the process they break the problem up. Multiple chains of thought advanced this technique by enabling multiple independent paths to be generated but, even with this advance, reasoning is still limited because there is no way to perform any "local exploration" such as back tracking.
Building on this, the Tree of Thoughts approach was introduced, which allowed AI models to follow multiple paths, similar to human decision-making processes, while still breaking down problems. Think of this as the LLM generates many thoughts (outputs) and all thoughts exist in a structure similar to a decision tree and the final reasoned output is the best construction of this tree of thoughts. Though an improvement, this method remained fundamentally linear in its approach.
How can AI break out of its linear thinking? It has to be able to think a bit more like a human: recursively and in a graph (network) kind of way.
Human reasoning, as highlighted by thinkers like Andy Clark, is often "loopy." We pull ideas from various sources, combine them, revisit previous thoughts, and continuously integrate new insights or preferences. Our thinking operates in parallel, is dynamic, and exhibits complex systems behavior such as phase transitions and criticality. For example, if you're working on a novel problem you might start down one path, backtrack, pause, combine an idea from a previous thought process, merge ideas, and then consider strengths and weaknesses of various ideas.
This process of loopy human reasoning is closer to a graph structure, which is prevalent in information systems, where thoughts interconnect in a network. A social network is a graph where people are nodes and their relationships are links.
A Graph of Thoughts (GoT) process allows LLMs to operate in a graph-like structure, more closely resembling human thought processes. This method represents the reasoning of LLMs as a graph, with thoughts as nodes and their dependencies and relationships as links or edges. This structure allows the combination of various thoughts in new and recursive ways. This in turn facilitates more complex and interconnected reasoning patterns, including novel transformation and aggregation of thoughts.
For example, some nodes could model the plans for writing a paragraph while others model the actual paragraphs of text. GoT offers a structure for transforming these "thoughts" or for looping over a thought in order to enhance it. These graph-enabled transformations could enable better document merging, for example. The researchers used GoT to generate a new NDA based on several input documents that partially overlapped in their contents with the goal of minimum duplication while maximizing information retention.
The architecture of the system is quite complicated: a set of modules that themselves interact. The prompter prepares messages as inputs to the LLM, encoding the graph structure within the prompt. The parser extracts information from the LLM's replies and constructs a thought state. The scoring module verifies and scores the replies, either by referring back to the LLM or to a human. The controller coordinates the entire process and decides how to progress it.
In a (for a human) simple task—sorting—the researchers found that GoT outperformed other prompting techniques. GoT prompting resulted in a 62% increase in the quality of tasks over ToT, while simultaneously reducing costs by more than 30%. GoT seems to do this by improving the tradeoff between latency (number of hops in the graph of thoughts to reach a given final thought) and the volume (number of preceding LLM thoughts that impact the final thought). In other words: demonstrating the efficiency of graph-based structures for (what is essentially) a search process.
The base line performance improvement is quite impressive, but even more importantly, GoT seems to get better as problems get bigger. Quality increases as the problem size (and complexity) grows. What's going on here? Unlike Input/Output, CoT, or ToT prompting schemes, Graph of Thoughts demonstrates that a graph structure can brings more thoughts to a problem and allows for those thoughts to change as the problem becomes more elaborate.
The approach is perhaps best conceptualized as a generic framework to enhance an LLM architecture without having to update the model itself. The researchers have creatively adapted graph abstraction, a general computing approach exemplified by developments like AlphaFold, and applied it to prompting in the field of AI. This unique application signifies a significant advancement in the area.
What interests us more is how it may accelerate the use of LLMs in more complex reasoning tasks where humans have to make a lot of interrelated decisions. For example, planning a complex itinerary. Here the model could consider various travel options, user preferences, and constraints simultaneously, navigating through these choices in a non-linear, interconnected, and recursive manner.
The graph of different thoughts doesn't emerge on its own: there isn't a magic prompt for having ChatGPT unbundle an elaborate problem for you. Unlike chain of thought reasoning, it's not a prompting scheme: it's more of a fine-tuning scheme. To put it into action, designers will have to take on the challenge of creating an interface and affordances that allow a user to guide the generation of thoughts and their relationships in partnership with the AI.
Helen Edwards is a Co-Founder of Artificiality. She previously co-founded Intelligentsia.ai (acquired by Atlantic Media) and worked at Meridian Energy, Pacific Gas & Electric, Quartz, and Transpower.