This week we are leaning into multiple metaphors: AI as a mirror, UX design as a window or room, and life as information. Plus: read about Michael Levin's upcoming presentation at the Imagining Summit, Helen's Book of the Week, and our upcoming events.
It’s easy to fall prey to the design illusion that because LLMs look sleek, they must be well-designed. But aesthetics alone do not equal design. As Steve Jobs once said, “Design is not just what it looks and feels like. Design is how it works.”
It’s curious that these two papers, tackling such similar ideas, came out at the same time. Is this coincidence, or does it tell us something about where the study of life and intelligence is heading?
What are the capabilities of each model? Which model is better at what? What is the risk of using each model?
How were the models trained? On what data? Is the data freely available or copyrighted? What data was excluded? Why?
Which humans were involved in the training? How does human feedback affect the model? How were these people incentivized for their training input?
How does the model connect data across its data cosmos? How often are answers true? How often are they biased? When are their biases problematic?
What are the effects on users? Positive? Negative? Unknown?
What are the effects on the planet? How much energy is consumed? How much water is used for cooling?
How much does training a generative AI model cost? How much does inference cost? What is the short and long term estimate for cost changes?
Some of these mysteries may come across as nice-to-know, while others are essential-to-know. To trust generative AI in your daily workflow, you need to know how it works, when to trust it, and when not to trust it. This is the same as any intelligence you work with. Consider the people you work with. Whom do you trust to do what? Who do you not trust to do what? Now think of your favorite generative AI model: do you have the same level of insight?
The question of trust and understanding is even more important for companies developing applications on top of foundation models. Let’s say you’re considering developing a chatbot to communicate with your customers. Leveraging a foundation model for your chatbot provides an incredible opportunity because you can tap into its vast knowledge and superior language understanding and generation. But by using a foundation model, you may also inherit all the models’ problems and perils.
Mitigating risk in using foundation models requires understanding the models themselves. So, how much do we know about these models?
Today, the answer is very little. New research from the Stanford Center for Research on Foundation Models shows that foundation models are largely mysterious. The research team scored 10 major foundation models on 100 indicators of transparency, spanning how the model is built, how it works, and how it is used. The resulting scores on the Foundation Model Transparency Index paint a clear picture—with a high of 54 and a low of 12 (out of 100), these models are far from transparent.
The need for transparency in generative AI is different from previous technologies. It never mattered much how Microsoft Word works because the document output is based on the words you type into it. Generative AI, however, is creating the words. The choices made by the model creators are embedded inside the system and affect the outcome. When we don’t understand the context of those choices, we cannot understand the constraints on the model output.
Many of the mysteries above represent choices made that are either difficult or impossible to undo. For instance, the choice of which data to include takes a model down a path that cannot be reversed without starting from scratch. In complexity science, this is called path dependency—decisions that create lock-in, limiting future options. Without transparency, we don’t understand the decisions made today that will constrain future options. History matters.
Foundation model companies largely claim that transparency conflicts with competitiveness. It’s true that some disclosures would provide competitors with a better understanding. But what happens if generative AI customers—both consumers and enterprises—demand greater transparency, creating a competitive incentive for disclosure?
Perhaps it’s best to reverse the question: why wouldn’t generative AI customers demand greater transparency? Would you delegate tasks to someone who wouldn’t tell you anything about their methods? If not, why would you do the same with a mysterious machine?
Dave Edwards is a Co-Founder of Artificiality. He previously co-founded Intelligentsia.ai (acquired by Atlantic Media) and worked at Apple, CRV, Macromedia, Morgan Stanley, and Quartz.