www.quantamagazine.org /the-unpredictable-abilities-emerging-from-large-ai-models-20230316/

The Unpredictable Abilities Emerging From Large AI Models | Quanta Magazine

By Stephen OrnesMarch 16, 2023 5-7 minutes 3/16/2023

But the researchers quickly realized that a model’s complexity wasn’t the only driving factor. Some unexpected abilities could be coaxed out of smaller models with fewer parameters — or trained on smaller data sets — if the data was of sufficiently high quality. In addition, how a query was worded influenced the accuracy of the model’s response. When Dyer and his colleagues posed the movie emoji task using a multiple-choice format, for example, the accuracy improvement was less of a sudden jump and more of a gradual increase with more complexity. And last year, in a paper presented at NeurIPS, the field’s flagship meeting, researchers at Google Brain showed how a model prompted to explain itself (a capacity called chain-of-thought reasoning) could correctly solve a math word problem, while the same model without that prompt could not.

Yi Tay, a scientist at Google Brain who worked on the systematic investigation of breakthroughs, points to recent work suggesting that chain-of-thought prompting changes the scaling curves and therefore the point where emergence occurs. In their NeurIPS paper, the Google researchers showed that using chain-of-thought prompts could elicit emergent behaviors not identified in the BIG-bench study. Such prompts, which ask the model to explain its reasoning, may help researchers begin to investigate why emergence occurs at all.

Recent findings like these suggest at least two possibilities for why emergence occurs, said Ellie Pavlick, a computer scientist at Brown University who studies computational models of language. One is that, as suggested by comparisons to biological systems, larger models truly do gain new abilities spontaneously. “It may very well be that the model has learned something fundamentally new and different that it didn’t have at a smaller size,” she said. “That’s what we’re all hoping is the case, that there’s some fundamental shift that happens when models are scaled up.”

The other, less sensational possibility, she said, is that what appears to be emergent may instead be the culmination of an internal, statistics-driven process that works through chain-of-thought-type reasoning. Large LLMs may simply be learning heuristics that are out of reach for those with fewer parameters or lower-quality data.

But, she said, finding out which of those explanations is more likely hinges on a better understanding of how LLMs work at all. “Since we don’t know how they work under the hood, we can’t say which of those things is happening.”

Unpredictable Powers and Pitfalls

There is an obvious problem with asking these models to explain themselves: They are notorious liars. “We’re increasingly relying on these models to do basic work,” Ganguli said, “but I do not just trust these. I check their work.” As one of many amusing examples, in February Google introduced its AI chatbot, Bard. The blog post announcing the new tool shows Bard making a factual error.

Emergence leads to unpredictability, and unpredictability — which seems to increase with scaling — makes it difficult for researchers to anticipate the consequences of widespread use.

“It’s hard to know in advance how these models will be used or deployed,” Ganguli said. “And to study emergent phenomena, you have to have a case in mind, and you won’t know until you study the influence of scale what capabilities or limitations might arise.”

In an analysis of LLMs released last June, researchers at Anthropic looked at whether the models would show certain types of racial or social biases, not unlike those previously reported in non-LLM-based algorithms used to predict which former criminals are likely to commit another crime. That study was inspired by an apparent paradox tied directly to emergence: As models improve their performance when scaling up, they may also increase the likelihood of unpredictable phenomena, including those that could potentially lead to bias or harm.

“Certain harmful behaviors kind of come up abruptly in some models,” Ganguli said. He points to a recent analysis of LLMs, known as the BBQ benchmark, which showed that social bias emerges with enormous numbers of parameters. “Larger models abruptly become more biased.” Failure to address that risk, he said, could jeopardize the subjects of these models.

But he offers a counterpoint: When the researchers simply told the model not to rely on stereotypes or social biases — literally by typing in those instructions — the model was less biased in its predictions and responses. This suggests that some emergent properties might also be used to reduce bias. In a paper released in February, the Anthropic team reported on a new “moral self-correction” mode, in which the user prompts the program to be helpful, honest and harmless.

Emergence, Ganguli said, reveals both surprising potential and unpredictable risk. Applications of these large LLMs are already proliferating, so a better understanding of that interplay will help harness the diversity of abilities of language models.

“We’re studying how people are actually using these systems,” Ganguli said. But those users are also tinkering, constantly. “We spend a lot of time just chatting with our models,” he said, “and that is actually where you start to get a good intuition about trust — or the lack thereof.”