AI still fails to coherently understand real world

Date:

Generative artificial intelligence (GenAI) fails quickly in situations requiring accurate world modelling rather than just predictive capabilities, MIT and Harvard researchers have found in a study with significant implications for AI models used in the real world situations, such as AI-powered wayfinding.

Predictions accurate

The researchers focused on the type of “transformer” generative AI model employed behind GPT-4, which is trained on vast quantities of language, or “large language models” (LLMs). The language data eventually enables the transformers to predict outcomes in a sequence. 

In the tests, two classes of transformer were chosen for game playing, logic puzzles and navigation testing, such as solving a seating plan problem in a performance venue. One of them was trained on data generated from randomly produced sequences and the other on data generated by following strategies. The transformers can predict moves in games like Connect 4 and Othello, and supply step-by-step directions for navigating New York City, the scientists found.

© Vafa et al

World views can’t cope with minor change

But, in “sequence distinction” and “sequence compression” tests that revealed whether the transformers had actually formed an accurate real world model, their performance dropped notably. Only the transformer trained on strategy data was able to generate a coherent world model for Othello moves. And in worrying news for the mobility sector for example, neither class of transformer was able to produce an accurate city map for New York. What’s more, both the transformers’ navigation performances fell at minor hurdles.

For example, when the researchers introduced road blocks and detours into the New York navigation exercise, “its performance plummeted” MIT News reported, even when the variations were only minuscule. “I was surprised by how quickly the performance deteriorated as soon as we added a detour. If we close just 1 percent of the possible streets, accuracy immediately plummets from nearly 100 percent to just 67 percent,” lead author Keyon Vafa said.

© Vafa et al

AI New York map covered in nonexistent roads

Looking into the anomaly, the team, made up of academics from MIT, Harvard, Cornell and Tisch universities, discovered the AI had created an internal map of New York covered in a network of non-existent roads and flyovers. 

Speaking to the implications of the study, “the question of whether LLMs are learning coherent world models is very important if we want to use these techniques to make new discoveries,” said senior author, Ashesh Rambachan, of the MIT Laboratory for Information and Decision Systems (LIDS).

The next step for the researchers is to introduce problems with only partially known rules for the AI transformers to work on.

Share post:

Popular

More like this
Related

Best Tips for FPL Gameweek 11: Clean Sheet, Expected Points and Goalscorer Predictions

Fantasy Premier League (FPL) managers are always on the...

UNC vs. Kansas: How to watch NCAA basketball tonight

The No. 9 ranked University of North Carolina Tar...