When it comes to predicting people's preferences, it pays to consider "the power of three"
MIT researchers found that the common way of predicting what people want, by comparing two options at a time, hides the real connections between people's choices. By asking large groups to rank three alternatives in order instead, the team showed those hidden links become measurable. The work upgrades random utility models, a framework nearly 100 years old, with direct uses in training AI systems from human feedback.
Key Takeaways
- The central idea, which the team calls the power of three, is that ranking three options at once exposes correlations between preferences that two-option comparisons never reveal.
- Random utility models assume each person picks the option giving them the highest personal value, and they have guided business and economic forecasting since the 1920s.
- The usual two-option method assumes the value a person places on each choice is independent, which produces a coarse and often inaccurate picture of real human taste.
- The same correlation information also surfaces when best-of-three choices are combined with best-of-two choices.
- The team proved efficient algorithms exist, so the computation does not grow exponentially as the catalog of options gets larger, keeping the method practical at scale.
- The finding applies directly to training large language models, where user rankings teach AI systems which responses people prefer.
Stats & Key Facts
- #Random utility models trace back to a foundational paper published in 1927, making the framework nearly 100 years old.
- #The study showed two options are not enough and three options are the threshold that unlocks correlation data.
- #The research was presented in April 2026 at the International Conference on Learning Representations.
- #The paper was authored by 4 researchers across MIT and Nanyang Technological University in Singapore.
- #The method scales without exponential growth in computation as the number of options increases.
The Power of Three Beats Two-Option Comparisons
The core result is a single change to how preference data gets collected.
For decades, the standard way to measure what people want has been to show them two options and record which one they prefer. The MIT team proved this approach has a built-in blind spot. Looking at two things at a time makes it impossible to find correlations among the many choices people face.
When large numbers of people instead rank three alternatives in order, those hidden correlations become measurable. The same information also emerges from combining best-of-three choices with best-of-two choices. A small structural change to the question produces far richer preference estimates.
What Random Utility Models Are and Why They Matter
The framework being upgraded is older than most people assume.
Random utility models predict behavior by assuming each person picks the option giving them the highest personal value. They are random by design because people differ, and even one person's preferences shift over time. Businesses and economists have relied on these models to forecast demand and choices since a foundational paper in 1927.
The models are widely used, yet the common way of fitting them to data carries a flaw. They often assume the value a person places on each option is independent of the others. In real life those values are connected, so the standard fit gives an oversimplified view of how varied human taste truly is.
Why Hidden Correlations Change the Picture
Preferences tend to travel together, and missing that pattern distorts predictions.
- ›A person who supports one policy position often supports a related one, so those choices are linked rather than separate.
- ›Someone who favors independent films might also prefer foreign films while disliking big action releases.
- ›Treating each preference as independent erases these patterns and flattens the real range of human taste.
- ›Capturing the links produces sharper, more accurate predictions of what groups of people will choose.
Direct Uses in Streaming, Government, and AI Training
The finding reaches several fields where predicting choice drives decisions.
- ›Digital platforms such as streaming and e-commerce services rely on preference models to recommend titles and products.
- ›Government planners use similar models when weighing infrastructure and public-service decisions.
- ›Training large language models depends on human rankings that teach the AI which responses people prefer, the process known as reinforcement learning from human feedback.
- ›Across all three, ignoring correlations gives a coarse signal, while best-of-three rankings sharpen it.
An Efficient Algorithm That Scales
A practical method matters as much as the theory.
A richer model is only useful if the math stays manageable as the list of options grows. The MIT team proved that efficient algorithms exist for extracting the correlation information. The computation does not balloon exponentially as the catalog of choices gets larger.
That result keeps the approach realistic for large systems, where users might choose among thousands of products, videos, or AI responses. The estimator the team describes reaches near-optimal performance while staying both statistically and computationally efficient.
The Research Team and Where the Work Appeared
The study brought together economists and computer scientists.
- ›The paper is titled Learning Correlated Reward Models: Statistical Barriers and Opportunities.
- ›It was presented in April 2026 at the International Conference on Learning Representations in Rio de Janeiro, Brazil.
- ›Yeshwanth Cherapanamjeri, a former MIT postdoc, is now at Nanyang Technological University in Singapore.
- ›The MIT co-authors are assistant professor Gabriele Farina, Avanessians Professor Constantinos Daskalakis, and PhD student Sobhan Mohammadpour.
- ›The authors expect building utility models to remain an active area of study.
Frequently Asked Questions
What does the power of three mean in this study?
It means asking people to rank three alternatives at once, rather than comparing two at a time. Ranking three reveals correlations between preferences that two-option comparisons cannot detect.
What is a random utility model?
It is a mathematical framework that predicts choices by assuming each person selects the option that gives them the highest personal value. The framework dates to a foundational paper from 1927 and is used in economics, marketing, and AI.
Why does this matter for artificial intelligence?
Large language models are trained using human rankings that teach the system which answers people prefer. Capturing the links between preferences gives a more accurate signal for that training process.
Does the new method work at large scale?
Yes. The researchers proved efficient algorithms exist, so the computation does not grow exponentially as the number of options increases, keeping the approach practical for big catalogs.
Where was this research published?
The paper, titled Learning Correlated Reward Models: Statistical Barriers and Opportunities, was presented in April 2026 at the International Conference on Learning Representations in Rio de Janeiro, Brazil.
By shifting from two-option comparisons to three-option rankings, MIT researchers gave a near-century-old prediction framework a measurable upgrade. The change helps platforms, planners, and AI systems capture the real connections in human preferences.
Continue Learning
Comments
Sign in to join the conversation