This or that? The psychology behind picking just one
If you’ve ever taken a personality test or job assessment, chances are you’ve encountered what we call “Likert-type questions”: On a scale from 1 (strongly disagree) to 5 (strongly agree), how much does this statement sound like you?
These kinds of questions are on surveys everywhere, and they certainly have the advantage of being simple, intuitive, and easy to score.
But they also come with baggage.
People often respond in ways that are socially desirable or self-enhancing, especially in high-stakes settings like job interviews – after all, who wouldn’t rate themselves “5 out of 5” on a statement like “I complete my tasks on time” when applying for a job?
Even in casual settings, respondents often gravitate toward the middle of the scale (e.g., “3 – Neither agree nor disagree”), or they struggle to meaningfully distinguish between overlapping items like “I encourage my team members” vs. “I listen to my team members’ feedback.”
That’s where forced-choice (FC) measurement comes in.
🧩 How Forced-Choice Works
Instead of rating one statement at a time, FC measures ask you to choose between two or more statements. For example:
Which sounds more like you?
A) I enjoy organizing my work.
B) I get along well with others.
You can’t say “both.” You have to choose.
This simple tweak creates a measurement format that’s comparative rather than absolute. It’s harder to fake and often more precise in revealing what makes someone tick.
📜 A Long (and Complicated) History
Forced-choice formats have actually been around since the 1940s. But for decades, they were treated like a niche technique, used primarily in military and personnel settings.
One reason? Scoring them is hard.
When people pick between equally attractive options, their raw choices don’t tell us much unless we use advanced mathematical models to infer their underlying traits. For a long time, the tools just weren’t there.
But over the past two decades, thanks to breakthroughs in item response theory, latent modeling, and now even generative AI, the field has seen a major resurgence. Popular surveys, like the Clifton StrengthsFinder, now use the FC format.
🚨 Our Contribution
In our new paper – published this month in Organizational Research Methods – we take a big-picture look at this entire literature: “The Journey of Forced Choice Measurement Over 80 Years: Past, Present, and Future”
The goal? To give researchers a clear, updated map of what we know, what we’ve learned, and where the field should go next.
✨ Key Takeaways
It generally works. FC formats are often more predictive of real-world outcomes than traditional Likert scales. Properly designed, they reduce social desirability and faking, which is especially useful in hiring and promotion contexts.
Scoring is getting smarter. Advances in statistical modeling continue to improve the accuracy, reliability, and fairness of FC scoring.
Design matters. FC questions can be cognitively demanding – people have to carefully compare options rather than rate them in isolation. Newer approaches, like graded FC formats, show promise in balancing rigor with user experience.
The field is evolving. From AI-assisted item generation to computer-adaptive testing and hybrid formats that blend FC with traditional scales, innovation is moving fast – and changing how we think about assessment.
Whether you're a psychometrician, HR researcher, or just a measurement nerd, there’s something here for you.
💡 Want to Learn More?
We break down the history, the models, the controversies, and the innovations in the full review – read the full paper here.
Ever used forced-choice in your own research? Curious about how AI might generate psychometric items? Want to try building your own?
Reach out! We’re building tools and sharing resources at www.statslabatcmc.com all the time, and we’d love to work with you.