What does leadership really mean?
We all know and can agree that the concept of leadership exists. Moreover, we know that there are differences between good and bad leaders, and even among good leaders, there are differences in how leaders lead.
And yet, despite decades of research on leadership — from a variety of angles ranging from psychology to political science — scholars still struggle to answer the basic question, “What does leadership really mean?”
In this research review, I discuss how a recent paper by Fischer and Sitkin (2023) articulates the growing concerns over problems in how we study leadership. They focus specifically on issues of measuring leader behavioral styles, that is, popular concepts such as “transformational leadership” and “authentic leadership” and “servant leadership” that appear to dictate a set of behaviors that leaders can and should enact.
Fischer and Sitkin go so far as to claim that the problems they identify with leadership styles “call into question the entire evidence base of leadership style research” (p. 331, emphasis added).
I hope that more and more scholars and practitioners read their work, find ourselves appropriately challenged in the way we think about leadership, and together take important steps towards more a meaningful, accurate study of leadership.
The core problem: Valence-based conflation
Leadership research is overflowing with styles that promise better leaders and better workplaces: transformational, authentic, ethical, servant, empowering. You’ve probably heard of these in your classes, research, or management training programs. Each “style” offers its own vocabulary for “good” leadership; and, conversely, their opposites (abusive, destructive) map neatly onto the “bad” side.
But what if this moral sorting is a scientific mirage?
That’s the main concern that Fischer and Sitkin raised, that existing conceptualizations of leadership styles are simply clusters of behaviors that align on “positive” or “negative” characteristics.
The result is a kind of “moral coloring” of leadership theory. Behaviors are defined as inherently good or bad, and outcomes become self-fulfilling: positive styles lead to positive outcomes, and negative styles to negative ones — not because that’s necessarily true, but because that’s how they were defined in the first place.
A more nuanced view of leadership styles
Instead, we should look at leadership styles from four different perspectives:
Behavioral content – what leaders actually do.
Intentions – why they do it.
Quality of execution – how well they do it.
Effects – how effective those actions are.
The problem arises because every leadership style they reviewed conflates these categories. For example, “ethical leadership” assumes leaders who act with moral intent and positive outcomes. “Empowering leadership” presupposes that followers actually feel empowered. Even the classic “task vs. relational” framework from the 1950s bundles descriptive and evaluative elements like “accurate decisions” or “well-defined communication patterns” into its measures.
That blending of these four perspectives makes it impossible for scholars and practitioners to truly isolate what aspect of leadership is being captured. In other words, we can’t know whether outcomes (such as team performance) stem from leader behaviors themselves, or from our judgments of those behaviors’ quality or success.
If a demanding visionary leader succeeds, we might label them “transformational.” If they fail, the same behavior might be branded “abusive.”
Why it matters
The implications are not just semantic. If leadership styles are built on valence (good vs. bad) then decades of findings linking “positive” leadership to positive outcomes may be statistical artifacts, not evidence. The authors call this the “do-good” and “don’t-do-bad” logic: the comforting but circular belief that good leadership works because it’s good.
This echoes a broader theme in quantitative and organizational research: construct redundancy and tautology. Many leadership styles overlap in behaviors, differ mainly in label, and measure outcomes that are baked into their definitions. That means we may be studying linguistic clusters rather than distinct, causally testable constructs.
As a notable example of this problem, I sat in a presentation a couple of years ago where the presenters demonstrated how easy it was to use ChatGPT to “create” a new theory of leadership. All it took was placing a new positive-sounding adjective in front of the word leadership!
I tried it myself with ChatGPT and “discovered” fractional leadership: “the extent to which individuals intentionally lead by offering partial, provisional, and revisable guidance that is explicitly communicated as incomplete.” Then I asked ChatGPT to give me five-item survey I could use to measure fractional leadership. All I need now is to come up with some theoretical justification, then off it goes for peer review!
This somewhat tongue-in-cheek illustration points towards the problem we face in the cluttered field of leadership studies, with (at last count) over 700 different definitions of leadership, constructs of leadership styles, and survey-based measures claiming to predict successful leadership.
A way forward: Deconflation and Configuration
To be clear, Fischer and Sitkin don’t call for abandoning leadership style research — rather, we need to take a completely different approach.
First, separating the four aforementioned layers could clarify what’s descriptive versus evaluative, and reduce the “amalgamation” of unrelated traits into one glossy package. We can intentionally use a combination of different methods (surveys, observations, simulations) to isolate the behavior from the intent from the result.
Second, researchers and scholars can take a configurational approach. Perhaps leadership is best treated as a pattern or combination of behaviors rather than a single construct. I teach this to my students using a toolkit analogy: leaders can pull from a range of different overlapping behaviors, depending on the situation at hand.
Finally, we need to think better about leadership. It’s easy to simply claim that “good thing leads to good thing” (maybe depending on a third good thing). In our papers, public discourse, and teaching, we need to be training the world to be better critical thinkers of what leadership actually is and how to develop it.
Stay in touch!
Here in the STATS Lab, our ongoing research on the intersection of leadership and measurement aims to directly respond to Fischer and Sitkin’s calls. We hope to move beyond self-report adjectives, bringing in modern data science methods to better understand leadership.
As a brief example, one exciting paper we are working on uses natural language processing to isolate clusters of semantic meaning across hundreds of leadership survey items.
We need behaviorally grounded, empirically falsifiable leadership measures, ones that describe what leaders do without presupposing that those actions are moral, competent, or effective.
Untangling those threads isn’t just good theory. It’s good measurement science.
Read the full article here: Fischer, T., & Sitkin, S. B. (2023). Leadership styles: A comprehensive assessment and way forward. Academy of Management Annals, 17(1), 331–372.
Find out more about our lab’s research here: https://www.statslabatcmc.com/research