Is the Efficient Market Hypothesis true?

Linearity, Non-Linearity, and a Confounding Conjunction

May 28, 2025

Is the Efficient Market Hypothesis true?

The Efficient Market Hypothesis (EMH) has been a cornerstone of financial theory for decades, asserting that market prices fully reflect all available information. It’s practically dogma in some circles. But is it “true”?

In this post, we’ll delve into a specific scenario that highlights the limitations of market-based information aggregation, particularly when faced with highly complex, non-linear realities.

What is a Market?

Before we dive into market efficiency, let’s take a step back and ask a foundational question: what is a market?

Imagine a bustling marketplace, not for fruits and vegetables, but for something like copper. Copper is a vital commodity, its price influenced by a myriad of factors: new mining discoveries, global industrial demand, construction booms, geopolitical stability in copper-producing regions, innovations in electronics, and even the cost of energy to extract and refine it.

In this copper market, we have numerous participants – let’s call them ‘traders’ or ‘experts’, though their expertise levels will vary:

Information Gathering: Each trader has access to some information. A mining company executive might have early insights into production yields. An industrial purchasing manager might foresee a surge in demand from their sector. An economist might be tracking global macroeconomic indicators that affect commodity prices. A speculator might be analyzing weather patterns in mining regions. Crucially, no single participant typically knows everything. Each holds different, often overlapping, pieces of a giant puzzle.
Forming Estimations: Based on their unique sliver of information, each trader forms an estimation of value. If a trader learns of a new, easily accessible copper deposit, they might revise their internal valuation of copper downwards, anticipating increased supply. If another sees strong manufacturing orders, they might revise their valuation upwards, expecting higher demand.
Acting on Estimations (Buying/Selling): This is where the information translates into action.
- If a trader believes copper is currently undervalued by the market (i.e., the current market price is lower than their estimated true value), they will be motivated to buy.
- Conversely, if they believe copper is overvalued (the market price is too high), they will be motivated to sell (or short-sell). The strength of their conviction and the amount of capital they have will influence the size of their buy or sell orders.
The Aggregation Mechanism: The collective impact of all these buy and sell orders is what forms the market price. If more participants are trying to buy than sell at the current price, their combined demand will push the price up. If selling pressure dominates, the price will fall. In this sense, the market price acts as a dynamic aggregator – a constantly updating signal that, in theory, reflects the combined knowledge, expectations, and sentiments of all participants.

So, at its heart, a market is a decentralized mechanism for information aggregation. Thousands of individuals, each with their partial view and acting in their own interest, contribute to a collective “wisdom” that is (ideally) reflected in the price.

The Efficient Market Hypothesis: A Quick Refresher

Formalized by Eugene Fama in the 1960s¹, the EMH makes three core assertions:

Market prices fully reflect all available information.
It’s impossible to consistently achieve returns exceeding the market average (except through luck or by taking on higher risk).
New information is rapidly incorporated into prices.

The hypothesis is often discussed in three forms, each more stringent than the last:

Weak-form efficiency: Prices reflect all historical market data (past prices, trading volumes, etc.).
Semi-strong-form efficiency: Prices reflect all publicly available information (news articles, company reports, economic data, etc.).
Strong-form efficiency: Prices reflect all information, including private or insider information.

The magic behind market efficiency supposedly lies in the collective actions of informed traders. As they act on their unique insights, they nudge prices, and the market, in theory, aggregates this dispersed knowledge into a single, coherent price signal².

A Mathematical Glimpse at Information Aggregation

To understand how markets try to aggregate information, let’s sketch a simplified mathematical framework:

$P_t$: The price of an asset at time $t$.
$V$: The true, intrinsic value of the asset.
$X = {X_1, X_2, …, X_n}$: A set of variables that collectively determine $V$.
$I_j$: The information set available to trader $j$.
$E[V\mid I_j]$: The expected value of the asset according to trader $j$, given their specific information.

The process of price formation can be abstractly represented as:

\[P_t = f(D_t, S_t)\]

Where $D_t$ represents demand (buying pressure) and $S_t$ represents supply (selling pressure). Both are, in turn, functions of the collective valuations $E[V\mid I_j]$ from all traders.

For small changes in supply and demand, this relationship is often approximated as linear:

\[\Delta P \approx \lambda(\Delta D - \Delta S)\]

Here, $\lambda$ is a liquidity parameter, indicating how much the price moves for a given net order imbalance.

The EMH, in essence, posits that this dynamic process leads to a market price that reflects the combined knowledge of all participants:

\[P_t \approx E[V\mid I_{union}]\]

Where $I_{union}$ represents the union of all traders’ information sets³. This is the ideal scenario.

The Conjunction Problem: A Fly in the Ointment

Now, let’s construct a counterexample – a “conjunction problem” – that throws a wrench into this elegant aggregation machinery.

Imagine a peculiar stock whose true value, $V$, is determined as follows:

\[V = \begin{cases} \text{Significant value} & \text{if } X_1 \land X_2 \land ... \land X_{10000} \\ 0 & \text{otherwise} \end{cases}\]

In plain English: this stock is immensely valuable if and only if all 10,000 distinct binary (true/false) conditions are met. If even one condition is false, the stock is worthless. Think of it as needing 10,000 keys to unlock a treasure chest.

Now, let’s add some assumptions about our market participants:

Dispersed Knowledge: Each trader knows the values of only a small, unique subset of these variables (e.g., 100 variables each). No single trader has the full picture.
Information Opacity: Traders don’t know which specific variables other traders have observed.
Cognitive & Computational Limits: Crucially, traders (and even their current computational tools) have limitations. They cannot realistically process or verify all 10,000 variables simultaneously, even if, hypothetically, all the data were dumped on them. This reflects a real-world constraint on human cognition and the processing power of most individual market actors. ⁴

Why Market Aggregation Stumbles

Consider a rational trader, Alice. She observes her 100 variables, and all of them happen to be true. What is her valuation of the stock?

\[E[V\mid I_{Alice}] = (\text{Significant value}) \times P(\text{all other 9,900 variables are true})\]

If we assume, for simplicity, that each of the unknown variables has an independent probability $p$ of being true (say, $p=0.9$, which is optimistic), then:

\[P(\text{all other 9,900 variables are true}) = p^{9900} = (0.9)^{9900}\]

This value is astronomically small, practically indistinguishable from zero. Even if $p$ is very high (e.g., 0.99), $p^{9900}$ is still vanishingly tiny.

The outcome? Every rational trader, even if their own limited set of variables are all true, will value the stock at approximately zero. Consequently, no one has an incentive to buy the stock at any positive price. The market price will languish near zero, even if the collective knowledge of all traders, if perfectly pooled and processed, would reveal the stock’s significant true value. The market fails to connect the dots.

The Underlying Linearity of Market Aggregation

This failure isn’t just a quirk of our example; it points to a more fundamental characteristic of how prices are formed: market aggregation mechanisms are, by their nature, predominantly linear.

Theoretical Underpinnings

A standard model of price formation in market microstructure can be depicted as:

\[P_t = P_{t-1} + \lambda \sum_{j=1}^{m} w_j (E[V\mid I_j] - P_{t-1})\]

Where:

$P_t$ is the new price.
$P_{t-1}$ is the previous price.
$\lambda$ is the market liquidity parameter.
$w_j$ is the relative market power (e.g., capital, influence) of trader $j$.
$E[V\mid I_j]$ is trader $j$’s valuation based on their information.

This equation essentially describes the new price as the old price plus a weighted average of how much each trader’s valuation deviates from that old price. A weighted average is a linear combination. The market, in this view, computes a linear aggregate of individual beliefs, scaled by their capacity to deploy capital⁵.

From Individual Beliefs to Price Moves

The process generally unfolds like this:

Each trader forms a belief $E[V\mid I_j]$ using their private information.
If $E[V\mid I_j] > P_{t-1}$, they are inclined to buy. If $E[V\mid I_j] < P_{t-1}$, they are inclined to sell.
The size of their order is often (though not always perfectly) proportional to the perceived mispricing: $(E[V\mid I_j] - P_{t-1})$.
These buy and sell orders collectively shift the price, typically through a mechanism that approximates a linear summation of these pressures.

Kyle’s Lambda: A Model of Linearity

The influential Kyle (1985) model of market microstructure formalizes this linear relationship⁴. In its basic form, it posits:

\[\Delta P = \lambda \cdot \text{Order Flow}\]

Where “Order Flow” is the net imbalance between buy and sell orders. This $\lambda$ (Kyle’s lambda) represents market illiquidity – how much the price moves for a given unit of net order flow. While the real world is more complex, this linear model has found empirical support in many contexts, particularly for small to moderate order sizes relative to market depth. ⁶ ⁷ ⁸ Researchers often test it by regressing price changes on net order flow data from exchanges, though isolating the “informed” component of order flow from “noise” trading is a significant challenge. ⁹ ⁷ ⁸ Some studies also find that for very large trades, the impact might become non-linear (concave). ⁶

The Linearity Bottleneck

This inherent linearity is the crux of the problem. If the true value function of an asset is highly non-linear (like our all-or-nothing conjunction example), a linear aggregation mechanism struggles to accurately reflect it unless a single actor possesses (and can process) almost all the critical information.

In our conjunction problem, the market price approximates:

\[P_t \approx \sum_{j=1}^{m} w_j E[V\mid I_j]\]

Since every individual trader’s expectation $E[V\mid I_j]$ is near zero (as shown earlier), their weighted sum will also hover near zero, irrespective of the weights $w_j$. The market remains blind to the potential “significant value.”

The Formal Constraint

This leads to a more general statement:

Theorem (Informal): If an asset’s true value $V$ is determined by a complex, non-linear function of many variables $X_i$ (such as a large conjunction or other intricate logical relationships), and no single trader’s information set $I_j$ (or their cognitive/computational capacity) covers a sufficiently comprehensive portion of these variables and their interdependencies, then a market mechanism relying on linear aggregation of individual valuations through price signals alone may fail to converge to $V$.

In our example, even if every trader knows that all variables in their specific, limited subset are true, the market price can remain stubbornly at zero. Traditional market activity – buying and selling based on these limited individual valuations – won’t solve this particular puzzle because:

The value function is starkly non-linear (a logical AND across thousands of variables).
Each trader’s rational valuation, based on incomplete information, is near zero.
The market’s price aggregation mechanism is fundamentally linear.
Cognitive and computational limitations prevent any single participant from integrating the full, dispersed set of variables. ⁴

Markets vs. Boosting: Two Flavors of Aggregating Weak Predictions

This market limitation becomes particularly stark when contrasted with concepts from ensemble learning in machine learning, notably “boosting.” Both markets and boosting algorithms aim to combine many “weak” individual assessments to form a stronger, more accurate collective judgment, but their methodologies diverge significantly¹⁰.

The Power of Weak Learners: A Machine Learning Marvel

A cornerstone of machine learning theory is the surprising and powerful result that “weak learners” can be combined to create “strong learners.” ⁵ ¹¹ A weak learner is an algorithm that performs just slightly better than random guessing. Robert Schapire’s seminal work in 1990 demonstrated that, through a process called boosting, one can iteratively combine such weak learners to produce a classification algorithm with arbitrarily high accuracy. ⁵ ¹¹ As the theory often states: a set of weak learners can be combined to form a single strong learner. ³ ⁵ ¹¹

The Crucial Distinction

The critical difference lies in how this combination occurs. Boosting algorithms learn the optimal way to combine the outputs of weak learners, often in highly non-linear ways. They adaptively re-weight the importance of different weak learners and the data points themselves. Markets, in their price discovery role, predominantly use a fixed, linear aggregation rule (weighted by capital).

For our conjunction problem:

A sophisticated boosting algorithm, given access to the outputs of many “traders” (each reporting on their subset of variables), could potentially learn the underlying AND function. It would learn that only when all (or a very high proportion of) inputs are true does the positive outcome occur.
The market’s linear price formation, as we’ve seen, cannot inherently represent this logical AND relationship across widely dispersed information.

This divergence helps explain why our market counterexample fails where an ensemble learning approach might succeed. The market, in its basic price-setting function, lacks the adaptive, non-linear combination capabilities that make boosting so effective. This also hints at why specialized information intermediaries, analysts who synthesize diverse data, and increasingly, sophisticated AI-driven funds, play a role in financial markets – they are, in effect, attempting to perform these more complex, non-linear aggregation functions that raw price signals alone cannot achieve.

Beyond Price: The Market’s Other Voices

The limitations of price as the sole aggregator of complex, non-linear information help explain why real-world markets have evolved a richer tapestry of communication channels⁶:

Direct Corporate Disclosures: Companies don’t just let the market guess. Earnings calls, annual reports, investor presentations, and detailed financial statements provide multi-dimensional information far richer than a single price point.
Information Intermediaries: Sell-side analysts, credit rating agencies, and alternative data providers make a business of collecting, processing, and interpreting complex, often non-linear, information, then disseminating their conclusions (which are, themselves, new information).
Derivative Markets and Market Microstructure Clues: Options prices can reveal implied volatility and probability distributions. The depth of the order book can signal conviction and potential support/resistance. Futures curves offer insights into expected future spot prices. These are all richer signals than a single last-traded price.
Institutional Coordination & Direct Communication: Industry conferences, investor coalitions, and even informal networks allow for more direct and nuanced information exchange among significant market players, sometimes facilitating a more holistic understanding before large-scale trading occurs⁹.

In our conjunction problem, these alternative channels could, in principle, offer a way out. Specialists could explicitly communicate, “My 100 variables are ALL TRUE,” rather than just placing a near-zero bid. Coalitions could form to pool and explicitly share their findings before committing capital. However, the feasibility and efficiency of such coordination on a massive scale, especially with 10,000 variables, remains a challenge.

The Quest for Perfect Foresight: A Sisyphean Task?

Ultimately, accurately valuing complex entities like companies, let alone predicting future market movements, requires a profound understanding of how the world works. Imagine the ideal: a causal model of the global economy and all its interacting parts, fed by yottabytes of real-time data. Such a model would need to capture countless non-linear relationships, feedback loops, and emergent behaviors. While this is the holy grail, it remains far beyond our current capabilities, highlighting that even with advanced tools, perfect information aggregation is a monumental, perhaps perpetually elusive, goal. ¹ ² ⁴

Implications for AI and the Future of Markets

The “conjunction problem” and the linearity constraint have significant implications as artificial intelligence becomes more integrated into financial markets. An AI system with the sheer computational capacity to monitor, process, and find patterns across all 10,000 variables (and their non-linear interactions) could, in theory, identify the stock’s true value where the human-driven market fails. This points to a potential avenue for AI to outperform markets, specifically in scenarios characterized by:

High-dimensionality: Vast numbers of relevant variables.
Complex Non-Linearities: Value derived from intricate interactions, not simple sums.
Dispersed Information: No single human agent possessing the full picture.

Unlike human traders, who are bound by cognitive limits (we can only juggle so much information at once!) and often rely on heuristics, advanced AI systems can potentially:

Process and “remember” vastly larger and more complex datasets. ⁸
Detect subtle, non-linear correlations and causal links that escape human intuition.
Integrate information from incredibly diverse sources – news, social media, satellite imagery, financial statements, etc. – in a more holistic way. ⁸ ¹⁰

However, it’s crucial to add a caveat: if we simply scale up the “conjunction problem” to be even larger and more complex (say, a million variables with even more convoluted interdependencies), it could once again exceed the capabilities of current AI. The frontier of computational capacity is always a moving target. ⁴

Can We “Fix” Markets for Non-Linearity?

This begs the question: can market mechanisms themselves be redesigned to better handle such non-linear aggregation challenges? Some ideas, though speculative, include:

More Expressive Betting Mechanisms: Markets that allow traders to bet on complex combinations of events or variable states, rather than just price levels.
Information Aggregation Platforms: Systems designed to explicitly pool and analyze dispersed information snippets in a secure and verifiable way, perhaps using cryptographic methods, before translating that into market actions.
AI-Augmented Market Making: Market makers or liquidity providers who use sophisticated AI to detect potential non-linear patterns from order flow and other data sources, adjusting their pricing and liquidity provision accordingly.
Prediction markets with more complex payoff structures: Moving beyond simple binary outcomes to contracts that reward the correct identification of complex conditional events.

These are frontiers of research and development, but they point towards a future where markets might evolve to become more adept at tackling the kinds of complexities our conjunction example highlights.

Conclusion: Beating the Market – Infeasible, or Just Really, Really Hard?

The Efficient Market Hypothesis provides an invaluable lens for understanding how financial markets generally operate. However, our journey through the “conjunction problem” reveals that the market’s ability to aggregate information through price mechanisms alone is not absolute. It faces fundamental constraints when confronted with highly complex, non-linear value functions and widely dispersed information, especially when coupled with the inherent cognitive and computational limits of individual participants.

So, is it possible to “beat the market”? Our example suggests that, in principle, if you could uniquely access and process information in a way that overcomes these aggregation failures (particularly the non-linear aspect), then yes. However, achieving this is monumentally challenging. It would necessitate possessing superior information, superior processing capabilities (potentially on a vast scale), and the ability to act on it before this insight becomes widely disseminated or the market structure itself adapts.

Perhaps the EMH is not so much “wrong” as it is a powerful approximation that holds true most of the time, for most participants. The edge cases, like our conjunction problem, highlight that outperformance isn’t necessarily impossible due to some magical market omniscience, but rather because it is infeasibly difficult for any single agent to consistently gather, process, and act upon the totality of complex, non-linear information faster and more accurately than the collective, albeit imperfect, wisdom of the crowd. The challenge is less about the market being perfectly efficient, and more about the sheer difficulty of being more efficient than everyone else in a computationally and cognitively bounded world.

Fama, E. F. (1970). Efficient Capital Markets: A Review of Theory and Empirical Work. The Journal of Finance, 25(2), 383-417. ↩ ↩²
Malkiel, B. G. (2003). The Efficient Market Hypothesis and Its Critics. Journal of Economic Perspectives, 17(1), 59-82. ↩ ↩²
Grossman, S. J., & Stiglitz, J. E. (1980). On the Impossibility of Informationally Efficient Markets. The American Economic Review, 70(3), 393-408. ↩ ↩²
Kyle, A. S. (1985). Continuous Auctions and Insider Trading. Econometrica, 53(6), 1315-1335. ↩ ↩² ↩³ ↩⁴ ↩⁵
Hayek, F. A. (1945). The Use of Knowledge in Society. The American Economic Review, 35(4), 519-530. ↩ ↩² ↩³ ↩⁴
Bouchaud, J. P., Farmer, J. D., & Lillo, F. (2009). How markets slowly digest changes in supply and demand. In T. Hens & K. R. Schenk-Hoppé (Eds.), Handbook of Financial Markets: Dynamics and Evolution (pp. 57-160). North-Holland. ↩ ↩² ↩³
Healy, P. M., & Palepu, K. G. (2001). Information asymmetry, corporate disclosure, and the capital markets: A review of the empirical disclosure literature. Journal of Accounting and Economics, 31(1-3), 405-440. ↩ ↩²
De Prado, M. L. (2018). Advances in Financial Machine Learning. John Wiley & Sons. ↩ ↩² ↩³ ↩⁴
Diamond, D. W., & Verrecchia, R. E. (1991). Disclosure, Liquidity, and the Cost of Capital. The Journal of Finance, 46(4), 1325-1359. ↩ ↩²
Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2), 197-227. ↩ ↩²
Almgren, R., & Chriss, N. (2001). Optimal execution of portfolio transactions. Journal of Risk, 3, 5-39. (While not directly about Kyle’s Lambda empirical validation, it deals with price impact, a related concept). ↩ ↩² ↩³