All Posts
Laketap Team

Why Lakehouse Performance Is a Coordination Problem

Why lakehouse performance is governed by coordination across independent systems, not by execution speed inside any one of them.

In this series: Acceleration in Modern Lakehouse Platforms

Distributed Coordination System-Level Performance Lakehouse

From shared data to fragmented access paths


Performance loses a single point of control

Lakehouse systems are built to maximize flexibility.

Storage is shared. Engines evolve independently. Workloads choose the best runtime for the job.

This flexibility is the lakehouse’s defining advantage. But it also introduces a less obvious consequence.

Once execution is decoupled, performance loses a single point of control.

Each engine can be fast. Each component can be well optimized. And still, end-to-end performance fails to compound.

That failure is not accidental. It reflects a deeper issue: lakehouse performance is governed by coordination across independent systems, not by execution speed inside any one of them.

In a lakehouse, performance is a coordination problem.


What “coordination” means in a data platform

Coordination is often misunderstood as an organizational problem—teams, processes, or communication.

In data platforms, it is more precise than that.

Coordination means that multiple independent decisions must align for a query to be fast.

In a lakehouse, those decisions are made by different components, at different times, with different scopes of visibility:

  • Catalogs decide what data exists and which version is current
  • Planners decide how data should be accessed
  • Engines decide how work is executed
  • Storage and networks decide where data flows

Each decision can be locally correct. Performance problems emerge when they fail to align globally.

From this point on, performance in a lakehouse should be understood not as an execution problem, but as a coordination problem.


Why execution can improve while performance does not

Modern engines are highly optimized.

Execution is vectorized. Operators are efficient. Runtimes are faster than ever.

Yet platform-level performance remains inconsistent and difficult to reason about.

The reason is simple: execution is optimized locally, while performance depends on the entire access path.

No single engine sees that full path. Each optimizes its own segment, assuming the rest of the system will cooperate.

That assumption rarely holds.

Execution gets faster. But the cost of misalignment remains.


Coordination in practice: the modern lakehouse IO path

This becomes clearer when we follow a typical lakehouse IO path.

A query begins with a catalog lookup. Metadata is resolved. Files are selected. Splits are constructed. Data is scanned, decoded, filtered, and transferred.

At each stage, decisions are made independently.

Metadata conclusions are derived early, but discarded once execution begins. Planning decisions assume execution behavior that may not hold across engines. Split construction assumes no memory of prior scans. Each workload starts from a cold understanding of the same data.

What matters is not the individual steps, but what doesn’t happen between them.

Decisions made early in the path are not preserved, shared, or coordinated downstream. Each stage recomputes what the previous stage already learned.


The problem is not length — it is fragmentation

It is tempting to describe this as a “long IO path.”

But length alone does not explain poor performance.

A long path with coordination can be efficient. A short path without coordination cannot.

The defining characteristic of the lakehouse IO path is fragmentation.

Each segment operates in isolation. Each optimization resets at component boundaries. Each engine re-discovers the same facts about the same data.

The system pays repeatedly for the same knowledge because there is no place where that knowledge is coordinated.


Why engines cannot solve coordination

At this point, a natural response is to ask whether engines can simply become more aware.

In practice, they cannot.

Engines are designed to optimize execution within their own context. They see a plan, a runtime, and a bounded workload. They do not see:

  • Access patterns across engines
  • Historical behavior across workloads
  • Reuse opportunities across time

This is not a limitation of engine design, but of engine scope.

Coordination requires a vantage point that spans engines, workloads, and lifecycles. That vantage point does not exist inside any single engine.


Performance as a system property

This is why lakehouse performance cannot be understood as an engine property or a storage property.

It is a system property.

Performance emerges from how the system coordinates access to shared data—how decisions made at one stage are preserved, reused, or lost at the next.

When coordination is absent, performance fragments. When coordination exists, performance compounds.

The difference is not execution speed. It is where coordination lives.


Where acceleration must live

If performance is a coordination problem, acceleration cannot live inside engines alone.

It must live where coordination is possible—above individual runtimes, across access paths, and over time.

This is the missing layer in the lakehouse model.

The question is no longer whether we need another engine, but where coordination can actually exist in a lakehouse.