
Share
As datasets grow, feeding large JSON blobs back to LLMs becomes inefficient and impractical. Code orchestration emerges as a more scalable solution for managing complex data flows in real-world applications.
When it comes to working with Model-Composed Programs (MCPs) and integrating them into real-world applications, one common practice is to feed the outputs from tool calls back into the Large Language Model (LLM) as messages. The idea is that the model will interpret this data and determine the next steps. This approach can be effective for small datasets, but it quickly becomes problematic with larger, more complex data.
Let's dive into why LLM function calls don't scale well:
id fields and other metadata that take up many tokens but offer little semantic value. This inefficiency can quickly exhaust the token limit of LLMs, leading to slower processing times and higher costs.To illustrate this, consider our use case with Linear and Intercom:
id fields that are not semantically meaningful.When using Claude with these MCPs, the entire JSON blob is sent back to the model verbatim. This approach can lead to significant performance issues and data loss, as the model might fail to accurately reproduce or process all the data.

The core issue here is that we are conflating orchestration (managing the workflow) with data processing (handling the actual data). This confusion leads to inefficiencies and scalability problems. Here’s how code orchestration can help:
sort operation directly on the parsed data.Let’s walk through a simple example to see how code orchestration can be more effective:
sort) to process the data as needed.By following this approach, you can handle large datasets more efficiently and avoid the pitfalls of token overhead and data reproduction.
While LLM function calls are a powerful tool, they hit a wall when dealing with large, real-world datasets. By separating orchestration from data processing using code orchestration, we can achieve better performance, reduce costs, and minimize the risk of errors. This approach is more scalable and aligns well with the structured nature of modern APIs.
Tags
Original Sources
↗ https://jngiam.bearblog.dev/mcp-large-data/?utm_source=tldrai
About the author
Kai built ML infrastructure at a Bay Area startup before developing an obsession with transformer architectures and inference optimisation that eventually pulled him out of product work entirely. A stint at a compute research lab sharpened his instinct for what actually matters in a model release versus what is marketing. He writes from the inside — from the perspective of someone who has debugged the systems he is describing at three in the morning. He is allergic to hype and instinctively drawn to the unglamorous plumbing questions that everyone else skips over.
More from The Engineer →This Week's Edition
22 May 2025
88 articles
Related Articles
Related Articles
More Stories