Improving Performance by Reducing API Data Size

Published:
Keywords: performance

In a recent consulting engagement, I worked on improving performance for a ticketing e-commerce website. This is the first of a series of posts about what helped. I’ll discuss what led to the decision to reduce data size, the degree to which that change helped, and other adjacent ideas like Backend-for-Frontend (BFF).

Contents

Diagnosis

One of the first things I did to diagnose the website’s performance was to use Chrome DevTools to observe a page load with the Network tab open. I noticed that one particular API call took 5+ seconds, with the page being blocked from a contentful paint until that call completed. The API included multiple megabytes of content for one month of an ongoing event series - for example: river cruises and museum tours.

Because the payload was so large, the browser was forced to tie up the main thread while parsing JSON, causing “micro-stutters” where the UI could briefly look like it’s frozen and not respond to user input. This is especially noticeable on underpowered devices such as kiosks.

Further, there’s an interesting performance opportunity that happens when you can fit the payload size in under 14–15kB that can significantly help page load and API performance, due to the implementation details of the TCP slow start algorithm. Without getting into too much technical detail, this happens because the initial TCP congestion window (and the convention for the number of TCP packets that a server will send at one time) limits how much data can be sent in the first round trip.

In practice, staying under this window can often make the first response arrive up to roughly twice as fast compared to slightly larger payloads. Missing the ~14kB window by even 1kB can increase the response time of that network call by 600ms or more, especially on higher-latency connections. Relatedly, for webpage content that can’t be made that small, it can be helpful to stream in meaningful “above-the-fold” content within the first 14kB, focusing on what is most important for the user to see when they visit your website. This can include headers, titles, and key CSS styling.

Solution

Returning to the earlier example: since we control both the frontend and the backend code, and the product requirements only necessitate showing the complete details of one day at a time, it made sense to provide different API calls where we segmented the data based on the part that applied to each day, with the YYYY-MM-DD date value being part of the path of the new API calls. Requesting these only took around 900ms, which significantly sped up the page load.

In general, the split point doesn’t necessarily have to be by date - other criteria could make sense depending on the use case. Reducing data might even require multiple split points, in which case the API path could be modified to include both of those (either in clear text or hashed form).

Opportunity: BFF Endpoints

One architectural idea that biases toward implementing these ideas is Backend-for-Frontend (BFF). A Backend-for-Frontend is a backend whose only consumer is a single frontend product. It sits between the client and other backend services, and is expected to be developed alongside the frontend. Ideally, the BFF and the frontend are even developed within the same monorepo, where a single pull request can span changes to both.

This pattern can be especially helpful when there are dependencies between different individual existing backend API calls. Rather than waiting for one API to complete in the frontend before making the next, bundle all that into one combined BFF API call that will return all of the needed attributes and/or related entities instead of just IDs.

One can imagine a progression where gradual improvements are made to the existing architecture.

  1. As a first step, implement a single BFF API call for the data needed on the page, or perhaps for one section of a complex page. Within that API, include each kind of data as a top-level (or nearly top-level) payload, ideally scoping that data appropriately to keep API response payload size under control.

    Remember: Don’t worry about having perfect levels of abstraction, as the only consumer of this call is the frontend.

  2. Later, implement other optimizations.

    • Potentially allow the BFF to automatically render HTML server-side for the part of the page that it controls instead of returning JSON response payloads, leveraging tools like Next.js, TanStack Start, and React Compiler.
    • For the backend APIs that the BFF calls, potentially adjust these to provide smaller data payloads and/or more ways to filter response data using query parameters or POST parameters.
    • Potentially allow the BFF to have a bit more direct access to the data sources to skip a few layers, as long as rate limits and other controls are properly implemented.

Conclusion

APIs that are designed for general consumption by any kind of frontend or backend provide helpful flexibility, but they can lead to poor webpage performance without further data segmentation. When your API calls start dragging past the 1-second mark, look at the payload size. Are you sending the entire history when the page only needs a smaller subset?

A BFF can be an interesting further idea that allows treating your API payload as part of the UI design. By restricting data to the current context (specific dates, active items, or relevant categories), your application can, as a general rule, remain performant and flexible.

What’s Next

Our segmented API calls in this example now take 900ms each instead of 5+ seconds. But there’s more to be done.

  • What can be done to reduce that 900ms even further?
  • What can cause temporary mysterious latency spikes of 10+ seconds?
  • And what if users in other regions are noticing slowdowns that seem much worse than what the primary region experiences?

Stay tuned to future blog posts in this series for answers to those questions.