Scaling React/Redux in Yahoo Finance
Yahoo started its journey on modernizing its stack across the board roughly 3 years ago. While internally we have a stable framework on top of React & Fluxible, we decided to experiment with Redux architecture in Yahoo Finance Mobile Web & potentially expand it to certain pieces of desktop. In terms of RUM, Finance Mobile Web commands roughly 2M DAU and generates 3.3M Daily PVs. It is a big enough space to explore new opportunities yet small enough to manage user-impacting risks.
Architecture
At a high level, we believe there are a couple of key components in building an isomorphic web application:
- API Layer: In charge of fetching data from source of truth and update user data accordingly
- Data Massaging Layer: In charge of massaging data into UI-friendly business model objects
- State Management: Manages and synchronize states between various pieces of the app, including UI local states, caching layer, upstream API…
- Router: Handle route changes which include setup/teardown of certain application state
- Rendering Engine: Renders DOM and manages local UI states (such as selected, active…)
API Layer
We decided to build our API layer on top of ES6 fetch
thanks to its modern API using Promise. While making a simple call isn’t hard, in a production system a lot of complexity comes from a combination of contextualizing the request, measuring, caching and the likes.
In order to keep the layer lean yet extensible, instead of integrating performance labels, proxies, fallbacks… directly into the layer, we offset the complexity into the message action instead.
The core action throughout the application is a POJO and has exactly 4 fields:
export interface FSAction<T, U> { /** * Type (required) */ type: string /** * Payload, if error is true, then payload is the Error * Otherwise it's w/e */ payload?: T /** * Metadata that comes w/ the payload */ meta?: U /** * Error obj */ error?: any}
For example, instead of bundling a caching layer into our API layer, we parse and expose Cache-Control
into meta
field. Instead of bundling performance logging, we capture response timing and expose performance field in meta
field also. Both pieces of information will be delegated to another centralized layer that examine the action and determine the appropriate behavior.
export interface CacheControl { /** * Max-age from cache-control spec */ ‘max-age’: number /** * Stale while revalidate */ ‘stale-while-revalidate’?: number}export interface PerformanceTiming { label?: string start?: number stop?: number failure?: boolean
}
Data Massaging Layer
This is primarily Redux reducer
s so nothing interesting. We do HTML decoding & number manipulation and such in this layer.
State Management (Scaling Redux)
Thanks to Redux’s centralized middleware flow, we essentially compartmentalize complicated data flow into a bunch of middlewares. An example simplified version of an API call lifecycle is something like this:
Some of the sample key components:
- API Middleware sandwiched between 2 Cache Middleware (Get/Set): An API action flows through
CacheGet
middleware 1st, thenApiMiddleware
, thenCacheSet
middleware. This allows an opportunity for the Caching layer to modify request params before actually fetching data. E.g: if we ask for 5 things and 3 things are in cache, it’d only fetch 2 things. Caching middleware also provides opportunity to use different cache backends (localStorage
vsmemcached
vsin-memory
) - Beacon Middleware at the end: This allows us to beacon not only every action that has an error, but also capture every performance timings happening in between based on the
PerformanceTiming
payload described above.
The middleware flow also acts as a gateway for foreign systems to hook into our centralized Store while potentially maintaining its own internal state. For example, our real-time streamer
exposes a middleware that allows components to (un)subscribe, while pumping data into the centralized Store.
In production, we have roughly 10 different Redux middlewares handling a myriad of things:
- Config Middleware: Pumps config into action
- Contextualize Middleware: Pumps user-sensitive data into action
- CacheGet
- API
- CacheSet
- Roughly 4 middlewares from foreign systems
- Beacon Middleware
Router
When we initially started the experiment, our first goal is to introduce an architecture that is isomorphic (universal) to maximize logic reusability. The immediate challenge is the difference between server and client. Server aims to be as stateless as possible and put an emphasis on handling the most number of requests while client is stateful by nature, storing user state locally to facilitate a faster user experience. Replicating all client states on the server would heavily increase memory usage and subsequently put stress on GC. This approach is also prone to memory leaks due to the nature of socket reuse on Node.js. Therefore, we decided to model client app after server request/response lifecycle instead.
In our implementation, we built a “server shim” layer to normalize the client/server differences. First of all, such layer would establish the practice of treating each page navigation as a request/response cycle. This provides a good level of abstraction and normalize routing methodologies. It would also populate relevant data from window
to replicate a basic request object, such as cookie
, user-agent
, router…
Once the routing abstraction is established, we declared a straightforward routing life cycle:
- contextualize: focuses on extracting out information from request headers/cookie to provider a preliminary context
export interface Req extends Request {
context: Context
}export interface Context {
// “0/1”
authed: string
bucket: string
devFlags: DevFlags
env: string
intl: string
config: ApiConfigs
}
- preinit: fetches the initial data for the page, enough to determine the
statusCode
, certain SEO info and enough information to resolve to a certain layout. - renderHead: sets response’s
statusCode
, headers and potentially render/dehydrate initial data. It will also render out certain<script>
to allow pre-fetching content. - init: fetches the rest of the data for page and populate corresponding sections, ready to be rendered.
- renderBody: renders the rest of the page with the rest of dehydrated data and potentially multi tree React.
- postInit: is client-only, in case we defer rendering.
The clear separation of preInit
phase & init
phase allow more tuning for Above-The-Fold experience and optimization opportunities. It also allows us to easily shuffle different phases to handle Failsafe/Crawler.
For example, full server side rendering means changing the order of those phases to:
- contextualize
- preinit
- renderHead
- init
- postInit
- renderBody
By moving renderBody
to the bottom we force all Below Fold API calls to finish before rendering, making it a full server-side render.
On the other, a full client-side render would be to move renderBody
to above init
and flush early.
Rendering Engine (Scaling React)
While React excels on the client side thanks to its DOM-ding mechanism, since server is stateless, its performance has been an issue on the Node.js side. The first thing we decided to do was to measure its performance using a mock application state, thanks to our State Management philosophy. Using this data point, we can safely adjust the ratio between server-rendered components and client-rendered components, thus balancing UX and scalability.
Nebula@0.1.457rehydration x 2,340 ops/sec 6.53% (53 runs sampled)
navrail x 2,870 ops/sec 4.36% (57 runs sampled)
LoadingBar x 2,813 ops/sec 2.16% (64 runs sampled)
UH x 218 ops/sec 5.68% (50 runs sampled)
hero x 48.76 ops/sec 6.60% (45 runs sampled)
main x 1,174 ops/sec 6.39% (56 runs sampled)
lightbox x 2,762 ops/sec 3.19% (59 runs sampled)
feedback x 710 ops/sec 2.70% (61 runs sampled)CPU time for rehydration: 0.42734133643922173ms
CPU time for navrail: 0.3483773579622734ms
CPU time for LoadingBar: 0.35554301294768764ms
CPU time for UH: 4.596207153906486ms
CPU time for hero: 20.508138746296293ms
CPU time for main: 0.8519489946518364ms
CPU time for lightbox: 0.36210397088059915ms
CPU time for feedback: 1.407506364826446mstotal CPU time per req: 28.857166937910844ms
max theoretical rps/core: 34.65343642886367
Based on our max rps/core
we can safely do capacity planning and adjust our caching TTL for certain components.
However, we’ve also made the decision to drastically reduce the scope of React integration within the app itself. This means instead of heavily relying on React component life cycle to trigger certain behaviors, we have been treating React as a pure stateless rendering engine.
This philosophy allows us to safely and easily mix in multiple rendering engines where it’s best needed, such as D3/Canvas for chart rendering and Ads without having them depend on each other, thus reducing complexity. There have also been several more lightweight Virtual DOM implementations that we would like to explore, in case React’s direction does not match with our own. This mental shift allows us to separate out presentation layer and make it reusable across properties.
Besides, in production, every sizable component flows through a centralized caching layer that caches the HTML based on a combination of props & unique dimensions that prevents us from having to re-render things.
Memory & Garbage Collection
Since the server logic is constructed to be stateless, there is always a need to create new application context for every incoming request, and as soon as it finishes the server should be able to clean up all the resources associated with and consumed by that particular request. It will be a significant problem to the heap memory if the server fails to correctly garbage collect the unused memory. By using immutable data structure as well as custom control ow of the application, YFinance architecture minimizes the exposure of the original request object, thus reducing the risk of memory leak. As a result, the typical GC scavenge cycle takes lesser time to run and the frequency of the mark-sweep cycle is lower due to the improved result from the scavenge one. This is crucial to server scalability because the more frequent and the longer each GC cycle takes, the lower the throughout of the server to handle incoming requests because the processes will be blocked during GC cycle due to the single-threaded nature of a Node process.
Some Graphs (on 1 box)
There’re some graphs that we cannot publish, but our Above-Fold-Time reduces by 2x compared to our previous legacy stack and our RPS is also ~3x compared to our legacy stack also.