Data ArchitectureOmnichannelD2CTechnologyIndiaAnalyticsOperations

Building a Single Data Layer Across All Channels

A D2C brand operating across its own website, Amazon, Flipkart, Blinkit, and modern trade retail is generating performance data in five separate systems with five separate data models and five separate definitions of the metrics that matter. Without a single data layer that unifies these streams, the business is navigating with five partial maps rather than one complete one.

Manthan Sharma

Author

04-05-2026
9 min read
Building a Single Data Layer Across All Channels

The weekly performance review took ninety minutes not because the analysis was complex, but because the first forty minutes were spent reconciling numbers across five different reports. The Shopify report showed one revenue figure. The Amazon Seller Central report showed another. The Blinkit partner dashboard showed a third. The retail distributor's Excel-based sales report showed a fourth. And the finance team's accounting system showed a fifth that was different from all of them, for reasons that the finance manager could explain but that required a five-minute explanation every week. The management team was spending forty minutes per week over thirty hours per year of senior leadership time on a problem that was entirely architectural. The data existed. It was accurate within each source system. The problem was that it existed in five separate data models with five separate definitions of revenue, five separate treatment approaches for returns and cancellations, and no unified layer that presented a single, consistent view of the business's performance across all channels. Building that single data layer is not a luxury for a multi-channel D2C brand. It is the operational prerequisite for managing the business with the clarity that multi-channel scale requires.

01

Why Each Channel Has a Different Data Model

Every sales channel records performance data according to the operational logic of that channel and the operational logic of each channel is different in ways that make naive aggregation of their data models produce incorrect results. Shopify records revenue at order creation in the brand's base currency, before platform fees, net of discount codes applied at checkout. Amazon Seller Central records revenue as settlement amounts that include the product price and any marketplace promotions but exclude the Amazon fulfilment fee (which is recorded separately as a cost), with settlement occurring fourteen days after the sale. Blinkit's partner dashboard records sales at the dark store level by day, with returns recorded separately on a delayed basis as they are processed through the platform's return system. Retail distributor reports record secondary sales (distributor to retailer) rather than primary sales (brand to distributor), creating a timing difference between the brand's revenue recognition and the sell-through data that reflects actual consumer demand.Each of these data models is internally consistent and appropriate for the operational purpose of the system that generates it. The problem arises when someone attempts to answer the question 'how did the business perform this week?' by adding the numbers from each system together because the numbers represent different things, measured at different points in the transaction flow, with different treatment of returns, promotions, and fees. The result is a number that is arithmetically precise and operationally misleading.

02

The Architecture of a Single Data Layer

A single data layer does not replace the source systems it creates a unified view on top of them by applying consistent transformation logic to each source system's data before aggregation. The architecture has three components. The first is a set of source connectors the technical integrations that pull raw data from each source system into a central repository on a defined schedule (daily minimum, hourly for operational decisions). For most D2C brands, this is achievable through a combination of platform APIs (Shopify, Amazon, Flipkart all provide API access to order and settlement data), partner portal data exports (Blinkit, Zepto, retail distributors), and accounting system integrations.The second component is a transformation layer the code or configuration that converts each source system's data model into the unified data model defined by the brand's metric dictionary. This transformation layer is where the consistency work happens: defining that 'revenue' in the unified model means net settlement value received (or to be received within the settlement period) after platform fees and returns, and writing the transformation logic that derives this consistently from each source's different raw data structure. The third component is the presentation layer the dashboards, reports, or data exports that surface the unified data model in the format that decision-makers need. The presentation layer is the visible output of the single data layer, but it is only as good as the transformation layer beneath it.

03

Building It: Sequence, Tools, and Trade-offs

For a D2C brand at ₹50 lakh to ₹2 crore monthly revenue, the single data layer can be built with a combination of a cloud data warehouse (Google BigQuery, Amazon Redshift, or a simpler alternative like Metabase with a PostgreSQL backend for smaller data volumes), a data pipeline tool (Fivetran, Airbyte, or custom API integrations depending on budget and technical capacity), and a transformation layer (dbt for teams with SQL capability, or a simpler spreadsheet-based transformation for teams without dedicated data engineering).The build sequence that minimises time-to-value starts with the two or three channels that account for the largest share of revenue and the most frequent reconciliation effort typically the direct website and the primary marketplace. Getting these two sources into a unified model with consistent revenue and returns definitions eliminates the majority of the reconciliation problem immediately. Additional channels are added sequentially as the foundation is stable, with each addition requiring the definition of the transformation logic for that channel's specific data model. The trade-off to manage throughout the build is the tension between comprehensiveness and usability: a data layer that attempts to unify every possible metric from every source system simultaneously is an engineering project that takes months to complete and often produces a system that is too complex to maintain. A data layer that unifies the ten metrics that most matter for the business's key decisions, built in weeks rather than months, delivers the majority of the value at a fraction of the complexity.