0% found this document useful (0 votes)
7 views18 pages

Developing A Data Pricing Framework For Data Excha

Majumdar et al. propose a comprehensive data pricing framework to address the challenges of pricing in data markets, which often suffer from information asymmetry between buyers and sellers. The framework categorizes data pricing models into five themes and identifies eight key factors influencing pricing decisions. This study aims to enhance data exchange efficiency by providing structured pricing guidelines that consider various attributes and stakeholder dynamics.

Uploaded by

Hoàng Liêm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views18 pages

Developing A Data Pricing Framework For Data Excha

Majumdar et al. propose a comprehensive data pricing framework to address the challenges of pricing in data markets, which often suffer from information asymmetry between buyers and sellers. The framework categorizes data pricing models into five themes and identifies eight key factors influencing pricing decisions. This study aims to enhance data exchange efficiency by providing structured pricing guidelines that consider various attributes and stakeholder dynamics.

Uploaded by

Hoàng Liêm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Majumdar et al.

Future Business Journal (2025) 11:4 Future Business Journal


https://doi.org/10.1186/s43093-025-00422-z

REVIEW Open Access

Developing a data pricing framework


for data exchange
Rupsa Majumdar1, Anjula Gurtoo2*   and Minnu Maileckal2

Abstract
Despite emergence of data markets such as Windows Azure Marketplace and India Urban Data Exchange (IUDX),
comprehensive frameworks to determine data pricing and/or determine parameters for profit maximization remain
a gap. Data valuation often gets guided by the sellers, ignoring the interests of the buyers. The information asym-
metry results in lopsided pricing. The data sellers fail to price optimally, and the buyers are unable to optimize their
purchasing decisions, thus, reinforcing the need for a structured data pricing framework. The paper reviews litera-
ture and applies the stages as reported by Ritchie and Spencer (in: Bryman, Burgess (eds) Analysing qualitative data,
Routledge, London, 1994) for applied policy research to determine the main approaches of data pricing and develop
a comprehensive pricing framework. Literature selection on pricing attributes and content analysis classifies data pric-
ing models into five broad but distinct themes, based on the data pricing method, namely data characteristics-based
pricing, quality-based pricing, query-based pricing, privacy-based pricing, and organizational value-based pricing.
Application of the Ritchie and Spencer stages identifies eight factors, namely customer need, customer assigned
value, market maturity, market structure, usable data, data quality, seller reputation and seller objectives as defin-
ing and intersecting with the five pricing models. A framework is hence developed to guide data pricing. Thereby,
the paper creates a platform for prescribing data pricing formulas.
Keywords Data pricing framework, Data attributes, Data exchange/trading, Data pricing

Introduction concern of fair data pricing mechanism to better pro-


Data as a resource have gradually been recognized as a mote data circulation and sharing [2, 24, 26].
new production factor. Data are utilized in business eco- Data pricing, hence, forms one of the central issues
nomics for market research, customer analysis, product in data exchange. Due to the intangible, non-rivalrous
creation, investment decisions, risk management, and the nature of data as well as the capacity to provide posi-
social fields like security monitoring, public delivery ser- tive externalities, traditional data pricing models have
vices like health care and education, and targeted popula- limitations. Data are a virtual object as opposed to tradi-
tion or area planning [26]. Data-based interventions not tional commodities. Value of data gets determined by the
only improve efficiency and increase productivity, data- nature of knowledge and insights gained about specific
based problem analysis and solutions help develop better areas or topics [3, 5]. Hence, traditional pricing methods
policies [56]. However, data utilization faces a significant may be inadequate to realize the true value and utility of
big data [46, 73].
Furthermore, compared to the sale of other informa-
*Correspondence:
Anjula Gurtoo tion goods like telephone minutes and bandwidth, the
[email protected] sale of data is more complex. For a buyer, the quality of
1
Department of Economic Sciences, Indian Institute of Technology
data becomes as important as the quantity of data. When
Kanpur (IIT Kanpur), Kanpur 208016, India
2
Center for Society and Policy, Indian Institute of Science (IISc), given the same information, a buyer interested in con-
Bangalore 560012, India tacting people who need to rent apartments, for example,

© The Author(s) 2025. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://​creat​iveco​mmons.​org/​licen​ses/​by/4.​0/.
Majumdar et al. Future Business Journal (2025) 11:4 Page 2 of 18

will be drawn to different records than a buyer wanting Literature review


to contact people for an opinion poll. Thus, usability of The section reviews three themes, namely the complex
a particular dataset becomes a significant factor for buy- dynamics between the main stakeholders (buyer, seller,
ers and enables diverse users to endogenously select the platform/agent, and the regulators); the main theories
records of their interest [51]. guiding the pricing dynamics; and value of data attributes
In the light of the above discussion, the paper aims to for data pricing. The section ends with the justification
develop a conceptual framework of data pricing attrib- for the paper.
utes for use of data exchange. Specifically, the research
aims to: (a) identify pricing models and their attributes; Stakeholder dynamics
(b) analyse the models for their positioning strategy and Four prominent stakeholders participate in the data mar-
design; and (c) develop a comprehensive model of data kets, namely data buyers, data sellers, data brokers (in
pricing. Ritchie and Spencer [58] stages for applied pol- case of exchange platforms), and the regulators. Data
icy research provide the theoretical foundation for the market falls under the two-sided market approach, where
analysis. We classify data based on attributes. Attribute an economic platform has two distinct user groups pro-
is a quality, character or characteristics ascribes specifi- viding each other with mutual benefits [8]. The two user
cally to someone or something [16, 25, 30]. Attributes groups are the data buyers constituting the demand side
are unique traits providing qualities specific to a theme and the data sellers symbolizing the supply side. Funda-
[56]. Other classifications of data pricing include pric- mental economic principles argue for buyers make their
ing through auctions, such as weighted pairing auctions purchasing decisions guided by the objective of utility
and repeated auctions ([3, 15]; Zhao et al. 2020), pricing maximization, subject to a budget constraint [29, 80]. The
on the basis of ownership like public data, private data, buyers choose datasets which either have a large volume
restricted data, etc. [7, 56, 63], and data pricing based (to increase the robustness of their objective) or choose
on different data strategies [13, 18]. These classifications datasets of superior quality and high periodicity like real-
have utility within a limited domain, and hence, pric- time datasets. Alongside, the buyers might have a reser-
ing attribute classification is conducted and used for the vation price for their preferred dataset [13, 51]. The buyer
paper. buys the dataset when the net utility gains from buying
Literature review and content analysis form the meth- is greater than the utility from not buying [18, 60]. This
odological approach. A comprehensive pricing frame- forms the underlying mechanism guiding the buyers pur-
work forms a knowledge gap area in the data marketplace chasing decision.
research [1, 2, 21, 27, 38] and is addressed in the paper The seller, on the other hand, has an edge over the
using the Ritchie and Spencer [58] approach. The study quantity, quality, and precision of the data [78]. Customer
contributes to the literature in two ways. First, compre- surveys enable them to understand the reservation price
hensive assessment and drawing of attributes systema- of consumers. Sellers, therefore, have the opportunity to
tizes the body of scientific knowledge build around data classify the dataset based on quality levels or the demand
economics and data exchange, and the value propositions [3, 80]. Accordingly, they pursue profit maximization by
for data pricing. The insights generate scholarly debate adopting a differential pricing strategy.
and provides useful knowledge in developing research The concept of a data exchange platform along with a
ideas on the exploration of data value and data pricing data broker comes into play since the data market suffers
and impact of stakeholder dynamics on data pricing. Sec- from the problem of information asymmetry [1, 78]. The
ond, the study discusses the association between data introduction of a broker might ease the constraints as the
value and data pricing, mediated by specific attribute var- broker/platform can facilitate the match between buyers
iables like quality, customer demand, organizational goal, and sellers with similar quality values. The broker or plat-
and market support. A framework is hence developed to form charges a fees for enrolling in the platform and for
guide data pricing. The knowledge from the results can every successful match [18].
be drawn upon to develop pricing models. Thereby, the Although the presence of a data broker is of vital
paper creates a platform for prescribing data pricing importance to address the issue of information asymme-
formulas. try, their involvement might give birth to inefficiencies.
Kuempel [39] states how data brokers aggregate informa-
tion from data buyers and convert them into marketable
form. This conglomeration of information is detrimental,
Majumdar et al. Future Business Journal (2025) 11:4 Page 3 of 18

since these transactions happen beyond the eyes of the of knowledge and fear in society [62]. A common chal-
general public, it bears the risk of violation of privacy lenge is finding a balance between the need for strong
of the data buyers. Data gathered on a large scale about regulatory foundations to ensure data security and pri-
particular vendors may end up giving an accurate picture vacy, and the need for flexible and practical processes
of the individual, violating privacy, even though when that promote development of conducive environment
in silos format, the unaggregated data may appear to be for data exchange across multiple sectors and indus-
harmless [77]. This in turn leads to a conflict of interest tries. Data and information regulations must consider
among the sellers and the data broker community. Rent the dynamic dialectics of inclusion and exclusion, the
seeking is an economic concept which involves asking complex environment they exist in, and the changes that
privileges from the government for enhancing profit, occur within it [11].
without adding any value to the economy. Muralidhar
and Palk [53] opine that the aggregation of information Pricing theories
from the government provided Internet along with the Data pricing theories get classified under two main
profit collection from freely available public data by the themes, namely economic models and game-theory
brokers is a symbolization of the negative rent seeking based models [21, 38, 46]. Economic-based pricing mod-
behaviour. This behaviour in turn is likely to exacerbate els can be classified as cost model, consumer perceived
the inequality in accessing credible data. Therefore, this value, supply-demand model, and dynamic data pricing.
rent-seeking behaviour interferes with achieving socially Based on the nature of the market and number of play-
optimum allocations, where all the data market players ers, game-theory-based pricing models subdivide into
are better-off. The earlier paragraph talks about the data non-cooperative game, bargaining game and Stackelberg
brokerage charges which is an essential component of the game.
profit maximizing behaviour of the broker cohort. Thus,
based on the principles of economic theory, an escalation Economic models
in the brokerage fees might hinder the other data market Cost-based pricing models take into account the total
players from engaging in any transaction, which might cost of any commodity where total cost consists of fixed
prevent creation of some efficient matches. cost like resource and equipment and variable costs like
The regulator plays a pivotal role in the exchange pro- labour, capital, and material costs. The price then equals
cess as well as in every stage of development of the mar- the cost along with a percentage markup as profit. Cost
ketplace. With the infancy of the data markets and high models consider only the intrinsic factors (say cost of
demand for data, regulators currently focus on the data providing the data, cleaning the data, maintain the data)
sellers [71]. Regulators provide enough incentives to to put a price, ignoring the market conditions, including
compel sellers to participate in the data exchange pro- the buyer [1, 3, 78].
cess [3, 5]. Once the exchange ecosystem establishes For a commodity like data where reproduction cost
as an authentic transaction medium, and the match is almost zero, consumer value-based theoretical mod-
between data buyers and sellers becomes accurate over els appear to be more apt [71]. Five main factors affect
time, regulators can focus on ecosystem optimization data pricing, namely feedback from consumers, the mar-
[55]. The evolution stage of the market (well established ket environment influencing the actions of the consum-
vs. nascent), thus, dictates the status of the ecosystem ers, consumer interest and motivation in purchasing the
and the ecosystem in turn determines the focus of the data, the credibility of the seller and the economic value
regulator. The challenge of regulating data economies is guided by the demand and perception of the consumer
compounded by the fact that it is a hard task to plan for [29, 60, 80].
risks that are still poorly understood, like the unbridled To ensure fairness in the market, therefore, the supply-
development of digital platforms into monopolistic behe- demand theoretical models come into play. The models
moths, the responsibility and liability of digital entities ensure consistency in action between buyers and sellers,
operating internationally, and the growing apprehensions where the participants cannot change the terms and con-
surrounding unsupervised algorithmic decision-making ditions as per their whims, since the entire exchange gets
[33]. The length of time it takes for government legisla- guided by market forces [1, 3, 78, 80]. Different consum-
tion to pass, the financial burden of implementing and ers have preference towards different types of data and
enforcing a data protection regime, and the lack of coop- the differential pricing model allows the seller to charge
eration and knowledge between governmental entities different prices based on consumer preferences [60, 71].
that regulate in parallel are common obstacles to devel- For example, a high-quality dataset gets priced more as
oping effective data protection regimes. One or more of compared to the low-quality counterpart.
the aforementioned issues are also made worse by a lack
Majumdar et al. Future Business Journal (2025) 11:4 Page 4 of 18

The dynamic data pricing model prescribes price Justification for the study
depending on time and usage [78]. For instance, a par- Pricing of data plays a pivotal role in building an effective
ticular dataset with high demand at a particular point of market or platform for data trading. Literature highlights
time will have higher price during that time. Similarly, accurate pricing emerges aligned with market trends and
the usage-based pricing is proportional to the volume of market opportunities. One, data pricing occurs when the
data used [60, 71]. For example, state level data on school data owners give each data a price in order to push those
enrolment rates gets provided at a base rate, but for dis- datasets into markets [47]. If a standard model for data
trict or household level details the prices increase. pricing existed—one which considers all aspects of value
For spheres like online marketplaces, online flash sales such, age of the data, the reliability of the sample, and
and loan pricing, where the market condition is dynamic, other factors—sellers would be able to price optimally
Cohen et al. [14], in their paper, talk about the pricing in the market and buyers can make appropriate com-
mechanism followed by the firm who sells product char- parisons across data service providers to get a fair price
acterized by a host of features. Given the initial value of [1, 56]. However, data valuation is based on maximizing
the feature remains indeterminable, the pricing strategy value creation at a point of time. The data markets pre-
under such a dynamic structure is based on the purchase vailing nowadays, consequently, are dynamics and chang-
history of the products at the previous quoted prices. ing [27].
For rapidly evolving markets, the pricing models are Secondly, the current markets tend to be vertical within
responsive to four major components of people (say con- an industry and limited in reach due to lack of established
sumers), product configuration, time and locations as mechanism for determining the value of datasets [1, 21].
highlighted by Kopalle et al. [36]. These factors guide the The existing data market lacks sufficient transparency
transition from static to dynamic pricing, wherein sellers between buyers and sellers on the collection, manipula-
offer personalized prices based on customer behaviour. tion, and usage of data both before and after the sale. The
Saharan et al. [61] while taking about the mobil- market, thus, creates a "market for lemons" reducing the
ity management of normal or Electric Vehicles (EV’s) value of all products [38]. This results in parties being
stresses on the importance of dynamic pricing in the deceived.
Intelligent Transportation System (ITS). Time, demand, Thirdly, most economic models focus on data charac-
weather conditions and culture have the potential to teristics like quality, volume, supply, and demand and on
influence demand and supply, competitors price which data value like preferences, utility and similar, suggesting
in turn determines the optimal price in dynamic market a differential pricing approach. The differential model
structure. also gets applied in the context of query-based pricing
where price is a function of usage level. The game-the-
Game‑theory‑based models ory-based models suggest the same. The models derive
Game-theory-based pricing models focus on the nature their data monetization on competition makes both buy-
of the market and the number of players [55, 80]. The ers and seller better off by data trading and allowing firms
models get subdivide into non-cooperative game, bar- to get feedback about their datasets. Value-based pric-
gaining game and Stackelberg game. In the non-coopera- ing gets restricted to intrinsic properties and to the ser-
tive game, the players (data sellers in this context) do not vices emerging from the dataset. Hence, all the modelling
cooperate with each other and set prices in order to max- frameworks seek to establish a basis for deriving a rela-
imize profits [18]. Assuming all other players are doing tionship between data attributes and price. Developing a
the same, the game obtains a Nash equilibrium from transparent and rigorous model of pricing, thus, becomes
where no seller has a unilateral incentive to deviate [82, imperative.
83]. Bargaining model is a negotiation model where the
data sellers and the data buyers would propose their res-
ervation price for sale and purchase of datasets, respec- Case examples and pricing of attributes
tively. These prices are derived based on their objective An analysis of 87 data businesses worldwide conducted in
of profit and utility maximization [82, 83]. The final price January–March 2024, using publicly available data to gain
is set based on the weighted average of the reservation insight into the real-world pricing of data, highlighted
price of both parties. In Stackelberg game, with two data the parameters being used for data pricing (Table 1).
sellers, one becomes the leader and announces their pric- The parameters used for pricing decisions by data sell-
ing strategy first [55]. Based on the profit motive of the ing companies can be grouped under four dimensions,
follower and after considering the pricing strategy of namely customer expectations, data characteristics, mar-
the leader, the follower puts forward their own strategy. ket situation, and seller conditions.
Here, the leader enjoys a first mover advantage [18].
Majumdar et al. Future Business Journal (2025) 11:4 Page 5 of 18

These real-world pricing decisions are highlighted in consumer. A consumer is willing to pay the base price if
literature as well. Buyers willingness to pay is based on all their preferred attributes are met.
data attributes, forcing the sellers to understand attrib- Alternately, Ferreira et al. [19] discuss the importance
utes and critical information about the attributes [68]. of dynamic systems for diverse service requests. For value
The seller screens the consumers and offers a menu of attributes like urgency, duration and amount of infor-
options with information on attributes and expected mation, the authors argue for dynamic pricing model
prices. Consumers perceive the product to have certain being better applicable, where demand–supply condi-
attributes and puts weights to these attributes accord- tions of the market decide the pricing structure. Within
ing to their preferences [10]. The optimal allocation will the dynamic pricing system, Chen et al. [12] propose a
choose the buyer with maximum willingness to pay. pricing model based on cost signalling by the data buyer.
Wilson [41] develops a model for the same, considering The authors classify datasets as homogeneous (example,
a single decision maker. The decision maker relates the air quality level of different cities) and heterogeneous
attributes of the dataset to an individual’s order of prefer- (example, information on the air quality parameters like
ence. Price itself becomes a product attribute here. The humidity, precipitation for the considered cities). These
base price is set high, and the prices are lowered based datasets would vary in attributes like quality and quantity
on the demand at various attribute levels. The utility and the seller charges a price based on the buyer’s signal
function of the consumers therefore incorporates the on the most valuable attribute for them [12].
perceived value of the attributes and the utility score
derived ranks the product according to the affinity of the

Table 1 Pricing mechanism of businesses selling data


Organization type Parameters affecting pricing Dimensions Company Examples

Sells data Quantity, diversity of data, data features, Data characteristics Brevo, Mintec Global, Shop-
updating frequency gram, Skuuudle
Amount of data, number of users, data Data characteristics, Market situation Zenput, Retool, Lusha,
features
Level of customer support, data features, Data characteristics, Meeting customer Creatio, Retention.
number of records expectation com, Seamless.AI
Data features, email campaigns, sales tracking Data characteristics, Seller conditions Lead squared, AirTable
Number of users/makers, level of support, Data characteristics, Seller conditions, Meeting ProductBoard
team size customer expectation
Number of users, features, level of support, Data characteristics, Seller conditions, Meeting PrivCo
bandwidth customer expectation, Market situation,
Schemes. Level of support, number of tem- Meeting customer expectation Zyte, Pipefy, TeamSpective
plates and emails
Team size Seller conditions DataStreamX
Number of contacts, team members, level Seller conditions Meeting customer expecta- Podium
of support tion, Market situation
Sells and buys data Features, number of users Data characteristics, Meeting customer ActiveCampaign, Drift
expectation
Quantity and quality of data Data characteristics ImportGenius
Sells data and services Amount of data, level of support Data characteristics, Meeting customer Datamam
expectation
Specific products and services that a customer Data characteristics Meeting customer BWT
needs expectation
Support and maintenance Meeting customer expectation CrawlNow
Investment performance, team experience, Seller conditions, Market situation Tailwind
market conditions
Amount of data, frequency of scrapes, com- Data characteristics, Meeting customer APISCRAPY
plexity of scrapes, speed of delivery, customer expectation
support
Data platform Data usage, storage, processed, and transacted Data characteristics Narrative, Force24
Data discovery tool Features Data characteristics Microsoft Power BI
Source: Publicly available secondary data
Majumdar et al. Future Business Journal (2025) 11:4 Page 6 of 18

Methodology services, Internet of Things and e-commerce. Figure 1


The paper follows a mixed method content analysis illustrates the methodology.
approach by employing the methodology outlined by
Mangiaracina et al. [50] and qualitative data analysis Content analysis
framework by Ritchie and Spencer [59]. We conducted Content analysis method identifies the relationship
the research in three phases. The first phase comprises a between pricing and the importance of various data
bibliometric search and literature review based on Man- attributes. Content analysis is a deductive approach,
giaracina et al. [50]. The second phase includes content which utilizes existing theoretical concepts for categori-
analysis and identification of key variables using qualita- zation [65]. We apply the deductive approach to increase
tive data analysis methodology by Ritchie and Spencer the reliability of the coding.
[59]. Results are then presented as obtained from the Ritchie and Spencer [59]’s framework is used as the
content analysis. The third phase develops a framework base. The authors highlight the indexing technique to
for data pricing. Appendix 1 provides a snapshot of the select relevant data and break down into a dataset of a
methodology employed. manageable size. ‘Indexing’ refers to the process, whereby
a thematic framework is systematically applied to data
Paper selection [59]. By applying a thematic framework or index to the
A literature search on Google Scholar, ProQuest whole dataset, assumptions and analysis of the dataset
research, JSTOR, Wiley digital, Springer, Elsevier, Tay- become systematic, potentially replicable and add robust-
lor and Francis, and other Web of Science publications ness to the analysis [35].
from 2000 to 2024 gave 76 publications. An iterative Multiple keywords are identified from the literature to
search strategy with different combinations of broad key- identify, extract, and categorize relevant text from the
words was used such as “pricing attributes”, “data pricing”, data pricing and data valuation literature. The keywords
“approach” AND “data pricing”, “data” AND “pricing”, extract texts with arguments and evidence of different
“pricing models” AND “attributes”, "attributes" OR "mod- forms of data pricing and categorize the same into each
els" OR “strategy" AND "data pricing" OR "data strategy" construct. For extraction, in addition to relying on pre-
among others. determined keywords from literature, we introduce new
The initial search of 76 publications got filtered to 54 keywords as well and categorize texts based on the con-
based on the scope of the article and by excluding arti- textual meaning during the process. For example: the
cles focusing on electronic business, cloud computing, keyword “data platform owner” gets identified during
wireless networks, mobile data, web-enabled application the process of extraction, particularly in the context of
quality-based pricing. Similarly, the keyword “interactive

Fig. 1 Selection methodology


Majumdar et al. Future Business Journal (2025) 11:4 Page 7 of 18

pricing” is not predetermined but gets selected due to Results: content analysis
contextual meaning. Tables 2 and 3 show the coding Ritchie and Spencer [59] define four category of ques-
structure followed along with the literature-based list tions, namely contextual, diagnostic, evaluative and stra-
of keywords used for extracts and classification of each tegic with their own goals and key questions (Table 4).
extract. Applying the framework to the literature makes the
assumptions clear and the analysis systematic.

Stage 1: contextual analysis


Table 2 First level coding structure for attribute categorization The review of literature classified data pricing models
Data valuation Data seller Data consumer into five broad but distinct themes, depending on the
data pricing method, namely data characteristics-based
Privacy Accuracy Private valuation
pricing, quality-based pricing, query-based pricing, pri-
Granularity Completeness Mechanism design
vacy-based pricing and organizational value-based pric-
Google AdWords Redundancy Sensitivity to privacy
ing. The themes are derived based on whether the papers
Bid prices Willingness to pay
give a specific perspective on pricing data, focus on some
Data market Data platform owner Interactive pricing
specific attribute of data, or talk about some specific data
Consumer perceptions Versioning Conditional pricing
type (Table 5).
Nature of data Bi-level programming Positive cost
Studies on data characteristic-based pricing focus on
Context of pricing
the taxonomy of the data and the data market for data
Information entropy Pricing metrics Data type
valuation. In other words, studies either price data based
Data tuple Dynamic pricing Traffic
on the fundamental properties of the data or specific data
Value weight Classification Raw
ecosystem variables. For example, Zhang and Beltran [81]
Data reference index Online pricing Processed
and Zhang et al. [83] group pricing strategies according to
Reverse pricing Earth observation
the fundamental data properties of privacy (Yes/No) and
granularity (Coarse/Fine). The taxonomy considers the
data structure as well as query type like clarity of back-
ground, quality of sample, data credibility and clarity of

Table 3 Examples of extracts and categorization of text


Code Sub-code (keywords) Explanation of the Code Source

Pricing based on all possible data characteristics Data valuation “If a standard model for data pricing existed— Heckman et al. [31]
Data market one that considered many aspects of value such
as the age of the data, the reliability of the sam-
ple, and other factors—sellers would be able
to price optimally in the market and buy-
ers could make appropriate comparisons
across data service providers to get a fair price”
Pricing based on quality parameters Data seller “The data platform can set the subscription fee Yang et al. [75]
Data platform owner based on the quality level of the provided data
and service to determine the profit maximiza-
tion. The data consumer decides whether or not
to purchase according to their willingness to pay
and consumption”
Pricing based on number of views and queries Interactive pricing “We consider a data producer who sells Li and Miklau [42]
answers to aggregate queries. Consumers
request the price of individual queries, or sets
of queries, from the producer prior to purchase.
We allow interaction between the producer
and consumer: after paying for a set of queries,
and receiving their answers, the consumer may
choose to purchase additional queries.”
Pricing of private data Data consumer “This mechanism takes into consideration Lia and Raghunathan [45]
the differences in the utility of different types
of data users for data of different sensitiv-
ity levels and provides incentive for a data
user to reveal his true purpose of data usage
and acquire the data suiting the purpose”
Majumdar et al. Future Business Journal (2025) 11:4 Page 8 of 18

Table 4 Framework stages and key questions


Category Goal Key questions

Contextual Identify the existing models What are the key dimensions of each models?
What needs and assumptions does each model
fulfil?
Diagnostic Examine the reasons and assumptions for each What factors underlie the assumptions?
model What factors underlie the design?
What is the application potential of each model?
Evaluative Appraise the effectiveness of reasons What common factors affect successful design
and assumptions of a pricing framework?
What common factors affect successful applica-
tion of a pricing framework?
Strategic Developing a framework: new conceptual struc- How can the objectives be achieved?
ture for action How can the system be practically implementa-
ble?
Adapted from “Qualitative data analysis for applied policy research” by Jane Ritchie and Liz Spencer in A. Bryman and R.G. Burgess (eds.) “Analysing
qualitative data”, 1994, pp. 173–194

synthesis ([2, 31, 76]). Authors like Fricker and Maksimov Studies on query-based pricing provide three pric-
[20] and Liang and Yuan [47] propose a potential dataset ing functions. Bergemann and Bonatti [9] and Zheng
valuation model using the hedonic price technique using et al. [84] describe surplus function as the building
variables like data maturity, contexts targeted, data speci- block of the query-based model, which assigns a value
fication, data range, and data speed. Kushal et al. [40], to each realized match between a consumer and seller.
Muschalle et al. [54] and Radic et al. [57] argue for prag- Xu et al. [73], Deep and Koutris [17] and Li and Miklau
matic transition from cost-based pricing to value-based [42] propose a similar scheme which allows for real-
pricing which enables inclusion of vendor characteristics time pricing as a network flow problem, wherein the
like trustworthiness and maturity of offerings. Studies on price of a dataset is either determined by the number of
ecosystem context [4, 48] propose previously unrecog- views or volume or revenue maximization. For cases in
nized aspects such as costs associated with various data which the consumer has not purchased any query, the
categories, differences within data markets, value of data consumer is expected to pay a fixed cost pricing since
co-creation, perceived risk and user utility. Yu et al. [78] every query provides some information about the data-
utilize matchmaking method and employ reinforcement set, and hence, assumed to have a positive cost associ-
learning (RL) to optimize pricing strategies by modelling ated as pointed out by Balazinska et al. [6]. Koutris et al.
interaction attributes of ecosystem stakeholders. [37] and Cong et al. [15] on the other hand suggest a
Studies on quality-based pricing introduce quality as bundled model where the seller specifies some explicit
a utility in order to derive optimal prices. For example, price point, where a price point gets defined as a pair
Yang et al. [75] calculate a quality score by formulat- consisting of a view (query bundle) and the respective
ing a linear equation with accuracy, completeness, and price. Thus, the price of a query gets derived from the
redundancy as dependent variables. The square root of explicit price of views.
the quality score yields the quality level, which, when Privacy-based pricing studies take two approaches,
put into the profit-maximising equation along with the namely cost of privacy or privacy trade-off. These mod-
customer’s willingness to pay function, yield the opti- els formulate a mechanism design problem driven mainly
mal quality and price for data products and services. Yu by an allocation function and a payment function, where
and Zhang [79] and Bataineha et al. [8] and Golrezaei the allocation function takes into account the kind of
and Nazerzadeh [23] propose a multi-version strategy data provided to the individual [32, 51, 66], deriving the
where M number of data product versions get differ- optimal solution gets based on the revenue maximization
entiated according to K number of quality dimensions. of sellers and the utility maximization of buyers [22, 73],
Tang et al. [70], Stahl and Vossen [69], Yu and Zhang where the sellers report their privacy cost. Lia and Ragu-
[79] and Xing and Wang [72] focus on the value of nathan [45] and Li et al. [43] present the privacy trade-off
data itself and find different pricing strategies based on scenario where the authors consider linear as well as con-
degree of information asymmetry, sample set cost, and vex cost functions. Here, the cost includes collecting and
other available information. processing charges as well as the compensation to be paid
Majumdar et al. Future Business Journal (2025) 11:4 Page 9 of 18

Table 5 Indexing the reviewed literature under identified themes


Models Authors Variables used

Data characteristics-based pricing Zhang and Beltran [81] Privacy, granularity


Liang et al. [46] Cost of data collection, analysis and management, consumer percep-
tions, supply and demand, product characteristics and data market
structures
Acemoglu et al. [2] Clarity of background, data credibility and clarity of synthesis
Ye et al. [76] Data sparseness, sample uniqueness, feature dependency
Fricker and Maksimov [20] Aim fulfilment, sample quality, credibility of research, clarity of synthesis,
nature of data, market structure, pricing scheme
Heckman et al. [31] Age, periodicity, volume, accuracy
Hao et al. [27] Data score, update time, usage times, data volume, scarcity score, con-
sistency score and application score
Muschalle et al. [54] Common queries and demands of participants of the data market,
nature of pricing models benefitting consumers of data associated
products, research challenges of data markets
Schomm et al. [64] Type of product offered by vendor, time frame, domain, data origin, pric-
ing model, data access, data output, language, target, trustworthiness,
size of vendor, maturity of offerings
Kushal et al. [40] Number of transactions to access the dataset
Azcoitia et al. [4] Range of prices of data traded, categories and types of data products,
data volume and granularity
Radic et al. [57] Utility, data quality and data cost
Liang and Yuan [47] Data specification and quantity, demand (users and query), data param-
eters, data range, response speed, sales volume, free trial option
Liao and d Li [48] Value of data co-creation, risk, privacy/security, utility for the user,
revenue obtained
Zhang et al. [82] Kind of market structure, such as sell-side, buy-side, or two-sided, query
type and privacy notion
Yu et al. [78] Reinforcement learning, historical data, pre-training and fine-tuning
strategies
Mendizabal Arrieta and Castellano [52] Data quality, information entropy, value of data
Quality-based pricing Yang [75] Accuracy, Completeness, Redundancy
Yu and Zhang [79] Different quality dimensions
Stahl and Vossen [69] Accuracy, amount of data, availability, completeness, latency, response
time, timeliness
Yu and Zhang [79] Number of data-product versions, number of data consumers, number
of data-quality dimensions, quality level of data product, linear and inte-
grated quality of data product and willingness to pay
Tang et al. [70] Data quality
Bataineha et al. [8] Categorization of market agents according to data quality
Xing and Wang [72] Degree of information asymmetry, sample set cost, quality of sample set
Query-based pricing Bergemann and Bonatti [9] Realized match between a consumer and firm
Balazinska et al. [6] Different versions
Li and Miklau [42], Xu et al. [73] Direct data pricing, set of views, set of users, real-time users
Deep and Koutris [17], Koutris et al. [37] Query set pricing based on views, real-time pricing system
Majumdar et al. Future Business Journal (2025) 11:4 Page 10 of 18

Table 5 (continued)
Models Authors Variables used

Privacy-based pricing Mehta et al. [51] Private valuation of buyer


Yang and Xing [74] Categorization of data providers, consumer willingness to pay, sensitivity
of private data
Shen et al. [66] Value weight, information entropy and data reference index
Gkatzelis et al. [22] Privacy cost, number of buyers getting access to private data
Li et al. [43] Accuracy, compensation for privacy loss
Lia and Ragunathan [45] Sensitivity level of data, number of individual records
Jaisingh et al. [32] Categorization of buyers and third -party agents on their valuation
for privacy
Shen et al. [67] Loss of privacy of user, total privacy compensation, user’s privacy atti-
tudes, level of noise variance
Organizational cases Ye et al. [76] Calendar price set by host, booking probability, market demand signals
Golrezaei and Nazerzadeh [23] Accuracy, reliability
Li et al. [44] Data size, accurate classification of data
Zheng et al. [84] Different versions, revenue
Agarwal et al. [3], Harris and Longley [28] Pricing strategies
Cong et al. [15] Pricing data in machine learning, data marketplace
Jia-q and Zhang [34] Future cash flow, income on asset maturity, discount rate, and value
of making a decision

as a result of privacy loss and, thus, assume a two-part market. (b) No comprehensive list of attributes can be
tariff pricing optimization problem. considered while pricing data since all data type cannot
Organizational cases discuss pricing issues of specific be brought under the same umbrella and importance of
datasets like earth observation data [28] and traffic data data attributes vary across data consumers. For example,
[23] and shed some light on the dynamic pricing strat- age of data becomes important for air quality data but
egy of Airbnb using regression models for optimal pric- not for location of bus stops. Henceforth, general pric-
ing [76]. Li et al. [44] put forward a data pricing metric ing although shows the way to move forward, but fails to
referred to as information entropy in order to value data consider the market dynamics.
based on the amount of information the sellers possess. Quality-based pricing attempts to portray a bird’s
Data information entropy has been found to be posi- eye view of the importance of quality as a parameter
tively related to the size of the data and accuracy of data for data pricing. Quality being a subjective concept
classification [18, 80]. Jia-q and Zhang [34] use the pre- and determining data quality being a huge research
sent value method taking variables like future cash flow arena, literature has failed to consider the perspective
desired, final income at asset maturity and discount rate of data quality solely from the buyers point view, lead-
at base point. Zheng et al. [84] concentrate on the mobile ing to omission of quality parameters relevant to the
crowd-sensing data market to design a query-based data buyer but not the seller. For example, sellers are not
pricing mechanism. concerned about the accuracy of data, but buyers place
a lot of value on accurate data. In addition, the impor-
Stage 2: diagnostic analysis tance of quality dimensions varies across datasets.
Data characteristics-based pricing gives insight into pric- Introduction of a data broker may further disrupt the
ing techniques taking ideas from the tangible goods mar- entire process, if the broker interferes with data quality
ket like cost, supply–demand, usage and data attributes. or provides wrong information, giving rise to a prob-
Two limitations, however, can be identified, (a) The pric- lem of adverse selection. Consequently, while more
ing strategies applicable for tangible goods fail to func- structured than general pricing, quality-based pricing
tion properly in case of data. For example, cost-based models are fuzzy on the definition of quality and thus,
pricing gets guided by the sellers’ interests, hence prices application in the real world.
are biased, while supply demand-based pricing gets dif- Query-based pricing derives prices based on the
ficult to implement due to evolving nature of the data demand for a dataset, which gets determined by the
Majumdar et al. Future Business Journal (2025) 11:4 Page 11 of 18

number of views for a particular dataset or the number objectives. These themes come together under 4 dimen-
of queries generated. Even though the pricing scheme sions (Table 6) and align with the case studies as well
is applicable in the real-world setting, two limitations (Sect. "Case examples and pricing of attributes").
exist. One, the pricing being proportional to number of Two factors come together to form the data character-
views leads to seller profit maximization without consid- istics dimension, namely data usability and data quality.
ering the utility perspective of the data consumers. Sec- Literature shows data features, data worth and credibility,
ond, the pricing scheme can impede research with high and data value coming together to demonstrate the pro-
social value. For example, research in the field of health portional relationship between data attributes and data
may have many views or queries, increasing the pricing quality. Additionally, cost-based pricing suggests specific
and making the data inaccessible to researchers. Despite data ecosystem context such as costs associated with
having the potential of application in the real setting, this various data categories, value of data co-creation and
scheme undermines the value of data as a public good. perceived risk and user utility which demonstrate data
Privacy-based pricing are considerably more applicable characteristics.
because of the compensation component in the model. A significant feature of the reviewed studies shows data
The strategy is unique for taking into account the cost of often get investigated from the data seller point of view.
information loss and thereby attempts to nullify the prev- Optimal price gets derived for profit maximization and
alence of asymmetric information. However, the model profit optimization, taking data consumers are taken as
does not apply to all datasets (for example, non-personal a static constant. Customer needs and value assigned by
data like no of parks or traffic lights in a city) as sensitivity the customer, two critical inputs to the data pricing, get
levels are difficult to determine. The pricing scheme often ignored.
suffers from the same problem of bias towards the data Pricing, however, can vary based on the importance to
owners, even though some strategies have introduced the buyer. Research focuses on the perspective of buyers
a data broker to facilitate the transaction. Presence of a evaluating the worth of data, followed by optimization of
data broker, however, can lead to leakage of information. the objectives of the data buyer. Identification of the price
Organizational cases bring forth some real-world differences based on categorization of data consumers
pricing strategies like dynamic pricing and online pric- according to their preference towards the data is a sub-
ing mechanisms. Dynamic pricing gets applied where theme within the privacy research. However, research
demand is not easily quantifiable. The attributes essen- demonstrates the worth of data based on the informa-
tial for certain types of datasets are derived and priced tion entropy as well as credibility. A data broker, thus, has
accordingly. As a drawback, these pricing strategies fail to been brought into the scenario in order to simultaneously
provide a clear outline on the method and the underlying optimize the objectives of the data buyer as well as the
logic for dynamic pricing for different datasets and their seller.
attributes. The pricing function gets derived on number of dimen-
sion like customer value, seller objectives and buyer
Stage 3: evaluative analysis: factors across pricing models usage. The differences across studies arise on the realized
While indexing literature on variables, we see dis- match between the various dimensions. While buyer and
tinct themes emerge, namely data usability, data qual- seller form the basis for pricing, competition between
ity, customer need, data value by customer, market datasets and within sellers forms a strong input as well.
structure, market maturity, seller reputation, and seller In other words, market maturity and market structure

Table 6 Emerging variable clusters and factors


S. no Variable Clusters Factors Dimensions

1 Data attributes, worth, credibility, Data usability Data characteristics


2 Granularity, age, periodicity, volume, compatibility, accuracy, completeness, redundancy Data quality Data characteristics
3 Ease of access, expectation, trust, price, value, utility Customer need Customer expectations
4 Type of need, type of customer/buyer, perceptions, risk Data valued by customer Customer expectations
5 Size, type of seller, type of buyer, type of query Market structure Market situation
6 Supply, demand, competition, products offered, vendor management, service delivered Market maturity Market situation
7 Cost of data collection, target customer, need fulfilment, strategy, risk perception Seller objectives Seller conditions
8 Credibility, timeliness Seller reputation Seller conditions
Majumdar et al. Future Business Journal (2025) 11:4 Page 12 of 18

govern the pricing. Price gets determined by the amount access, credibility of the data, trust in the data source,
of information contained in data and how versioning can and expected prices influence the demand. Differ-
be carried out in practice, for example, price differential ent customers will have different utility for the same
between processed and raw data [25, 51]. Thus, opti- dataset. A dataset, therefore, may have buyers from
mal pricing comes into play based on existing market different domains and sectors. Different buyers will
dynamics like number of buyers, number of sellers, reg- have different perceptions about the same dataset.
ulatory safety of data providers and competition within Together they form the value of data in the eyes of
the sector. Furthermore, for the data platform owner or the customer.
data seller, data consumer needs and utility perspective 2. Data characteristics include data usability as well as
becomes a dynamic input as well, thus, deriving a pricing quality. Inherent data characteristics like age, granu-
function based on the current market situation. larity, periodicity, and volume affect usability, indi-
The above pricing models are applicable under the rectly affecting the price. Quality parameters such as
assumptions of a static market structure which opens up clarity, accuracy, completeness, and redundancy also
the scope for discussing their applicability if the dynamic influence data prices as elaborated in earlier sections.
elements of the market come into play. The preliminary 3. Market structure and market maturity explain the
dynamic components are the supply and demand side market situation. In relatively new markets the size
factors. Demand and supply of datasets would be highly of the market, variety in data products, number of
influenced by socio-economic forces. For example, competitors and management of vendors are evolv-
demand for health data was on peak during the times of ing. The service-delivery system and supply demand
COVID-19 pandemic. Again, with the increasing impor- mechanism, consequently, will get established as the
tance of real-time datasets, supply of real-time informa- market matures and comes to a stable and transpar-
tion on air pollution level is being made readily available. ent equilibrium.
Hence, dynamic pricing is the only feasible option under 4. A seller’s organizational objectives as well as reputa-
these scenarios, since quantification of individual tion play a significant role in deciding pricing strat-
demand curves is tedious. The guiding force of dynamic egy. Credibility of the seller builds customer trust and
pricing is time and its efficiency is critically dependent timeliness of data updating improves seller reputa-
on the capability to forecast future demand [49]. Given tion.
real-time datasets, employing real-time demand learning
results in a more robust dynamic pricing outcome. Thus, Consolidation (Fig. 2) demonstrates the interlinkages
even if there is no clue about the demand or supply for between the four major dimensions of data attributes.
a particular dataset in the future, the real-time informa- A two-way relationship exists between data character-
tion along with the current purchase rate can be utilized istics and market situation. Well-defined data attributes
to dynamically price these datasets. like data quality or usability parameters will enhance the
market situation with more data products and addition-
Conclusion: framework development ally create a distinct class of buyers and sellers in the
Data pricing is one of the most crucial parts of data market. A well-developed market structure will enhance
exchange. Content analysis of recent literature (2000– the data characteristics according to the signals provided
2024) on pricing attributes highlights five broad but by the market.
pricing models, namely data characteristics-based Quality of data provided by the seller will enhance the
pricing, quality-based pricing, query-based pricing, seller reputation and credibility at the same time influ-
privacy-based pricing, and organizational value-based encing the customer valuation of the data. The sellers
pricing. Application of the Ritchie and Spencer stages are expected to continue upgradation of the data assets
identifies eight factors, namely customer need, cus- to maintain their market worth, according to the buyer
tomer assigned value, market maturity, market struc- preferences.
ture, usable data, data quality, seller reputation and Customer type, need and perceptions influence mar-
seller objectives as defining and intersecting with the ket mechanisms such as demand. Demand influences the
five pricing models. evolution of the market. Matured markets have systems
A comprehensive model, thus, can be developed link- in place ensuring data credibility and, in turn, improves
ing the six significant factors influencing pricing strategy. customer trust and expectations. Thus, market situa-
The model can have features as follows: tion and meeting customer expectations are influencing
each other. With time sellers acquire experience and tar-
1. Customer expectations include customer need ful- gets customers and cater to their preferences better, thus
filment and value of data to the customer. Ease of gaining loyalty of those satisfied cohort of buyers which
Majumdar et al. Future Business Journal (2025) 11:4 Page 13 of 18

Fig. 2 Conceptual framework to guide pricing for data exchange

boosts demand, hence the bidirectional relationship. selling businesses can utilize. Marketing agencies often
Finally, market structure and maturity of the market give aggregate data to target eligible customers. This often
an impact on the cost, provision, upgradation, and main- involves a risk of loss of personal information, for which
tenance of data, thus influencing seller conditions via the data sellers should be appropriately compensated.
data characteristics and meeting customer expectations. The proposed pricing model addresses this aspect by
Thus, the different components interact with each other including the privacy dimension, thereby avoiding any
and influence data pricing. conflicts of interest. This pricing mechanism strengthens
Given the developments in data ecosystem, this frame- the transaction between marketing agencies (data buyers)
work can bolster development of data marketplaces by and the sellers. The prescribed model can be employed
effectively valuing data as a corporate asset. The frame- by agencies or businesses that deal with valuation of
work establishes data as a tradeable commodity and thus datasets, having some component of privacy inbuilt, such
can facilitate seamless operations in data trading plat- as companies aggregating credit card transactions data,
forms. Hence, this framework can be adapted and used health data, etc.
by any business or agency looking to monetize or trade In short, this section focusses on the six crucial fac-
data. Valuation of data is critically dependent on a num- tors influencing data pricing–Customer expectations
ber of factors like origin, quality, frequency of usage, etc., consisting of customer need fulfilment and value of
which the framework incorporates and thereby eases the data to the customer, data characteristics, market struc-
construction of transparent and robust pricing frame- ture and maturity and seller’s organizational objective
work that data platforms, data marketplaces and data and reputation. As the dynamic pricing model hints on
Majumdar et al. Future Business Journal (2025) 11:4 Page 14 of 18

parameters like purchase history, time, demand, cul- Analysis of pricing decisions by data selling companies
ture, location, these factors are implicitly taken care of highlighted the importance of parameters such as cus-
in the framework. Customer need fulfilment captures tomer expectations, data characteristics, market situa-
the dimension of demand and trust tries to capture the tion, and seller conditions.
faith of the buyers, if the buyer develops a trust on the One, economic literature showcases data buyers and
seller, he/she is more likely to purchase from that seller, data sellers as the two main players in the data market
thereby subtly including the dimension of purchase his- and the data market falling under the two-sided market
tory. Data characteristics incorporate product configura- approach [8, 80]. Thus, fundamental economic princi-
tion, time, essential for pricing and location parameter is ples of utility maximization and budget constraints get
considered in market structure. For example, if the data applied to discuss pricing, with focus on net utility gains
platform is operated by a developed country, market [18, 60]. This forms the underlying mechanism guid-
structure is much more organized as compared to devel- ing pricing in general. Our results, however, show the
oping counterparts. Hence, the pricing structure would relationship between data buyer and seller moderated
be much more precise as compared to a relatively nascent by two other parameters, namely data quality and data
platform. Perceptions about value of data incorporates value. All the five pricing models identified in the analysis
the role of culture, because the society around influ- emphasized the significant influence of data quality like
ences your valuation of data. In less developed countries completeness, timeliness and credibility, and significant
where basic literacy is hard to achieve, valuation of data influence of the value defined by the buyers as impacting
is a far-fetched concept. Demand for data is also abstract utility maximization.
in such a scenario; hence, this dynamic element of cul- Two, while a data exchange platform forms a signifi-
ture is also included. Categorizing buyers in these evolv- cant stakeholder in the data market, their role in pricing
ing markets is difficult, but explicitly it is done based on is limited and underemphasized. Data platform and data
their valuation of data and need for data. The six major broker come into play to remove the problem of informa-
factors interact among each other to create the platform tion asymmetry [1, 78] and facilitate the match between
for dynamic pricing. Meeting customer expectations cre- buyers and sellers with similar values. However, the cur-
ates the channel for real-time buyer feedback, and this is rent pricing models do not include needs and constraints
achieved by the growth of the market structure. As mar- of a data platform. Our analysis demonstrates their lim-
kets mature, the better it can cater to the needs of the ited role currently as well. Data exchange platforms like
buyers and hence better the feedback. Finally, improved IUDX and Windows Azure Marketplace are essential for
market scenario can help to identify the major data data sharing. Large exchanges, as an activity, done one
attributes, satiating the needs of the buyers in the market. on one or manually has high inefficiency. However, the
development of platforms is a slow process and their role
Discussion in pricing remains indirect through data quality or value
The paper provides direction for the orderly development added services.
of data pricing and data markets through a systematic Third, dynamics pricing models look at the supply-
review of the current literature on the attributes related demand equation adding market fairness in the model,
to data pricing and develops a comprehensive framework since the entire exchange gets guided by market forces [3,
for the same. The results demonstrate five broad but dis- 78, 80]. Models focus on the nature of the market and the
tinct themes around data pricing methods, namely data number of players [55, 80]. Literature highlights accurate
characteristics-based pricing, quality-based pricing, pricing emerges aligned with market trends and market
query-based pricing, privacy-based pricing, and organi- opportunities. Our results highlight market maturity and
zational value-based pricing. The methods intersect market structure as strong influences on market trends
and merge into eight factors, namely data usability, data and opportunities. However, market maturity remains
quality, customer need, data value by customer, market under explored in dynamic pricing. The market struc-
structure, market maturity, seller reputation, and seller ture currently is highly dynamic and changing. The pric-
objectives. The application of Ritchie and Spencer [58] ing models, however, take market structure as static, thus
framework classifies the factors under four main dimen- limiting pragmatic and real-world use of such models.
sions, namely customer expectations, data characteris- Fourth, the real-world pricing decisions (Sect. "Case
tics, market situation and seller conditions. We observe examples and pricing of attributes") highlight the signifi-
a match between the results obtained from literature cance of data attributes. Sellers are seen to base their pric-
analysis and the real-world pricing decisions outlined in ing on data attributes and the information revealed about
Sect. "Case examples and pricing of attributes". Further- the attributes. However, the pricing attributes parameters
more, our results bring forth four distinct arguments. used currently fall under only four dimensions, namely
Majumdar et al. Future Business Journal (2025) 11:4 Page 15 of 18

data characteristics, market situation, customer expecta- such trends. The paper limits itself by not including
tions and seller conditions. Value attributes like usabil- the complex nuances of exchange/trading.
ity, quality, seller reputation and demand assumptions • While the paper provides a thorough review of quali-
are missing from the pricing models. Our review and tative literature on data pricing, lack of quantitative
analysis highlight these four dimensions as significant as analysis to complement the qualitative findings lim-
well, thus defining efficient pricing as more complex than its the paper. The study is limited to analysing already
understood currently. proposed pricing strategies. Future research can con-
centrate on utilizing quantitative data to study the
Implications and limitations proposed framework.
Literature stresses on the complexity of pricing a dataset • Number of consumers and buyers and differences in
and has attempted to provide methodologies by consid- their preferences create competition in the market.
ering intrinsic data characteristics like quality, amount, The more consolidated is the market, the higher is
queries generated. We provide a comprehensive list of the price for the good and the intensity of competi-
attributes from the perspective of the data buyer, data tion. In the current fragmented data market, the
seller, and the broker required to be weighed in for mon- competition remains low. However, with time, com-
etization. Thus, the framework tries to address the infor- petition intensity will increase where many buyers
mation asymmetry existing in the data marketplace. The will choose the same supplier and switching suppli-
framework outlined in Fig. 2 explicitly talks about the ers will be common. In such case the framework will
main attributes guiding pricing and emphasizes the inter- have to include either various competition scenarios
relationships existing between them, to yield efficient or include competition as an attribute of the dataset.
outcomes. The current framework does not include competition
The framework becomes pertinent for the evolving intensity and hence gets limited for future use.
data exchange platforms, where appropriate pricing
methodologies remain underdeveloped. Data pricing
being an evolving concept, getting quantitative informa- Future recommendations and policy directions
tion or finding an appropriate proxy for the pricing of Newer research and techniques and constantly emerging
data is difficult. Finding target consumers who use data in the upcoming field of data pricing. Three avenues of
regularly and over a long period of time is also arduous. future development in data pricing stemming from the
Thus, establishing conditions pertaining to data transac- results are.
tions becomes the basis to investigate data pricing. In the
paper, we outline all the pertaining conditions and the • Future approaches for data pricing will utilize eco-
relationship between them, to facilitate development of nomic models as well as computer algorithms to
suitable pricing models. develop more efficient and comprehensive mod-
Pricing of data as a service remains unexplored so far. els. Techniques such as deep learning will apply
Some obstacles and hurdles include categorizing data on dynamic pricing where all the stakeholders will
types, streamlining purpose, sorting data emergence come together as equal opportunity participants.
(owner, industry, country) and bundling datasets based In such circumstances, a comprehensive pricing
on some criteria. Moreover, information asymmetry model framework as suggested in the paper will
creates a mismatch between supply and demand infor- have to be applied as the basis for the exchange
mation. The value of data, thus, becomes challenging to between stakeholders. In other words, mapping all
quantify using a rigid and single formula. However, look- stakeholders to the various attributes and dimen-
ing at attributes simplifies the process and also supports sions discussed in our results will create a workable
quantification. Using pricing attributes takes into account and effective model.
the requirements and interests of all the three main par- • Pricing methods are still in the theoretical phase and
ties, namely the seller, the buyer and the exchange (data application to datasets or data marketplace is in the
trading) platform. nascent stage. The characteristics described in the
However, limitations of the study remain as follows: framework can be applied to different real-world
exchange scenarios to build newer data pricing meth-
• Consumer information has implications for retailers ods and techniques. Additionally, different nature of
discriminatory pricing strategies, thus manipulating data can be tested for applicability of different attrib-
the market outcome. Furthermore, consumer and utes to the dataset. In such case weighted averages
buyer privacy form a significant part of any trading. can make the pricing method more robust
Attributes-based pricing strategy, however, ignores
Majumdar et al. Future Business Journal (2025) 11:4 Page 16 of 18

• Data security and data privacy take centre stage for Aspect Details Description
data exchange. Pricing methods have to be based
Paper selection Paper selection using An iterative search
on the nature and amount of privacy demanded by methodology out- strategy using different
the data owner. Data privacy is not an attribute but lined by Mangiaracina combinations of broad
an access variable, hence not taken in the current et al. [50] keywords
model. However, methods will have to include pri- Publications were
then selected based
vacy when pricing data. In such a scenario measure on the scope of the arti-
of privacy and compensation for privacy will have to cle
be included in the pricing model. Years of scope Literature considered Literature between 2000
for the present study and 2024 was consid-
ered for analysis
Data have become an important strategic resource
Coding structure Coding structure The data get coded
and factor of production, with significant impact on and keywords identi- and keywords are in two stages
public and private decisions. However, the current fication identified from lit- In the first stage, five
fragmented data limit exchange/trading thus creat- erature for categoriza- main codes are created
tions as “first level code”
ing information silos and difficulty in commercial
use. Furthermore, lack of fair assessment techniques In the second stage,
for the single con-
makes the price of data unmatched to the true value. struct of data pricing,
In such circumstances, a non-personal data gov- sub-codes or “second
ernment framework by the regulators/governments level codes” are created
based on data pricing
requires development to reduce systemic problems attributes
of data quality, information asymmetry and ethics. Content analysis Ritchie and Spencer Indexing technique
Basic policy guidelines for data access, data disclosure [59]’s framework is employed where the-
norms, data quality and standards, user charges/pric- highlights the index- matic framework
ing technique is systematically applied
ing, usage rights, ethics and fair use of data and data to select relevant data to data to select
monitoring and security are an urgent imperative. and break down into a relevant data and break
Data governance policies are key to promote the cir- dataset of a manage- down into a dataset
able size of a manageable size
culation of data and endure accurate pricing.
The literature then
Performance-driven pricing strategies will take over gets grouped into four
the current base pricing methods. With machine learn- categories: strategic,
ing development, data valuation will be aided by RL. In evaluative, contextual,
and diagnostic
such circumstances, a guidelines for future data collec-
Recurring factors are
tion becomes a government imperative to enable fairness identified across pricing
and awareness among data owners. Policies for privacy- models which can be
driven pricing strategies which include data owner roles, categorized into distinct
themes
clarity on their responsibilities and options of sharing are
Framework develop- Developing concep- A conceptual frame-
essential to ensure just data environment. ment tual framework based work for pricing data
on results obtained is developed based
on the themes emerg-
ing from content
analysis
Appendix 1: Methodology Case study Real-world data are Analysis of data-selling
used to demonstrate businesses is conducted
the practical applica- using publicly available
tion of our conceptu- data
Aspect Details Description
alization
Study method Mixed method con- The research uses
tent analysis mixed method content
analysis by employing
the methodology out- Abbreviations
lined by Mangiaracina IUDX India Urban Data Exchange
et al. [50] and qualitative RL Reinforcement learning
data analysis framework
by Ritchie and Spencer Acknowledgements
[59] The authors would like to gratefully acknowledge OMIDYAR NETWORK for
their support through funds and IUDX INDIA for their support through multi-
ple discussions on practical ways of pricing data.
Majumdar et al. Future Business Journal (2025) 11:4 Page 17 of 18

Author contributions 14. Cohen MC, Lobel I, Paes Leme R (2020) Feature-based dynamic pricing.
All authors contributed to the study conception and design. Material prepara- Manag Sci 66(11):4921–4943
tion, data collection and analysis were performed by all authors. The first draft 15. Cong Z, Luo X, Pei J, Zhu F, Zhang Y (2022) Data pricing in machine learn-
of the manuscript was written by all authors, and all authors commented on ing pipelines. Knowl Inf Syst 16:1417–1455
previous versions of the manuscript. All authors read and approved the final 16. Cornford IR (1996) The defining attributes of “skill” and “skilled perfor-
manuscript. mance”: some implications for training, learning, and program develop-
ment. Aust N Z J Vocat Educ Res 4(2):1–25
Funding 17. Deep S, Koutris P (2017) QIRANA: a framework for scalable query pricing.
OMIDYAR NETWORK (Grant No. SP/OMNI-20-0001.06). In: Proceedings of the 2017 ACM international conference on manage-
ment of data. Association for Computing Machinery, New York, pp
Availability of data and material 699–713
The data used in the paper are all secondary data. The secondary data are 18. Fernandez RC, Subramaniam P, Franklin MJ (2020) Data market platforms:
public data. The data can be accessed through a repository available with the trading data assets to solve data problems. In: Proceedings of the VLDB
first author. endowment, vol 13, no 11, ISSN 2150–8097. https://​doi.​org/​10.​14778/​
34077​90.​34078​00
19. Ferreira VC, Esmat HH, Lorenzo B, Kundu S, Mg FF (2022) Reinforcement
Declarations learning based multi-attribute slice admission control for next-generation
networks in a dynamic pricing environment. In: 2022 IEEE 95th vehicular
Ethics approval and consent to participate technology conference: (VTC2022-Spring), IEEE, pp 1–5
Not applicable. 20. Fricker SA, Maksimov YV (2017) Pricing of data products in data market-
places. In: International conference on software business, Springer, pp
Consent for publication 49–66
Not applicable. 21. Fruhwirth M, Rachinger M, Prlja E (2020) Discovering business models of
data marketplaces. In: Proceedings of the 53rd Hawaii international con-
Competing interests ference on system sciences, pp 5736–5747. https://​schol​arspa​ce.​manoa.​
The authors have no conflict of interest. hawaii.​edu/​handle/​10125/​64446
22. Gkatzelis V, Aperjis C, Huberman BA (2015) Pricing private data. Available
at SSRN: https://​ssrn.​com/​abstr​act=​21469​66 or https://​doi.​org/​10.​2139/​
Received: 15 March 2024 Accepted: 23 December 2024 ssrn.​21469​66
23. Golrezaei N, Nazerzadeh H (2014) Pricing scheme for metropolitan traffic
data markets. In: Proceedings of 3rd international conference on data
management technologies and applications, pp 266–271
24. Gurkan H, De V’ericourt F (2022) Contracting, pricing, and data collection
References under the AI flywheel effect. Manag Sci 68:8791–8808. https://​doi.​org/​10.​
1. Abbas AE, Agahari W, Van de Ven M, Zuiderwijk A, De Reuver M (2021) 1287/​mnsc.​2022.​4333
Business data sharing through data marketplaces: a systematic literature 25. Hager P, Holland S (2006) Graduate attributes, learning and employability.
review. J Theor Appl Electron Commer Res 16(7):3321–3339 Springer, Cham
2. Acemoglu D, Makhdoumi A, Azarakhsh M, Ozdaglar A (2022) Too much 26. Hao J, Deng Z, Li J (2023) The evolution of data pricing: from economics
data: prices and inefficiencies in data markets. Am Econ J Microecon to computational intelligence. Heliyon 9(9):e20274. https://​doi.​org/​10.​
14(4):218–256. https://​doi.​org/​10.​1257/​mic.​20200​200 1016/j.​heliy​on.​2023
3. Agarwala A, Dahleha M, Thibaut H, Ruia M (2023) Towards data auctions 27. Hao J, Yuan J, Li J, Liu M, Liu Y (2023) Ensemble pricing model for data
with externalities a institute for data, systems, and society, working paper assets with ranking-pruning-averaging strategy. Proc Comput Sci
MA 02139, Massachusetts Institute of Technology, Cambridge 221:813–820
4. Azcoitia SA, Iordanou C, Laoutaris N (2022) Measuring the price of data 28. Harris RJ, Longley PA (2002) New data and approaches for urban analysis:
in commercial data marketplaces. In: Proceedings of the 1st international modelling residential densities. Trans GIS 4(3):217–234
workshop on data economy, December 6, ISBN: 978–1–4503–9923–4. 29. Harmon R, Demirkan H, Hefley B, Auseklis N (2009) Pricing strategies for
Association for Computing Machinery, New York, pp 1–7 information technology services: a value-based approach. In: 2009 42nd
5. Badewitz W, Hengesbach C, Weinhardt C (2022) Challenges of pricing Hawaii international conference on system sciences, IEEE, pp 1–10
data assets: a literature review. In: IEEE 24th conference on business 30. Haserot FS (1953) Spinoza’s definition of attribute. Philos Rev
informatics (CBI), vol 1, pp 80–89, 15–17 June. IEEE Explore 62(4):499–513
6. Balazinska M, Howe B, Koutris P, Suciu D, Upadhyaya P (2013) A discus- 31. Heckman JR, Boehmer EL, Peters EH, Davaloo M, Kurup NG (2015) A pric-
sion on pricing relational data. In: Tannen V, Wong L, Libkin L, Fan W, Tan ing model for data markets. In: iConference 2015 proceedings, Core, UK
WC, Fourman M (eds) Search of elegance in the theory and practice of 32. Jaisingh J, Barron J, Mehta S, Chaturvedi A (2008) Privacy and pricing
computation: essays dedicated to peter Buneman. Springer, Cham, pp personal information. Eur J Oper Res 187(3):857–870
167–173 33. Jeong Y (2023) Enhancing policy and regulatory approaches to
7. Ballantyn A (2020) How should we think about clinical data ownership? J strengthen digital, platform, and data economies. Asian Development
Med Ethics 46(5):289–294 Bank, Mandaluyong
8. Bataineha A, Mizounib R, Barachic ME, Bentahar J (2016) Monetizing per- 34. Jia-qi W, Zhang M (2023) Research on data asset pricing based on bar-
sonal data: a two-sided market approach. Proc Comput Sci 83:472–479 gaining model. Proc Comput Sci 221:601–608
9. Bergemann D, Bonatti A (2015) Selling cookies. Cowles foundation 35. Kiernan MD, Hill M (2018) Framework analysis: a whole paradigm
discussion paper no. 1920R. Comput Ind Eng 112:1–10 approach. Qual Res J 18(3):248–261
10. Bergemann D, Bonatti A, Gan T (2022) The economics of social data. Rand 36. Kopalle PK, Pauwels K, Akella LY, Gangwar M (2023) Dynamic pricing:
J Econ 53(2):263–296 Definition, implications for managers, and future research directions. J
11. Bräutigam T, Miettinen S (2016) Data protection, privacy and European Retail 99(4):580–593
regulation in the digital age. Unigrafia OY, Helsinki 37. Koutris P, Upadhayaya P, Balazinska M, Howe B, Suciu D (2015) Query-
12. Chen J, Li M, Xu H (2022) Selling data to a machine learner: pricing via based data pricing. J ACM 62(5):1–44
costly signaling. In: International conference on machine learning. PMLR, 38. Koutroumpis P, Leiponen A, Thomas LD (2020) Markets for data. Ind Corp
pp 3336–3359 Chang 29(3):645–660
13. Campbell C, Sands S, Ferraro C, Tsao HY, Mavrommatis A (2020) From data 39. Kuempel A (2016) The invisible middleman: a critique and call for reform
to action: how marketers can leverage AI. Bus Horiz 63(2):227–243 of the data broker industry. Northwest J Int Law Bus 36(1):207
Majumdar et al. Future Business Journal (2025) 11:4 Page 18 of 18

40. Kushal A, Moorthy S, Kumar V (2011) Pricing for data markets. In: Working 67. Shen Y, Guo B, Shen Y, Duan X, Dong X, Zhang H, Zhang C, Jiang Y (2022)
paper, University of Washington, USA. https://​cours​es.​cs.​washi​ngton.​edu/​ Personal big data pricing method based on differential privacy. Comput
cours​es/​cse544/​11wi/​proje​cts/​kumar_​kushal_​moort​hy.​pdf Secur 113:102529
41. Lawler‐Wilson C (1979) Pricing New Products:: the Application of a Multi‐ 68. Smolin A (2023) Disclosure and pricing of attributes. Rand J Econ
attribute Model. Manag Decis 17(4):304–316. https://​doi.​org/​10.​1108/​ 54(4):570–597
eb001​194 69. Stahl F, Vossen G (2016) Fair knapsack pricing for data marketplaces. In:
42. Li C, Miklau G (2012) Pricing aggregate queries in a data marketplace. In: Advances in databases and information systems: 20th east European
WebDB, pp 19–24 conference, ADBIS 2016, Prague, Czech Republic, August 28–31, proceed-
43. Li C, Li DY, Miklau G, Suciu D (2014) A theory of pricing private data. ACM ings 20. Springer, pp 46–59
Trans Database Syst (TODS) 39(4):1–28 70. Tang R, Amarilli A, Senellart P, Bressan S (2016) a framework for
44. Li X, Yao J, Liu X, Guan H (2017) A first look at information entropy- based sampling-based XML data pricing. Transactions on large-scale data-and
data pricing. In: IEEE 37th international conference on distributed com- knowledge-centered systems XXIV: special issue on database-and expert-
puting systems, 5–8 June, IEEE Explore systems applications. Springer, Cham, pp 116–138
45. Lia BX, Raghunathan S (2013) Pricing and disseminating customer data 71. Tian Y, Ding Y, Fu S, Liu D (2022) Data boundary and data pricing based
with privacy awareness. Decis Support Syst 59:63–73 on the shapley value. IEEE Access 10:14288–14300
46. Liang F, Yu W, An D, Qingyu Y, Fu X, Zhao W (2018) A survey on big data 72. Xing A, Wang H (2024) Pricing and sample set strategies of data providers
market- pricing, trading and protection. IEEE Spec Sect Priv Preserv Large- under quality information asymmetry. J Oper Res Soc 75(2):278–296
Scale User Data Soc Netw 6:15132–15154 73. Xu J, Hong N, Xu Z, Zhao Z, Wu C, Kuang K, Shum H (2023) Data-driven
47. Liang J, Yuan C (2021) Data price determinants based on a hedonic pric- learning for data rights, data pricing, and privacy computing. Engineering
ing model. Big Data Res 25:100249 25:66–76
48. Liao J, Li R (2023) Establishing a two-way transaction pricing model of 74. Yang J, Xing C (2019) Personal data market optimization pricing model
“platform-individual” co-creation data property rights. J Innov Knowl based on privacy level. Information 10(4):1–23
8(4):1004–1027 75. Yang J, Zhao C, Xing C (2019) Big data market optimization: pricing
49. Lin KY (2006) Dynamic pricing with real-time demand learning. Eur J model based on data quality. Wiley, New York
Oper Res 174(1):522–538 76. Ye P, Qian J, Chen J, Wu C-H, Zhou Y, Mars SD (2018) Customized regres-
50. Mangiaracina R, Marchet G, Perotti S, Tumino A (2015) A review of the sion model for airbnb dynamic pricing. Appl Data Sci Track Paper. https://​
environmental implications of B2C e-commerce: a logistics perspective. doi.​org/​10.​1145/​32198​19.​32198​30
Int J Phys Distrib Logist Manag 45(6):565–591 77. Yeh CL (2018) Pursuing consumer empowerment in the age of big data:
51. Mehta S, Dawande M, Janakiraman G, Mookerjee V (2019) How to sell a a comprehensive regulatory framework for data brokers. Telecommun
dataset? Pricing policies for data monetization. Inf Syst Res 32:1281 Policy 42(4):282–292
52. Mendizabal-Arrieta G, Castellano-Fernández E, Rapaccini M (2023) A 78. Yu Y, Yao S, Li J, Wang FY, Lin Y (2023) SWDPM: a social welfare-optimized
pricing model to monetize your industrial data. Front Manuf Technol data pricing mechanism. In: 2023 IEEE international conference on
3:1057537 systems, man, and cybernetics (SMC), IEEE, pp 2900–2906
53. Muralidhar K, Palk L (2018) A free ride: data brokers’ rent-seeking behavior 79. Yu H, Zhang M (2017) Data pricing strategy based on data quality. Com-
and the future of data inequality. Vanderbilt J Entertain Technol Law put Ind Eng 112:1–10
20(3):779 80. Zhang M, Arafa A, Huang J, Poor HV (2021) Pricing fresh data. IEEE J Sel
54. Muschalle A, Stahl F, Vossen G (2013) Pricing approaches for data markets. Areas Commun 39(5):1211–1225
In Castellanos M, Dayal U, Rundensteiner EA (eds) BIRTE 2012, LNBIP, vol. 81. Zhang M, Beltrán F (2020) A survey of data pricing methods. SSRN Elec-
54, pp 129–144 tron J. https://​doi.​org/​10.​2139/​ssrn.​36091​20
55. Niu C, Zheng Z, Tang S, Gao X, Wu F (2019) Making big money from 82. Zhang M, Beltrán F, Liu J (2023) A survey of data pricing for data market-
small sensors: trading time-series data under pufferfish privacy. In: IEEE places. IEEE Trans Big Data 9:1038
INFOCOM 2019-IEEE conference on computer communications, IEEE, pp 83. Zhang X, Yue WT, Yu Y, Zhang X (2023) How to monetize data: an eco-
568–576 nomic analysis of data monetization strategies under competition. Decis
56. Pei J (2020) A survey on data pricing: from economics to data science. Support Syst 173:114012
IEEE Trans Knowl Data Eng 34(10):4586–4608 84. Zheng Z, Peng Y, Wu F, Tang S, Chen G (2017) An online pricing mecha-
57. Radic M, Herrmann P, Stein T, Kleine S (2023) Data marketplaces: value nism for mobile crowdsensing data markets. In: Proceedings of the
drivers and pricing approaches of data. In: ECIS 2023 research papers, vol 18th ACM international symposium on mobile ad hoc networking and
311. https://​aisel.​aisnet.​org/​ecis2​023_​rp/​311 computing, pp 1–10
58. Ritchie J, Spencer L (1994) Qualitative data analysis for applied policy
research. In: Bryman A, Burgess RG (eds) Analysing qualitative data. Rout-
ledge, London, pp 173–194 Publisher’s Note
59. Ritchie J, Spencer L (2002) Qualitative data analysis for applied policy Springer Nature remains neutral with regard to jurisdictional claims in pub-
research. In: Huberman AM, Miles MB (eds) The qualitative researcher’s lished maps and institutional affiliations.
companion. SAGE Publications Inc, Thousand Oaks, pp 305–329
60. Ritala P, Keränen J, Fishburn J, Ruokonen M (2024) Selling and monetizing
data in B2B markets: four data-driven value propositions. Technovation
130:102935
61. Saharan S, Bawa S, Kumar N (2020) Dynamic pricing techniques for intel-
ligent transportation system in smart cities: a systematic review. Comput
Commun 150:603–625
62. Sarangi U (2018) Information economy and data protection laws: a global
perspective. Int J Bus Manag Res 6(2):15–35
63. Schmalz MC (2021) Recent studies on common ownership, firm behavior,
and market outcomes. Antitrust Bull 66(1):12–38. https://​doi.​org/​10.​1177/​
00036​03X20​985804
64. Schomm F, Stahl F, Vossen G (2013) Marketplaces for data: an initial
survey. SIGMOD Record 42(1):15–26
65. Seuring S, Gold S (2012) Conducting content—analysis based literature
reviews in supply chain management. Supply Chain Manag J 17:544–555
66. Shen Y, Guo B, Shen Y, Duan X, Dong X, Hong Z (2016) A pricing model for
big personal data. Tsinghua Sci Technol 21(5):482–490

You might also like