Data is the Oil of the Digital World. What if Tech Giants had to Buy It from Us?

By Agnes Budzyn - 02 May 2019

Agnes Budzyn argues that data will gradually return to the hands of those who generate it.

In 2017, the Economist published a widely referenced article claiming that oil had been replaced as the world’s most valuable resource by data – specifically, data captured from users by tech companies like Google, Facebook, and Amazon. The article called for antitrust laws uniquely tailored to the tech industry in order to democratize access to data and provide room for future disruptors to enter the market. The Economist painted a bleak picture of a future where the FAANG (Facebook, Apple, Amazon, Netflix and Google) monoliths have so much data at their fingertips that they can spot and undermine a potential competitor at the earliest possible stage – before it poses a credible threat or is known by more than a handful of people.

What the Economist article doesn’t explore, however, is what happens to all the data the world creates – and will continue to create at exponential rates – regardless of whether those tech giants are subject to antitrust laws. The fact is that, whether in the hands of a few or a multitude of competing tech companies, data is not going anywhere. We are only creating more of it, and our technology is only getting better at analyzing and profiting from it.

For the past few decades, we have not had any satisfactory option but to hand over our data to tech companies (short of removing ourselves from online interconnection entirely). Data cannot simply “exist” without having somewhere to exist. Since the advent of the internet, the only place for data to really live has been in the centralized servers of tech companies. That also has meant, however, that the value and economic potential of that data has been largely restricted to those players.

We must shift how we view and treat data in order to better prepare for an increasingly data-saturated future. With the spread of blockchain technology and its integration into data-driven industries like healthcare and social media, we can gradually take back control of our data. We will be able to see which companies and entities have access to which portions of our online data (i.e. our online identity). We will be able to see how our data is being used and we will be able to respond accordingly, by allowing or restricting further access. We will have the option to request compensation in the form of access to services and money if a company wants to harvest, study and sell our information. At the end of the day, they’re making a fortune off it.

As those capabilities arrive and we begin owning and monetizing our own data, we will need to redefine “data”. One way to imagine data is the determination of it as an asset class – just like oil and other commodities. This article is an exploration of what a world in which data is an asset class looks like. The decision on our hands will be whether or not to follow through.

“Information of information” – the argument for “asset-ization”

At the dawn of the digital age, Tim Berners-Lee, the creator of the internet, described a world transacted on the web as the ability to buy information with information. In other words, everything becomes represented as data and information. We see this with money and it has changed our world. We are aware that our behavioural patterns, our finger movements, our tastes, our shopping habits, our movements, our likes and our political leanings are all captured and redirected at us in the forms of ads. But we are only at the tip of the iceberg. Blockchain technology will herald an age of colossal asset tokenization and transfer. There will be exponentially more data transacted on a daily basis. Without an efficient way to capture and value this data, it is likely to either go unused or exploited against us.

Imagining data as an asset class

If we assume we could classify and execute data as an asset class today, how would it operate? One way is to imagine the creation of “data packets” – i.e. collections of information organized around a vertical, industry, behaviour or platform. Those data collections can be modular, and could be reconfigured and made more granular, based on what information a person is willing to share or what the market is determining as more valuable information. Online shopping information, for instance, could be further divided into categories such as food, clothing, services and events. An individual could decide to bundle all those packets together and sell them to a behemoth like Walmart, since Walmart may be willing to pay roughly the same amount for all those categories. A specific clothing retailer, however, would only be interested in online clothing shopping data, and might pay a higher price to purchase just that one packet of data.

The classification and optimization of data packet configuration would, naturally, give rise to a parallel industry made up of services that organize and sort people’s data in the most profitable way possible. This model also allows for people to execute their own economic and personal beliefs. Someone could create a large portfolio of general and high-level data packets that would look similar to a market index fund of today, which casts a wide net with the expectation of smaller but more consistent returns. On the other hand, if someone cares deeply about data privacy and is only comfortable sharing certain details about their lifestyle, they can choose to configure and make available data packets that only share limited information.

Fundamental to the emergence and success of this third-party “data packet” industry is trust. Consumers of this new data services market cannot be forced into the same incentive structure that exists today, where their data is made visible to an exploitative industry. These companies must be able to offer their packet configuration services without compromising the security or identity of any piece of information or user.

Data

It’s also interesting to speculate on the mechanisms by which people could sell their data to paying companies. Someone could decide to sell historical data, providing a company with online shopping information between, say, 2010 and 2015. Something to consider, of course, would be questions around peoples’ ability to remove elements of their data during that historical time frame. If a company paid for data that had been “tampered” with by removing certain historical activity, would it be a true reflection of consumer behaviour and worthy of the price tag? Would companies have the right to determine whether information has been expunged from a data packet, and would they have the right to pay less for edited user data?

Users could also pay on an ongoing basis, making certain lifestyle data available to a company or set of companies willing to pay for it. This ongoing relationship could legally be broken at any time; a user could decide s/he no longer wants to make certain information available, or a company could decide they no longer want to pay for certain data. Another option would be for a company and consumer to lock themselves in a contract – an agreement that might fetch a higher price. A user could consent to sell an agreed-upon set of data to a company for, say, the following three years. Theoretically, neither party could legally break that agreement – a situation that could provide users with confirmed and stable extra income over a certain amount of time. Something to think about is the idea of “life events” that could qualify a user to break the contract. If, for example, someone becomes ill, their consumer habits might change in a way that would make them easily identifiable as a patient of a certain illness. Clauses could be written into the agreement that allow users to break contracts if certain life events happen, like a particular illness.

Exclusivity is another interesting thought exercise when it comes to the asset-ization and monetization of data. In a free, global, borderless, 24/7 and deep liquid data market, a company would be willing to pay extra money to be the sole recipient of a user’s data instead of sharing that data among multiple competitors. The third-party data service companies discussed above might be able to flag a user as a high-worth individual – “worth” being measured by the amount and importance of the online data they create – and suggest they offer exclusivity to companies in a historical, ongoing or contract form. A user may also choose to sell their data to more buyers, which might make them less money per data packet, but more money overall if there are many companies seeking to buy.

Challenges to data asset-ization

Whether or not you agree with the asset-ization of blockchain-represented data, the challenges to making it successful, manageable, decentralized and equitable are considerable:

• Technical development: Blockchain technology continues to be improved and developed upon, but is still in its infancy for most of its more esoteric and abstract applications, such as data asset-ization. Amid industry-wide interrogations over scaling, private vs public, interoperability and privacy, the matter of blockchain-based identity – i.e. being able to securely and undeniably associate one’s data with one’s identity – is unique to the question of data as an asset.

• User experience: Today’s Web3 user experience (UX) is largely embraced by the gaming industry, as an ecosystem that requires lower trust and asks for lower risk from its users. Blockchain applications that monitor people’s money, information, or identity, on the other hand, are still largely mediated by Web2-architectured companies that people feel more comfortable trusting. From a UX perspective, this makes sense. Ask new (or even veteran) Web3 users to manage a litany of wallet addresses, private keys, and seed phrases – and all the associated risk – and a company is bound to lose users. Eventually, however, the internet must become gradually more decentralized, and UX innovation will need to keep up in order to ensure users can confidently transact something as simple as a token or as nebulous as a “packet” of user data.

• Valuation: Imagining data as an asset requires packaging relevant pieces of data together into identifiable and valuable “packets” that relate to behavior, industry, lifestyle etc. How to value those packets becomes a matter of machine learning and companies’ profit strategies. How do we determine if one person’s data packet is more valuable than another’s, even if they both relate to the same set of online behaviour? How do we value quantity vs quality of data? Who determines the value of data packets – third-party companies, regulatory bodies, the data owners or the data sellers? What does a marketplace of data packets look like? If individuals rather than companies own their own data, can I sell my Facebook traffic information to Twitter? What we will likely see is a parallel industry rise alongside the asset-ization of data. These services would sort people’s information – in a secure manner non-identifiable to the third party itself – algorithmically into the most profitable data compositions based on existing market trends. These third parties may reasonably take a cut of a data packet’s profitability, but should have no visibility into the content of user data – or else we are back at square one, where third parties are scraping our data for their own profitability.

• Data learning technology: Hand-in-hand with the challenge of data valuation is the need for data to be properly analyzed, organized and valued using AI and machine learning. Most data churned into existence by online traffic is “noise”, and tech empires have been built around separating the signal from the noise. As machine learning technology improves, more data can be synthesized into useful information, bringing greater economic opportunity to both consumers and companies.

• Regulation: The entire blockchain ecosystem is more or less holding its breath, as regulators debate and determine how precisely to approach this new technology and the use cases it enables (particularly decentralized monetary behavior). It has always been the case that regulation has been a step behind private technical innovation, for better or worse. So the question arises: if we decided that data should be an asset class right now, could it legally even be one? Of the five broadly understood asset classes – commodities, real estate, fixed income, equity and cash – where should one’s own digital information be placed? An argument could be made for Commodities [i.e. ownership of a “good” that has a utility] or Fixed Income [i.e. lending money (in the form of data) to someone (a company)]. Or we might need the creation of a new asset class. Real estate is defined as the “ownership of physical space”. Perhaps we need a digital asset class, defined as “ownership of digital space”. Regardless of how the regulation shakes out, companies must first create possible solutions to demonstrate to regulators what options are possible before asking them to make a decision.

Looking ahead

Powered by blockchain technology applications, data will gradually return to the hands of those who generate it. Companies will have to create alternative incentivization mechanisms in order to gain access to that data, allowing for radical market accountability concerning the ethical, permissioned and equitable use of one’s information. Recreating economic models takes time, but if we are able to integrate new blockchain-enabled monetary behaviour into existing financial systems, we can accelerate adoption. Classifying data as an asset class does just that, and brings us a step closer to an equitable and empowering relationship with the internet.

Agnes Budzyn, Managing Director, Global Growth, Consensys AG.

This first appeared on the World Economic Forum's Agenda blog.

Image credit: dirkcuys via Flickr (CC BY-SA 2.0)