Oracles and Data Security – Part I

An Oracle is middleware that serves as a bridge between a blockchain (or smart contract) and the outside world. 

It allows  smart contracts to access information that is not stored on the blockchain, in order to take data from everyday events in the real world. This external connectivity exponentially increases the number of conditions a smart contract can write about, allowing developers to capture more value in a wider range of markets. 

Blockchains that have the ability to program smart contracts, can work for multiple purposes, beyond just being a store of value (Bitcoin / Digital Gold), such as insurance, supply chains, real estate, finance, etc.

How can we implement an insurance solution if we can only read token transactions? We can not. That is why we need Oracles. A tool that can provide reliable information to a smart contract. As an example for insurance contracts, you must know the geographic coordinates where a tornado occurred, to determine the location of the loss and the policy coverage.

Types of Oracles

Although there is no formal classification for Oracles, they can be differentiated by their way of delivering the product:

  • Data Aggregator: data provided with a cryptographic signature, available off-chain (such as a website or an API). For example: A crypto exchange like Coinbase signs the data and makes it available through an API. A smart contract can verify it using its public key.
  • Data solutions: Oracles that work directly with sources to make their data available in smart contracts. They basically provide the infrastructure for the APIs to be compatible with the blockchain.
  • Decentralized Network: connected Oracle nodes that provide real-world data to smart contracts on a blockchain. For example, Chainlink.  

How Can We Trust Oracles?

As the value assured by DeFi continues to grow at a rapid rate, so does the importance of ensuring that this revolutionary new decentralized financial ecosystem provides a higher level of security and reliability guarantees to its users.

It is quite difficult to work with external data. The smart contract code might be adequate, but an attacker could fool the system by attacking the external data source, the Oracle.

The “Oracle problem” of blockchain is quite well known, as many articles have examined this topic in detail. However, the issue of “data quality” provided by the Oracles remains largely unknown and misunderstood. The misunderstanding stems from the assumption that Oracles are used both to transfer external data on-chain and to generate high-quality data. Blockchain Oracles are designed to chain data transfer and fortify it against tampering, and not to create the data itself.

Decentralized Oracle networks are a means of secure connecting on-chain and off-chain environments in a framework of reliability. Without holding the Oracles to the same security and reliability standards as the blockchain, the entire smart contract is at risk, even if the contract code is flawless.

While blockchain hash mining is a fairly uniform and secure operation, rather than trying to use the Oracle mechanism to generate high-quality information from a collection of raw data, developers often design Oracles to pull data directly from data aggregation companies, with large teams and full-stack infrastructure.

Generating high-quality proprietary data requires a lot of capital. Nodes must have a paid subscription to the data provider (API) or be specifically authorized by the data provider. Both authorized models require password and credential management capabilities, to bridge the interaction between the node and the API. Therefore, node operators need the ability to store API keys and manage account logins to interact with these premium data providers.

Oracle solutions that cannot connect to premium APIs, due to lack of credential management capabilities, are limited to offering open, free, or hacked APIs. These APIs generally have low quality data, speed limit accelerations, unreliable response times, and no legally binding availability or quality of service guarantees, all of which make such data sources may not be suitable for high, medium or even small value use cases. Smart contracts that are fed with low-quality data do not guarantee the reliability or accuracy of the data that is consumed, which creates a greater attack vector in the process. As with any other data-driven technology, “garbage in, garbage out.”

If the smart contracts underlying DeFi applications are open source for the public to control in real time, then Oracle’s pricing mechanisms that provide data should also be transparent. Without the transparency of the Oracle solution, DApp users do not have the ability to verify where the data is coming from, which nodes provide data, the latency of responses, the historical performance of the Oracle network, and the accuracy of their data, and more.

Decentralization of Node Operators and Data Incorporation

Decentralization of high quality node operators is a key design pattern to protect against unpredictable periods of downtime, and eliminate the need to rely on a single entity not to disrupt the process. 

Decentralized consensus greatly increases the cost of the attack, because even if some nodes experience downtime or become malicious, it will have little effect on the final aggregate response.

Developers use these external adapters to quickly connect their smart contracts to any data source required for execution. They can also customize the precise amount of decentralization they need, the sources they want to pull data from, what algorithms are used to aggregate data, and how often updates should occur. This provides maximum flexibility in how the contract consumes external data.

DeFi markets differ from traditional financial markets in that no exchange owns the exclusive issuance of assets and therefore cannot lock users down and cover the entire trading market for an asset. Blockchain technology is permissionless and therefore anyone can include tokens on their exchange for traders to access at any time. Due to these dynamics, the volume of cryptocurrencies is spread across many different exchanges, and it can change quite quickly between different exchanges. This must be taken into account through an Oracle mechanism if market manipulation attacks are to be avoided, in which most of the volume is transferred to an exchange not included in the data aggregation process.

It is important to avoid applying decentralization without quality control standards, to prevent lower quality data sources from diluting the aggregation process. 

An attacker does not have to be an experienced developer to exploit these attack vectors. Any retail trader or small group of traders who become aware of such an opportunity can use an exchange’s user interface to manipulate particular markets. 

Oracle networks that pull price data from a single exchange not only have no protection against exchange downtime, sudden failures, and price manipulation, they also have extremely limited market coverage. While such a setup may appear to initially work during times of low volatility, when market volatility increases, price discovery occurs, and volume can change frequently between different exchanges. Even if the Oracle is updated to track a different exchange, the new price point can be very inaccurate, because market changes don’t always take the same shape. This creates a scenario where, although the data source has changed, it cannot reliably maintain market coverage.

Oracles that pull data directly from preselected exchanges are vulnerable to situations where the volume changes to new exchanges that were not included in the original aggregation process. While the exchanges originally chosen as data sources for the Oracle network may have been liquid during their initial creation, there is no guarantee that volume will remain on these exchanges in the future. This reduces the cost of the attack by malicious actors because only a small proportion of the overall volume of an asset needs to be manipulated.

On the other hand, there is a great risk in mixing highly secure and reliable pricing data with lower quality data from untested and less transparent Oracle solutions that also do not support premium API and/or source market data directly from exchange APIs.

Decentralization in an Oracle solution is key, but it shouldn’t come at the expense of data or node quality. 

Oracle and data sources are two distinct components that must be equally resilient when combined to provide end-to-end security. To have a reliable Oracle network, capable of supporting a DeFi ecosystem that is responsible for billions and eventually trillions of dollars, the quality of the data that is supplied must be secure and reliable. 

In the next article I will tell you about the Oracles preparing for the Cardano ecosystem.

1 comment
Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts