The company that presents this development deserves my respect, and is recognized in the Cardano developer community. Some time ago I wrote an article about TxPipe, and I leave it at the end of this article(1).
The majority of votes in the Catalyst Project on the Cardano blockchain are generally aimed at final consumer products, while infrastructure developments attract little to the community, because many do not understand that these proposals are essential to be able to build those final products.
On-chain data indexing is complex, because each project has unique requirements, and generic indexing solutions are too heavy, or lack the necessary features.
This proposal provides a framework that allows projects to define the data filtering and/or aggregation logic in a concise Typescript file (Map/Reduce algorithm). It is a tool that performs custom indexing and exposes a GraphQL API.
The team says that this approach proposed here has been validated, as it is being applied in other blockchain ecosystems. The team seeks to improve the developer experience that Cardano has, compared to other ecosystems.
The team seeks to develop a tool to manage the following issues:
- On-chain data is massive and growing
- The entire structure of the general ledger is very complex
- The vast majority of dApps only care about a small fraction of the data on the chain that is relevant to their use case. For example: UTxOs locked to a specific script address, or UTxOs generated by a specific script address, or UTxOs with a specific token or asset policy
- The vast majority of dApps only care about concrete ledger projections relevant to their use case. For example: The balance of a particular token per address, or the Plutus data contained in the UTxO data, or the Plutus data sent as redeemers, or the metadata contained in a particular tag
Current tools address the problem with a single solution, such as DB-Sync or Carp which address the problem by trying to replicate the general ledger to a relational database. While they are flexible enough to meet almost any query requirement, they are storage-intensive, compute-intensive, and complex queries are slow.
There are tools such as Kupo or Scrolls that approach the problem by having an opinionated view of the data that needs to be indexed. They are lightweight; don’t require much compute resources, the queries are optimized. But the available queries are limited and the use-cases are limited.
Ideally, each dApp would have a database tailor-made for its particular use case, containing only the subset of the chain relevant to the dApp, where the database schema is designed for the dApp’s needs.
All the code and infrastructure needed to sync the data on the chain should already be available as an SDK, and so developers should just focus on tuning the SDK to their dApp’s particular requirements.
Querying data should be easy and flexible, and in particular, a GraphQL endpoint should be available for querying from the browser or from any of the existing client SDKs.
For example, a DEX might create custom indices that represent its liquidity pools and swap transaction history, or an NFT might create custom indices that represent available funds, bid price, total bid, and transaction history.
Another use case would be that an Oracle could create custom indexes that they represent the current value and the past value history of their fact statement catalogs.
In the case of the ADA Handles service, it could be represented by a custom index that maps handles to the last address.
Another use case, where a dApp that requires batching of UTxOs, could create custom indexes to keep track of the relevant UTxOs required for its batching purposes.
This proposal submitted by the TxPipe team, in the Catalyst Fund10, was included in the category: ‘Developer Ecosystem – The Evolution’.
The idea is to build an open source indexing engine SDK, containing the libraries and toolchains needed to build plugins. It will be based on a modified version of Scrolls that supports plugins, and will be run using the Deno runtime or a WASM runtime.
Each plugin will be responsible for providing two functions: “Map” and “Reduce”, where the “Map” function will take a Cardano block as a parameter and generate a custom key/value pair defined by the developer and the “Reduce” function will take an array of key/value pairs and add them into a single key/value pair.
The indexing engine will take care of tracking the history of the chain and executing Map/Reduce operations for each block.
The result of Map/Reduce operations will be stored in a relational database (yet to be determined, but probably PostgreSQL). The plugin will declaratively provide the database schema.
Developers will have a base tool, and they will adapt the logic to each use case.
The tool will integrate existing GraphQL libraries that automatically generate APIs from existing database schemas. This will allow developers to query their data using modern and flexible technologies.
The proposal includes installation instructions, docker images, and provisioning scripts so that DevOps teams that want to run the system can do so on their own infrastructure.
The indexing engine will be integrated with the Demeter platform so developers can have one-click deployments of their custom indexes running in the cloud.
The Success Metrics
The team considers the following dimensions to measure the success of the project:
- Activity on the Github open source repository through metrics such as, but not limited to: # of issues, clones, external contributors, stars, visitors, etc.
- Number of external repositories that include this project as a dependency
- Number of dApp projects using this feature via hosted version onDemeter.run
Being a project of Open Source, the results will be available to any developer in the ecosystem at every step of the development process.
The final product will be:
- An LTS version of the engine available for download in various formats (binary, Docker image, etc.)
- A CLI binary (Command Line Interface) that will serve as an entry point for developers
- A documentation website with usage and deployment instructions
- A collection with several examples of different common indexers to use as starting points
- A video tutorial showing how to create a custom indexer
Milestone 1: Scrolls Refactoring (1 month)
- Integrate Deno / WASM runtime into Scrolls pipeline
- Adapt processing logic for Map/Reduce operations
- Adapt storage backend to accept custom schemas
Milestone 2: SDK Development (1.5 month)
- Implement Deno (Typescript) library for Map/Reduce definition
- Implement Rust library for Map/Reduce definition
- Implement Python library for Map/Reduce definition
- Implement Go library for Map/Reduce definition
- Implement a CLI (Command Line Interface) for scaffolding indexer code
- Prepare documentation and tutorials
Milestone 3: GraphQL Server (1 month)
- Implement GraphQL server to serve custom schemas
- Create end-to-end examples
Milestone 4: Demeter Integration (1 month)
- Add Scrolls as a new extension in Demeter.run
- Allow provisioning of new indexers by uploading the Map/Reduce definition
- Provide automatic provision of GraphQL endpoints for each indexer
The total requested is ₳ 207,857.
—Breakdown by resource type:
- Rust developers: 1 FTE x 4.5 months = ₳ 144,643
- Frontend / React developers: 1 FTE x 1 month = ₳ 14,286
- Technical writers: 1 FTE x 2 months = ₳ 10,714
- Project manager: 1/4 FTE x 4.5 months = ₳ 9,643
- Site-reliability engineers: 1 FTE x 1 month = ₳ 28,571
FTE = full-time equivalent
—Breakdown by milestone
- Milestone 1: ₳ 41,429
- Milestone 2: ₳ 67,500
- Milestone 3: ₳ 50,357
- Milestone 4: ₳ 48,571
The team has developed Pallas, a Rust library for Cardano that is used by several high-profile projects in the community (such as: cncli and Aiken).
They developed Oura with Catalyst funding, an off-chain data integration tool for Cardano used by many community projects (such as: Pool.io and dcSpark’s Carp, etc), and Dolos, a minimalist version of Cardano node, which is slowly being released to the community as a beta version.
They have also created Demeter, a cloud hosting platform for Cardano infrastructure with several high-profile clients (such as: JPG.store, SummonPlatform, and TeddySwap).
The people in charge of this project are:
- Santiago Carmuega (TxPipe): will be in charge of system architecture and part-time Rust developer. Github
- Paulo Bressan: will be responsible for Rust development (Github)
- Rodrigo Santamaría: will be responsible for UI / UX and frontend development. (Github)
- Federico Weill: will be responsible for project management.
- Florencia Luna: will be responsible for technical writing of tutorials and documentation.
LiberLion: What was your motivation for choosing Cardano to develop your work?
Santiago: Over the past 15 years, I’ve dedicated a significant portion of my career to the development of distributed systems. Around two years ago, my partner Federico Weill and I ventured into the realm of blockchain technology. Among the various options, Cardano stood out to us primarily due to its robust technical foundations, including its non-custodial proof-of-stake, the Ouroboros consensus protocol, and the extended UTXO model. As we interacted with the developers, SPOs and entrepreneurs, we realized the true potential of this ecosystem relied on its community. Our conviction grew stronger; not only did we see it as an exciting ecosystem to explore and contribute to, but we also identified an opportunity to build a sustainable business model within it. So, we made the decision to dedicate our efforts to helping this ecosystem flourish, that’s why we created TxPipe.
LiberLion: Why did you choose to develop infrastructure? It is easier to be profitable in the crypto ecosystem with the creation of end products for the mass public.
Santiago: Our decision to focus on infrastructure stemmed from our strengths and expertise in that area. It may come across as naive, but our primary motivation isn’t profit. We’re committed to building compelling technology that equips developers with the tools they need to create and innovate. We firmly believe that by offering valuable, user-friendly tools, we’ll naturally generate revenue over time. There’s a unique sense of fulfillment that comes from creating tools that others find useful, and we’re confident that we can deliver an infrastructure that many developers will appreciate and leverage.
LiberLion: If you had to choose only one negative feature of Cardano, what is the one that makes it more difficult, in general, for developers to build products?
Santiago: While I wouldn’t label it a “negative” feature per se, I would say the Extended UTxO model poses the most significant challenge when it comes to building on Cardano. This is essentially a new paradigm, one for which there is no established experience, best practices, or patterns to lean on. In comparison, the account model utilized by many other blockchains tends to be more developer-friendly. Don’t get me wrong, I’m fully convinced that in the long run, the Extended UTxO model is the superior approach. However, until the developer ecosystem matures, this model does impact product development quite significantly.
LiberLion: What is TxPipe’s business model? How do you finance your developments without Catalyst? How do you sell your infrastructure products to other developers?
Santiago: Our principal revenue stream is derived from Demeter.run, the Platform-as-a-Service (PaaS) we’ve developed. Our modus operandi is straightforward: we create open-source tools that anyone can download, customize, and operate. However, if you prefer to focus on your core business and product development without the hassle of managing infrastructure, we can handle that aspect for you via Demeter.run. Our platform is designed for those developers who would rather avoid dealing with infrastructure intricacies. Think of it as an alternative to services like Firebase or AWS, but specifically tailored for Cardano infrastructure.
LiberLion: Do you think that developing a generic indexer will be enough for developers to adapt it for each use case? I guess there will be more complex developments and maybe customizing with Scrolls is not possible.
Santiago: Indeed, there will certainly be use-cases that fall outside the scope of Scrolls. In software development, it’s rare to find a one-size-fits-all solution, and Scrolls is no exception. If your requirements involve extremely flexible data queries that account for every little detail of the ledger, DBSync is likely your best bet. Trying to tackle such requirements with Scrolls could be cumbersome and possibly less efficient. On the other hand, if your use-case perfectly aligns with what Kupo’s API provides, then you’ll significantly benefit from a lightweight, ready-made solution. Scrolls is designed to fill the gap at the middle of the spectrum between DBSync and Kupo, catering to scenarios where developers require a specific slice of the ledger, tailored to unique requirements.
LiberLion: A developer could blame you for her/his malpractice in building an incorrect indexing. Besides the fact that your product will be open source, How do you protect yourself from these problems? Could your company be hired to audit developments that use Scrolls?
Santiago: That’s an inherent part of the open-source landscape. Project maintainers often find themselves under considerable pressure from the community demanding swift bug fixes and feature enhancements. TxPipe is not immune to that pressure, but we’ve gained some experience on how to gracefully deal with roadmaps that aligns our goals with the ones from the larger community of developers. At TxPipe, our main focus currently is the development of our open-source tools. However, we’re gradually expanding into services like consultancy, bespoke development, and audits for community projects. We welcome any projects seeking assistance to reach out to us through our various channels. In addition, if you’re a developer with questions or feedback, we encourage you to get in touch via Github Issues or our Discord channel. We believe in fostering an open dialogue with our user community to ensure we address their needs and keep improving our offerings.
You can read the original proposal at IdeaScale
. . .