In the past 50 days, we have completed many important improvements, and the Kasplex indexer will soon be open source. Before that, I would like to briefly share the technical points of the new system architecture, and btw, let my tense nerves relax a little😆.
Splitting of indexers (modularization).
The initial version had a single system responsible for all tasks, which obviously increased the complexity of the code and the difficulty of debugging/testing. In the new architecture, the system is split into multiple independent components: Node-Syncer, OP-Executor, OP-Stats/API.
Node-Syncer.
The Syncer communicates with the node in real-time, archives the original data to the NoSQL cluster and handles block-reorg events. The data is include three parts: block data, transaction data, and sorted data (VSPC).
OP-Executor.
The Executor scans the VSPC sorting table in the archive-db in real time, retrieves and processes transaction data that complies with the protocol spec, and updates both the local db and the cluster db for the state data generated after executing each OP. In addition, due to the existence of block reorg-events, a real-time state rollback mechanism is necessary.
OP-Stats/API.
We need to provide a public api service that includes real-time query of state data (such as asset balance) and query of statistics or historical OPs. Considering the efficiency of OP execution, the latter should be separated from the core execution logic. OP-Stats scans the OP list, generates the indexes required for each query, and completes several statistics at the same time. When querying state data, the API gateway directly uses the real-time data provided by the executor, and uses the data generated by the OP-Stats in other cases.
Database.
The consideration for this part is that RocksDB provides high-speed read/write and transaction capabilities of local disks; at the same time, it writes to the Cassandra cluster to allow a distributed system with high throughput. Although the latter reduces the performance of the OP-Executor, it is not necessary for future Light-Indexers.
Necessity of archive db.
The current indexer system does not have a distributed P2P sync mechanism. The executor needs to build complete state data from the initial OP. It will become increasingly difficult for the archive node to provide a large amount of historical data (transaction and sorting data) while maintaining operation (especially at 10BPS). Another reason is that the indexer needs to calculate the transaction fee by querying the spent-TXO. In the initial version, we solved this by caching UTXO in advance, but it was not reliable enough.
Performance optimization measures.
In the new architecture, we replaced the db with NoSQL, and used multi-threaded concurrency and batch operations as much as possible in terms of reading/writing. The batch state pre-reading mechanism is implemented in the OP executor to greatly reduce the db reading/writing. For block reorg-events, a state set list will be maintained to implement batch rollback processing. In addition, we can also configure aggressive/conservative policies for real-time scanning. The former keeps the latest block sync as much as possible, but there will be more block-reorg and state rollbacks, and the latter allows a lag of 2-3 blocks to avoid most rollbacks.
OP checkpoints and trust model.
As the basis for implementing the data trust model in the future, each successfully executed OP will generate a checkpoint, which includes the OP header, state changed, and the previous checkpoint. We will deploy multiple independent and complete indexer systems with multiple partners (project parties, ecosystem builders, or individuals) to discover possible problems (system bugs or malicious attackers) through OP checkpoints.
TODO, Archive-API and OP-Syncer initialization.
A newly deployed indexer system needs to at least initialize the OP-Syncer, which includes reading the historical data (starting from a certain chain block). We will provide this part of the data in the public api service, then the Syncer will auto complete it.
R&D, Light-Indexer.
For the future high-throughput big data scenarios of 10BPS or 3000TPS, we need to complete the Light-Indexer. Ideally, it only needs to connect to a regular node (non-archival node) and maintain only a necessary global state db, verify and correct errors with other indexers through the P2P mechanism. Several problems need to be solved here. The first is the data trust model, how to verify the authenticity of data, trace and repair; how to quickly calculate the transaction fee without relying on the archive db; how to directly sync the state data when the node lacks historical data (is pruned), etc.
R&D, Decentralization and Layer-2 construction.
How to decentralize means that there are more things to consider. We see that KASPA will implement off-chain data verification through native support for ZK opcodes, which I think will be the basis for all second layers. In addition, as a Kasplex protocol, how to support general computing layers (such as smart contracts), which consensus mechanisms can be implemented, or what incentives can be used to encourage more people (or institutions) to run, these are all still under consideration.