Skip to main content

Adding your own task

If you want to develop a new task, first think about whether or not you want to modify an existing task, or create a new task

Usually, it's preferable to modify an existing task by providing a new parameter to the task that users can specify in their execution plan. This is because if you provide two variants of the same task (ex: InsertBlock1, InsertBlock2), it makes it harder for child tasks to define their dependencies

If you still want to add a new task, here are the steps to follow:

  1. (if needed) add a new entity for your schema definition in indexer/entity
  2. (if needed) add a new migration to add your new table to indexer/migration
  3. Add a new task to indexer/tasks
  4. Add the new task to your execution plan in indexer/execution_plan

Tasks are easiest to write using the carp_task DSL.

Here is an example task that you can use as reference

example_task.rs
use crate::config::EmptyConfig::EmptyConfig;use crate::dsl::database_task::BlockGlobalInfo;use crate::dsl::task_macro::*;carp_task! {  // The task name. This is what will show up in the task graph  // and this is how you specify dependencies  name ExampleTask;  configuration EmptyConfig;  doc "An example task to help people learn how to write custom Carp tasks";  // The era your task operates on. Note: different eras have different block representations  era multiera;  // List of dependencies for this task. This is an array of names of other tasks  // Note: your task will run if all dependencies either ran successfully OR were skipped for this block  dependencies [];  // Specify which fields your task will have read-access to  read [multiera_txs];  // Specify which fields your task will have write-access to  write [multiera_addresses];  // Specify whether or not your task needs to run for a given block  // Note that by design, this function:  // 1) CANNOT access parent task state  // 2) Is NOT async  // 3) CANNOT save intermediate state  // (1) is because this function is called BEFORE any task is actually run to generate the actual execution plan for a block  // (2) is because this is meant to be a cheap optimization to skip tasks if they clearly aren't required  //     Ex: if your task can be skipped if no txs exists in the block, if no metadata exists in the block, etc.  // (3) is because the cost of storing and passing around intermediate state would be more expensive than recomputing  should_add_task |_block, _properties| {    true  };  // Specify the function what your task actually does  // Your task has access to the full block data and any data you specified in either `read` or `write`  execute |_previous_data, task| handle_dummy(      task.db_tx,      task.block,  );  // Specify how to merge the result of your task back into the global state  merge_result |data, _result| {  };}async fn handle_dummy(    _db_tx: &DatabaseTransaction,    _block: BlockInfo<'_, cml_multi_era::MultiEraBlock, BlockGlobalInfo>,) -> Result<(), DbErr> {    Ok(())}

As you can see, tasks all share access to an execution context that holds which variables you can read and write from. This context usually contains things like database IDs of the recently added data so that it can be properly references from other tables.

execution_context.rs
pub use crate::era_common::OutputWithTxData;pub use entity::{    prelude::*,    sea_orm::{prelude::*, DatabaseTransaction},};pub use std::collections::BTreeMap;#[macro_export]macro_rules! data_to_type {  // genesis  (genesis_block) => { Option<BlockModel> };  (genesis_txs) => { Vec<TransactionModel> };  (genesis_addresses) => { Vec<AddressModel> };  (genesis_outputs) => { Vec<TransactionOutputModel> };  // byron  (byron_block) => { Option<BlockModel> };  (byron_txs) => { Vec<TransactionModel> };  (byron_addresses) => { BTreeMap<Vec<u8>, AddressInBlock> };  (byron_inputs) => { Vec<TransactionInputModel> };  (byron_outputs) => { Vec<TransactionOutputModel> };  // multiera  (multiera_block) => { Option<BlockModel> };  (multiera_txs) => { Vec<TransactionModel> };  (vkey_relation_map) => { RelationMap };  (multiera_queued_addresses_relations) => { BTreeSet<QueuedAddressCredentialRelation> };  (multiera_stake_credential) => { BTreeMap<Vec<u8>, StakeCredentialModel> };  (multiera_addresses) => { BTreeMap<Vec<u8>, AddressInBlock> };  (multiera_metadata) => { Vec<TransactionMetadataModel> };  (multiera_outputs) => { Vec<TransactionOutputModel> };  (multiera_used_inputs) => { Vec<TransactionInputModel> };  (multiera_used_inputs_to_outputs_map) => { BTreeMap<Vec<u8>, BTreeMap<i64, OutputWithTxData>> };  (multiera_assets) => { Vec<NativeAssetModel> };}pub(crate) use data_to_type;