Data Processor API
Documentation of Data Processor API
Last updated
Documentation of Data Processor API
Last updated
The Data Processor API allows users to submit computational tasks combining data lakes and aggregate functions. This document outlines how to structure requests to the API, manage authentication, and interpret the parameters needed for successful data processing.
Every call to the Data Processor API must include your API secret key. You can create your API key from the .
/submit-batch
)URL: https://hdp.api.herodotus.cloud/submit-batch?apiKey={yourApiKey}
Method: POST
This endpoint accepts a JSON payload containing one or more computational tasks. Each task specifies a data lake configuration and an aggregate function to process the data.
Requests to the endpoint are organized into batches. A batch can contain multiple tasks, each defined by a combination of a data lake type and an aggregate function. For a detailed explanation of each request field, refer to .
Example: In the Ethereum Sepolia ("ETHEREUM_SEPOLIA") blockchain, calculate the average base_fee_per_gas
for blocks 5,515,000 to 5,515,039.
Example: In the Ethereum Sepolia ("ETHEREUM_SEPOLIA") blockchain, determine the maximum nonce
for transaction indices 10 to 40 in block 5,409,986.
The endpoint returns a JSON object containing the batchId
and an array of taskHashes
. The task hashes are required for fetching the task results from the result map smart contract.
Example Response:
destinationChainId: Defines the specific chain to which the result of your computation is delivered.
tasks: An array allowing you to define multiple tasks in one request. Each task will be processed in the same batch.
Each task object includes the following fields:
type: Defines the task type. Currently, we support DatalakeCompute
and Module
.
For DatalakeCompute
Tasks:
datalake: Detailed data definition to compute over.
type: The type of data lake. For block sampled data, set to BlockSampled
; for transactions in block data, set to TransactionsInBlock
.
chainId: The chain ID the data should be sourced from, e.g., "ETHEREUM_SEPOLIA"
for Sepolia.
blockRangeStart: Starting block number of the range.
blockRangeEnd: Ending block number of the range (inclusive).
sampledProperty: Specific property to sample. There are three types you can utilize:
header: Use the format header.{specific_header_field}
. All RLP-decoded fields from the block header are available.
account: Use the format account.{target_address}.{specific_account_field}
. All RLP-decoded fields from the account are available.
storage: Use the format storage.{target_address}.{storage_slot}
. Given the target contract address, the property points to the value from the given storage slot as the key.
increment: Incremental step over the range from blockRangeStart
to blockRangeEnd
. The default is 1.
compute:
aggregateFnId: The computation function that the task will execute. Available functions are: avg
, sum
, min
, max
, count
.
For Module
Tasks:
programHash: The hash of the uploaded program to execute.
inputs: An array of input objects, each containing:
visibility: Specifies whether the input is public
or private
.
value: The value of the input parameter.
count
The count
function performs operations over a specific value to compare.
operatorId: Operation symbol to filter the value set. Available operations are:
eq
(equal to ==
)
nq
(not equal to !=
)
gt
(greater than >
)
gteq
(greater than or equal to >=
)
lt
(less than <
)
lteq
(less than or equal to <=
)
valueToCompare: The value to compare against using the specified operator.
Example: Given the data lake, count the number of values greater than 1000000000000
.
/batch-query/{yourBatchId}
)URL: https://hdp.api.herodotus.cloud/batch-query/{yourBatchId}
Method: GET
This endpoint allows you to query the current status of a submitted batch using the batchId
.
Opened: The batch has been accepted and is initiated.
ProofsFetched: Successfully fetched proofs from the preprocessor and generated the corresponding PIE object.
CachedMmrRoot: Successfully cached the MMR root and MMR size used during the preprocessing step to the smart contract.
PieSubmittedToSHARP: Successfully submitted the PIE to SHARP.
FactRegisteredOnchain: The fact hash of the batch is registered in the fact registry contract.
Finalized: Successfully authenticated the fact hash and batch, and finalized the valid result on the contract mapping.
To access computed data on Starknet, specify the destination chain as follows:
When should I use the Data Processor instead of the original Herodotus API?
Depending on your use case, both products have pros and cons. If you intend to access data over large ranges of blocks, we recommend using the Data Processor. It is designed to handle large amounts of data at a much lower cost.
Note that not all RLP-decoded fields are compatible with all computations. Check out to ensure you are using a supported field.
Task hash values are returned in the /submit-batch
response. Use these hashes as identifiers to fetch your valid results after the job is finished. Once the task is finalized, you can use the taskHash
to query the result from the getFinalizedTaskResult
function of the .
By specifying the destination chain ID as L2, you can access data computed with the Data Processor. This L2 delivery is facilitated by the .