Getting started with the MongoDB sink
The MongoDB integration provides a way to mirror onchain data to a MongoDB collection of your choice. Data is automatically inserted as it's produced by the chain, and it's invalidated in case of chain reorganizations.
- The integration can be used to populate a collection with data from one or more networks or smart contracts.
- Create powerful analytics with MongoDB pipelines.
- Change how collections are queried without re-indexing.
Installation
apibara plugins install sink-mongo
Collection schema
The transformation step is required to return an array of objects. Data is
converted to BSON and then written to the collection. The MongoDB integration
adds a _cursor
column to each record so that data can be invalidated in case
of chain reorganizations.
Important indices
To ensure the best performance, you need to add the following indices to your MongoDB collections. This is especially important if you're indexing pending blocks.
_cursor.from
_cursor.to
Querying data
When querying data, you should always add the following property to your MongoDB filter to ensure you get the latest value:
{
"_cursor.to": null,
}
The next section contains information on why you need to add this condition to your filter.
Storage & Data Invalidation
Storing blockchain data poses an additional challenge since we must be able to
rollback the database state in case of chain reorganizations.
This integration adds an additional _cursor
field to all documents to track
for which block range a piece of data is valid for.
type Cursor = {
/** Block (inclusive) when this piece of data was created. */
from: number;
/** Block (exclusive) at which this piece of data became invalid. */
to: number | null;
};
It follows that a field is valid at the most recent block if its _cursor.to
field is null
.
Example: we're indexing an ERC-721 token with the following transfers:
- block: 1000, transfer from 0x0 to 0xA
- block: 1010, transfer from 0xA to 0xB
- block: 1020, transfer from 0xB to 0xC
If we put the token ownership on a timeline, it looks like the following diagram.
1000 1010 1020
--+-----------------------+---------------------+---- - - - - - - -
[ { owner: "0xA } )
[ { owner: "0xB" } )
[ { owner: "0xC" }
Which translates to the following documents in the MongoDB collection.
After the first transfer:
[{ "owner": "0xA", "_cursor": { "from": 1000, "to": null } }]
After the second transfer:
[
{ "owner": "0xA", "_cursor": { "from": 1000, "to": 1010 } },
{ "owner": "0xB", "_cursor": { "from": 1010, "to": null } }
]
And after the third transfer:
[
{ "owner": "0xA", "_cursor": { "from": 1000, "to": 1010 } },
{ "owner": "0xB", "_cursor": { "from": 1010, "to": 1020 } },
{ "owner": "0xC", "_cursor": { "from": 1020, "to": null } }
]