Independently verifying DataTrails transparent merkle logs
Exploring DataTrails' merkle log with Veracity
Overview
Veracity is an open-source command line tool developed by DataTrails. With it, you can explore the merkle log and prove the inclusion of your event data. By default it connects to the DataTrails service to obtain a copy of the merkle log. Veracity can also work from copies of the merkle log on disk.
In this guide we’ll explore how you can use Veracity to:
- Prove the inclusion of events that matter in the DataTrails merkle log with
verify-included
- Explore the DataTrails merkle log using the
node
command
Prerequisites
Verifying Event Data
DataTrails records the events that matter to your business and lets you prove them at a later date. This guide will show how to do this for both online and offline data scenarios.
For simplicity we’ll walk through an example of proving that a publicly attested event exists on the merkle log for the public tenant on DataTrails. If you want to try this with your own data, simply download a copy of your event from the DataTrails API and supply your tenant ID instead of the public one.
Setup
Let’s set some variables that reference the public tenant in DataTrails and a public event that we’d like to verify the inclusion of.
EVENT_ID=publicassets/046ad7b4-dc99-4f90-9511-d2fad2e72bed/events/fef3c753-52e5-406b-8e41-8a36a2cc4818
DATATRAILS_URL=https://app.datatrails.ai
TENANT_ID=tenant/6ea5cd00-c711-3649-6914-7b125928bbb4
Loading the Event
curl -sL $DATATRAILS_URL/archivist/v2/$EVENT_ID > event.json
If you inspect the contents of event.json
you will see something like this (with some fields omitted
for brevity.)
{
"identity": "publicassets/046ad7b4-dc99-4f90-9511-d2fad2e72bed/events/fef3c753-52e5-406b-8e41-8a36a2cc4818",
"asset_identity": "publicassets/046ad7b4-dc99-4f90-9511-d2fad2e72bed",
"tenant_identity": "tenant/ee52ae46-d4fb-4030-9888-4696ef4b27da",
"event_attributes": {
"arc_display_type": "Business Critical Action",
"quality_system_ref": "1112345",
"approval_status": "2",
"arc_description": "An important event to your business"
},
"timestamp_declared": "2024-08-22T16:28:54Z",
"timestamp_accepted": "2024-08-22T16:28:54Z",
"timestamp_committed": "2024-08-22T16:29:04.130Z",
"confirmation_status": "CONFIRMED",
"merklelog_entry": {
"commit": {
"index": "5772",
"idtimestamp": "01917aeb9103048500"
},
"confirm": {
"mmr_size": "5774",
"root": "Z5S0ewjARI26IP04vJOC5pnH2V/M/BETAB4pojIZFkQ=",
"timestamp": "1724344145109",
"idtimestamp": "",
"signed_tree_head": ""
},
}
}
Prove Event Inclusion
cat event.json | veracity \
--data-url $DATATRAILS_URL/verifiabledata \
--tenant=$PUBLIC_TENANT_ID \
--loglevel=INFO \
verify-included
Its that simple. Note that by default Veracity produces no output on success to enable simple build
system integration. By supplying --loglevel=INFO
we get some insight into what the tool is doing:
...
verifying for tenant: tenant/6ea5cd00-c711-3649-6914-7b125928bbb4
verifying: 5772 2889 01917aeb9103048500 publicassets/046ad7b4-dc99-4f90-9511-d2fad2e72bed/events/fef3c753-52e5-406b-8e41-8a36a2cc4818
OK|5772 2889|[c46a47677b043602dba8a9d1db3215207d1e2f4bdbb19bc07592602fa745b3b7, 18b5d6be487dc0b87d14cb7a389a6cf936aab2427dd26c1b230653f692964f06, a68a7678739a2e00431c25bf3d810b4f417830c3a95cfc692e771d6d54e37fa6, 907c561fd157a5a022aa4e42807bfca082c54d98531831847ad5414a1ad2b492, 9dfeaef9e86d6b857170245ec4cfc5d98fea11bba3937e211d134ab548eb743e, 04602adc424529275ce3415d55f31413743b67bf7e7fae03c90b08f1f5422264]
Detecting Tampering
Adversaries tampering with critical data is a serious risk, but DataTrails makes this straightforward to detect. Try tampering with
event.json
and re-runningverify-included
to observe the failure:sed "s/Business Critical Action/Malicious Action/g" event.json | \ ./veracity \ --data-url $DATATRAILS_URL/verifiabledata \ --tenant=$PUBLIC_TENANT_ID \ --loglevel=INFO \ verify-included
... verifying for tenant: tenant/6ea5cd00-c711-3649-6914-7b125928bbb4 verifying: 5772 2889 01917aeb9103048500 publicassets/046ad7b4-dc99-4f90-9511-d2fad2e72bed/events/fef3c753-52e5-406b-8e41-8a36a2cc4818 XX|5772 2889 error: the entry is not in the log. for tenant tenant/6ea5cd00-c711-3649-6914-7b125928bbb4 ...
Offline Verification
Veracity can be used to verify the inclusion of an event in an offline backup of a DataTrails
merkle log. We can do this by supplying a --data-local
argument instead of --data-url
. First,
we’ll need to get a copy of the massif.
Note: DataTrails break the merkle log down into manageable chunks called massifs. Once each massif is full, a new one is started. The filenames are numbered (e.g. 0000000000000000.log, 0000000000000001.log) to indicate order.
The argument
--data-local
accepts either a single massif file or a directory containing multiple massif files. The event we’re verifying in this example is contained within the first massif.
curl -H "x-ms-blob-type: BlockBlob" -H "x-ms-version: 2019-12-12" https://app.datatrails.ai/verifiabledata/merklelogs/v1/mmrs/tenant/6ea5cd00-c711-3649-6914-7b125928bbb4/0/massifs/0000000000000000.log -o mmr.log
When we run the verify-included
command using our local copy of the massif, it will also verify
successfully with the outputs matching.
cat event.json | \
./veracity \
--data-local mmr.log \
--tenant=$PUBLIC_TENANT_ID \
--loglevel=INFO \
verify-included
Note: Proof paths shown in the output were complete at time of writing. As the log grows the proof path increases in length. See this article for a deep-dive into our merkle log.
verifying for tenant: tenant/6ea5cd00-c711-3649-6914-7b125928bbb4
verifying: 5772 2889 01917aeb9103048500 publicassets/046ad7b4-dc99-4f90-9511-d2fad2e72bed/events/fef3c753-52e5-406b-8e41-8a36a2cc4818
OK|5772 2889|[c46a47677b043602dba8a9d1db3215207d1e2f4bdbb19bc07592602fa745b3b7, 18b5d6be487dc0b87d14cb7a389a6cf936aab2427dd26c1b230653f692964f06, a68a7678739a2e00431c25bf3d810b4f417830c3a95cfc692e771d6d54e37fa6, 907c561fd157a5a022aa4e42807bfca082c54d98531831847ad5414a1ad2b492, 9dfeaef9e86d6b857170245ec4cfc5d98fea11bba3937e211d134ab548eb743e, 04602adc424529275ce3415d55f31413743b67bf7e7fae03c90b08f1f5422264]
Exploring the Merkle Log
The node
command is a convenience function for retrieving the value of a node in the merkle log
without needing to download the entire massif. Lets use our example event from earlier, which lives
at index 5772 (this works with both --data-local
and --data-url
.)
veracity --data-url $DATATRAILS_URL/verifiabledata \
--tenant=$PUBLIC_TENANT_ID \
node --mmrindex 5772
The value returned is the hash stored at that node:
26c7061166187363dd156f4f5f1f517a39323af3c70d572de28c5206de160ec2
Leaf nodes in the merkle log contain the hash of the event data (plus some metadata, see this article) while intermediate nodes hash together the content of their left and right children.