Alignment Tutorial¶
[Specification] [Example]
This tutorial walks through creating a Scripture Burrito metadata file for a
word alignment project. By the end you will have a valid metadata.json and
understand the structure of the alignment content files it describes.
Scenario: The Zarma translation team (from Scripture Text Tutorial) wants to publish word alignments between their Zarma New Testament and the SBL Greek New Testament (SBLGNT). An automatic aligner has produced one JSON alignment file per book. We want to package these as an alignment burrito.
The directory looks like this:
zarma-alignment/
alignments/
40MAT-dje-sblgnt.json
41MRK-dje-sblgnt.json
43JHN-dje-sblgnt.json
44ACT-dje-sblgnt.json
We will build the metadata.json file section by section.
Common Fields¶
These fields are common to all burritos — see Scripture Burrito Structure for the full specification.
1. Format and meta¶
Every burrito begins the same way:
{
"format": "scripture burrito",
"meta": {
"version": "1.0.0",
"category": "source",
"generator": {
"softwareName": "AutoAligner",
"softwareVersion": "1.0.0",
"userName": "Fatima Maïga"
},
"defaultLocale": "en",
"dateCreated": "2025-11-05T10:00:00+01:00"
},
2. Identification section¶
Name the project so tools and people can identify it:
"identification": {
"name": {
"en": "Zarma NT — SBLGNT Word Alignment"
},
"description": {
"en": "Word-level alignment of the Zarma New Testament against the SBLGNT"
},
"abbreviation": {
"en": "ZJNT-SBLGNT-align"
}
},
3. Languages section¶
An alignment burrito typically involves two languages. List both:
"languages": [
{
"tag": "dje",
"name": {"en": "Zarma"}
},
{
"tag": "grc",
"name": {"en": "Ancient Greek"}
}
],
4. Agencies section¶
"agencies": [
{
"id": "https://seedcompany.com",
"roles": ["rightsHolder", "content"],
"url": "https://seedcompany.com",
"name": {"en": "Seed Company"},
"abbr": {"en": "SC"}
}
],
5. Common Ingredients¶
The ingredients object maps every file path (relative to the burrito root)
to a descriptor. These fields appear in every burrito regardless of flavor:
file path (the key) — relative to the burrito root, using forward slashes. Must match the actual layout exactly.
checksum — used by receiving tools to verify file integrity. MD5 is currently the standard algorithm.
mimeType — identifies the file format. Allowed values are flavor-specific; see below.
size — file size in bytes.
scope — lists the books the file contains. Each book code maps to either an empty array (whole book) or a list of chapter numbers.
Alignment Fields¶
These fields are specific to the Alignment flavor — see Alignment Specification for the full specification.
6. Type section¶
The type section declares this as an alignment burrito:
"type": {
"flavorType": {
"name": "alignment",
"flavor": {
"name": "alignment"
}
}
},
flavorType.nameis"alignment"— this is what distinguishes an alignment burrito from a text translation or audio burrito.flavor.nameis also"alignment".Unlike scripture burritos,
currentScopeis not required — the alignment files themselves record which references are covered.
7. Flavor-Specific Ingredients¶
Alignment files use "application/json" as their MIME type. The scope
records which book each file covers, using the same book-scope pattern as text
translation ingredients.
"ingredients": {
"alignments/40MAT-dje-sblgnt.json": {
"checksum": {"md5": "a1b2c3d4e5f60001..."},
"mimeType": "application/json",
"size": 184200,
"scope": {"MAT": []}
},
"alignments/41MRK-dje-sblgnt.json": {
"checksum": {"md5": "a1b2c3d4e5f60002..."},
"mimeType": "application/json",
"size": 112500,
"scope": {"MRK": []}
},
"alignments/43JHN-dje-sblgnt.json": {
"checksum": {"md5": "a1b2c3d4e5f60003..."},
"mimeType": "application/json",
"size": 161800,
"scope": {"JHN": []}
},
"alignments/44ACT-dje-sblgnt.json": {
"checksum": {"md5": "a1b2c3d4e5f60004..."},
"mimeType": "application/json",
"size": 245600,
"scope": {"ACT": []}
}
}
8. Structure of an alignment content file¶
The ingredient files follow the
Scripture Burrito Alignment Format.
Here is an excerpt from 40MAT-dje-sblgnt.json showing alignment records for
Matthew 6:9:
{
"format": "alignment",
"version": "0.4",
"groups": [
{
"type": "translation",
"meta": {
"creator": "AutoAligner/1.0.0",
"timestamp": "2025-11-05T10:00:00Z"
},
"documents": [
{"scheme": "BCVWP", "docid": "SBLGNT"},
{"scheme": "BCVWP", "docid": "ZarmaNT"}
],
"roles": ["source", "target"],
"records": [
{
"references": [["400060090011"], ["400060090011"]],
"meta": {"confidence": 0.97}
},
{
"references": [["400060090021"], ["400060090021", "400060090031"]],
"meta": {"confidence": 0.88}
}
]
}
]
}
Key points:
formatandversionidentify this as an alignment format file.Each
groupcollects related alignment records. Here the type is"translation"and the roles are"source"(Greek) and"target"(Zarma).documentshoists the reference scheme and document identifiers so they do not have to be repeated in every record. TheBCVWPscheme identifies words by a 12-characterBBCCCVVVWWWPstring (book, chapter, verse, word, part).400060090011is Matthew 6:9, word 1, part 1.roleshoists the role names so eachreferencesarray is positional:references[0]is the source unit,references[1]is the target unit.A reference unit with two selectors (
["400060090021", "400060090031"]) means that Greek word 2 aligns to the combination of Zarma words 2 and 3 — a one-to-many mapping.meta.confidencerecords the aligner’s confidence for each record. Individual records can add or override metadata hoisted from the group.
The complete file¶
Putting the metadata together:
{
"format": "scripture burrito",
"meta": {
"version": "1.0.0",
"category": "source",
"generator": {
"softwareName": "AutoAligner",
"softwareVersion": "1.0.0",
"userName": "Fatima Maïga"
},
"defaultLocale": "en",
"dateCreated": "2025-11-05T10:00:00+01:00"
},
"identification": {
"name": {
"en": "Zarma NT — SBLGNT Word Alignment"
},
"description": {
"en": "Word-level alignment of the Zarma New Testament against the SBLGNT"
},
"abbreviation": {
"en": "ZJNT-SBLGNT-align"
}
},
"languages": [
{
"tag": "dje",
"name": {"en": "Zarma"}
},
{
"tag": "grc",
"name": {"en": "Ancient Greek"}
}
],
"type": {
"flavorType": {
"name": "alignment",
"flavor": {
"name": "alignment"
}
}
},
"agencies": [
{
"id": "https://seedcompany.com",
"roles": ["rightsHolder", "content"],
"url": "https://seedcompany.com",
"name": {"en": "Seed Company"},
"abbr": {"en": "SC"}
}
],
"ingredients": {
"alignments/40MAT-dje-sblgnt.json": {
"checksum": {"md5": "a1b2c3d4e5f60001..."},
"mimeType": "application/json",
"size": 184200,
"scope": {"MAT": []}
},
"alignments/41MRK-dje-sblgnt.json": {
"checksum": {"md5": "a1b2c3d4e5f60002..."},
"mimeType": "application/json",
"size": 112500,
"scope": {"MRK": []}
},
"alignments/43JHN-dje-sblgnt.json": {
"checksum": {"md5": "a1b2c3d4e5f60003..."},
"mimeType": "application/json",
"size": 161800,
"scope": {"JHN": []}
},
"alignments/44ACT-dje-sblgnt.json": {
"checksum": {"md5": "a1b2c3d4e5f60004..."},
"mimeType": "application/json",
"size": 245600,
"scope": {"ACT": []}
}
}
}
Next steps¶
Add a
relationshipssection to link this alignment burrito to the Zarma text translation burrito it was produced from — use"relationType": "source".To record which alignments were manually reviewed, add per-record metadata in the alignment content files (e.g.
"meta": {"curated": true}).For the alignment content file format reference, see the Scripture Burrito Alignment Format specification.
For the complete metadata field reference see Alignment Specification.