Describe Your Dataset
How to use dataset cards and codebooks to describe the data in a dataset
Dataset cards
Each BDM dataset includes a README.md
file in its root directory that provides both human- and machine-readable metadata. Machine readability relies primarily on the front-matter. It includes information about the dataset as a whole (such as the title, description, and author) and the data (such as the variable names, types, and descriptions). Here is an example:
---
title: "Example dataset"
description: "This is an example dataset that contains some example data."
author: "John Doe"
---
In BDM, part of the card that that describes the the data format is called codebook. It can include information about the variables in the dataset, such as the name, type, and description of each variable, the values that each variable can take on, and any missing data that may exist in the dataset. Codebooks are useful for understanding the data, sharing it with others, and for reproducibility.
`codebook` field to the front-matter. The value of the `codebook` field should be either a path to the codebook file or the codebook itself. For example: If you have a codebook for your dataset, you can include it in the dataset card by adding a
… codebook: “codebook.md”
Or:
… codebook: @context: “https://behaverse.org/schemas/bdm/codebook.json” @type: “Codebook” variable: - name: “age_group” description: “The age group of the subject.” type: “integer” missing: “NA” values: - “18-25”: “between 18 and 25 years old (inclusive)” - “26-35”: “between 26 and 35 years old (inclusive)” - “36-45”: “between 36 and 45 years old (inclusive)” - “46-55”: “between 46 and 55 years old (inclusive)” - “56-”: “56 years old (inclusive) or older”
## Citation
...
## License
...