Glossary

Controlled vocabulary for behavioral data terms and BDM concepts

This glossary is generated from the published behaverse/schemas artifacts: cross-cutting terms from the vocabulary, plus the field terms each schema defines.

General

accuracy float

Refers to a measure of performance. In many behavioral tasks, it reflects the percentage (0-100%) or fraction (0-1) of correct responses. Always use accuracy to refer to a performance measure that is a real number (float) and bounded to the [0-1] range.

Range

0 to 1 (inclusive)

correct boolean

A boolean which indicates whether a response in a given trial was correct or not. When no response was given when it should (i.e., timeout), correct evaluates to FALSE rather than N/A. This is to avoid the case where subjects would be given a high performance score when in fact they avoided all difficult trials and responded correctly only to easy trials.

response_time float

The meaning of response time or reaction time (and its unit) is not consistent across studies. In BDM, response_time is the duration in seconds between a) the moment the subjects fully completed their response on a given trial, and b) the moment that the earliest possible correct response could have been completed by a hypothetical agent with perfect knowledge of the task and ability to instantaneously execute the response.

Range

In seconds

Other measures of durations exist and may be useful to describe subjects’ responses. If such additional measures are needed, they should be specified explicitly; for example: response_onset, response_offset, or response_duration.

Units for response times are not consistent across papers and publicly available datasets. One can find them expressed in either seconds or milliseconds. BDM uses seconds as the default unit for response times to: - avoid “exception” by always using seconds as the temporal unit; - avoid additional computation by keeping the units as they currently are in our raw data and task speicifications; - avoid the temptation to round times to integers when expressed in milliseconds; - take advantage of the fact that many popular packages to analyse response time seem to be using seconds as the default unit; - be consistent with what seems to be the default unit in fMRI data standards (e.g., BIDS or DICOMs).

It is tempting to abbreviate response_time as rt. However, there are several other variables prefixed response_ which do not have abbreviations. Spelling the names out, while making the name longer, makes the overall data structure more consistent and explicit.

timed_out boolean

Indicates whether the subject failed to respond within the allocated time period.

Demographics

age float

Age is typically expressed in years. However, we don’t recommend rounding “age” to get integer values, as rounding implies losing data. It is better to leave variables as real numbers (floats when they are floats) and let the data analysts decide whether or not rounding this variable is necessary for their specific use case.

gender/sex enum

Gender and sex are not exactly the same. Sex refers to a biological sex while gender is a more complex construct. A person may have a male biological sex but identify as a women for example. Depending on the question asked, the variable should therefore be either sex or gender.

For example, “What sex were you assigned at birth, such as on an original birth certificate?” is a question about biological sex and should be coded as sex. The possible values for sex are:

Range

female: female (girl, woman)

male: male (boy, man)

other: other non-binary

skip: prefer not to say

Generic Suffixes

length float

Refers to the length in centimeters of a physical object. When possible use a more specific word (e.g., height, width, distance).

Don’t use length to mean count or size. This is contrary to the terms used in arrays/lists in programming languages.

height float

Refers to the height of a physical object in centimeters.

width float

Refers to the width of a physical object in centimeters.

weight float

Refers to the weight of a physical object in kilogram.

*_count integer

Refers to the cardinality of that entity. A variable named page_count indicates the number of pages. Or, if an observation/row has car_count = 5 this means that this particular observation involves a total count of 5 cars; this 5 is unrelated to other rows in the table.

Avoid the use of size as this term is ambiguous; it could refer to the height of a person, the screen width \(\times\) times height dimensions, or a level within a likert scale (e.g., “Medium”).

“Note that”count” is different from “sum” (e.g., one can sum negative float values while count involves positive integers only) and from “index” (e.g., “this is the second” versus “there are two”).

Avoid the use of n to refer to counts. While using n to refer to counts is much shorter and might be standard in some circles, count is more explicit and less error-prone than n which may mean different things in other contexts (e.g., the length of the variable, an iterator).”

type enum

Type is always an enum with known values. The meaning of the particular enum value needs to be explained in a codebook.

It can be tempting to use synonyms of “type”, in particular when “type” is already used for something else. Such synonyms include things like “category”, “kind” or “set”. When those terms are not required, they should be avoided and replaced by “type”.

description string

Description is always a text (string) for human consumption. While it is not strictly necessary, a textual description can greatly facilitate the understanding and processing of the data by humans.

Aggregation Suffixes

mean float

The average of a numeric variable.

Don’t use avg or average to refer to the mean value.

median float

The median of the variable.

Don’t use med to refer to the median.

mode

The mode of a variable.

min

The minimum value of a variable.

max

The maximum value of a variable.

The standard deviation of a variable.

Don’t use std or SD to refer to the standard deviation.

var

The variance of a variable.

iqr

The interquartile range of a variable.

Don’t use IQR to refer to the interquartile range.

sum float

The sum of all values of a variable (e.g., item_price_sum = sum(item_price)).

Don’t use total to designate the result of a sum operation.

quantile* float

Quantile is similar to percentile, as both refer to the value of a parameter Q that splits the data such that a given fraction of the data is smaller than Q. Quantile expresses that fraction as a number between 0 and 1 while percentiles express it as a percentage (between 0 and 100).

Use quantiles rather than percentiles because they allow naming the resulting variables in a simpler way. BDM uses the following convention to name the parameter X: - quantile(x, q = 0.23) -> quantile23 - quantile(x, q = 0.145) -> quantile145

Note that quantile(x, q = 1) can not be expressed using this convention. However, quantile(x, q = 1) is in fact equivalent to max(x) which is the preferred expression.

rank integer

Rank of a value in a set (ascending or first to last).

Variables can be sorted (for example from the smallest to the largest values) and some values can be tied (in which case the rank may no longer be represented by integers). Also, it might not be clear if the ranks are descending or ascending (e.g., age_rank). If such confusion arises, it is prefered to use a more explicit name (e.g., youngest_to_oldest or youngest_first_rank).

Transformation Suffixes

log float

Natural log.

log2 float

Log of base 2.

log10 float

Log of base 10.

Always specify the base when using the log except for the natural log.

sqrt

Square root.

pow2

Power of 2.

floor integer

Flooring of a number (e.g., 3.6 becomes 3).

ceil integer

Ceiling of a number (e.g., 3.6 becomes 4).

round integer

Rounding of a number to the closest integer (e.g., 3.6 becomes 4).

Referencing Suffixes

*_id stringinteger

If a column or variable name is suffixed with _id (e.g., participant_id, task_id), it is expected that there exists a supplementary table which has the same name (“participant”, “task”), with a primary key named id such that a value of in the first (particiapant_id = 215) refers to an entry in the second (a row in the participant table where id = 215). It is expected that the values in a variable postfixed _id are unique within a “local scope” of the source table; however, it is not expected that they are unique globally—for such purposes one should use the _uuid.

Range

Unique within a table or within an explicit context

Note that “id” typically implies a context, within which the “id’ is unique. That context must be made explicit. For example, trial_id may identify trials within a trial table for one activity completed by one subject.

If there is a column named id (i.e., without prefix), it is expected to be a primary key and there exists other tables or files that refer to this column; if such a link between tables does not exist, use index or name instead.

The postfix _id does not imply a particular data type: both integers and strings are valid.

*_name string

Sometimes “name” is used in a way that is similar to a unique id (e.g., study_name or task_name). The difference between “id” and “name” is that “name” is expected to be a readable text (e.g., n-back versus f346-r23v). As with “id”, it is expected that it refers to other tables and that it is unique within a certain context (contrary to, for example, “label”).

*_uuid string

Universally Unique Identifier (UUID) is a random 32-digit label that can be generated on the fly and will most likely be unique in computer systems. UUID can be used to assign a record a unique identifier without having to ensure that that number is not yet used by some other records or tables.

Range

UUIDv7 or later

*_uuids are expected to be globally unique.

*_uuids are not expected to be human interpretable.

Avoid using _uid suffix to refer to a UUID variable.

Within BDM, string-formatted Version 7 UUIDs are preferred over older versions or corresponding 128-bit integers. For example: 01934efd-35d5-79db-9aca-fc29b0451cd1.

*_hash string

It is sometimes useful to create a reproducible keys based on some data. A hash is not strictly necessary as it can be recreated using different data but it can be convenient for data processing.

There is no single widespread standard for hashing; rather there are multiple algorithms that can be used depending on the use case. You can use either CRC32 (32 hexadecimal characters; e.g., “098f6bcd4621d373cade4e832627b4f6”) or SHA256 (base64 characters, e.g., “d14a028c2a3a2bc9476102bb288234c415a2b01f828ea62ac5b3e42f”) depending on the probability of collision (i.e., two hashes for different data being identical). When that collision probability is deemed high, use SHA256.

*_index integer

Indices should be favored over labels and ids when a variable is used for referencing and when order is important (often, but not always, the chronological order). For example, a variable named stimulus_position_index implies its value points to an entry in a list of possible stimulus positions.

Range

1-based indices

Note that “index” typically implies a context, within which the indexing occurs and that context must be made explicit. For example, trial_index may index trials within a block.

BDM follows the convention of 1-based indexing: always starting counting/indexing from 1 rather than 0.

Avoid the use of *_number because it is ambiguous.

*_repetition integer

Repetition counts the number of times the same “thing” occurred, e.g., a participant completes the same test twice, the same stimulus appears multiple times.

Range

0-based

As with index and id, repetition assumes a context which must be clarified when ambiguous.

Repetition is 0-based: it starts “counting” at 0 rather than 1; *_iteration instead of *_repetition would make it 1-based like indices, but it is less explicit and thus less preferred.

*_label string

A text attached to a variable and identifies it. It is expected to be human readable, but not always unique.

Trial fields

activity_index index

When subjects complete multiple activities (e.g., a cognitive test followed by a questionnaire), this variable indicates the order of each activity (i.e., the first activity completed by the subject has activity_index = 1, the second session has activity_index = 2; even if the second activity is an exact repetition of the first one.

Range

1-based index of activity within the subject-level data.

See Studyflow table for more details.

adaptive_method_config string

More detailed configuration for the adaptive method, including initial values, step sizes, and termination criteria.

For example, 1up-2down. The 1-up, 2-down is a common adaptive staircase procedure used to estimate a subject’s threshold or sensitivity. After a correct response, the difficulty level is increased, and after two incorrect responses, the difficulty level is decreased.

adaptive_method_name enum

Specifies the adaptive procedure used to modify instrument parameters in response to subject performance (e.g., staircase).

adaptive_parameter_name string

Specific instrument parameter that is dynamically modified based on the subject’s performance.

adaptive_parameter_value any

The specific value of the instrument parameter that was used for this trial. This value is updated as the adaptive algorithm adjusts the parameter based on the subject’s responses.

Range

Data type depends on the type of the adaptive parameter, as defined in adaptive_parameter_name)

adaptive_parameter_value_next any

The next value of the adaptive parameter that will be used in the subsequent trial, as determined by the adaptive algorithm.

Range

Data type depends on the type of the adaptive parameter, as defined in adaptive_parameter_name)

additional_measures string

Indicates whether additional measures have been recorded during this trial and if so what kind of measures they are. Possible values include (non-exhaustive):

Range

mouse_trajectories - fmri - eye_tracking - heart_rate

Leave this field empty if no additional measures were recorded for this specific response.

agent_id string

A unique identifier assigned to the agent (typically person) generating the responses. This ID tracks their participation and responses throughout the study. See Agent table.

animation string

Describes the animation used to display a specific stimulus in a human-readable format. For example, “fadeIn 3s” indicates a 3-second fade-in animation.

To maintain clarity and consistency, BDM recommends using CSS-style naming conventions for common animations (e.g., “3s linear slide-in”).

Shown as defined in the Stimulus table; also a field of: Option.

block_id string

Specific parameterization of the instrument for a single block of trials (e.g., “DS_FORWARD_PRACTICE” and “DS_FOWARD_TEST”). Block-level parameters override timeline-level parameters.

block_index index

Refers to the order in which this block has been experienced by the subject. When there are multiple blocks, this variable indicates the order of each block (i.e., the first block completed by the subject has block_index=1, the second block has block_index=2, even if the second block is an exact repetition of the first one).

Range

1-based index of the block within the timeline

In questionnaires, block_index may refer to distinct pages where each page may contain multiple questions.

block_name string

The name of a particular block in a timeline. If the same block is completed twice in a row, they would have different block_index values (1 and 2, respectively) but they would have the same block_name (e.g., “NB_timeline1_block1”). More details about the block_name is available in the Instrument table.

block_type enum

Specifies the experimental role of the block (e.g., tutorial, practice, test, instruction).

Range

tutorial: A simplified version of the test designed to teach participants how the test works. - practice: Typically identical to the main test blocks but are used to get subjects accustomed to the task in a no-stakes environment. - test: Primary blocks used to measure the desired behaviors. - instruction: Presents written and/or visual instructions to the subject.

color_hex string

The hexadecimal RGB color code of the component (e.g., #FF0000 for red) with optional alpha channel for transparency.

Range

#000000 to #FFFFFF

Shown as defined in the StimulusComponent table; also a field of: OptionComponent.

color_name string

The human-readable name of the component color, e.g., red.

To maintain clarity and consistency, BDM recommends using CSS-style naming conventions for colors (e.g., “lightgray”).

Shown as defined in the StimulusComponent table; also a field of: OptionComponent.

duration float

Describes for how long this stimulus was displayed after its onset, in seconds.

Range

In seconds

When the stimulus is shown using an animation, duration covers the complete period between the start of the animation and the end of the animation.

Shown as defined in the Stimulus table; also a field of: Option, Input.

episode_index index

Episodes are temporally distinct bins of time (no overlap and discrete). The binning of the time into successive episodes depends on the task; it is mostly used and necessary to group data in cases where two distinct trials occurred at the same time (e.g., dual N-back).

Range

1-based interger index.

evaluation_label enum

There are several labels that can be assigned to a given response to specify what that response means in terms of evaluation within a task. The most general terms are “correct” and”error” (which are already given by the correct variable). There are however more specific sets of terms that may apply in different contexts. For example, in a signal detection task, it is common to use labels from the signal detection theory framework (i.e., “hit”, “miss”, “false alarm”, “correct rejection”). In other contexts, researchers might use terms like “omission” or “commission” errors or even things like “perseveration” error (e.g., in the Wisconsin Card Sorting Test). Note that these terms are not always well defined or exclusive. For example, a “hit” is also a “correct” response and a “false alarm” may be synonymous to “commission error”. Whenever possible use the more specific terms (i.e., always use “hit” rather than “correct” when applicable). Here are few evaluation labels that are commonly used:

Range

correct: The response is correct. - error: The response is incorrect. - hit: The stimulus was present and the subject correctly responded present. - miss: The stimulus was present and the subject correctly responded absent. - fa: The stimulus was absent and the subject correctly responded present. - cr: The stimulus was absent and the subject correctly responded absent.

expected_response_description string

A description of the expected response using the same convention as response_description.

expected_response_option_index integer

The index of the option the subject is expected to choose from the set of options.

When expected_response_index = 0, it means that the subject should not respond at all.

Sometimes stimuli serve both as stimuli and as response options as subjects have to click on a particular stimuli to give their response (e.g., spatial span, odd one out). It is convenient in those cases to use stimulus_position_index to order/index the options (i.e., option_index == stimulus_position_index) and consequently also the responses.

feedback_description string

Lists the different kinds of feedback that were shown on a given trial. When multiple types of feedback were used, feedback will list them using ; as a separator. If a given type of feedback was shown multiple times during a trial, that feedback type is listed only once (i.e., feedback_description does NOT represent the sequence of feedbacks). The possible values for feedback are:

Range

none: No feedback was shown. - expected_response: Feedback indicated what the correct response would have been. - explanation: Feedback explains why a certain option is the correct one. - correctness_on_option: “Feedback indicates (on the option itself) if the option chosen by the participant was the correct one (e.g., in green), or not (e.g., in red). - correctness_on_screen:”Feedback displayed on the screen center indicates if the response to the current trial was correct or not (e.g., using a green check or a red cross).

This list is not exhaustive and characterizing feedback in the future will involve more variables (e.g., separating the type of information shown (e.g., correctness) and how it is shown (“on_option” versus “on_screen”).

NOTE2: We don’t consider here as “feedback”, the kind of feedback that is used in UI to confirm to users that a button has indeed been clicked.

group_name string

Subjects may be assigned to different groups. Typically, different groups will have different experiences within a study.

index index

A 1-based index indicating the stacking order of stimulus components. A stimulus component with a higher index is displayed on top of those with lower values, similar to CSS z-index property.

Range

1-based indices

If the presented options have no specific temporal or spatial order, leave this field empty or assign the same index to all options, e.g., index=1.

Shown as defined in the StimulusComponent table; also a field of: OptionComponent.

index_in_source integer

When a stimulus is picked from a particular set (e.g., “digits1to9”), this index refers to the index within that set.

In addition to index_in_source, stimulus_id can also be used to look up further information about the stimulus in the source.

Shown as defined in the Stimulus table; also a field of: Option.

index_in_trial integer

Refers to individual stimuli within the sequence or set of stimuli shown during a trial.

Shown as defined in the Stimulus table; also a field of: Option.

input_action_type enum

Refers to the type of action the subject performs to give a response. Possible values include (non-exhaustive):

Range

“mouse-click” - “mouse-release” - “key-press” - “key-release” - “mouse-drag” - “touch” - “swipe”

The type of input_action determines the structure of detailed response data (i.e., mouse-click data is different from key-press data).

input_count integer

The number of inputs (i.e., actions) the user made during the trial.

For mouse-drag, it corresponds to the number of drag points that have been sampled during the drag-and-drop.

input_id PRIMARY KEY

Primary key; each input or click has its own identifier value that is unique within the table.

input_interface_type enum

Refers to the type of interface subjects used to input actions. Possible values include (non-exhaustive):

Range

keyboard: A keyboard is displayed on the screen. - buttons: Dedicated buttons on the screen. - stimulus-button: Stimuli serves as buttons. - text-field: A text field is displayed on the screen. - slider: A slider is displayed on the screen.

instrument_id string

The unique identifier of the instrument used for collecting data (e.g., the name of the computer script used to run the test). Unique in the Instrument table and corresponding files in the instruments/ folder.

Range

id in the Instrument table.

Shown as defined in the Response table; also a field of: Instrument.

instrument_repetition integer

The number of times this particular instrument has already been completed anytime in the past by this particular subject in this study. This variable has a value 0 the first time an instrument is used.

Range

0-based

is_object_enabled boolean

Indicates whether the object that was clicked on was enabled (clickable) or not.

job_description string

The more specific description of a job, which gives more information about what the participant sees and has to do. Whereas the job_type typically uses only verbs and adjectives, the job_description also contains nouns (e.g., “recall-digits-forward”, “recall-letters-backward”).

job_repeat enum

Whether this trial’s job has not been seen before in this timeline (i.e., specific version of the instrument).

Range

new: The job has never been seen before by this subject in the current study. - repeat: The job is the same as the previous trial. - switch: The job is different from the previous trial but has been seen prior in the timeline.

job_type string

The general type of operation the subject needs to perform. The job typically is expressed as a verb (e.g., “recall”, “sort”) and can be the same for different instruments (e.g., Digit Span test and Spatial Span test both have a job of type “recall-forward”).

language string

The language the task was completed in, expressed as a two-letter code within the ISO_639-1 standard.

link url

External link, if any, the provides more information about the instrument, e.g., on Cognitive Atlas.

Permanent links, e.g., DOIs, are preferred over particular websites.

measurement_type enum

Describes the type of measurement implied by Option which in turn has implications on how that data should be processed during analysis; takes a value in:

Range

“nominal”: Set of unordered labels (e.g., {“Luxembourg”, “France”, “Germany”}). - “ordinal”: “Ordered values without clear distance (e.g., {“a lot”, “a bit”, “not at all”}). -”interval”: Ordered values with clear distances but no absolute zero (e.g., 10 versus 20 degrees Celsius). - “ratio”: Values with clear distance metrics and absolute zero (e.g., length in cm).

multitask_type enum

Subjects may be required to perform multiple tasks at the same time. This variable indicates the type of multitasking required.

Range

Empty: No multitasking, i.e., single-tasking. - concurrent: There are two independent tasks that need to be completed in parallel. - compound: The task requires multiple successive stages or involves tasks that are dependent/coupled.

If no multitasking was involved, leave this field blank.

This characterization of multitasking_type is rudimentary and will likely evolve in the future.

name string

Name of the toolkit (scene, code, or configuration) that is used to collect the data, e.g., “DS” for a software that runs digit span task in forward OR backward order. The specific parameterization of the instrument is defined by the “Timeline” (e.g., a variant of instrument called “DS_FORWARD”).

object_id string

A stimulus is defined by a set of features. This variable is used to identify each time the same stimulus features were used.

For example, if the same white digit “3” is shown in a digit span sequence, all those instances would have the same object_id although they would have different ids (as they appeared at different times).

Shown as defined in the Stimulus table; also a field of: Option.

object_name string

The human-readable name of the object that was clicked on (e.g., “sos_box_1_3”).

object_state string

Describes the state the object was in before it was clicked on. The meaning of “state” depends on the particular task (e.g., “new empty”).

object_type enum

Describes the type of object that was clicked on (e.g., “button”).

onset float

Duration between the start of the trial and the appearance of the stimulus, in seconds.

Range

In seconds

Shown as defined in the Stimulus table; also a field of: Option, Input.

option_count integer

The number of options the participant can choose from on a given trial.

option_data_type enum

Describes the type of data this option entails. Possible values include:

Range

“nominal”: Set of unordered labels (e.g., {“Luxembourg”, “France”, “Germany”}). - “ordinal”: “Ordered values without clear distance (e.g., {“a lot”, “a bit”, “not at all”}). -”interval”: Ordered values with clear distances but no absolute zero (e.g., 10 versus 20 degrees Celsius). - “ratio”: Values with clear distance metrics and absolute zero (e.g., length in cm).

option_id integer

Is a unique identifier for the option (set or generator) used on a given trial.

Shown as defined in the Response table; also a field of: Option, Input, OptionComponent.

option_source string

Refers to the specific generator or set that determined the options on a given trial. Option that stem from the same source have the same data scheme and could thus be described in a table named after option_source (i.e., option_source indicates which table contains the full information about the option set).

While there is a stimulus_index_in_source to refer to the particular stimulus that was used, we don’t have an equivalent opiton_index_in_source since all options are displayed. Instead, we use response_index and expected_response_index to refer to a particular option within the set of options.

option_source_type enum

A set of options is typically created using a particular procedure/algorithm (“generator”) or is sampled from a particular set (“set”). This variable indicates which of these two applies for the current options.

orientation enum

Indicates the symbol orientation.

Range

north: bottom to top. - north_east - east: left to right. - south_east - south: top to bottom. - south_west - west: right to left - north_west - free: no specific orientation.

If none of the predefined orientations apply, leave this field empty or use a custom human-readable label. Make sure custom labels are clearly defined in the codebook.

Shown as defined in the StimulusComponent table; also a field of: OptionComponent.

outcome_description string

Describes the observable consequences of the subject’s response (e.g., “the opened box is empty”).

outcome_numeric float

A numeric value describing the observable consequences of the subject’s response (e.g., +3 points).

panel_id string

Identifier of the panel this stimulus is displayed over.

Shown as defined in the Stimulus table; also a field of: Option, OptionComponent.

presentation_id string

In a multitasking setting, a particular instance of a stimulus (e.g., the current letter “A”) may be used by multiple tasks at the same time (e.g, in the dual N-back task). Because these are different trials, they will have different trial_id values and hence will have different rows in the Stimulus table. We use presentation_id to indicate that a given stimulus is in fact the same instance across those trials.

response_count integer

Each trial contains by definition only one response. However, when response_structure is other than unitary, a response comprises multiple pieces of information (e.g., “3-5-7” could be one response in the digit span task and this response contains three components, namely “3”, “5” and “7”). response_count refers to the number of components that make up a response (not the number of responses within a trial).

response_count is different from input_count; a subject may in some cases change their response multiple times before submitting the final response. In such cases, there would be many more inputs than there are components to the final response.

While we have stimulus_set_size we currently don’t have a response_set_size, but we do have option_count and response_count.

response_datetime datetime

The datetime corresponding to the completion of the response.

response_description string

A description of participant’s response; typically the description of the option that was chosen.

response_description can be directly compared to expected_response_description.

response_element_index index

Indicates which of the clicks is used and in what order to form the actual response in the response table when response_structure is “sequence” or “set”.

Range

1 to input_count in the corresponding row of the Response table.

This needs to be here rather than in Option table, because the same option can be clicked multiple times and either serve or not for the response depending on the order of the clicks. For example, in the Digit Span test we could have the response of “3;5;7” on a particular trial. This might correspond to > - option.description = [“3”, “4”, “delete”, “delete”, “3”, “5”, “7”, “enter”] > - click.response_element_index = [NA, NA, NA, NA, 1, 2, 3, NA]

response_id PRIMARY KEY

A unique identifier assigned to responses in temporal order, meaning that larger IDs correspond to more recent responses that occurred later in time. This ID is unique within this table; no two rows share the same value.

This identifier is used by other tables, for example Stimulus table which describes in greater detail the sequence of images shown during a trial, their timing, and visual properties. That table will refer to this id in order to link those descriptions (typically multiple lines in the Stimulus table) to a unique row in the Response table.

Shown as defined in the Response table; also a field of: Stimulus, Option, Input.

response_initiation_time float

In some cases (e.g., a reaching movement) it might be useful to encode when a response was initiated.

response_numeric float

A numeric value associated with a particular response; this could be a numeric value entered directly by the subject or the numeric meaning of a selected option (for example, the choice of option “Never” may be associated with the numeric value of 0). Note that this variable describes the subject’s response; it does not describe the value (e.g., correctness or goodness) that is associated with that response.

response_option_index integer

The index of the option the participant chose, starting from 1.

response_option_index = 0 means the subject chose none of the options (e.g., a “no-go” response in a Go/No-go task).

response_index can be directly compared to expected_response_index.

response_index refers to an entry in the Option table (i.e., there is no Response table).

response_skipped boolean

In some cases (e.g., in some questionnaires), subjects have the option to skip a question.

response_structure string

The structure of the response required by the subject; can take values in:

Range

unitary: The subject provides a single input (e.g., chooses option same). - set: The subject provides a set of information, and the order does not matter (e.g., list words that start with the letter A). - sequence: The subject provides a sequence of information, and the order matters (e.g., a sequence of memorized digits in their order of appearance).

Note that the distinction between set and sequence refers to the importance of order information to evaluate if the response is correct or not; a response with a set structure may unfold over time (each piece of information is given in a particular temporal order) and it may be of scientific interest to take into account that order; however, the order itself is not important within the task itself. For example, in the MOT task one may ask subjects to point to all dots that hide a token. If subjects point to all such dots they will be correct no matter in which order the dots were clicked in.

response_validation_time float

In some cases, subjects may need to press an extra key to validate previous responses. When relevant, this variable may encode this duration.

role enum

Describe the role that the stimulus plays in the trial, e.g., “target”.

Range

target: A stimulus the agent must process and which should trigger the completion of the response (e.g., classify, reach, memorize) if the agent is doing the task as intended. In some cases (e.g., in a go/no-go task) the correct response to a stimulus is to NOT click the button. In this case, the stimulus that triggered the decision to NOT click the button is still a target. - non_target: A stimulus the agent must process but which does not trigger the completion of the response (e.g., the first two stimuli in a 2-back test). - distractor: “A stimulus the agent should not process at all (i.e., ignore) and which is unrelated to the correct execution of the task. - location_cue: A stimulus giving a spatial location information that agents could use to improve their performance. - job_specifier: A stimulus specifying which job the agent should perform. - stop_signal:”A stimulus signaling the agents that they should abort current action. - probe: A stimulus indicating about which stimulus to respond.

score float

A numeric value associated with a particular response in a given context. This variable may be used to compute a performance metric or a questionnaire level index (e.g., a well-being score).

session_id integer

When there are multiple sessions, this variable indicates the order of each session (i.e., the first session completed by the subject has session_index = 1, the second session has session_index = 2; even if the second session is an exact repetition of the first one.

Range

index of session within subject.

We currently don’t use session_name, session_id and session_repetition in this table.

source string

Refers to the specific generator or set the stimulus belongs to.

Stimuli that come from the same source have the same data scheme and could thus be described in a table named after the stimulus_source. stimulus_source indicates which table contains the full information about the stimulus; e.g., “digits1to9”.

One could include a source_count variable here that indicates how many different stimuli there are in the set; but it’s better stored in the table that contains information about that stimulus source.

Shown as defined in the Stimulus table; also a field of: Option.

source_type enum

A stimulus is typically created using a particular procedure/algorithm (“generator”) or is sampled from a particular set (“set”). This variable indicates which of these two applies for the current stimulus.

Range

set: stimulus is sampled from a fixed set of stimuli. - generator: “stimulus is created using a procedure/algorithm.

Shown as defined in the Stimulus table; also a field of: Option.

stimulus_count integer

The number of stimuli shown to the participant during the trial.

Range

This should match number of stimuli in stimulus_id

stimulus_description string

A human readable, compact description of the main aspects of the stimulus. The description for a given stimulus depends on the task but follows a specific template for a given task. Because of this, it looks like the stimulus_description could be “parsed” and “tidied”—however, this is not the intention; parsed/tidied data will be available in other tables; description is for human readability and facilitates the understanding of the data.

In some cases, when stimuli are too complex or can’t be precisely described, a summary of all stimuli is given instead.

stimulus_id integer

Is a unique identifier for the (unitary, set or sequence of) stimuli presented during a trial; if those exact same stimuli are repeated in a different trial, that trial would have the same value for stimulus_id. stimulus_id may also be used to refer to a specific message or question in a questionnaire.

Shown as defined in the Response table; also a field of: Stimulus, Input, StimulusComponent.

stimulus_index list[index]

Indexes in chronological (or spatial) order the stimuli shown within an instrument (counting one stimulus per response). stimulus_index may for instance be used to refer to the nth question asked within a questionnaire.

Use semicolon-separated indices if more then one stimulus were presented, e.g., 1;2;3.

stimulus_index_in_source integer

Index of the stimulus within the table referred to by stimulus_source. : For example, if stimulus_source == “digit1to9”, stimulus_index_in_source = 1 refers to “1” while for stimulus_source == “LettersAtoD”, stimulus_index_in_source = 1 refers to “A”.

It is not because a particular stimulus_source is used in a given timeline that all possible stimuli of that source are displayed to the user. For example, the AX-CPT may use “upper-case-letters” but only use a subset of those letters (e.g., “A”, “B”, “X”, “Y”). Whenever possible, we specify the most relevant/specific set (e.g., “digit1to9” rather than “digit”).

stimulus_onset float

Duration between the start of the trial and the appearance of the stimulus, in seconds.

Range

In seconds

stimulus_panel_count integer

The number of panels or screen areas stimuli may appear on during the trial. For example, in a task where stimuli to be compared are presented on the left and right side of the screen, stimulus_panel_count = 2.

stimulus_position_index integer

Refers to discrete positions on the screen the stimulus may appear on. The set and ordering of possible positions depends on the test. Whenever possible, it follows a natural order (left to right, top to bottom), but in free-form layouts, indices are arbitrary.

stimulus_role enum

A stimulus may play different roles within a trial. Below is a list of some possible roles:

Range

“target”: “A stimulus the subject must process and which should trigger the completion of the response (e.g., classify, reach, memorize) if the subject is doing the task as intended. Note that in some cases (e.g., in a go/no-go task) the correct response to a stimulus is to NOT click the button. In this case, the stimulus that triggered the decision to NOT click the button is still a target.” - “non_target”: “A stimulus the subject must process but which does not trigger the completion of the response (e.g., the first two stimuli in a 2-back test).” - “distractor”: “A stimulus the subject should not process at all (i.e., ignore) and which is unrelated to the correct execution of the task.” - “location_cue”: “A stimulus giving a spatial location information that subjects could use to improve their performance.” - “job_specifier”: “A stimulus specifying which job the subject should perform.” - “stop_signal”: “A stimulus signaling the participant he should abort his current action.” - “probe”: “A stimulus indicating about which stimulus to respond.”

stimulus_set_size integer

The number of different values each presented stimulus could have taken. This value gives an indication of the complexity of the stimulus space. When this number is large we set this variable to infinity, when for any reason it was not computed, it has a value of NA.

To specify “infinity” in a CSV file we use +Inf and -Inf; these are correctly recognized in R (tidyverse) and Python (pandas) as being valid numbers rather than strings.

stimulus_source string

Refers to the specific generator or set the stimulus belongs to. Stimuli that stem from the same source have the same data scheme and could thus be described in a table named after stimulus_source (i.e., stimulus_source indicates which table contains the full information about the stimulus; e.g., “digit1to9”).

stimulus_source_type enum

Range

“set”:“stimulus is sampled from a fixed set of stimuli.” - “generator”: “stimulus is created using a procedure/algorithm.”

stimulus_structure enum

We distinguish three stimulus structures: unitary, set, sequence

Range

unitary: Only one stimulus is shown, alone. - set: Many stimuli are shown, either at the same time or not; order does not matter. - sequence: Multiple stimuli are shown, either at the same time or not; order does matter (order may be indicated by the order of presentation or by a digit for example).

stimulus_structure_source string

Refers to the specific generator used to produce the stimulus_structure (e.g., sequence of digits in a digit span test). When no generator was used, this variable has a value of none.

stimulus_structure_source_type enum

Indicates the type of method used to generate the stimulus_structure (this is relevant when a trial displays a sequence of or set of stimuli): none, preset, generator

Range

none: when stimulus_structure == unitary.”, - preset: The structure of stimuli is hard coded in a file. - generator: A procedure was used to generate the stimulus_structure.

stimulus_type enum

BDM distinguishes the following stimulus types: messages and questions

Range

message: The stimulus is a message shown to subjects (e.g., task instructions). - question: The stimulus may consist of text, images and/or sounds; they require subjects to make a decision based on the content of the stimulus.

study_name string

The name of the study or experiment.

symbol_count integer

The number of symbols represented in this component.

Shown as defined in the StimulusComponent table; also a field of: OptionComponent.

symbol_layout enum

How the symbols are laid out.

Range

vertical: along the Y axis. - horizontal: along the X axis. - diagonal_top_left - diagonal_top_right - square - ring - cross - two_columns

If none of the predefined layouts apply, leave this field empty or use a custom human-readable label. Make sure custom labels are clearly defined in the codebook.

Shown as defined in the StimulusComponent table; also a field of: OptionComponent.

symbol_name string

The human-readable name of the displayed symbol.

Shown as defined in the StimulusComponent table; also a field of: OptionComponent.

task_index index

when multitask_type is not empty, task_index refers to each of the individual tasks. For example, for auditoy-visual dual N-back, task_index=1 is the auditory task and task_index=2 is the visual task.

Range

1-based index.

timeline_id string

Timelines are specific parameterization of an instrument and their identifiers are unique within the corresponding table for the instrument in the instruments/ folder.

Shown as defined in the Response table; also a field of: Instrument.

timeline_repetition integer

The number of times this particular timeline has already been completed anytime in the past by this particular subject in this study. This variable has a value 0 the first time a timeline is completed.

Range

0-based

transformation_name string

Refers to the specific events-to-trials function used to construct rows of this table from raw events. The transformation (or projection in DDD terminology) embodies the definition of a trial for a particular task. The transformer name refers to a code in the format of a function f(trial_state, event) => trial_state, where event is the event occurred during the performance of the task, and trial_state is the data stored for the trial. The final state of a trial is thus the result of applying a sequence of projections such that trial = f(f(f(initial()), e), e), e).’

The transformation/projection encapsulates the domain rules that define a “trial” for a given task. It defines what constitutes a “trial”.

trial_id id

Refers to the trial_index in the Response table and indicates in which trial this stimulus was shown.

Range

trial_id in the Response table of the same agent/session/activity/attempt

trial_index id

Sequential identifier representing number of times transformation rule to the events occurred. It increases with each re-computation of the trial based on updated or newly received events.

Range

Preferably 1-based integer index.

This field emphasizes order of trials and alignment with projection-based definition of trials. A more complete name would be projection_index or pojected_trial_index. For brevity, BDM uses trial_id instead.

Shown as defined in the Response table; also a field of: Option.

trial_seed integerstring

Random seed used in the trial (if any).

trial_start_datetime datetime

The the first event of the trial occured.

value float

A numeric value associated with a particular response option; typically indicating the “worth” of a response (e.g., value=1 for the correct response).

version string

Refers to the specific version/build of a particular instrument. We will use a calendar based versioning system (calver.org; e.g., “v2024.01”).

x_screen integer

X coordinates of the stimulus on the screen in pixels.

In BDM, the preferred position is the center of the object. However, specific implementations of the tasks may use other locations such as the top-left corner. If this is the case, it should be explicitly stated in the codebook.

Shown as defined in the Stimulus table; also a field of: Option, Input, StimulusComponent, OptionComponent.

x_viewport float

X coordinates of the stimulus on the screen expressed as a fraction of the screen width.

Range

0 to 1 (inclusive)

Shown as defined in the Stimulus table; also a field of: Option, Input, StimulusComponent, OptionComponent.

y_screen integer

Y coordinates of the stimulus on the screen in pixels.

Shown as defined in the Stimulus table; also a field of: Option, Input, StimulusComponent, OptionComponent.

y_viewport float

Y coordinates of the stimulus on the screen expressed as a fraction of the screen height.

Range

0 to 1 (inclusive)

Shown as defined in the Stimulus table; also a field of: Option, Input, StimulusComponent, OptionComponent.

Event fields

actor object

Who or what performed/experienced the event — {objectType, id, name?}, where objectType is one of the actor types below. (Renamed from agent; an Agent is one type of actor — BDM deviation D5.)

attachments array

References to additional files/data associated with the event (stimulus blobs, recording files, timeseries), each with its own metadata. Payloads are not inlined.

authority object

The authority that generated the event (e.g. the client app/developer). Populated by the LRS.

context object

Contextual information (study, studyflow, and the session→activity→runtime→block→trial scoping hierarchy) under context.extensions, keyed by bdm:* extension keys.

object object

What the action was performed on — {objectType, id, name?}, where objectType is one of the object types below.

result object

The outcome of the event (e.g. accuracy, response_time, score). Domain-specific payload lives under result.extensions keyed by bdm:* extension keys.

stored datetime (RFC 9557)

When the event was stored in the LRS. Populated by the LRS.

timestamp datetime (RFC 9557)

When the event occurred, as an ISO 8601 / RFC 9557 datetime with timezone offset.

updated datetime (RFC 9557)

When the event was last updated in the LRS. Populated by the LRS.

verb string

The action that occurred, drawn from the canonical verb vocabulary below.

version string

The associated BDM/schema version (e.g. v26.0608). Typically populated by the LRS.

Categories

General

Demographics

Generic Suffixes

Aggregation Suffixes

Transformation Suffixes

Referencing Suffixes

Trial fields

Event fields