Specification

Studyflow is a domain-specific language for specifying scientific processes and their associated data. It extends the BPMN 2.1 standard to fit the specific needs of experimental sciences.

Formal definition

A studyflow diagram is a \(S = (N, E, T, \tau, \lambda)\) tuple, where \(N\) is a finite set of elements, \(E\subseteq N\times N\) represents sequence flows (edges), \(T\) is a set of pre-defined node types (events, activities, gateways, data), \(\tau: N \rightarrow T\) is a typing function that assigned types (events, activities, gateways, data) to the nodes, and \(\lambda\) is a labeling function that assigns additional attributes to the nodes (e.g., metadata, triggers, gateway logic, implementation). The elements, \(N\), are connected by directed edges, \(E\), forming a directed graph that represents the flow of the study.

\(N\) can be further divided into subsets based on the type of elements (\(T\)). For example, \(N_{E} \subseteq N\) represents the set of events (e.g., start and end events), \(N_{A} \subseteq N\) represents the set of activities (e.g., tasks, sub-processes), and \(N_{G} \subseteq N\) represents the set of gateways (e.g., randomizer, decision points, parallel splits). \(N_{D} \subseteq N\) represents data objects that can be used to store and manipulate data within the studyflow. Each subset has its own specific attributes and behaviors defined by the \(\lambda\) function.

The main components of the \(S\) tuple are described in the BPMN 2.1 specification, and studyflow extends them with additional types and attributes to better suit the needs of experimental studies. More specifically:

  • \(N_A\) (activities) is extended with specific activity types relevant to experimental studies, such as cognitive tests, questionnaires, instructions, rest periods, and video games.
  • \(N_G\) (gateways) is extended with a random gateway type, which allows for random assignment of participants to different branches of the study based on specified probabilities or conditions.
  • \(N_D\) (data objects) can be used to represent data collected during the study, such as participant responses, physiological measurements, or other relevant data. It also support standard data formats (e.g., BIDS, BDM, Psych-DS) and can include attributes for data validation and preprocessing.
  • \(\lambda\) (attributes) is extended to include attributes specific to experimental studies or data analysis, such as metadata (e.g., study name, version), event triggers (e.g., temporal, errors), gateway logic (e.g., randomization probabilities, conditional logics), and implementation details (e.g., links to external scripts or software).
  • \(E\) (edges) can include group assignments, indicating which paths participants should follow based on their assigned group.
  • \(S\) can also include design patterns commonly used in experimental studies, such as counterbalancing, recruitment, exception handling, and data quality checks.

Grammar

The following grammar describes the structure of a studyflow diagram:

Studyflow EBNF grammar (click to expand)
Definitions       = { Study } ;

Study             = "Study", identifier, { Attribute }, { Element | SequenceFlow } ;

Element           = Event | Activity | Gateway ;

Event             = StartEvent | EndEvent ;

StartEvent        = "StartEvent", identifier, { Attribute } ;
EndEvent          = "EndEvent", identifier, { Attribute } ;

Activity          = "Activity", identifier, { ActivityAttribute }, [ Choreography ] ;
ActivityAttribute = ActivityType | Attribute ;
ActivityType      = "@type", "CognitiveTest" | "Questionnaire" | "Instruction" | "Rest" | "Script" | "Manual" ;

Choreography      = "choreography", { Attribute }, ParticipantList, [ InitiatingParticipant ] ;

Gateway           = "Gateway", identifier, { GatewayAttribute } ;
GatewayAttribute  = GatewayType | Attribute ;
GatewayType       = "@type", "Random" | "Exclusive" | "Complex" ;

SequenceFlow      = "SequenceFlow", identifier, { Attribute }, NodeRef, "→", NodeRef ;

MessageFlowList   = { "messageFlow", identifier, { Attribute }, ParticipantRef, "→", ParticipantRef } ;

Attribute             = identifier, value ;
ProcessRef            = identifier ;
NodeRef               = identifier ;
ParticipantRef        = ProcessRef ;
InitiatingParticipant = ParticipantRef ;
ParticipantList       = identifier, { identifier } ;

(* low-level components *)
identifier            = letter, { letter | digit | "_" } ;
value                 = string | number | boolean | identifier ;
number                = [ "-" ], digit, { digit }, [ ".", digit, { digit } ] ;
boolean               = "true" | "false" ;
string = '"', { ? Any unicode character except ?
  | "\", (
  '"' (* quotation mark *) |
  "\" (* reverse solidus *) |
  "/" (* solidus *) |
  "b" (* backspace *) |
  "f" (* formfeed *) |
  "n" (* newline *) |
  "r" (* carriage return *) |
  "t" (* horizontal tab *) |
  "u", 4 * ? hex digit ?
  ) }, '"';

letter = [A-Za-z] ;
digit = [0-9] ;

An example studyflow in this formalism is shown below:

Example studyflow (click to expand)
Study exampleStudy
  StartEvent s

  Activity qs
    @type Questionnaire
    language "en"
    text "What is your age?"

  Gateway gw
    @type Random
    condition "ageGroup"

  Activity instr
    @type Instruction
    text "Follow carefully"

  Activity rest
    @type Rest
    duration 5

  EndEvent e

  SequenceFlow f1 s → qs
  SequenceFlow f2 qs → gw
  SequenceFlow f3 gw → instr
  SequenceFlow f4 gw → e
  SequenceFlow f5 instr → rest
  SequenceFlow f6 rest → e

Which can be visualized as an extended BPMN diagram:

The same studyflow visualized and annotated using the Modeler app

The same studyflow visualized and annotated using the Modeler app

This diagram can also be represented in machine-readable formats:

XML/BPMN serialization (click to expand)
<?xml version="1.0" encoding="UTF-8"?>
<bpmn2:definitions
  xmlns:bpmn2="http://www.omg.org/spec/BPMN/20100524/MODEL"
  xmlns:studyflow="http://behaverse.org/schema/studyflow"
  id="example-diagram">
  <studyflow:study id="exampleStudy" isExecutable="false">
    <bpmn2:startEvent id="s" name="s">
      <bpmn2:outgoing>f1</bpmn2:outgoing>
    </bpmn2:startEvent>
    <studyflow:questionnaire id="qs" name="qs" type="studyflow:Questionnaire">
      <bpmn2:incoming>f1</bpmn2:incoming>
      <bpmn2:outgoing>f2</bpmn2:outgoing>
    </studyflow:questionnaire>
    <bpmn2:sequenceFlow id="f1" name="f1" sourceRef="s" targetRef="qs" />
    <studyflow:randomGateway id="gw" name="gw" type="studyflow:RandomGateway">
      <bpmn2:incoming>f2</bpmn2:incoming>
      <bpmn2:outgoing>f3</bpmn2:outgoing>
      <bpmn2:outgoing>f4</bpmn2:outgoing>
    </studyflow:randomGateway>
    <bpmn2:sequenceFlow id="f2" name="f2" sourceRef="qs" targetRef="gw" />
    <studyflow:instruction id="instr" name="instr" type="studyflow:Instruction">
      <bpmn2:incoming>f3</bpmn2:incoming>
      <bpmn2:outgoing>f5</bpmn2:outgoing>
    </studyflow:instruction>
    <bpmn2:sequenceFlow id="f3" name="f3" sourceRef="gw" targetRef="instr" />
    <studyflow:cognitiveTest id="rest" name="rest" instrument="rest" type="studyflow:CognitiveTest">
      <bpmn2:incoming>f5</bpmn2:incoming>
      <bpmn2:outgoing>f6</bpmn2:outgoing>
    </studyflow:cognitiveTest>
    <bpmn2:sequenceFlow id="f5" name="f5" sourceRef="instr" targetRef="rest" />
    <bpmn2:endEvent id="e" name="e">
      <bpmn2:incoming>f6</bpmn2:incoming>
      <bpmn2:incoming>f4</bpmn2:incoming>
    </bpmn2:endEvent>
    <bpmn2:sequenceFlow id="f6" name="f6" sourceRef="rest" targetRef="e" />
    <bpmn2:sequenceFlow id="f4" name="f4" sourceRef="gw" targetRef="e" />
  </studyflow:study>
</bpmn2:definitions>
YAML serialization (click to expand)
study:
  @id: exampleStudy
  elements:
    - @type: bpmn2:StartEvent
      @id: s
      outgoing: [f1]
    - @type: studyflow:Questionnaire
      id: qs
      attributes:
        - language: en
        - text: What is your age?
      incoming: [f1]
      outgoing: [f2]
    - type: studyflow:RandomGateway
      id: gw
      attributes:
        - condition: ageGroup
      incoming: [f2]
      outgoing: [f3, f4]
    - @type: studyflow:Instruction
      id: instr
      attributes:
        - text: Follow carefully
      incoming: [f3]
      outgoing: [f5]
    - @type: studyflow:Rest
      @id: rest
      attributes:
        - duration: 5
      incoming: [f5]
      outgoing: [f6]
    - @type: bpmn2:EndEvent
      @id: e
      incoming: [f4, f6]