Introduction

Flow Results Data Package

A container and data format for describing a collection of interactions or "responses" reported by end-users of a digital system using the Flow Data paradigm. It provides for the open publication, exchange, and analysis of Flow-generated interactions across supporting platforms.
Authors
Mark Boots (Viamo) Peter Lubell-Doughtie (Ona)
Eduardo Jezierski (InSTEDD)
Gustavo Giráldez (InSTEDD) Evan Wheeler (UNICEF)
Media Type
TODO: once registered: application/vnd.org.flowinterop.results+json
Version
1.0.0-rc.1
Last updated
2017-09-30
Created
2017-03-31

Table of Contents

Language

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119.

Introduction

Within the field of ICT tools for humanitarian and development work (ICT4D), a range of software applications has emerged for digital data collection and mobile engagement with remote populations. These applications employ a diversity of communication channels, such as text messaging, automated voice calls, unstructured supplementary service data (USSD) menus, web-based forms, and in-person surveys using mobile apps for digital data entry. Despite differences, many of these applications are based on a similar paradigm of "Flow-based" data collection using logical decision trees. However, it was not previously possible to exchange data collected using different applications and vendors unless ad-hoc or one-to-one data mapping software was created.
By defining an open specification for the exchange of data generated by Flow-like software applications, organizations can reduce ad-hoc software development effort; accelerate the creation of new interoperable analysis tools for visualization, dashboards, and decision making; speed response time to sharing data in crisis situations, and reduce risks associated with vendor lock-in.

Scope

The purpose of this specification is to standardize the exchange of data between data collection and data analysis/visualization applications within the ICT4D sector. The focus of the data covered by this specification is the "results" or "responses" recorded during interactions with end-users through digital channels.
The specification is intended to be relevant to both file-based and API-based data exchange. It is agnostic to the communication channel used for data collection.
The specification is self-sufficient and can be used independently, but it is also designed to complement the Flow Definition specification describing logical flows of mobile engagement content. It is also designed to be flexible enough to describe responses collected with non-Flow-based applications, such as the suite of tools based on the Open Data Kit (ODK) framework, and to create a bridge between established and evolving technologies.

Terminology

Contact: A Contact is an end-user of a digital interactive system, providing input or "responses" to the system. Contacts can be human beings interacting via a channel such as interactive voice response (IVR), SMS, USSD, social media messaging, and web browsers. Contacts might also be non-human entities (such as a waterpoint or school) or automated systems (programmable agents) that do not necessarily represent a human being.
Response: A Response is a single input given by a Contact when prompted during an interaction with a digital interactive system. As an example, answering a multiple choice question asking, "Are you male or female?" by choosing the female option constitutes providing a Response.
Question: A Question is a prompt to the Contact for a Response. When looking at results data, knowledge of the nature of the question can help to analyze and visualize it. For example, numeric question responses might be graphed on a scatter plot, while multiple-choice question responses are naturally visualized on a bar chart or pie chart. Often additional information about the question is necessary beyond simply the responses, such as the complete set of choices presented in a multiple choice question.
Although the terminology of "Questions" is natural when using Flows for survey-like use-cases, it's important to clarify that in general, Questions might not be literal questions. They might represent the output of a lookup, script, or logical analysis within a digital engagement system. The terminology used varies across platforms: Variable, Result, and Column are other common terms. Within this specification, we refer to them uniformly for the sake of convenience as Questions.
(TODO: We agreed that Question isn't the best terminology, but I found that while drafting the spec this made it much more legible and understandable, compared to "Variable" or "Result" which are much less precise.)

Example

A minimal example of a Flow Results data package, stored as files on disk, is given here:
There is a Descriptor file in JSON format, e.g. flow-results-example-1.json:
1
{
2
"profile":"flow-results-package",
3
"name":"flow-results-example-1",
4
"flow_results_specification_version":"1.0.0-rc1",
5
"created":"2017-06-30 15:35:27+00:00",
6
"modified":"2017-06-30 15:38:05+00:00",
7
"id":"b03ec84-77fd-4270-813b-0c698943f7ce",
8
"title":"A nice title",
9
"resources":[
10
{
11
"path":"data/flow-results-example-1-data.json",
12
"name":"flow-results-example-1-data",
13
"access_method":"file",
14
"mediatype":"application/json",
15
"encoding":"utf-8",
16
"schema":{
17
"language":"eng",
18
"fields":[
19
{
20
"name":"timestamp",
21
"title":"Timestamp",
22
"type":"datetime"
23
},
24
{
25
"name":"row_id",
26
"title":"Row ID",
27
"type":"string"
28
},
29
{
30
"name":"contact_id",
31
"title":"Contact ID",
32
"type":"string"
33
},
34
{
35
"name":"session_id",
36
"title":"Session ID",
37
"type":"string"
38
},
39
{
40
"name":"question_id",
41
"title":"Question ID",
42
"type":"string"
43
},
44
{
45
"name":"response",
46
"title":"Response",
47
"type":"any"
48
},
49
{
50
"name":"response_metadata",
51
"title":"Response Metadata",
52
"type":"object"
53
}
54
],
55
"questions":{
56
"ae54d3":{
57
"type":"multiple_choice",
58
"label":"Are you male or female?",
59
"type_options":{
60
"choices":[
61
"male",
62
"female",
63
"not identified"
64
]
65
}
66
},
67
"ae54d7":{
68
"type":"multiple_choice",
69
"label":"Favorite ice cream flavor?",
70
"type_options":{
71
"choices":[
72
"chocolate",
73
"vanilla",
74
"strawberry"
75
]
76
}
77
},
78
"ae54d8":{
79
"type":"numeric",
80
"label":"How much do you weigh, in lbs?",
81
"type_options":{
82
"range":[
83
1,
84
250
85
]
86
}
87
},
88
"ae54da":{
89
"type":"open",
90
"label":"How are you feeling today?",
91
"type_options":{
92
93
}
94
},
95
"ae54db":{
96
"type":"geo_point",
97
"label":"Where are you?",
98
"type_options":{
99
100
}
101
}
102
}
103
}
104
}
105
]
106
}
Copied!
Additionally, there is a data file or API endpoint containing the Responses data described by this resource, in this example, data/flow-results-example-1-data.json. It provides all individual Responses in the following compact JSON format:
1
[
2
[ "2017-05-23T13:35:37.356-04:00", 20394823948, 923842093, 10499221, "ae54d3", "female", {"option_order": ["male","female"]} ],
3
[ "2017-05-23T13:35:47.012-04:00", 20394823950, 923842093, 10499221, "ae54d7", "chocolate", {} ],
4
[ "2017-05-24T15:15:37.981-04:00", 20394823952, 923842086, 10499224, "ae54d3", "male", {"option_order": ["male","female"]} ],
5
[ "2017-05-23T15:16:12.005-04:00", 20394823953, 923842086, 10499224, "ae54d7", "vanilla", {} ],
6
[ "2017-05-23T15:16:20.781-04:00", 20394823954, 923842086, 10499224, "ae54d8", 196, {} ],
7
[ "2017-05-23T15:16:38.119-04:00", 20394823955, 923842086, 10499224, "ae54da", "I am feeling curious.", {"type": "text", "language": "eng"} ],
8
[ "2017-05-23T17:25:12.722-04:00", 20394823956, 923842093, 10499227, "ae54da", "https://myexampleflowserver.org/resources/audio/20394823956.ogg", {"type": "audio", "language": "eng", "format": "audio/ogg"} ],
9
[ "2017-05-23T17:25:47.214-04:00", 20394823957, 923842093, 10499227, "ae54db", "[35.678323, -108.25343]", {} ]
10
]
Copied!
Last modified 2mo ago