How to author items for reporting?

How metadata and variable naming choices affect TAO Insights data

When creating items in TAO, some authoring decisions have a direct impact on the data that will later be extracted and analyzed. Two areas matter especially: item metadata and item variables. Metadata gives analytical context to each item, while variable naming determines how responses will appear in exports and API results. In TAO Insights, item metadata is exposed with item-level result data, and item responses are exposed under the response identifiers defined in authoring.

1. Item metadata: what gives meaning to the results

If a reporting dimension matters later, it should exist as metadata on the item from the start. Typical examples include competency, grade level, domain, difficulty, strand, form, or pilot status. TAO Insights exposes item metadata in result data through item-level metadata fields, which makes it possible to filter, group, and interpret results in a meaningful way.

The main point for authors is simple: without metadata, there is little context for analysis. You may still retrieve scores and responses, but you lose the ability to easily answer questions such as:

How did learners perform by competency?
What is the average success rate by grade level?
Are difficult items behaving differently from easier ones?

So metadata should not be treated as optional decoration. It is part of the reporting model.

Example of a possible metadata schema in TAO Authoring:

And how it is made available for each itemResults available in the data extracts:

{
"metadata": {
"Label": "Luxembourg's neighbors",
"Measured competency": "Reading Comprehension",
"Grade": "Grade 8",
"Subject matter": "English"
},
...
},

2. Response identifiers and choice identifiers: what the result file will actually contain

For item responses, there are two levels to think about.

The first level is the response identifier. In TAO result exports, responses appear under responses.RESPONSE_x, and TAO explicitly states that this value is the response identifier defined for the interaction in authoring. It is editable from the item authoring screen.

The second level is the value stored inside that response. For many closed-ended interactions, that stored value is not the text shown to the learner, but an identifier-based value. TAO’s export reference shows that identifier-type responses are serialized as identifiers, including single, multiple, and ordered values.

That means that for closed-ended items, both the response identifier and the choice identifiers matter.

In practice:

the response identifier tells you which variable to look at
the choice identifier(s) tell you what was actually selected

So for reporting purposes, authors should not only ask “What is my response variable called?” but also “What exact code will be stored if the learner selects this option?”

For example in this item:

The interaction response identifier is choice_lux_border

Each choice has its own human readable identifier, e.g. belgium, germany, austria, denmark

And the result extract will look like this if the test-taker selected both “Austria” and “Denmark” choices.

{
"responses": [
{
"choice_lux_border": {
"correct": true,
"value": "['austria'; 'denmark']"
}
}
],
}

3. Why this matters for cross-item analysis

This becomes critical when multiple items are supposed to represent the same thing across forms, grades, or languages.

Sometimes the same item cannot be reused as-is. There may be a translation, a slight wording adaptation for another grade, or a parallel item built for another population. In those cases, analysts will often still want to correlate or reconcile the responses across items.

That only works if the stored values are aligned.

If two items are intended to measure the same construct, but one item stores:

A, B, C

and the other stores:

choice_1, choice_2, choice_3

then the data may still be technically extractable, but reconciliation becomes manual, error-prone, and expensive.

If the items cannot be reused, then at least the choice identifiers should correspond across the different items whenever the underlying meaning is the same. Otherwise, downstream correlation becomes close to impossible.

This is the core message of the guide: authoring choices become data model choices.

4. Naming strategies for item variables

There is no single perfect naming convention. The right approach depends on whether the priority is technical stability, human readability, or cross-item comparability.

Here are four workable approaches.

TAO Default behavior

TAO’s uses RESPONSE as the standard response variable example in exported data, and generic choice labels such as choice_1, choice_2, etc. So the default pattern is typically functional, but not very meaningful for reporting.

Item ID: item-1
Response ID: RESPONSE

Item content:
Question: Which fraction is equivalent to 1/2?

Choices:

choice_1 → 2/4
choice_2 → 3/4
choice_3 → 1/3
choice_4 → 4/5

Stored response if learner selects 2/4:
responses.RESPONSE = choice_1

A. Purely technical

Item ID: MATH_001
Response ID: MATH_001

Item content:
Question: Which fraction is equivalent to 1/2?

Choices:

MATH_001_A → 2/4
MATH_001_B → 3/4
MATH_001_C → 1/3
MATH_001_D → 4/5

Stored response if learner selects 2/4:
responses.MATH_001 = MATH_001_A

Advantages

easy to generate systematically
low risk of collision
easy to trace back to a specific item

Limit

not very readable for analysts

B. Human-readable

Item ID: reading-main-idea-grade4
Response ID: main_idea

Item content:
Question: What is the main idea of the passage?

Choices:

main_idea → The passage explains why bees are important for plants.
supporting_detail → Bees can travel long distances.
irrelevant_detail → Some flowers are yellow.
too_narrow → One farmer noticed fewer apples one year.

Stored response if learner selects the correct answer:
responses.main_idea = main_idea

Best when exports are reviewed directly by analysts or business users.

Advantages

easier to understand in raw exports
easier to explain to customers

Limit

requires discipline to keep names short and consistent

C. Hybrid

Item ID: RDG_G4_014
Response ID: main_idea

Item content:
Question: What is the author’s main message in this text?

Choices:

mi_correct → Recycling helps reduce waste and protect resources.
mi_example → One student started recycling at school.
mi_detail → Paper can be sorted into different bins.
mi_offtopic → Some bins are blue and others are green.

Stored response if learner selects the correct answer:
responses.main_idea = mi_correct

Usually the most practical option.

D. Cross-item anchored

This is the useful one when different items should still produce comparable values.

English item

Item ID: ENG_G4_021
Response ID: agreement_level

Item content:
Question: Reading every day helps me understand stories better.

Choices:

strongly_agree → Strongly agree
agree → Agree
disagree → Disagree
strongly_disagree → Strongly disagree

Stored response if learner selects “Agree”:
responses.agreement_level = agree

French item

Item ID: FRA_G4_021
Response ID: agreement_level

Item content:
Question: Lire chaque jour m’aide à mieux comprendre les histoires.

Choices:

strongly_agree → Tout à fait d’accord
agree → D’accord
disagree → Pas d’accord
strongly_disagree → Pas du tout d’accord

Stored response if learner selects “D’accord”:
responses.agreement_level = agree

This is exactly the kind of setup that makes cross-language reconciliation possible, because the displayed text changes, but the stored values remain aligned.

5. A practical rule of thumb

A useful way to think about this is:

metadata tells you what the item is about
response identifiers tell you where to find the answer
choice identifiers tell you what answer was given

If reporting will only ever happen item by item, almost any naming scheme can work.

But if there is any possibility that results from several items will later need to be grouped, compared, merged, or correlated, then variable naming must be designed deliberately from the beginning.

How to author items for reporting?

This article describes best practices for authoring items to optimize reporting.

How metadata and variable naming choices affect TAO Insights data

1. Item metadata: what gives meaning to the results

2. Response identifiers and choice identifiers: what the result file will actually contain

3. Why this matters for cross-item analysis

4. Naming strategies for item variables

TAO Default behavior

A. Purely technical

B. Human-readable

C. Hybrid

D. Cross-item anchored

English item

French item

5. A practical rule of thumb