Kidoio Methodology

Behavioural Measurement for Children's Digital Content

Version 0.1 · April 2026 · Draft for advisory review

Studio age ratings describe what children's content forbids. They do not describe what it teaches, what behaviour it models, what it rewards, or which developmental band the content is genuinely appropriate for. Kidoio is a measurement layer that produces, for any video intended for child audiences, a structured behavioural profile and a recommended developmental age band — anchored only on frameworks with strong empirical support, with contested frameworks explicitly excluded from the scoring engine.

Contents

  1. Purpose and scope
  2. The measurement gap
  3. Taxonomy v1
  4. Evidence-quality tiering
  5. The scoring engine
  6. Recommended age bands
  7. Limitations
  8. Versioning & governance
  9. Acknowledgements
  10. References

1. Purpose and scope

1.1 What Kidoio measures

Kidoio analyses children's digital content — initially video, with games and interactive media to follow — and produces:

1.2 What Kidoio does not measure

Kidoio is a measurement of content, not children. It does not:

These boundaries are not stylistic; they are load-bearing. Crossing them would put Kidoio in regulatory and methodological territory it is not built for.

1.3 Audience

Three audiences are served simultaneously: platforms (streaming services, content libraries, kids' EdTech) for pre-release content audit and compliance documentation; regulators and policy bodies as measurement infrastructure for emerging child-safety obligations under DSA Article 28, the UK Online Safety Act, COPPA reauthorisation, and child provisions of the EU AI Act; and researchers as structured stimulus metadata for downstream developmental, neuroscientific, and educational research.


2. The measurement gap

Existing classification systems for children's content fall into three categories. Each is necessary; none is sufficient.

Studio and statutory age ratings (PEGI, ESRB, BBFC, MPAA, Common Sense Media age levels) describe forbidden content with reasonable accuracy: graphic violence, sexual content, profanity. They do not describe what behaviour the content models, what reward structure it uses, or how it paces cognitive load. A 7+ animated series can pass its age rating while explicitly rewarding physical aggression with status elevation in every conflict. A 3+ preschool show can pass its age rating while consisting of rapid-cut sequences with near-zero narrative dependency. Both situations occur in widely-distributed contemporary children's content.

Editorial reviews (Common Sense Media, IMDb Parents Guide, parent-led blogs) provide context that ratings do not. They are valuable, slow, expensive per title, and depend on reviewer judgement that varies between titles and across time. They do not scale to a streaming catalog of tens of thousands of titles, and they cannot document compliance with regulation that requires platform-level content stewardship.

Trust & Safety tooling (Bark, Qustodio, Net Nanny, Kidas, SuperAwesome) addresses the runtime side: filtering, monitoring child usage, blocking inappropriate apps. These are valuable. They do not analyse what content is teaching — they treat content as opaque.

The missing layer is structural measurement of content itself: what does this video model? what does it reward? how is it paced? for what developmental band is its profile appropriate? Kidoio is that layer.


3. Taxonomy v1

3.1 Behavioural dimensions

Each dimension is described below alongside its Tier 1 anchor — the framework that load-bears the corresponding scoring rule.

3.1.1 Empathy and emotion-naming

Observable moments where characters acknowledge another's emotional state, express concern about harm, apologise, or engage in emotional repair. Particular attention to whether emotions are named (vocabulary) and whether mixed or self-conscious emotions are handled appropriately for the content's apparent target band.

Tier 1 anchor: RULER (Brackett, Yale Center for Emotional Intelligence). RULER has cluster RCT evidence in 62 schools / 3,824 students [Hagelskamp et al., 2013] and is listed as evidence-based by CASEL. Gottman emotion coaching reinforces the construct.

Orienting frame in copy only: Selman's perspective-taking stages.

3.1.2 Aggression

Observable instances of physical force, verbal threats, intimidation, and destructive action. Cartoon-stylised aggression and live-action aggression are distinguished, because the latter belongs in the content-maturity layer. Within the behavioural layer, aggression is evaluated on three sub-questions: how much is present, how the narrative reacts to it (rewarded, neutral, punished), and what alternative responses are modelled.

Tier 1 anchors: Bandura's social learning theory (one of the most reliable findings in developmental psychology); Tremblay's developmental aggression curve, which calibrates expectations — physical aggression peaks at 24-36 months and declines, against common assumption.

3.1.3 Cooperation versus dominance

For each conflict, the resolution mode is classified — collaboration, hierarchical authority, domination, or unresolved. The aggregate distribution characterises whether the content is cooperation-dominant, balanced, or dominance-dominant.

Tier 1 anchors: Bandura modelling reinforced by Gottman's longitudinal work on conflict patterns and child outcomes.

Orienting frame in copy only: Kohlberg's moral reasoning stages.

3.1.4 Conflict resolution style

The methods by which conflicts conclude: dialogue, hierarchical authority, physical force, avoidance, or no resolution. Particular attention to the presence of repair attempts after rupture — whether characters acknowledge harm and reconnect.

Tier 1 anchors: Gottman emotion coaching and the related "four horsemen" framework (criticism, contempt, defensiveness, stonewalling); Greene's Collaborative & Proactive Solutions [multiple RCTs reducing externalising behaviour, ~50% of treated youth diagnosis-free post-treatment] for the framing of behaviour as lagging skill rather than moral failure.

3.1.5 Reward framing

What behaviour does the narrative reward? Praise, success, status elevation, material gain, social approval — and whether these follow prosocial behaviour, neutral behaviour, or aggressive/dominance-based behaviour. Particular attention to whether reward is for process (effort, strategy, persistence) or for outcome (winning, talent, innate ability).

Tier 1 anchor: Bandura modelling — reinforcement schedules visible to observers shape acquisition.

Drop: Dweck's "growth mindset" intervention construct. Macnamara & Burgoyne (2023) meta-analysis of 63 studies / ~98,000 participants reported overall effect d = 0.05 (n.s. after publication-bias correction); the highest-quality subset returned d = 0.02. The behaviour of praising process is preserved in copy and feature design (it does no harm and aligns with self-determination theory), but mindset science is not cited as a scoring anchor.

3.1.6 Cognitive engagement

The structural characteristics of pacing and narrative dependency. Four sub-signals are evaluated: stimulus density (rate of audiovisual change), repetition, narrative dependency (whether later scenes require understanding of earlier ones), and cognitive demand (markers of cause-effect, intent tracking, problem-solving). A specific structural pattern — high stimulus density combined with low narrative dependency — is labelled as a brain-rot pattern and treated as distinct from intentional simplicity in early-learning content.

Tier 1 anchors: Vygotsky's zone of proximal development; AAP (2016, 2023) and WHO (2019) screen-time and digital-media guidance for the precautionary thresholds.

Orienting frames in copy only: Piaget's progression direction (literal stage ages explicitly not used as gating logic, because modern developmental psychology has revised those ages downward and made them more domain-specific). Greenspan's functional emotional milestones as reference frame.

3.1.7 Self-regulation modelling

Whether characters demonstrate strategies for emotion regulation — distraction, reframing, requesting support, naming the feeling — or whether they rely on pure willpower or external restraint.

Tier 1 anchor: Gottman emotion coaching's documented effects on regulatory outcomes.

Drop: Mischel's marshmallow-test predictive narrative. Watts, Duncan & Quan (2018) replication with larger and more representative N showed the predictive correlation reduced to roughly half the original size, and reduced by two thirds when controlling for SES, family background, and early cognitive ability. The strategies (distraction, reframing) remain valid as feature patterns; the predictive narrative around delay-of-gratification as a marker of future success is excluded from scoring.

3.1.8 Adult-voice / parenting style modelled

Where adult characters are present, what parenting voice is modelled: authoritative (warm + structured + reasoning), authoritarian (controlling, low warmth), permissive (warm, low structure), uninvolved, or shaming. The dimension is descriptive of the adult voice the content offers as an ambient model.

Tier 1 anchor: Baumrind's parenting-styles typology. Authoritative parenting is consistently linked to better child outcomes across multiple cultures and decades.

3.1.9 Body and autonomy framing

For content that addresses bodies, growth, sexuality, or physical autonomy: whether the language is anatomically correct or euphemistic; whether bodily autonomy is represented; whether forced sharing or forced apology is modelled as moral teaching; whether consent is depicted appropriately for the content's apparent band.

Tier 1 anchor: AAP guidance and Planned Parenthood clinical recommendations for age-appropriate body language and consent education.

3.2 Content-maturity layer

A separate layer tracks the kind of content present — distinct from how content models behaviour. Maturity themes: real (live-action) violence, on-screen death, drug use, alcohol and tobacco, sexual content, nudity, profanity, graphic imagery, disturbing themes (torture, abuse, kidnapping), crime glamorisation, political content, religious content. Each instance is timestamped, intensity-graded, and tagged with the age bands for which it is inappropriate.

This layer does not attempt to model what such content "does" to viewers. It documents what is present.

3.3 Recommended age band

The recommended developmental age band is one of seven:

BandErikson tension (orienting frame)Approximate age
InfancyTrust vs. mistrust0–18 months
ToddlerhoodAutonomy vs. shame and doubt18 months – 3 years
PreschoolInitiative vs. guilt3–6 years
Early schoolIndustry vs. inferiority (early)6–9 years
Late childhoodIndustry vs. inferiority (deepening)9–12 years
Early adolescenceIdentity vs. role confusion (early)12–15 years
Late adolescenceIdentity consolidation; intimacy vs. isolation begins15–18 years

Erikson's framework provides the labels; scoring does not depend on Erikson's predictive claims. Bands are also rendered as a single-number equivalent (9+, 12+, 16+) for compatibility with legacy rating systems.


4. Evidence-quality tiering

This section is the crux of the methodology and the principal artefact for advisory review.

4.1 The three-tier evidence framework

Frameworks are tiered by the quality and quantity of empirical evidence supporting their constructs and predictions, not by popularity, theoretical elegance, or familiarity to general audiences. This distinction is load-bearing: many widely-cited frameworks in popular parenting culture (growth mindset, popular EQ, marshmallow predictive) have not survived rigorous replication, and Kidoio does not depend on them.

Tier 1 — Strong evidence

Frameworks with multiple RCTs, large meta-analyses, or robust cross-cultural replication. Used as load-bearing anchors in the scoring engine.

FrameworkBest supporting evidenceAnchors
Bowlby & Ainsworth — AttachmentVan IJzendoorn & Kroonenberg (1988) meta-analysis, 32 studies, 8 countries; modern cross-cultural replicationsUnder-3 design principles
CASEL — Social & Emotional LearningDurlak et al. (2011) meta-analysis: 213 studies, ~270,000 students, Hedges g = 0.22 on academic achievement; 2023 contemporary meta-analysesTop-level taxonomy structure
Brackett — RULERHagelskamp et al. (2013) cluster RCT, 62 schools, 3,824 students; replicated across grades; CASEL-listedEmpathy and emotion-naming
Greene — CPSMultiple RCTs reducing externalising behaviour; ~50% of treated youth diagnosis-free post-treatment vs 0% waitlist; California EBC listingConflict resolution; "lagging skill" framing
Gottman — Emotion coachingLongitudinal work linking parental meta-emotion to child vagal tone, regulation, peer competence; meta-analyses of emotion-socialisation parentingConflict resolution; adult voice; self-regulation
Bandura — Social learning theoryBobo doll plus replications; observational acquisition is among the most replicated findings in developmental psychologyAggression; cooperation; reward framing
Tremblay — Developmental aggressionLarge longitudinal cohorts; physical-aggression peak at 24-36 months replicates across countriesCalibration of expected aggression for early-childhood content
Vygotsky — ZPD & scaffoldingDecades of educational research; aligned with effective tutoring evidenceCognitive engagement
Baumrind — Parenting stylesAuthoritative parenting consistently linked to better outcomes across cultures and decadesAdult-voice / parenting-style
AAP & WHO — Screen-timePractice guidelines synthesised from accumulated cohort and experimental evidenceCognitive thresholds; under-3 content rules

Tier 2 — Moderate evidence

Useful theoretical frames with mixed empirical record. Used as orienting frames in copy and parent-facing language; not used as scoring gates.

FrameworkUse in Kidoio
Erikson — Psychosocial stagesLabels for the seven age bands; copy and parent-facing framing
Piaget — Cognitive stagesDirection of progression in narrative description; literal stage ages excluded
Kohlberg — Moral reasoningStory-design narrative framing for moral content by band
Selman — Perspective-takingEmpathy-content narrative description
Greenspan — DIR / FloortimeReference frame for early-stage milestones
Siegel & Bryson — Whole-Brain ChildPractitioner moves cited in copy ("connect-then-redirect", "name it to tame it")
Bronfenbrenner — Ecological systemsDesign lens for family-context features
Brazelton — TouchpointsParent-facing copy on developmental regressions

Tier 3 — Contested or weak evidence

Frameworks whose strongest empirical claims have not survived rigorous replication, or whose constructs have known measurement problems. Not used as scoring anchors. Where useful behavioural patterns from these frameworks survive, the patterns are preserved without the framework branding.

FrameworkIssueWhat Kidoio retains, if anything
Dweck — Growth mindsetMacnamara & Burgoyne (2023): 63 studies, ~98k participants; overall d = 0.05 (n.s.) after publication-bias correction; d = 0.02 in highest-quality subsetThe behaviour of praising process over innate ability, in copy and feature design. Mindset branding is not used.
Mischel — Marshmallow predictiveWatts, Duncan & Quan (2018): predictive correlation halved with larger N; reduced by two thirds with SES controlsThe strategies (distraction, reframing) as feature patterns. Delay-of-gratification predictive narrative is not used.
Goleman — popular EQDiscriminant-validity problems; "mixed EI" measures overlap heavily with personality and general intelligenceThe practical taxonomy under CASEL's banner, not Goleman's. "EQ as magic capability" framing is not used.
Haidt / Twenge — smartphones cause teen mental-illness epidemicMost rigorous large preregistered studies (Odgers, Orben, Przybylski) show small associations indistinguishable in size from mundane correlatesThe well-evidenced specifics (no algorithmic feeds <16, no public follower counts <16, no infinite scroll, no pornographic-content exposure <14, no loot-box mechanics). The blanket causal claim is not used.

4.2 Why this matters

A measurement methodology cited by regulators, platforms, and researchers must be defensible against reviewers who know the evidence-quality literature. Anchoring scoring on Tier 3 frameworks would invite avoidable methodological challenge. Anchoring on Tier 1, citing Tier 2 only as orienting language, and explicitly excluding Tier 3 from the scoring engine produces a methodology that survives scrutiny.

This tiering is not static. As the evidence base evolves — new RCTs, new failed replications — frameworks will move between tiers, and the methodology version will be incremented accordingly.


5. The scoring engine

5.1 Architecture

Content analysis flows in three stages:

  1. Extraction. A multimodal foundation model (currently Gemini 1.5/2.5) processes the source content end-to-end and produces a strictly-structured JSON payload enumerating observable instances within each behavioural and maturity dimension. Each instance includes a timestamp, a description, and metadata fields appropriate to the dimension. The model is constrained to enumerate, not to judge: prompts explicitly forbid evaluative or diagnostic language.
  2. Scoring. A deterministic rule engine maps the extraction payload into categorical scores per dimension and an overall recommended age band. Rules are documented, versioned, and traceable: every score derives from a stated rule applied to a stated count or pattern. Scores do not "emerge from a model"; they are computed.
  3. Explanation. A plain-language summary is generated, citing the specific instances that drove each score. The explanation may be model-generated or template-rendered; in both cases it is constrained to refer only to instances present in the extraction payload.

This separation between extraction and scoring is load-bearing. The extraction model can be replaced as the foundation-model state of the art evolves; the scoring rules remain stable and continue to produce comparable scores across models.

5.2 Auditability

Every score in a Kidoio profile is traceable to: the taxonomy version under which the dimension was defined; the prompt version used to extract observations from source content; the scoring-rules version applied to those observations; the specific observations (with timestamps) that drove the score; the Tier 1 framework anchoring the rule.

A reviewer who disagrees with a score can examine the underlying rule and the underlying observation, and adjudicate at either level. There is no "model said so" component in the scoring layer.

5.3 Determinism

The same extraction payload, scored under the same rule version, will produce the same scores. The extraction step (depending on a stochastic foundation model) is not fully deterministic, but its output schema is strict and its output is fully inspectable; reproducibility is empirical (run-twice agreement) rather than guaranteed by construction. This is acknowledged limitation and is a focus of ongoing measurement (see §7).


6. Recommended age bands

6.1 Derivation

The recommended age band is the lowest band for which the content's joint behavioural and maturity profile is appropriate, where "appropriate" is determined by the band-specific guidance synthesised from Tier 1 anchors and Tier 2 orienting frames as applied to each band.

The derivation is not a simple lookup against the maturity layer. A piece of content with no mature themes can still be recommended for an older band if its cognitive complexity, conflict patterns, or moral framing is mismatched with younger bands. Conversely, content with low cognitive complexity and no problematic behavioural patterns can be recommended for a younger band even if some mature elements (e.g. brief parental conflict shown with repair) are present, because the framing is appropriate for that band.

6.2 Band-by-band synthesis (summary)

A full per-band reference is published separately. In summary:

6.3 Output presentation

The recommended band is presented as both the band label ("late childhood 9–12 yr") and the single-number equivalent (9+) for legacy compatibility. Where a comparison rating is provided by the user, a comparison line is rendered: "Studio rated 7+; Kidoio recommends 9-12 yr (9+)." No comparison is rendered when no studio rating is provided; the recommendation stands on its own.


7. Limitations

7.1 Empirical limitations

7.2 Methodological limitations

7.3 What we explicitly do not claim


8. Versioning & governance

8.1 Version pins

Three version numbers travel with every Kidoio profile:

A profile produced under a given version pin remains valid against that version regardless of subsequent changes. Old profiles are not silently re-scored. Where a buyer requires up-to-date scores, content is re-analysed under the current versions and the audit trail records both runs.

8.2 Governance of changes

Changes to taxonomy, prompts, or scoring rules go through internal review against the Tier 1 / Tier 2 / Tier 3 evidence framework; external advisory review; publication of the change with rationale, version increment, and effective date; and audit-trail update. Backward-incompatible changes that would alter scores under earlier versions require a major taxonomy version bump.


9. Acknowledgements

The Tier 1 / Tier 2 / Tier 3 evidence framework was developed for this methodology drawing on a structured reference compiled by the founder, with synthesis of the contemporary developmental-psychology literature. Errors of synthesis are the founder's; the underlying scholarship is the cited authors'.

This methodology is currently in advisory-review draft. Named advisors will be added in a subsequent revision following formal advisory engagement. Researchers interested in advisory engagement are invited to contact hello@kidoio.com.


10. References

Ainsworth, M., Blehar, M., Waters, E., & Wall, S. (1978). Patterns of Attachment.

Bandura, A. (1977). Social Learning Theory.

Baumrind, D. (1971). Current patterns of parental authority. Developmental Psychology Monographs, 4(1).

Bowlby, J. (1969–1980). Attachment and Loss (vols. 1-3).

Brackett, M. (2019). Permission to Feel.

Durlak, J., Weissberg, R., Dymnicki, A., Taylor, R., & Schellinger, K. (2011). The impact of enhancing students' social and emotional learning: a meta-analysis of school-based universal interventions. Child Development, 82(1), 405-432.

Erikson, E. (1950, 1968). Childhood and Society; Identity: Youth and Crisis. [Cited as orienting frame only.]

Goleman, D. (1995). Emotional Intelligence. [Cited as Tier 3 popular synthesis; not used as scoring anchor.]

Gottman, J. (1997). Raising an Emotionally Intelligent Child.

Greene, R. (1998, 2014). The Explosive Child; Lost at School.

Greenspan, S. (1997). The Growth of the Mind. [Cited as Tier 2 reference frame only.]

Hagelskamp, C., Brackett, M., Rivers, S., & Salovey, P. (2013). Improving classroom quality with the RULER approach. American Journal of Community Psychology, 51(3-4), 530-543.

Haidt, J. (2024). The Anxious Generation. [Precautionary specifics adopted; blanket causal claim not adopted.]

Kohlberg, L. (1981). The Philosophy of Moral Development. [Tier 2 orienting frame only.]

Macnamara, B., & Burgoyne, A. (2023). Do growth mindset interventions impact students' academic achievement? Psychological Bulletin, 149(3-4), 133-173.

Mischel, W. (1972). Cognitive and attentional mechanisms in delay of gratification. Journal of Personality and Social Psychology, 21(2), 204-218. [Replication: Watts, Duncan & Quan 2018.]

Owens, E., Behun, R., Manning, J., & Reid, R. (2012). The impact of internet pornography on adolescents: a review of the research. Sexual Addiction & Compulsivity, 19(1-2), 99-122.

Piaget, J. (1952, 1972). The Origins of Intelligence in Children. [Tier 2; literal stage ages not used.]

Selman, R. (1980). The Growth of Interpersonal Understanding. [Tier 2 only.]

Siegel, D., & Bryson, T. (2011). The Whole-Brain Child. [Tier 2 practitioner synthesis.]

Tremblay, R. E. (2004). The development of aggressive behaviour during childhood. International Journal of Behavioral Development, 24(2), 129-141.

Van IJzendoorn, M., & Kroonenberg, P. (1988). Cross-cultural patterns of attachment: a meta-analysis of the Strange Situation. Child Development, 59(1), 147-156.

Vygotsky, L. (1978). Mind in Society.

Watts, T., Duncan, G., & Quan, H. (2018). Revisiting the marshmallow test. Psychological Science, 29(7), 1159-1177.

American Academy of Pediatrics. (2016, 2023). Media Use in School-Aged Children and Adolescents.

CASEL. Social and Emotional Learning Competency Framework.

World Health Organization. (2019). Guidelines on Physical Activity, Sedentary Behaviour and Sleep for Children Under 5 Years of Age.