Marnie — Analytics

Data source

Production

From

Run type

Analytics

Research

Total Runs

—

Completed

—

Total Tokens

—

Avg Loops

—

Convergence Rate

—

Avg Runtime

—

seconds

Runs per day (30 days)

By Condition

Condition	Runs	Avg Tokens	Conv.%
Loading…

By Genre

Genre	Runs
Loading…

Rows

Columns

Measure

Filter

Loading…

B2 — Inter-Rater Reliability tracking. Upload sampled critique labels (agent vs. human) to compute Cohen's κ.

Batch ID	n Critiques	Cohen's κ	Agreement Rate	Date
Loading…

Critique ID	Title	Agent Label	Human Label	Agreement	Rater
Select a batch.

B3 — Statistical test results. Enter results from Wilcoxon signed-rank or Mann-Whitney U tests. Results auto-format for paper citation.

Test	Conditions	Metric	Statistic	p-value	Effect (r)	95% CI	n A / n B	Citation
No results yet.

Datasets

Each dataset is a group of runs sharing a source label. Deleting a dataset removes all its runs.

Loading…

Add or update runs

Paste a JSON array of run records (or upload a .json file). Existing run_ids are updated; new ones are inserted. Set a dataset label to stamp every imported row with the same source.

Dataset label (optional) Upload JSON

Browse & remove

	Run ID	Dataset	Condition	Lead	Manuscript	A	C	I
Search to list runs.

Download a fully self-contained study report (works offline). Source is locked to Testing for study exports.

Run counts by condition (testing)

Completed runs from your admin account. Select one or more runs, choose a run type, and assign them to the analytics dashboard.

Select all

	Run	Date	Loops	Assignment	Actions
Loading…

Assign selected: