[Request] Seeking arXiv cs.AI endorsement — independent researcher, LLM metacognition benchmark (live Kaggle leaderboard, 8 frontier models, N=69 human panel)

oliveirarct · April 21, 2026, 12:10am

Hey folks!

I’m an independent AI researcher seeking an arXiv endorsement for the cs.AI category (cross-list: cs.CL, cs.LG, stat.ML). This is my first arXiv submission and I don’t have an institutional affiliation, so I need a personal endorsement from someone who has published in a related category.

About the paper

Title: “The Metacognitive Probe: Decomposing LLM Self-Knowledge into Five Measurable Dimensions”

The paper presents a 5-task diagnostic benchmark that decomposes LLM self-knowledge into separately-measurable dimensions — confidence calibration, epistemic vigilance, knowledge boundaries, calibration range, and reasoning-chain validation. Standard benchmarks (MMLU, BIG-Bench, HELM) measure what models know; this instrument measures what models know about what they know.

Headline finding: A 47-point within-model dissociation in Gemini 2.5 Flash — it achieves the panel’s best within-task calibration (T1-CC = 88) but the worst cross-task confidence prediction (T4-CR = 41). Flash reports confidence ≈ 100 on every factoid, including ones it gets wrong. This has direct implications for confidence-gated deployment systems.

The benchmark is evaluated on 8 frontier models (Claude Opus/Sonnet, Gemini Pro/Flash, DeepSeek-R1, GLM-5, Qwen 3, Gemma 3) and a human calibration panel (N=69). All code, data, prompts, and scoring rubrics are publicly released.

Verifiable materials

Live Kaggle benchmark: https://www.kaggle.com/benchmarks/rctoliveira/metacognitive-probe-measuring-llm-self-awareness
Google DeepMind Hackathon entry (Measuring Progress Toward AGI — Cognitive Abilities track)
Happy to share the full PDF privately before you decide

Endorsement details

Category: cs.AI (primary), cross-list cs.CL, cs.LG, stat.ML
Endorsement code: I4G6HG
To endorse, the endorser needs to have submitted 3+ papers to any cs.* category on arXiv within the last 5 years

If you’re an active arXiv author in any of these categories and willing to help, I’d really appreciate it. The endorsement takes about 30 seconds — just clicking a link and confirming. I’m happy to send you the paper first if you’d like to review it.

Thanks for your time!

Rafael Oliveira

Topic		Replies	Views
I'm new researcher working on hallucination detector, currently i need help with arXiv for the Endorsement Research	2	45	March 2, 2026
Is an agent-harness evaluation preprint suitable for arXiv cs.AI? Research	2	50	May 2, 2026
Seeking arXiv cs.AI Endorsement — Independent Researcher, 4 Published Papers, NeurIPS 2026 Submission Beginners	2	34	April 14, 2026
Seeking arXiv Endorsement for cs.AI Preprint on Consciousness Architecture (Independent Researcher) Beginners	2	36	December 3, 2025
Looking for endorsor for arXiv Submission (cs.LG) Research	10	216	February 28, 2026

[Request] Seeking arXiv cs.AI endorsement — independent researcher, LLM metacognition benchmark (live Kaggle leaderboard, 8 frontier models, N=69 human panel)

About the paper

Verifiable materials

Endorsement details

Related topics