Now in Beta

The Safety Standard
for Embodied AI

Submit your robot policy, we simulate it, and our VLM judge delivers verdicts with 94% human-level accuracy.

94%
VLM Accuracy
3+
Scenarios
<5min
Eval Time
Live Simulation
PASS

How It Works

From submission to verdict in under 5 minutes. No infrastructure to manage.

01

Submit Your Policy

Upload your trained robot policy as a Python file or ZIP archive. We support MuJoCo-based policies.

02

We Simulate

Your policy runs in our cloud MuJoCo environment. We record video and capture detailed metrics.

03

VLM Judges

Our Vision-Language Model analyzes the video and delivers a verdict with detailed reasoning.

terminal
# Install the CLI
$ pip install botarena
# Submit your policy
$ botarena submit my_policy.py --scenario messy_room_v1
Uploading... Done!
Simulating... 45s
Judging... Done!
Verdict: PASS (confidence: 0.92)
Video: https://botmanifold.com/submissions/abc123

Why BotArena?

The only platform that combines simulation, VLM evaluation, and community competition.

94%
Human Correlation

VLM-Powered Judging

Our Vision-Language Model analyzes simulation videos like a human expert. Research shows VLM judges achieve 94% correlation with human evaluators.

Real-time
Rankings

Public Leaderboard

Compete with the community. See how your policy stacks up against others. Track your progress over time.

3+
Scenarios

Multiple Scenarios

Test your policy across diverse challenges: household tasks, manipulation, navigation, and more.

AI-Generated
Reports

Detailed Feedback

Get actionable insights. Our VLM explains why your policy passed or failed with specific observations.

<5min
Eval Time

Cloud Infrastructure

No GPUs to manage. No MuJoCo installations. Submit from anywhere and get results in minutes.

Free
To Start

Open Community

Share results, learn from others, and improve together. Join our Discord to connect with fellow roboticists.

The Platform Vision

BotArena is just the beginning. We're building the verification layer for robot AI.

NowLive

BotArena

The Benchmark

Submit policies, get VLM verdicts, compete on the leaderboard.

Policy submissionVLM judgingPublic leaderboardVideo results
Q3 2026

BotRegistry

Hugging Face for Robots

A model registry designed for embodied AI with robot-specific metadata.

URDF metadataVerification badges3D visualizationpip install
Q3 2026

BotCI

GitHub Actions for Robots

Continuous integration that tests your policy on every commit.

GitHub integrationRegression detectionPR commentsTeam dashboards
2027

BotCertify

The Safety Standard

Enterprise safety certification for production robot deployments.

Compliance reportsAudit trailsSafety scoresISO alignment

“When a company deploys a robot to a factory, the insurer asks: What's the BotCertify Score?

Our vision for 2027 and beyond