Now in Beta

The Safety Standard
for Embodied AI

Submit your robot policy, we simulate it, and our VLM judge delivers verdicts with 94% human-level accuracy.

Try BotArena View Leaderboard

94%

VLM Accuracy

Scenarios

<5min

Eval Time

Live Simulation

PASS

How It Works

From submission to verdict in under 5 minutes. No infrastructure to manage.

Submit Your Policy

Upload your trained robot policy as a Python file or ZIP archive. We support MuJoCo-based policies.

We Simulate

Your policy runs in our cloud MuJoCo environment. We record video and capture detailed metrics.

VLM Judges

Our Vision-Language Model analyzes the video and delivers a verdict with detailed reasoning.

terminal

# Install the CLI

$ pip install botarena

# Submit your policy

$ botarena submit my_policy.py --scenario messy_room_v1

Uploading... Done!

Simulating... 45s

Judging... Done!

Verdict: PASS (confidence: 0.92)

Video: https://botmanifold.com/submissions/abc123

Why BotArena?

The only platform that combines simulation, VLM evaluation, and community competition.

94%

Human Correlation

VLM-Powered Judging

Our Vision-Language Model analyzes simulation videos like a human expert. Research shows VLM judges achieve 94% correlation with human evaluators.

Real-time

Rankings

Public Leaderboard

Compete with the community. See how your policy stacks up against others. Track your progress over time.

Scenarios

Multiple Scenarios

Test your policy across diverse challenges: household tasks, manipulation, navigation, and more.

AI-Generated

Reports

Detailed Feedback

Get actionable insights. Our VLM explains why your policy passed or failed with specific observations.

<5min

Eval Time

Cloud Infrastructure

No GPUs to manage. No MuJoCo installations. Submit from anywhere and get results in minutes.

Free

To Start

Open Community

Share results, learn from others, and improve together. Join our Discord to connect with fellow roboticists.

Available Scenarios

More scenarios added regularly

Messy Room

Medium

Clean up scattered objects on a table

Object Sorting

Easy

Sort objects by color or category

Precision Placement

Hard

Place objects at exact target locations

The Platform Vision

BotArena is just the beginning. We're building the verification layer for robot AI.

NowLive

BotArena

The Benchmark

Submit policies, get VLM verdicts, compete on the leaderboard.

Policy submissionVLM judgingPublic leaderboardVideo results

Q3 2026

BotRegistry

Hugging Face for Robots

A model registry designed for embodied AI with robot-specific metadata.

URDF metadataVerification badges3D visualizationpip install

Q3 2026

BotCI

GitHub Actions for Robots

Continuous integration that tests your policy on every commit.

GitHub integrationRegression detectionPR commentsTeam dashboards

2027

BotCertify

The Safety Standard

Enterprise safety certification for production robot deployments.

Compliance reportsAudit trailsSafety scoresISO alignment

“When a company deploys a robot to a factory, the insurer asks: What's the BotCertify Score?”

Our vision for 2027 and beyond

The Safety Standardfor Embodied AI