UI TARS Desktop
ByteDance's open-source multimodal AI agent that controls your computer, browser, and terminal via vision
Released
2025-01
Country
China
API
Available
Self-Host
Yes
GitHub Stars
36,428
Last Reviewed
2026-06
About UI TARS Desktop
Our Verdict
The open-source answer to computer-use agents. UI TARS Desktop pairs vision-driven GUI control with local deployability and ByteDance's model firepower — a compelling pick for developers who want an auditable, self-hostable agent that operates real software.
Features
Detailed Ratings
Pros & Cons
Pros
- Genuine open-source computer-use agent — an alternative to closed options
- Vision-based, so it generalizes across arbitrary GUIs
- Runs locally on consumer hardware for privacy
- Active ByteDance-backed development, tens of thousands of stars
- Spans desktop, browser, and terminal from one stack
Cons
- GUI agents can be slow and occasionally unreliable
- Requires technical setup and a capable GPU for local runs
- Early-stage ecosystem compared to mature automation tools
Use Cases
Who Is It For?
Developers and power users who want an open-source, self-hostable computer-use agent that controls desktop, browser, and terminal via vision
Frequently Asked Questions
What is UI TARS Desktop?
UI TARS Desktop is an open-source multimodal AI agent stack by ByteDance. It uses vision-language models to perceive your screen and operate your computer, browser, and terminal through natural language — a self-hostable 'computer use' agent.
Is UI TARS Desktop free?
Yes. It is free and open-source. The UI-TARS-1.5-7B model can run on consumer hardware, so you can self-host the full agent locally. You only pay for compute if you use cloud GPUs or a remote model API.
How does UI TARS differ from Claude Computer Use?
Both are vision-driven computer-use agents. UI TARS is open-source and self-hostable with ByteDance's own vision-language models, whereas Claude Computer Use is a closed, cloud-based capability. UI TARS also spans desktop, browser, and terminal from one stack.
Can UI TARS run locally on my own machine?
Yes. The UI-TARS-1.5-7B model is sized to run on most consumer hardware, so you can keep screenshots and agent execution entirely on your own device for privacy.
Related Agents
Excel
MCP Servers
A Model Context Protocol server for Excel file manipulation
n8n
Productivity Agents
Open-source workflow automation with a visual AI agent builder
AdWeave — Meta Ads
MCP Servers
Meta Ads MCP server with 47 tools for campaigns, creatives, audiences, and insights
Ag2
MCP Servers
AG2 is the open-source Python framework for building, orchestrating, and scaling multi-agent AI systems
Top Alternatives
Compare with these similar tools