UI-TARS is an open-source multimodal “GUI agent” created by ByteDance: a model designed to perceive raw screenshots (or rendered UI frames), reason about what needs to be done, and then perform real interactions with graphical user interfaces (GUIs) — like clicking, typing, navigating menus — across desktop, browser, mobile, or game environments. Rather than relying on rigid, manually scripted UI automation, UI-TARS uses a unified vision-language model (VLM) that integrates perception, reasoning, grounding, and action into one end-to-end framework: it “thinks before acting,” enabling flexible, general-purpose automation. This allows it to perform complex, multi-step tasks such as filling forms, downloading files, navigating applications, and even controlling in-game actions — all by understanding the UI as a human would. The project is open-source, supports deployment locally or remotely, and offers a foundation for building GUI automation agents that are more robust, and adaptable.

Features

  • Vision-language model-based GUI agent: perceives raw screenshots and reasons about UI context
  • Unified action space: supports clicks, typing, gestures, hotkeys across desktop, browser, mobile, and games
  • “Think-then-act” decision-making: performs internal reasoning (task decomposition, planning, reflection) before executing actions
  • Cross-platform GUI control: works across different operating systems, browsers, and application contexts
  • End-to-end automation: capable of carrying out full workflows (forms, downloads, navigation, game controls) without custom scripts per UI
  • Open-source with published inference scripts and models — enabling reproducibility and customization

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow UI-TARS

UI-TARS Web Site

Other Useful Business Software
The Secure And Reliable File Transfer Solution That You Control. Icon
The Secure And Reliable File Transfer Solution That You Control.

Helping IT professionals responsibly secure the world's data

Cerberus offers a variety of secure file transfer solutions to fit businesses of any size or business sector, including finance, technology, education, publishing, law offices, local, state, and federal government agencies, hospitals and many more.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of UI-TARS!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2025-12-01