
ByteDance Releases UI-TARS-1.5: An Open-Source Multimodal AI Agent Built upon a Powerful Vision-Language Model
ByteDance has released UI-TARS-1.5, an updated version of its multimodal agent framework focused on graphical user interface (GUI) interaction and game environments. Designed as a vision-language model capable of perceiving screen content and performing interactive […]