VITA

VITA

TLDR: VITA-1.5 is an open-source interactive multimodal LLM. It features reduced interaction latency, enhanced multimodal performance, improved speech processing, and a progressive training strategy. It outperforms on various benchmarks and can be trained and used for inference.

2024-08-10 Github