Feb 11, 2025
ByteDance

ByteDance's OmniHuman-1: Transforming Photos into Dynamic Virtual Humans

ByteDance launched OmniHuman-1. It can quickly turn a single photo into a talking, moving virtual human with lip sync, full-body movement and rich expressions, going beyond traditional deepfake tech.

More
Jan 25, 2025
ByteDance

ByteDance Unveils PaSa: An AI-Powered Academic Paper Search Solution

ByteDance Unveils PaSa: An AI-Powered Academic Paper Search Solution

ByteDance launched PaSa, an intelligent academic paper search agent. It aims to solve the problems of complex query handling in academic research and helps researchers save time in literature retrieval.

More
ByteDance Unveils PaSa: An AI-Powered Academic Paper Search Solution
Jan 23, 2025
Bytedance

Bytedance lanches Seed Edge for frontier AGI research

In late January, ByteDance officially launched a research project codenamed "Seed Edge," with the core objective of conducting long-term, foundational AGI (Artificial General Intelligence) frontier research that goes beyond pre-training and large model iterations. Seed Edge has already outlined five major research directions.

More
Jan 22, 2025
Bytedance

ByteDance Launches Doubao-1.5-Pro, Surpassing GPT-4o in Key Benchmarks

ByteDance Launches Doubao-1.5-Pro, Surpassing GPT-4o in Key Benchmarks

ByteDance released Doubao-1.5-Pro, an advanced AI model utilizing a sparse MoE architecture, achieving performance comparable to dense models with 7x fewer activation parameters. It outperformed GPT-4o, Claude 3.5 Sonnet, and others in coding, reasoning, and Chinese language benchmarks. The model also features enhanced visual and voice capabilities, offering cost-effective solutions for developers.

More
ByteDance Launches Doubao-1.5-Pro, Surpassing GPT-4o in Key Benchmarks
Dec 22, 2024
Bytedance

Bytedance anounces INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations

We present INFP, an audio-driven interactive head generation framework for dyadic conversations. Given the dual-track audio in dyadic conversations and a single portrait image of arbitrary agent, our framework can dynamically synthesize verbal, non-verbal and interactive agent videos with lifelike facial expressions and rhythmic head pose movements. Additionally, our framework is lightweight yet powerful, making it practical in instant communication scenarios such as the video conferencing. INFP denotes our method is Interactive, Natural, Flash and Person-generic.

More