Jan 15, 2025
MinMax

Hailuo AI Launches T2A-01-HD with Voice Cloning and Emotional Synthesis

Hailuo AI has unveiled T2A-01-HD, a groundbreaking text-to-speech model that can clone voices from just 10 seconds of audio. Supporting over 17 languages, it offers a diverse range of customizable voices, including options for emotional tone and speech effects. This innovative tool is set to transform industries like entertainment, education, and marketing by enabling high-quality voice synthesis with minimal input, enhancing user engagement and content creation.

More
Jan 09, 2025
Alibaba

Qwen team launches Qwen Chat – a Web UI for interacting with Qwen models! 🌟

Chat effortlessly with our flagship model Qwen2.5-Plus , explore vision-language capabilities with Qwen2-VL-Max , and dive into reasoning models like QwQ and QVQ, code with coding expert Qwen2.5-Coder-32B-Instruct, etc.

More

Unitree Technology's Humanoid Robots Shine at 2025 CCTV Spring Festival Gala

Unitree Technology brought its humanoid robots to the 2025 CCTV Spring Festival Gala. Directed by Zhang Yimou, the robots in red floral jackets danced Yangko, showing smooth walking and seamless interaction with dancers.

More
Dec 30, 2024
Kwai

Kling AI API Upgrades: Virtual Try-On V1.5 and Lip Sync Features

Kling AI API has launched significant upgrades, including the Virtual Try-On model V1.5, which now supports both individual clothing items and combinations of 'upper + lower' outfits. This enhancement allows for more realistic fitting experiences. Additionally, the newly opened lip-syncing capability enables perfect synchronization of characters' lip movements with voiceovers, enhancing video content creation for e-commerce and advertising sectors.

More
Dec 29, 2024
PixVerse

PixVerse V3.5 Launches with Enhanced Speed and Features

PixVerse V3.5 Launches with Enhanced Speed and Features

AIsphere Technology has launched PixVerse V3.5, now in beta, which significantly reduces video generation times to under 30 seconds. The update enhances motion control and introduces new features for improved semantic understanding and animation quality. With over 12 million users, this upgrade positions PixVerse among the top-tier AI video generation tools, catering to a growing demand in the market.

More
PixVerse V3.5 Launches with Enhanced Speed and Features
Dec 25, 2024
Deepseek

DeepSeek open-sourced DeepSeek-V3-Base, a 685B parameter model

LiveBench reported by r/LocalLlama - DeepSeek v3 is the BEST open weight LLM AND SECOND BEST non-reasoning LLM after `gemini-exp-1206`

More
Dec 25, 2024
Alibaba

Qwen released QVQ, the first open-weight model for visual reasoning

Building on the foundation of Qwen2-VL-72B, QvQ integrates architectural improvements that enhance cross-modal reasoning. Its open-weight design underscores the team’s commitment to making advanced AI more accessible.

More
Dec 23, 2024
Hume

Hume AI introduces its frontier speech-language model OCTAVE

A frontier speech-language model with new emergent capabilities, like on-the-fly voice and personality creation.

More
Dec 23, 2024
Unitree

Unitree B2-W Talent Awakening!

One year after mass production kicked off, Unitree’s B2-W Industrial Wheel has been upgraded with more exciting capabilities. Please always use robots safely and friendly.

More
Dec 22, 2024
Bytedance

Bytedance anounces INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations

We present INFP, an audio-driven interactive head generation framework for dyadic conversations. Given the dual-track audio in dyadic conversations and a single portrait image of arbitrary agent, our framework can dynamically synthesize verbal, non-verbal and interactive agent videos with lifelike facial expressions and rhythmic head pose movements. Additionally, our framework is lightweight yet powerful, making it practical in instant communication scenarios such as the video conferencing. INFP denotes our method is Interactive, Natural, Flash and Person-generic.

More