Latest Open Source Projects

F5-TTS

9,200

@SWivid

F5-TTS

TLDR: F5-TTS is a text-to-speech repository that features Diffusion Transformer with ConvNeXt V2 for faster training and inference. It includes various installation methods, inference options such as Gradio App and CLI, and training with a Gradio web interface. It also has an evaluation section and acknowledges multiple works. The code is released under MIT License while pre-trained models are under CC-BY-NC license.

Python

2024-10-08 Github

newsnow

3,000

@ourongxing

newsnow

TLDR: An elegant news reading application that provides a pleasant reading experience with features like Github login and data synchronization. Supports deployment on Cloudflare Pages, Vercel and Docker.

elegant news TypeScript

2024-09-23 Github

shortest

4,100

@anti-work

shortest

TLDR: An AI-powered natural language end-to-end testing framework built on Playwright with features like Anthropic Claude API integration, GitHub 2FA support, and email validation. Also includes guides for web app and CLI development.

TypeScript

2024-09-18 Github

cookiecutter-uv

633

@fpgmaas

cookiecutter-uv

TLDR: A modern cookiecutter template for Python projects that use uv for dependency management

Python

2024-09-02 Github

llm_engineering

536

@ed-donner

llm_engineering

TLDR: Repo to accompany my mastering LLM engineering course

AI Learning LLM Engineering Project-based Learning Jupyter Notebook

2024-08-31 Github

Qwen2-VL

4,300

@QwenLM

Qwen2-VL

TLDR: Qwen2-VL is a vision language model with enhancements such as understanding images and videos of various resolutions and ratios, including support for multilingual texts in images. It offers open-sourced models under different licenses and provides various usage examples and benchmarks. Additionally, it supports quantization methods and has limitations which are areas for further improvement. The repository also provides deployment options and a web UI demo.

Python

2024-08-29 Github

potpie

2,070

@potpie-ai

potpie

TLDR: Prompt-To-Agent : Create custom engineering agents for your codebase

Python

2024-08-12 Github

VITA

2,000

@VITA-MLLM

VITA

TLDR: VITA-1.5 is an open-source interactive multimodal LLM. It features reduced interaction latency, enhanced multimodal performance, improved speech processing, and a progressive training strategy. It outperforms on various benchmarks and can be trained and used for inference.

large-multimodal-models multimodal-large-language-models Python

2024-08-10 Github

story-adapter

699

@jwmao1

story-adapter

TLDR: This repository contains the official implementation of 'Story-Adapter', a training-free and computationally efficient framework for long story visualization. It uses an iterative paradigm and a global reference cross-attention module to enhance the generative capability of long stories.

diffusion-models generative-art generative-model image-generation storytelling visual-storytelling Python

2024-08-10 Github

multi-agent-orchestrator

3,900

@awslabs

multi-agent-orchestrator

TLDR: The multi-agent-orchestrator is an open-source framework for orchestrating multiple AI agents to handle complex conversations. It features intelligent intent classification, dual language support, flexible agent responses, context management, extensible architecture, and universal deployment. It comes with pre-built agents and classifiers and offers a variety of examples and quick start guides.

agents ai-agents ai-agents-framework anthropic anthropic-claude aws aws-bedrock aws-cdk aws-lambda chatbot framework generative-ai machine-learning openai openaiapi orchestrator python serverless typescript Python

2024-07-23 Github