Latest Open Source Projects

web-ui

4,100

web-ui

TLDR: This project builds on browser-use, offers a user-friendly WebUI with support for various LLMs, allows custom browser usage and persistent browser sessions. It has installation options like local and Docker, and provides different themes and settings. Changelog shows updates like adding DeepSeek-r1 support, Docker setup and keeping browser open between tasks.

Python

2025-01-02 Github

youtube

69

@kingluffywang

youtube

TLDR: This repository contains scripts for various YouTube video processing tasks such as audio to text conversion, audio to subtitle conversion, video resolution conversion for YouTube Shorts, subtitle text processing, and video splitting into short clips.

Python

2024-12-29 Github

AI-reads-books-page-by-page

804

@echohive42

AI-reads-books-page-by-page

TLDR: This repository contains a script that performs page-by-page analysis of PDF books, extracting knowledge points and generating summaries. It offers features like automated analysis, AI-powered content understanding, interval summaries, and customizable options. The script can be set up by cloning the repository, installing requirements, and configuring constants. It works by setting up directories, loading an existing knowledge base, processing pages, generating summaries, and saving the results.

Python

2024-12-28 Github

deepseek-engineer

846

@Doriandarko

deepseek-engineer

TLDR: A coding assistant application that integrates with DeepSeek API. It can process user conversations, generate JSON responses, read local files, create new files, and apply diff edits. It has features like DeepSeek client configuration, data models, helper functions, and an interactive session.

Python

2024-12-26 Github

Aria-UI

295

@AriaUI

Aria-UI

TLDR: Aria-UI is a model that handles diverse grounding instructions for GUI, is context-aware, lightweight and fast, and achieves superior performances on agent benchmarks. It can be installed and used with vllm or Transformers.

Python

2024-12-25 Github

pasa

449

@bytedance

pasa

TLDR: This repo introduces PaSa, an LLM-powered paper search agent. It can make autonomous decisions for complex scholarly queries. Optimized with reinforcement learning and synthetic data, PaSa outperforms baselines. It has two agents, Crawler and Selector, and uses two datasets. Instructions for quick start, running locally, and training are provided.

research Python

2024-12-23 Github

geminiCoder

735

@osanseviero

geminiCoder

TLDR: A project that generates small apps with one prompt powered by the Gemini API. It uses technologies like Gemini API, Sandpack, Next.js app router with Tailwind. Can be cloned and run locally.

TypeScript

2024-12-20 Github

GraphAgent

239

@HKUDS

GraphAgent

TLDR: GraphAgent is an automated agent pipeline for predictive and generative tasks. It consists of three key components: Graph Generator Agent, Task Planning Agent, and Task Execution Agent. It can handle real-world data with both structured and unstructured formats and has been demonstrated effective through extensive experiments. The repository also provides installation and inference instructions, along with benchmarks and citation information.

agent graph-data language-assistant large-language-models Python

2024-12-18 Github

openai-structured-outputs-samples

551

@openai

openai-structured-outputs-samples

TLDR: A repository of sample apps demonstrating the use of OpenAI's Structured Outputs feature with NextJS.

TypeScript

2024-12-16 Github

ai-gradio

1,000

@AK391

ai-gradio

TLDR: A Python package that enables developers to create machine learning apps powered by various AI models like OpenAI, Gemini, Anthropic's Claude, LumaAI, CrewAI, XAI's Grok, and more. It supports features such as text chat, voice chat (OpenAI only), video chat (Gemini only), text generation with different models, AI video and image generation with LumaAI, AI agent teams with CrewAI, and more.

Python

2024-12-14 Github