logocreator

logocreator

TLDR: An open source logo generator that creates professional logos in seconds using customizable styles. It uses Flux Pro 1.1 on Together AI for logo generation, Next.js with TypeScript for the app framework, Shadcn and Tailwind for UI components and styling, Upstash Redis for rate limiting, Clerk for authentication, and Plausible & Helicone for analytics and observability. Future tasks include creating a dashboard with logo history, supporting SVG exports, adding more styles, adding an image size dropdown, showing approximate price with a custom Together AI key, allowing reference logo upload, and redesigning popular brand logos in a showcase.

TypeScript
2024-11-06 Github

chonkie

chonkie

TLDR: Chonkie is a lightweight and fast RAG chunking library with various chunkers. It offers features like minimal default installs and supports multiple tokenizers. It has better size and speed compared to alternatives.

2024-11-01 Github

Roo-Cline

Roo-Cline

TLDR: Roo-Cline is a fork of Cline, an autonomous coding agent. It comes with additional experimental features such as drag and drop images, sound effects, language selection, and support for various models. It provides capabilities like creating and editing files, running commands in the terminal, using the browser, and adding custom tools through the Model Context Protocol.

TypeScript
2024-10-31 Github

browser-use

browser-use

TLDR: The browser-use repository provides an easy way to connect AI agents with the browser. It offers features like vision and html extraction, multi-tab management, custom actions, and parallelization of agents. It also collects anonymous usage data for improvement.

2024-10-31 Github

open-computer-use

open-computer-use

TLDR: A secure cloud Linux computer powered by E2B Desktop Sandbox and controlled by open-source LLMs. Supports various LLMs like Meta Llama and OS-Atlas. Operates via keyboard, mouse and shell commands. Easily add new LLMs adhering to OpenAI API specification.

2024-10-31 Github

BetterWhisperX

BetterWhisperX

TLDR: A fork of WhisperX with improvements. Provides fast automatic speech recognition with word-level timestamps and speaker diarization. Includes features like batched inference, accurate timestamps using wav2vec2 alignment, and VAD preprocessing.

Python
2024-10-23 Github

computer_use_ootb

computer_use_ootb

TLDR: Computer Use OOTB is an out-of-the-box solution for Desktop GUI Agent, providing both API-based and locally-running models. It supports Windows and macOS, has no Docker requirement, and offers a user-friendly Gradio interface. It has had major updates, including local run functionality, added examples, support for multiple displays, and more. Users need to install prerequisites, clone the repository, install dependencies, and set API keys to start the interface for remote control. It also has advanced settings for the ShowUI model and a roadmap for further improvement.

Python
2024-10-23 Github

text-extract-api

text-extract-api

TLDR: A tool for converting images, PDFs, and Office documents to Markdown or JSON with high accuracy. Built with FastAPI, uses Celery for asynchronous tasks and Redis for caching. Supports various OCR strategies and can remove PII. Comes with a CLI tool and has different storage strategies. Also has an online demo and dedicated API clients.

2024-10-23 Github

ai-engineering-hub

ai-engineering-hub

TLDR:

2024-10-21 Github

Sana

3,200
@NVlabs

Sana

TLDR: Sana is a text-to-image framework that can efficiently generate high-resolution images up to 4096×4096 resolution. It features designs like DC-AE, Linear DiT, decoder-only text encoder, and efficient training and sampling. Sana is competitive with giant diffusion models, being smaller and faster while deployable on laptop GPU.

Python
2024-10-11 Github