Latest Open Source Projects

deepclaude

121

deepclaude

TLDR: DeepClaude is a high - performance LLM inference API. It combines DeepSeek R1's reasoning and Anthropic Claude's creative and code generation capabilities, offering features like zero latency, security, high configurability, and is open - source. It allows users to use their own API keys and provides a unified interface to leverage the strengths of both models.

ai anthropic anthropic-claude api chain-of-thought claude deepseek deepseek-r1 llm rust Rust

2025-01-26 Github

browser-use

19,800

@browser-use

browser-use

TLDR: The browser-use repository provides an easy way to connect AI agents with the browser. It offers features like vision and html extraction, multi-tab management, custom actions, and parallelization of agents. It also collects anonymous usage data for improvement.

ai-agents ai-tools browser-automation llm python Python

2024-10-31 Github

open-computer-use

594

@e2b-dev

open-computer-use

TLDR: A secure cloud Linux computer powered by E2B Desktop Sandbox and controlled by open-source LLMs. Supports various LLMs like Meta Llama and OS-Atlas. Operates via keyboard, mouse and shell commands. Easily add new LLMs adhering to OpenAI API specification.

agent ai anthropic claude computer-use llm Python

2024-10-31 Github

text-extract-api

2,100

@CatchTheTornado

text-extract-api

TLDR: A tool for converting images, PDFs, and Office documents to Markdown or JSON with high accuracy. Built with FastAPI, uses Celery for asynchronous tasks and Redis for caching. Supports various OCR strategies and can remove PII. Comes with a CLI tool and has different storage strategies. Also has an online demo and dedicated API clients.

anonymization api extract json llm ocr ocr-python pdf pii Python

2024-10-23 Github

RAG_Techniques

11,500

@NirDiamant

RAG_Techniques

TLDR: This repository is a comprehensive collection of advanced Retrieval-Augmented Generation (RAG) techniques. It includes various methods for enhancing RAG systems such as query enhancement, context enrichment, advanced retrieval methods, iterative and adaptive techniques, evaluation, explainability, and advanced architectures.

ai langchain llama-index llm llms opeani python rag tutorials Jupyter Notebook

2024-07-13 Github

Building-llama3-from-scratch

131

@FareedKhan-dev

Building-llama3-from-scratch

TLDR: This repository contains code to build the LLaMA 3 language model from scratch using Python. It explains the components of LLaMA 3 such as pre-normalization using RMSNorm, SwiGLU activation function, Rotary Embeddings (RoPE), and Byte Pair Encoding (BPE) Algorithm. The code shows how to tokenize input data, create embeddings for each token, implement attention heads, self-attention, multi-head attention, SwiGLU activation function, and generate the output.

chatgpt gemini gpt llama-3 llm openai python Jupyter Notebook

2024-05-27 Github

firecrawl

23,000

@mendableai

firecrawl

TLDR: Firecrawl is an API service that crawls URLs and converts them into clean markdown or structured data. It offers advanced scraping, crawling, and data extraction capabilities with features like LLM-ready formats, customizability, and more. It also has SDKs for various languages and integrations with multiple frameworks.

ai ai-scraping crawler data html-to-markdown llm markdown rag scraper scraping web-crawler webscraping TypeScript

2024-04-15 Github

FlagEmbedding

8,300

@FlagOpen

FlagEmbedding

TLDR: FlagEmbedding focuses on retrieval-augmented LLMs and consists of multiple projects including inference, finetune, evaluation, dataset, tutorials, and research. It offers various embedding and reranker models for different languages and tasks.

embeddings information-retrieval llm retrieval-augmented-generation sentence-embeddings text-semantic-similarity Python

2023-08-02 Github

LLMs-from-scratch

38,600

@rasbt

LLMs-from-scratch

TLDR: This repository contains code for developing, pretraining, and finetuning a GPT-like LLM. It is the official code repository for the book 'Build a Large Language Model (From Scratch)'. The code is designed to run on conventional laptops and automatically utilizes GPUs if available. It also includes bonus materials and has specific hardware requirements.

chatgpt gpt large-language-models llm python pytorch Jupyter Notebook

2023-07-23 Github

llm-course

44,500

@mlabonne

llm-course

TLDR: This repository contains a comprehensive course on large language models, divided into three parts: LLM Fundamentals, The LLM Scientist, and The LLM Engineer. It includes notebooks and articles related to various aspects of large language models such as fine-tuning, quantization, and building applications.

course large-language-models llm machine-learning roadmap Jupyter Notebook

2023-06-17 Github