AI Resources

Generative AI Systems Stack

LLMs / Foundation Models

OpenAI — GPT-5.4, GPT-5.3, o3-pro
Anthropic — Claude Opus 4.6, Sonnet 4.6
Google Gemini — Gemini 3.1 Pro, Gemini 3.1 Flash
Meta Llama — Llama 4 (Maverick, Scout)
DeepSeek — DeepSeek R1, DeepSeek-V3
Mistral — Mistral Large 3, Mistral Small 4
xAI — Grok 4.20
Qwen — Qwen 3.6-Plus

AI Frameworks

LangChain — LLM orchestration and chaining
LlamaIndex — Data-centric RAG and document processing
Haystack — Production RAG and search pipelines
Microsoft Agent Framework — Unified SDK (successor to Semantic Kernel + AutoGen)
Mastra — TypeScript-first AI agent framework

AI Coding Assistants

Cursor — AI-native code editor with Background Agents
GitHub Copilot — Inline code completion and chat
Claude Code — CLI-based agentic coding assistant
Windsurf — AI IDE with agentic flows
OpenAI Codex — Cloud-based agentic coding with parallel worktrees
Devin — Autonomous AI software engineer
Amazon Kiro — Spec-driven AI IDE

Text Embeddings

OpenAI Embeddings — text-embedding-3-large
Cohere Embed 4 — Multimodal (text + images)
Voyage AI — Voyage 4, MoE architecture (acquired by MongoDB)
Gemini Embedding — Multimodal including native audio
Mistral Embed — Retrieval-optimized, lowest cost

Vector Databases

Pinecone — Fully managed, serverless
Weaviate — Open-source with hybrid search
Qdrant — High-performance, Rust-based
Chroma — Lightweight, developer-friendly
Milvus — Billion-scale vector search, hot/cold tiering
pgvector — PostgreSQL extension

RAG (Retrieval-Augmented Generation)

Hybrid RAG — Dense vector + sparse keyword search (production baseline)
Agentic RAG — Autonomous plan-retrieve-reason loops
Graph RAG — Knowledge graph layer for multi-hop reasoning
LlamaIndex Workflows — Event-driven RAG pipelines
LangChain LCEL — Chain-based RAG orchestration

AI Agents

LangGraph — Stateful multi-actor workflows
CrewAI — Role-based multi-agent collaboration
OpenAI Agents SDK — Production-grade agent orchestration
Claude Agent SDK — Tool-use agents with constitutional safety
Google ADK — Agent Development Kit (Python, Java, Go, TS)
AG2 — Community fork of AutoGen, multi-agent conversations
Smolagents — Hugging Face’s lightweight agent framework

Model Context Protocol (MCP)

MCP Specification — Open standard for connecting AI to tools and data
MCP Servers — 10,000+ community servers
Agentic AI Foundation — Linux Foundation governance (Anthropic, OpenAI, Google, Microsoft)

LLM Serving / Inference

vLLM — PagedAttention, high-throughput serving
SGLang — Zero-overhead batch scheduler, fastest on benchmarks
Ollama — Local model serving with Apple MLX support
TensorRT-LLM — NVIDIA-optimized inference
TGI — Hugging Face’s production inference server
LiteLLM — AI gateway/proxy for 100+ LLMs in OpenAI format

Fine-Tuning

Unsloth — 2-12x faster fine-tuning, 80% less memory
Axolotl — Multi-GPU, supports LoRA/QLoRA/SFT/RLHF/GRPO
LLaMA-Factory — GUI-first, 100+ model support, no-code fine-tuning
TRL — Hugging Face RL-based alignment (GRPO, PPO, DPO)
Torchtune — PyTorch-native, deep customization
PEFT — Parameter-efficient methods (LoRA, QLoRA, adapters)

Guardrails / Safety

NeMo Guardrails — NVIDIA’s open-source safety toolkit
Guardrails AI — Output validation and structuring
Llama Guard — LLM-based content classification

LLM Monitoring / Observability

Langfuse — Open-source LLM observability (acquired by ClickHouse)
LangSmith — Tracing, cost, and latency tracking
Arize Phoenix — Open-source drift and bias monitoring
Weights & Biases Weave — Multi-agent execution traces
AgentOps — Agentic workflow monitoring

Evaluation and Testing

RAGAS — RAG-specific evaluation (faithfulness, relevancy, recall)
DeepEval — 14+ targeted metrics, Pytest-like LLM testing
PromptFoo — Red teaming, security testing, and CI/CD integration
OpenAI Evals — Open-source evaluation framework