AI Resources

Generative AI Systems Stack

LLMs / Foundation Models

OpenAI — GPT-5, o3, o4-mini
Anthropic — Claude Opus 4.6, Sonnet 4.5
Google Gemini — Gemini 3, Gemini 2.5 Pro
Meta Llama — Llama 4 (Maverick, Scout)
DeepSeek — DeepSeek R1

AI Frameworks

LangChain — LLM orchestration and chaining
LlamaIndex — Data-centric RAG and document processing
Haystack — Production RAG and search pipelines
Semantic Kernel — Microsoft’s enterprise AI SDK

AI Coding Assistants

GitHub Copilot — Inline code completion
Cursor — AI-native code editor
Claude Code — CLI-based coding assistant

Text Embeddings

OpenAI Embeddings — text-embedding-3-large
Cohere Embed — Multilingual and multimodal
Mistral Embed — Retrieval-optimized
Voyage AI — Domain-specific embeddings

Vector Databases

Pinecone — Fully managed, serverless
Weaviate — Open-source with hybrid search
Qdrant — High-performance, Rust-based
Chroma — Lightweight, developer-friendly
Milvus — Billion-scale vector search
pgvector — PostgreSQL extension

RAG (Retrieval-Augmented Generation)

Agentic RAG — LLM-driven query decomposition
HiFi-RAG — Multi-stage hierarchical filtering
Bidirectional RAG — Controlled write-back with grounding checks
LlamaIndex Workflows — Event-driven RAG pipelines
LangChain LCEL — Chain-based RAG orchestration

AI Agents

LangGraph — Stateful multi-actor workflows
CrewAI — Role-based multi-agent collaboration
AutoGen — Microsoft’s multi-agent conversation framework
AutoGPT — Autonomous long-running agents

LLM Serving / Inference

vLLM — PagedAttention, high-throughput serving
TGI — Hugging Face’s production inference server
Ollama — Local model serving
TensorRT-LLM — NVIDIA-optimized inference
SGLang — Structured generation for constrained outputs

Fine-Tuning

Axolotl — Multi-GPU, supports LoRA/QLoRA/SFT/RLHF
Unsloth — 2-5x faster fine-tuning, 80% less memory
Torchtune — Deep customization and scalability
PEFT — Hugging Face parameter-efficient fine-tuning

LLM Monitoring / Observability

LangSmith — Tracing, cost, and latency tracking
Weights & Biases Weave — Multi-agent execution traces
Arize Phoenix — Open-source drift and bias monitoring
Langfuse — Open-source LLM observability
AgentOps — Agentic workflow monitoring

Evaluation and Testing

RAGAS — RAG-specific evaluation (faithfulness, relevancy, recall)
DeepEval — 60+ metrics, Pytest-like LLM testing
PromptFoo — Prompt A/B testing via YAML/CLI
OpenAI Evals — Open-source evaluation framework