AI Resources

Generative AI Systems Stack

LLMs / Foundation Models

AI Frameworks

AI Coding Assistants

  • Cursor — AI-native code editor with Background Agents
  • GitHub Copilot — Inline code completion and chat
  • Claude Code — CLI-based agentic coding assistant
  • Windsurf — AI IDE with agentic flows
  • OpenAI Codex — Cloud-based agentic coding with parallel worktrees
  • Devin — Autonomous AI software engineer
  • Amazon Kiro — Spec-driven AI IDE

Text Embeddings

Vector Databases

  • Pinecone — Fully managed, serverless
  • Weaviate — Open-source with hybrid search
  • Qdrant — High-performance, Rust-based
  • Chroma — Lightweight, developer-friendly
  • Milvus — Billion-scale vector search, hot/cold tiering
  • pgvector — PostgreSQL extension

RAG (Retrieval-Augmented Generation)

  • Hybrid RAG — Dense vector + sparse keyword search (production baseline)
  • Agentic RAG — Autonomous plan-retrieve-reason loops
  • Graph RAG — Knowledge graph layer for multi-hop reasoning
  • LlamaIndex Workflows — Event-driven RAG pipelines
  • LangChain LCEL — Chain-based RAG orchestration

AI Agents

  • LangGraph — Stateful multi-actor workflows
  • CrewAI — Role-based multi-agent collaboration
  • OpenAI Agents SDK — Production-grade agent orchestration
  • Claude Agent SDK — Tool-use agents with constitutional safety
  • Google ADK — Agent Development Kit (Python, Java, Go, TS)
  • AG2 — Community fork of AutoGen, multi-agent conversations
  • Smolagents — Hugging Face’s lightweight agent framework

Model Context Protocol (MCP)

LLM Serving / Inference

  • vLLM — PagedAttention, high-throughput serving
  • SGLang — Zero-overhead batch scheduler, fastest on benchmarks
  • Ollama — Local model serving with Apple MLX support
  • TensorRT-LLM — NVIDIA-optimized inference
  • TGI — Hugging Face’s production inference server
  • LiteLLM — AI gateway/proxy for 100+ LLMs in OpenAI format

Fine-Tuning

  • Unsloth — 2-12x faster fine-tuning, 80% less memory
  • Axolotl — Multi-GPU, supports LoRA/QLoRA/SFT/RLHF/GRPO
  • LLaMA-Factory — GUI-first, 100+ model support, no-code fine-tuning
  • TRL — Hugging Face RL-based alignment (GRPO, PPO, DPO)
  • Torchtune — PyTorch-native, deep customization
  • PEFT — Parameter-efficient methods (LoRA, QLoRA, adapters)

Guardrails / Safety

LLM Monitoring / Observability

Evaluation and Testing

  • RAGAS — RAG-specific evaluation (faithfulness, relevancy, recall)
  • DeepEval — 14+ targeted metrics, Pytest-like LLM testing
  • PromptFoo — Red teaming, security testing, and CI/CD integration
  • OpenAI Evals — Open-source evaluation framework