Skip to content
#

llm-deployment

Here are 21 public repositories matching this topic...

Production-ready checklists and frameworks for deploying LLMs, GenAI models, and AI infrastructure. Covers vLLM, Kubernetes, GPU optimization, observability, compliance, and Day-0 to Day-2 operations.

  • Updated Apr 15, 2026
  • Shell

🧠 A comprehensive toolkit for benchmarking, optimizing, and deploying local Large Language Models. Includes performance testing tools, optimized configurations for CPU/GPU/hybrid setups, and detailed guides to maximize LLM performance on your hardware.

  • Updated Mar 27, 2025
  • Shell

ModelSpec is an open, declarative specification for describing how AI models especially LLMs are deployed, served, and operated in production. It captures execution, serving, and orchestration intent to enable validation, reasoning, and automation across modern AI infrastructure.

  • Updated Apr 27, 2026
  • Python

A 6-week hands-on masterclass in production MLOps engineering. Build a file-backed experiment tracker, a containerized model registry, an inference server with dynamic batching, a cached pipeline DAG, automated drift detectors (PSI/KS), and high-performance LLM serving infra (KV-cache/continuous batching) from scratch.

  • Updated May 27, 2026
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the llm-deployment topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-deployment topic, visit your repo's landing page and select "manage topics."

Learn more