efficient-attention

Here are 19 public repositories matching this topic...

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

cuda triton attention vit quantization video-generation mlsys inference-acceleration efficient-attention llm llm-infra video-generate

Updated Jan 17, 2026
Cuda

lucidrains / ring-attention-pytorch

Star

Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch

attention-mechanism efficient-attention long-context distributed-attention

Updated May 16, 2025
Python

lucidrains / CoLT5-attention

Star

Implementation of the conditionally routed attention in the CoLT5 architecture, in Pytorch

deep-learning routing artificial-intelligence attention-mechanisms efficient-attention

Updated Sep 6, 2024
Python

lucaslingle / transformer_vq

Star

Official implementation of 'Transformer-VQ: Linear-Time Transformers via Vector Quantization'

research deep-learning transformers tpu jax efficient-attention long-context

Updated Dec 4, 2023
Python

jlamprou / Infini-Attention

Star

Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval

transformer infinite attention efficient-attention llm qwen

Updated May 9, 2024
Python

Ascend-Research / CascadedGaze

Star

The official PyTorch implementation for CascadedGaze: Efficiency in Global Context Extraction for Image Restoration, TMLR'24.

efficiency transformer image-restoration deblurring denoising efficient-attention

Updated Feb 13, 2025
Python

davidsvy / cosformer-pytorch

Star

Unofficial PyTorch implementation of the paper "cosFormer: Rethinking Softmax In Attention".

deep-learning neural-network pytorch artificial-intelligence transformer attention-mechanism iclr efficient-attention iclr2022

Updated Oct 29, 2021
Jupyter Notebook

zhenyi4 / ssa

Star

Official repository for "SSA: Sparse Sparse Attention by Aligning Full and Sparse Attention Outputs in Feature Space"

efficiency pre-training efficient-attention sparse-attention llm

Updated May 7, 2026
Python

HolmesShuan / Compact-Global-Descriptor

Star

Pytorch implementation of "Compact Global Descriptor for Neural Networks" (CGD).

efficient pytorch convolutional-neural-networks attention-mechanism attention-model efficient-attention

Updated Jan 9, 2025
Python

robflynnyh / hydra-linear-attention

Star

Implementation of: Hydra Attention: Efficient Attention with Many Heads (https://arxiv.org/abs/2209.07484)

machine-learning transformers attention linear-attention efficient-attention

Updated Jan 8, 2023
Python

gmlwns2000 / sea-attention

Star

Official Implementation of SEA: Sparse Linear Attention with Estimated Attention Mask (ICLR 2024)

attention linear-attention efficient-attention sea-attention

Updated Jun 20, 2025
Python

MAGICS-LAB / NonparametricHopfield

Star

Nonparametric Modern Hopfield Models

efficient-transformers efficient-attention modern-hopfield-networks modern-hopfield-model efficient-hopfield-models efficient-hopfield-networks

Updated Jan 8, 2024
Jupyter Notebook

Two small-scale research threads with pre-registered falsifiable bars + adversarial referee audits: Prizma-Seq (a parameter-free quadratic delta-state sequence mixer, an efficient-attention-replacement candidate) and Prizma (backprop-free, fully-local continual learning).

machine-learning research deep-learning pytorch continual-learning efficient-attention sequence-model

Updated Jun 20, 2026
Python

Lanerra / DWARF

Star

O(N) attention with a bounded inference KV cache. D4 Daubechies wavelet field + content-gated Q·K gather at dyadic offsets.

nlp rust machine-learning deep-learning language-modeling pytorch wavelet language-model attention-mechanism wavelet-transform ablation inference-efficiency ablation-study kv-cache linear-attention efficient-attention sparse-attention wavelet-attention

Updated Jun 9, 2026
Python

zengxyyu / HiCI

Star

HiCI: Hierarchical Construction-Integration for Long-Context Attention

memory transformers attention-mechanism hierarchical-attention efficient-attention llm long-context-modeling hici

Updated May 9, 2026
Python

pszemraj / samba-pytorch

Star

Minimal implementation of Samba by Microsoft in PyTorch

language-model ssm pytorch-implementation efficient-attention llm long-context-modeling mamba-state-space-models

Updated Nov 24, 2024
Python

StaryMoon / Mamba2-Unofficial

Star

Unofficial PyTorch reproduction for Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality.

pytorch reproduction state-space-model sequence-modeling efficient-attention mamba2 unofficial-implementation

Updated Jul 2, 2026
Python

YentlCollin / Projet_TabPFN_yentl

Star

Tabular foundation model experiments with learned context sampling and efficient attention

pytorch efficient-attention in-context-learning tabpfn tabular-foundation-models

Updated May 4, 2026
Jupyter Notebook

alosaupending874 / sage

Star

Run local AI models on your machine with a secure, Rust-based inference engine that keeps your data private and provides controlled system access.

android data-science reinforcement-learning ai wordpress-theme pytorch developer-tools vit copilot mlops mlsys inference-acceleration efficient-attention llm video-generate

Updated Jul 3, 2026

Improve this page

Add a description, image, and links to the efficient-attention topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the efficient-attention topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

efficient-attention

Here are 19 public repositories matching this topic...

thu-ml / SageAttention

lucidrains / ring-attention-pytorch

lucidrains / CoLT5-attention

lucaslingle / transformer_vq

jlamprou / Infini-Attention

Ascend-Research / CascadedGaze

davidsvy / cosformer-pytorch

zhenyi4 / ssa

HolmesShuan / Compact-Global-Descriptor

robflynnyh / hydra-linear-attention

gmlwns2000 / sea-attention

MAGICS-LAB / NonparametricHopfield

nazmiefearmutcu / Prizma

Lanerra / DWARF

zengxyyu / HiCI

pszemraj / samba-pytorch

StaryMoon / Mamba2-Unofficial

YentlCollin / Projet_TabPFN_yentl

alosaupending874 / sage

Improve this page

Add this topic to your repo