Publications

(2024). BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation. ACL 2024.

PDF Code

(2024). SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs. arXiv.

PDF Code

(2013). Benchmarking and Dissecting the Nvidia Hopper GPU Architecture. IPDPS 2024.

PDF

(0001). .

Cite