Publications

(2024). BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation. ACL 2024.

PDF Cite Code

(2024). SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs. arXiv.

PDF Cite Code

(2013). Benchmarking and Dissecting the Nvidia Hopper GPU Architecture. IPDPS 2024.

PDF Cite

(0001). .

Cite