Follow
Zhenmei Shi
Zhenmei Shi
Research Scientist at Voyage AI; PhD from University of Wisconsin–Madison
Verified email at cs.wisc.edu - Homepage
Title
Cited by
Cited by
Year
SF-Net: Structured feature network for continuous sign language recognition
Z Yang*, Z Shi*, X Shen, YW Tai
arXiv preprint arXiv:1908.01341, 2019
822019
A Theoretical Analysis on Feature Learning in Neural Networks: Emergence from Inputs and Advantage over Fixed Features
Z Shi*, J Wei*, Y Liang
ICLR 2022: International Conference on Learning Representations, 2022
622022
Deep Online Fused Video Stabilization
Z Shi, F Shi, WS Lai, CK Liang, Y Liang
WACV 2022: Winter Conference on Applications of Computer Vision, 2022
342022
Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability
Z Xu*, Z Shi*, Y Liang
COLM 2024: Conference on Language Modeling, 2024
332024
The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning
Z Shi*, J Chen*, K Li, J Raghuram, X Wu, Y Liang, S Jha
ICLR 2023 (Spotlight): International Conference on Learning Representations, 2023
292023
Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers
Y Liang*, Z Shi*, Z Song*, Y Zhou*
AFM Workshop @ NeurIPS 2024, 2024
262024
Towards Few-Shot Adaptation of Foundation Models via Multitask Finetuning
Z Xu, Z Shi, J Wei, F Mu, Y Li, Y Liang
ICLR 2024: International Conference on Learning Representations, 2024
252024
When and How Does Known Class Help Discover Unknown Ones? Provable Understandings Through Spectral Analysis
Y Sun, Z Shi, Y Liang, Y Li
ICML 2023: International Conference on Machine Learning, 2023
252023
Fourier Circuits in Neural Networks and Transformers: A Case Study of Modular Arithmetic with Multiple Inputs
C Li*, Y Liang*, Z Shi*, Z Song*, T Zhou*
AIStats 2025: International Conference on Artificial Intelligence and Statistics, 2025
23*2025
Conv-basis: A new paradigm for efficient attention inference and gradient computation in transformers
Y Liang*, H Liu*, Z Shi*, Z Song*, Z Xu*, J Yin*
arXiv preprint arXiv:2405.05219, 2024
232024
A Graph-Theoretic Framework for Understanding Open-World Semi-Supervised Learning
Y Sun, Z Shi, Y Li
NeurIPS 2023 (Spotlight): Neural Information Processing Systems, 2023
222023
Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time
Y Liang*, Z Sha*, Z Shi*, Z Song*, Y Zhou*
OPT Workshop @ NeurIPS 2024, 2024
212024
Beyond Linear Approximations: A Novel Pruning Approach for Attention Matrix
Y Liang*, J Long*, Z Shi*, Z Song*, Y Zhou*
ICLR 2025: International Conference on Learning Representations, 2025
20*2025
Discovering the gems in early layers: Accelerating long-context llms with 1000x input token reduction
Z Shi, Y Ming, XP Nguyen, Y Liang, S Joty
arXiv preprint arXiv:2409.17422, 2024
202024
Domain generalization via nuclear norm regularization
Z Shi, Y Ming, Y Fan, F Sala, Y Liang
CPAL 2024: Conference on Parsimony and Learning, 179-201, 2024
182024
A Tighter Complexity Analysis of SparseGPT
X Li*, Y Liang*, Z Shi*, Z Song*
Compression Workshop @ NeurIPS 2024, 2024
172024
Toward Infinite-Long Prefix in Transformer
Y Liang*, Z Shi*, Z Song*, C Yang*
arXiv preprint arXiv:2406.14036, 2024
172024
Differential Privacy of Cross-Attention with Provable Guarantee
Y Liang*, Z Shi*, Z Song*, Y Zhou*
SafeGenAi Workshop @ NeurIPS 2024, 2024
162024
Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective
Y Liang*, Z Shi*, Z Song*, Y Zhou*
arXiv preprint arXiv:2405.16418, 2024
162024
Attentive walk-aggregating graph neural networks
MF Demirel, S Liu, S Garg, Z Shi, Y Liang
Transactions on Machine Learning Research, 2022
16*2022
The system can't perform the operation now. Try again later.
Articles 1–20