publications
(*) denotes equal contribution.
For a complete list, visit my Google Scholar profile.
2024
- PreprintQuadratic Gating Functions in Mixture of Experts: A Statistical InsightarXiv:2410.11222, 2024Under review
- PreprintUnderstanding Expert Structures on Minimax Parameter Estimation in Contaminated Mixture of ExpertsarXiv:2410.12258, 2024Under review
- PreprintStatistical Advantages of Perturbing Cosine Router in Sparse Mixture of ExpertsarXiv:2405.14131, 2024Under review
- ICMLImproving Computational Complexity in Statistical Models with Local Curvature InformationIn International Conference on Machine Learning (ICML), 2024
- ICMLIs Temperature Sample Efficient for Softmax Gaussian Mixture of Experts?In International Conference on Machine Learning (ICML), 2024
- ICMLA General Theory for Softmax Gating Multinomial Logistic Mixture of ExpertsIn International Conference on Machine Learning (ICML), 2024
- ICLRStatistical Perspective of Top-K Sparse Softmax Gating Mixture of ExpertsIn International Conference on Learning Representations (ICLR), 2024
2022
- NeurIPSImproving Counterfactual Explanations for Time Series Classification Models in Healthcare SettingsIn NeurIPS 2022 Workshop on Learning from Time Series for Health, 2022