Twelfth Meeting
Presenter
References
- Just, Hoang Anh, Feiyang Kang, Jiachen T. Wang, Yi Zeng, Myeongseob Ko, Ming Jin, and Ruoxi Jia. “Lava: Data valuation without pre-specified learning algorithms.” In the 12th International Conference on Learning Representations (ICLR), 2023.
- Pruthi, Garima, Frederick Liu, Satyen Kale, and Mukund Sundararajan. “Estimating training data influence by tracing gradient descent.” Advances in Neural Information Processing Systems 33 (2020).
- Thakkar, Megh, Tolga Bolukbasi, Sriram Ganapathy, Shikhar Vashishth, Sarath Chandar, and Partha Talukdar. “Self-influence guided data reweighting for language model pre-training.” In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.
- Wang, Xiao, Weikang Zhou, Qi Zhang, Jie Zhou, Songyang Gao, Junzhe Wang, Menghan Zhang, Xiang Gao, Yunwen Chen, and Tao Gui. “Farewell to aimless large-scale pretraining: Influential subset selection for language model.” In Findings of the Association for Computational Linguistics (2023).