dattri: A Library for Efficient Data Attribution¶

dattri is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms. You may use dattri to

Deploy existing data attribution methods to PyTorch models - e.g., Influence Function, TracIn, RPS, TRAK, …
Develop new data attribution methods with efficient implementation of low-level utility functions - e.g., Hessian (HVP/IHVP), Fisher Information Matrix (IFVP), random projection, dropout ensembling, …
Benchmark data attribution methods with standard benchmark settings - e.g., MNIST-10+LR/MLP, CIFAR-10/2+ResNet-9, MAESTRO + Music Transformer, Shakespeare + nanoGPT, …

See also our [paper](https://arxiv.org/pdf/2410.04555), published in the NeurIPS 2024 Datasets and Benchmarks Track.

Attribution Task and Attributors:

Low-level Utility Functions:

Benchmark:

Benchmark Functions

Indices and tables¶