Benchmark Functions¶

dattri.model_util.retrain.retrain_loo(train_func: Callable, dataloader: torch.utils.data.DataLoader, path: str, indices: List[int] | None = None, seed: int | None = None, **kwargs) → None¶

Retrain the model for Leave-One-Out (LOO) metric.

The retrained model checkpoints and the removed index metadata are saved to the path. The function will call the train_func to retrain the model for each subset dataloader with one index removed.

Parameters:

train_func (Callable) –
The training function that takes a dataloader, and returns the retrained model. Here is an example of a training function: ```python def train_func(dataloader):

model = Model() optimizer = … criterion = … model.train() for inputs, labels in dataloader:

optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step()

return model

```
dataloader (torch.utils.data.DataLoader) – The dataloader used for training.
indices (List[int]) – The indices to remove from the dataloader. Default is None. None means that each index in the dataloader will be removed in turn.
seed (int) – The random seed for the training process. Default is None, which means the training process is not deterministic.
**kwargs – The arguments of train_func in addition to dataloader.
path (str) –
The directory to save the retrained models and the removed index metadata. The directory should be organized as ```

/$path
metadata.yml /index_{indices[0]}

model_weights.pt

/index_{indices[1]}
model_weights.pt

… /index_{indices[n]}

model_weights.pt

# metadata.yml data = {

’mode’: ‘loo’, ‘data_length’: len(dataloader), ‘train_func’: train_func.__name__, ‘indices’: indices, ‘map_index_dir’: {

indices[0]: f’./index_{indices[0]}’, indices[1]: f’./index_{indices[1]}’, …

}

}

```.

dattri.model_util.retrain.retrain_lds(train_func: Callable, dataloader: torch.utils.data.DataLoader, path: str, num_subsets: int = 100, subset_ratio: float = 0.5, num_runs_per_subset: int = 1, start_id: int = 0, total_num_subsets: int = 0, seed: int | None = None, **kwargs) → None¶

Retrain the model for the Linear Datamodeling Score (LDS) metric calculation.

The retrained model checkpoints and the subset data indices metadata will be saved to path. The function will call the train_func to retrain the model for each subset dataloader with a random subset of the data.

Parameters:

train_func (Callable) –
The training function that takes a dataloader, and returns the retrained model. Here is an example of a training function: ```python def train_func(dataloader, seed=None, **kwargs):

model = Model() optimizer = … criterion = … model.train() for inputs, labels in dataloader:

optimizer.zero_grad() outputs = model(inputs) loss = criterion(outputs, labels) loss.backward() optimizer.step()

return model

```
dataloader (torch.utils.data.DataLoader) – The dataloader used for training.
path (str) –
The directory to save the retrained models and the subset index metadata. The directory should be organized as ```

/$path
metadata.yml /0

model_weights_0.pt model_weights_1.pt … model_weights_M.pt indices.txt

… /N

model_weights_0.pt model_weights_1.pt … model_weights_M.pt indices.txt

``` where N is (num_subsets - 1) and M is (num_runs_per_subset - 1).
num_subsets (int) – The number of subsets to retrain. Default is 100.
subset_ratio (float) – The ratio of the subset to the whole dataset. Default is 0.5.
num_runs_per_subset (int) – The number of retraining runs for each subset. Several runs can mitigate the randomness in training. Default is 1.
start_id (int) – The starting index for the subset directory. Default is 0. This is useful for parallelizing the retraining process.
total_num_subsets (int) – The total number of subsets. Default is 0, which means the total number of subsets is equal to num_subsets. This is useful for parallelizing the retraining process.
seed (int) – The random seed for the training process and subset sampling. Default is None, which means the training process and subset sampling is not deterministic.
**kwargs – The arguments of train_func in addition to dataloader.

Raises:

ValueError – If total_num_subsets is negative.
ValueError – If num_subsets does not divide total_num_subsets.

dattri.metric.calculate_loo_ground_truth(target_func: Callable, retrain_dir: str, test_dataloader: torch.utils.data.DataLoader) → Tuple[torch.Tensor, torch.Tensor]¶

Calculate the ground truth values for the Leave-One-Out (LOO) metric.

The LOO ground truth is directly calculated by calculating the target value difference for each sample in the test dataloader on each model in the retrain directory. The target value is calculated by the target function.

Parameters:

target_func (Callable) –
The target function that takes a model and a dataloader and returns the target value. An example of a target function like follows: ```python def target_func(model, dataloader):

model.eval() with torch.no_grad():

for inputs, labels in dataloader:
outputs = model(inputs) # Do something with the outputs, e.g., calculate the loss.

return target_value

```
retrain_dir (str) – The directory containing the retrained models. It should be the directory saved by retrain_loo.
test_dataloader (torch.utils.data.DataLoader) – The dataloader where each of the samples is used as the test set.

Returns:

A tuple of two tensors. First is the LOO: ground truth values for each sample in test_dataloader and each model in retrain_dir. The returned tensor has the shape (num_models, num_test_samples). Second is the tensor indicating the removed index. The returned tensor has the shape (num_models,).

Return type:

Tuple[torch.Tensor, torch.Tensor]

dattri.metric.calculate_lds_ground_truth(target_func: Callable, retrain_dir: str, test_dataloader: torch.utils.data.DataLoader) → Tuple[torch.Tensor, torch.Tensor]¶

Calculate the ground-truth values for the Linear Datamodeling Score (LDS) metric.

Given a target_func, this function calculates the values of the target_func on each sample in test_dataloader, and for each model in retrain_dir. These values will be used as the ground-truth values for the LDS metric.

Parameters:

target_func (Callable) –
The target function that takes the path to a model checkpoint and the test_dataloader, and returns the target values for all test samples in this dataloader. Below is an example of a target function: ```python def target_func(ckpt_path, dataloader):

params = torch.load(ckpt_path) model.load_state_dict(params) # assuming model is defined somewhere model.eval() with torch.no_grad():

for inputs, labels in dataloader:
outputs = model(inputs) # Do something with the outputs, e.g., calculate the loss.

return target_values

``` This function should return a tensor of shape (num_test_samples,) where each element is the target value for the corresponding sample.
retrain_dir (str) –
The directory containing the retrained models. It should be the directory saved by retrain_lds. The directory is organized as ```

/$path
metadata.yml /0

model_weights_0.pt model_weights_1.pt … model_weights_M.pt indices.txt

… /N

model_weights_0.pt model_weights_1.pt … model_weights_M.pt indices.txt

` additionally, the `metadata.yml` file includes the following information: `

{
‘num_subsets’: N, ‘num_runs_per_subset’: M, ‘subset_dir_map’: {

0: ‘./0’, …

}

```
test_dataloader (torch.utils.data.DataLoader) – The test dataloader that will be used to calculate the target values.

Returns:

A tuple of two tensors. The first one has: the shape (num_subsets, num_test_samples), which contains the values of the target function calculated on all test samples under num_subsets models, each retrained on a subset of the training data. The second tensor has the shape (num_subsets, subset_size), where each row refers to the indices of the training samples used to retrain the model. The target value will be flipped to be consistent with the score calculated by the attributors.

Return type:

Tuple[torch.Tensor, torch.Tensor]

dattri.benchmark.utils.flip_label(label: np.ndarray | torch.Tensor, label_space: List | np.ndarray | torch.Tensor = None, p: float = 0.1) → Tuple[np.ndarray | torch.Tensor, List]¶

Flip the label of the input label tensor with the probability p.

The function will randomly select a new label from the label_space to replace the original label.

Parameters:

label (Union[np.ndarray, torch.Tensor]) – The label tensor to be flipped.
label_space (Union[list, np.ndarray, torch.Tensor]) – The label space to sample the new label. If None, the label space will be inferred from the unique values in the input label tensor.
p (float) – The probability to flip the label.

Returns:

A tuple of two elements.: The first element is the flipped label tensor. The second element is the flipped indices.

Return type:

Tuple[Union[np.ndarray, torch.Tensor], list]

dattri.metric.lds(score: torch.Tensor, ground_truth: Tuple[torch.Tensor, torch.Tensor]) → Tuple[torch.Tensor, torch.Tensor]¶

Calculate the Linear Datamodeling Score (LDS) metric.

The LDS is calculated as the Spearman rank correlation between the predicted scores and the ground truth values for each test sample across all retrained models.

Parameters:

score (torch.Tensor) – The data attribution score tensor with the shape (num_train_samples, num_test_samples).
ground_truth (Tuple[torch.Tensor, torch.Tensor]) – A tuple of two tensors. The first one has the shape (num_subsets, num_test_samples), which is the ground-truth target values for all test samples under num_subsets models, each retrained on a subset of the training data. The second tensor has the shape (num_subsets, subset_size), where each row refers to the indices of the training samples used to retrain the model.

Returns:

A tuple of two tensors. The first tensor: contains the Spearman rank correlation between the predicted scores and the ground truth values for each test sample. The second tensor contains the p-values of the correlation. Both have the shape (num_test_samples,).

Return type:

Tuple[torch.Tensor, torch.Tensor]

dattri.metric.loo_corr(score: torch.Tensor, ground_truth: Tuple[torch.Tensor, torch.Tensor]) → torch.Tensor¶

Calculate the Leave-One-Out (LOO) correlation metric.

The LOO correlation is calculated by Pearson correlation between the score tensor and the ground truth.

TODO: more detailed description.

Parameters:

score (torch.Tensor) – The score tensor with the shape (num_train_samples, num_test_samples).
ground_truth (Tuple[torch.Tensor, torch.Tensor]) – A tuple of two tensors. First is the LOO ground truth values for each sample in test_dataloader and each model in retrain_dir. The returned tensor has the shape (num_models, num_test_samples). Second is the tensor indicating the removed index. The returned tensor has the shape (num_models,).

Returns:

A tuple containing the LOO correlation: metric values and their corresponding p-values. Both tensors have the shape (num_test_samples,).

Return type:

Tuple[torch.Tensor, torch.Tensor]

dattri.metric.mislabel_detection_auc(score: torch.Tensor, ground_truth: torch.Tensor) → Tuple[float, Tuple[torch.Tensor, ...]]¶

Calculate the AUC using sorting algorithm.

The function will calculate the false positive rates and true positive rates under different thresholds (number of data inspected), and return them with the calculated AUC (Area Under Curve).

Parameters:

score (torch.Tensor) – The self-attribution scores of shape (num_train_samples,).
ground_truth (torch.Tensor) – A tensor indicating the noise index. The returned binary tensor has the shape (num_train_samples,).

Returns:

A tuple with 2 items. The first is the AUROC value (float), the second is a Tuple with fpr, tpr, thresholds just like https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html.

Return type:

(Tuple[float, Tuple[float, …]])

dattri.benchmark.load.load_benchmark(model: str, dataset: str, metric: str, download_path: str = '~/.dattri', redownload: bool = False) → Tuple[Dict[str, Any], Tuple[torch.Tensor, torch.Tensor]]¶

Load benchmark settings for a given model, dataset, and metric.

Please check https://huggingface.co/datasets/trais-lab/dattri-benchmark to see the supported benchmark settings (model, dataset).

Parameters:

model (str) – The model name for the benchmark setting.
dataset (str) – The dataset name for the benchmark setting.
metric (str) – The matrics name for the benchmark setting, which would affected the ground truth. Currently only “lds” and “loo” are supported.
download_path (str) – The path to download the benchmark files.
redownload (bool) – Whether to redownload the benchmark files.

Returns:

The first dictionary contains the attribution inputs, the items are listed as following. - “model”: The model instance for the benchmark setting. - “models_full”: The pre-trained model checkpoints’ path with

full train dataset, presented as a list of path(str). The models are trained with same hyperparameters and dataset while the only difference is the seed for random initialization.

”models_half”: The pre-trained model checkpoints’ path with
half train dataset, presented as a list of path(str). The models are trained with same hyperparameters while the difference is the dataset sampling (half sampling) for each model checkpoint.
”train_dataset”: The path to the training dataset with the
same order as the ground-truth’s indices.
”test_dataset”: The path to the testing dataset with the
same order as the ground-truth’s indices.
”loss_func”: The loss function for the model training. Normally
speaking, this should be the same as the target function.
”target_func”: The target function for the data attribution. Normally
speaking, this should be the same as the loss function.
”train_func”: The training function for the model. Normally it’s
not required if the pre-trained model checkpoints is enough for the algorithm you want to benchmark.

The second tuple contains the ground truth for the benchmark, the items are subjected to change for each benchmark settings. It can be directly sent to the metrics function defined in dattri.metric. Notably, the ground-truth depends on the metric parameter user stated.

Return type:

Tuple[Dict[str, Any], Tuple[torch.Tensor, torch.Tensor]]

Raises:

ValueError – If the model or dataset is not supported.