Attributors

This module implement the influence function.

class dattri.algorithm.influence_function.IFAttributor(target_func: Callable, params: dict, ihvp_solver: str = 'explicit', ihvp_kwargs: Dict[str, Any] | None = None, device: str = 'cpu')

Bases: BaseAttributor

Influence function attributor.

__init__(target_func: Callable, params: dict, ihvp_solver: str = 'explicit', ihvp_kwargs: Dict[str, Any] | None = None, device: str = 'cpu') None

Influence function attributor.

Parameters:
  • target_func (Callable) –

    The target function to be attributed. The function can be quite flexible in terms of what is calculated, but it should take the parameters and the dataloader as input. A typical example is as follows: ```python @flatten_func(model) def f(params, dataloader):

    loss = nn.CrossEntropyLoss() loss_val = 0 for image, label in dataloader:

    yhat = torch.func.functional_call(model, params, image) loss_val += loss(yhat, label)

    return loss_val

    ```. This examples calculates the loss of the model on the dataloader.

  • params (dict) –

    The parameters of the target function. The key is the name of a parameter and the value is the parameter tensor. TODO: This should be changed to support a list of parameters or

    paths for ensembling and memory efficiency.

  • ihvp_solver (str) – The solver for inverse hessian vector product calculation, currently we only support “explicit”, “cg” and “arnoldi”.

  • ihvp_kwargs (Optional[Dict[str, Any]]) – Keyword arguments for ihvp solver. calculation, currently we only support “explicit”, “cg”, “arnoldi”, and “lissa”.

  • device (str) – The device to run the attributor. Default is cpu.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) torch.Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (DataLoader) – The dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (DataLoader) – The dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

torch.Tensor

cache(full_train_dataloader: DataLoader) None

Cache the dataset for inverse hessian calculation.

Parameters:

full_train_dataloader (DataLoader) – The dataloader with full training samples for inverse hessian calculation.

class dattri.algorithm.influence_function.IFAttributorArnoldi(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', precompute_data_ratio: float = 1.0, proj_dim: int = 100, max_iter: int = 100, norm_constant: float = 1.0, tol: float = 1e-07, regularization: float = 0.0, seed: int = 0)

Bases: BaseInnerProductAttributor

The inner product attributor with Arnoldi projection transformation.

__init__(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', precompute_data_ratio: float = 1.0, proj_dim: int = 100, max_iter: int = 100, norm_constant: float = 1.0, tol: float = 1e-07, regularization: float = 0.0, seed: int = 0) None

Initialize the Arnoldi projection attributor.

Parameters:
  • task (AttributionTask) – The task to be attributed. Must be an instance of AttributionTask.

  • layer_name (Optional[Union[str, List[str]]]) – The name of the layer to be used to calculate the train/test representations. If None, full parameters are used. This should be a string or a list of strings if multiple layers are needed. The name of layer should follow the key of model.named_parameters(). Default: None.

  • device (str) – Device to run the attributor on. Default is “cpu”.

  • precompute_data_ratio (float) – Ratio of full training data used to precompute the Arnoldi projector. Default is 1.0.

  • proj_dim (int) – Dimension after projection. Corresponds to number of top eigenvalues to keep for Hessian approximation.

  • max_iter (int) – Maximum iterations for Arnoldi Iteration. Default is 100.

  • norm_constant (float) – Constant for the norm of the projected vector. May need to be > 1 for large number of parameters to avoid dividing the projected vector by a very large normalization constant. Default is 1.0.

  • tol (float) – Convergence tolerance. Algorithm stops if the norm of the current basis vector < tol. Default is 1e-7.

  • regularization (float) – Regularization term for Hessian vector product. Adding regularization * I to the Hessian matrix, where I is the identity matrix. Useful for singular or ill-conditioned matrices. Default is 0.0.

  • seed (int) – Random seed for projector. Default is 0.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) torch.Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (DataLoader) – Dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (DataLoader) – Dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

torch.Tensor

cache(full_train_dataloader: DataLoader) None

Cache the dataset and pre-calculate the Arnoldi projector.

Parameters:

full_train_dataloader (DataLoader) – Dataloader with full training data.

generate_test_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of test data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial test representations.

The default implementation calculates the gradient of the test loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The test data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the test data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

generate_train_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of train data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial train representations.

The default implementation calculates the gradient of the train loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The train data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the train data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

transform_test_rep(ckpt_idx: int, test_rep: Tensor) Tensor

Transform the test representations via Arnoldi projection.

Parameters:
  • ckpt_idx (int) – Index of the model checkpoints. Used for ensembling different trained model checkpoints.

  • test_rep (torch.Tensor) – Test representations to be transformed. A 2-d tensor with shape (batch_size, num_params).

Returns:

Transformed test representations. A 2-d tensor with

shape (batch_size, proj_dim).

Return type:

torch.Tensor

Raises:

ValueError – If the Arnoldi projector has not been cached.

transform_train_rep(ckpt_idx: int, train_rep: Tensor) Tensor

Transform the train representations via Arnoldi projection.

Parameters:
  • ckpt_idx (int) – Index of the model checkpoints. Used for ensembling different trained model checkpoints.

  • train_rep (torch.Tensor) – Train representations to be transformed. A 2-d tensor with shape (batch_size, num_params).

Returns:

Transformed train representations. A 2-d tensor with

shape (batch_size, proj_dim).

Return type:

torch.Tensor

Raises:

ValueError – If the Arnoldi projector has not been cached.

class dattri.algorithm.influence_function.IFAttributorCG(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', max_iter: int = 10, tol: float = 1e-07, mode: str = 'rev-rev', regularization: float = 0.0)

Bases: BaseInnerProductAttributor

The inner product attributor with CG inverse hessian transformation.

__init__(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', max_iter: int = 10, tol: float = 1e-07, mode: str = 'rev-rev', regularization: float = 0.0) None

Initialize the CG inverse Hessian attributor.

Parameters:
  • task (AttributionTask) – The task to be attributed. Must be an instance of AttributionTask.

  • device (str) – Device to run the attributor on. Default is “cpu”.

  • layer_name (Optional[Union[str, List[str]]]) – The name of the layer to be used to calculate the train/test representations. If None, full parameters are used. This should be a string or a list of strings if multiple layers are needed. The name of layer should follow the key of model.named_parameters(). Default: None.

  • max_iter (int) – Maximum iterations for Conjugate Gradient Descent. Default is 10.

  • tol (float) – Convergence tolerance. Algorithm stops if residual norm < tol. Default is 1e-7.

  • mode (str) – Auto-diff mode. Options: - “rev-rev”: Two reverse-mode auto-diffs. Better compatibility, more memory cost. - “rev-fwd”: Reverse-mode + forward-mode. Memory-efficient, less compatible.

  • regularization (float) – Regularization term for Hessian vector product. Adding regularization * I to the Hessian matrix, where I is the identity matrix. Useful for singular or ill-conditioned matrices. Default is 0.0.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) torch.Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (DataLoader) – Dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (DataLoader) – Dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

torch.Tensor

cache(full_train_dataloader: DataLoader) None

Cache the full training dataloader or precompute and cache more information.

By default, the cache function only caches the full training dataloader. Subclasses may override this function to precompute and cache more information.

Parameters:

full_train_dataloader (torch.utils.data.DataLoader) – Dataloader for the full training data. Ideally, the batch size of the dataloader should be the same as the number of training samples to get the best accuracy for some attributors. Smaller batch size may lead to a less accurate result but lower memory consumption.

generate_test_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of test data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial test representations.

The default implementation calculates the gradient of the test loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The test data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the test data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

generate_train_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of train data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial train representations.

The default implementation calculates the gradient of the train loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The train data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the train data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

transform_test_rep(ckpt_idx: int, test_rep: Tensor) Tensor

Calculate the transformation on the test rep through ihvp_cg.

Parameters:
  • ckpt_idx (int) – Index of the model checkpoints. Used for ensembling different trained model checkpoints.

  • test_rep (torch.Tensor) – Test representations to be transformed. Typically a 2-d tensor with shape (batch_size, num_parameters).

Returns:

Transformed test representations. Typically a 2-d

tensor with shape (batch_size, transformed_dimension).

Return type:

torch.Tensor

transform_train_rep(ckpt_idx: int, train_rep: Tensor) Tensor

Transform the train representations.

Inner product attributor calculates the inner product between the (transformed) train representations and test representations. This function calculates the transformation of the train representations. For example, the transformation could be a dimension reduction of the train representations.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • train_rep (torch.Tensor) – The train representations to be transformed. Typically, it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Returns:

The transformed train representations. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, transformed_dimension).

Return type:

torch.Tensor

class dattri.algorithm.influence_function.IFAttributorDataInf(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', regularization: float = 0.0, fim_estimate_data_ratio: float = 1.0)

Bases: BaseInnerProductAttributor

The inner product attributor with DataInf inverse hessian transformation.

__init__(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', regularization: float = 0.0, fim_estimate_data_ratio: float = 1.0) None

Initialize the DataInf inverse Hessian attributor.

Parameters:
  • task (AttributionTask) – The task to be attributed. Must be an instance of AttributionTask.

  • layer_name (Optional[Union[str, List[str]]]) – The name of the layer to be used to calculate the train/test representations. If None, full parameters are used. This should be a string or a list of strings if multiple layers are needed. The name of layer should follow the key of model.named_parameters(). Default: None.

  • device (str) – Device to run the attributor on. Default is “cpu”.

  • regularization (float) – Regularization term for Hessian vector product. Adding regularization * I to the Hessian matrix, where I is the identity matrix. Useful for singular or ill-conditioned matrices. Default is 0.0.

  • fim_estimate_data_ratio (float) – Ratio of full training data used to approximate the empirical Fisher information matrix. Default is 1.0.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) torch.Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (DataLoader) – Dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (DataLoader) – Dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

torch.Tensor

cache(full_train_dataloader: DataLoader) None

Cache the dataset and pre-calculate the Arnoldi projector.

Parameters:

full_train_dataloader (DataLoader) – Dataloader with full training data.

generate_test_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of test data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial test representations.

The default implementation calculates the gradient of the test loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The test data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the test data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

generate_train_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of train data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial train representations.

The default implementation calculates the gradient of the train loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The train data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the train data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

transform_test_rep(ckpt_idx: int, test_rep: Tensor) Tensor

Calculate the transformation on the test representations.

Parameters:
  • ckpt_idx (int) – Index of the model checkpoints. Used for ensembling different trained model checkpoints.

  • test_rep (torch.Tensor) – Test representations to be transformed. Typically a 2-d tensor with shape (batch_size, num_parameters).

Returns:

Transformed test representations. Typically a 2-d

tensor with shape (batch_size, transformed_dimension).

Return type:

torch.Tensor

transform_train_rep(ckpt_idx: int, train_rep: Tensor) Tensor

Transform the train representations.

Inner product attributor calculates the inner product between the (transformed) train representations and test representations. This function calculates the transformation of the train representations. For example, the transformation could be a dimension reduction of the train representations.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • train_rep (torch.Tensor) – The train representations to be transformed. Typically, it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Returns:

The transformed train representations. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, transformed_dimension).

Return type:

torch.Tensor

class dattri.algorithm.influence_function.IFAttributorEKFAC(task: AttributionTask, module_name: str | List[str] | None = None, device: str | None = 'cpu', damping: float = 0.0)

Bases: BaseInnerProductAttributor

The inner product attributor with EK-FAC inverse FIM transformation.

__init__(task: AttributionTask, module_name: str | List[str] | None = None, device: str | None = 'cpu', damping: float = 0.0) None

Initialize the EK-FAC inverse FIM attributor.

Parameters:
  • task (AttributionTask) –

    The task to be attributed. Must be an instance of AttributionTask. The loss function for EK-FAC attributor should return the following, - loss: a single tensor of loss. Should be the mean loss by the

    batch size.

    • mask (optional): a tensor of shape (batch_size, t), where 1’s

      indicate that the IFVP will be estimated on these input positions and 0’s indicate that these positions are irrelevant (e.g. padding tokens).

    t is the number of steps, or sequence length of the input data. If the input data are non-sequential, t should be set to 1. The FIM will be estimated on this function.

  • module_name (Optional[Union[str, List[str]]]) – The name of the module to be used to calculate the train/test representations. If None, all linear modules are used. This should be a string or a list of strings if multiple modules are needed. The name of module should follow the key of model.named_modules(). Default: None.

  • device (str) – Device to run the attributor on. Default is “cpu”.

  • damping (float) – Damping factor used for non-convexity in EK-FAC IFVP calculation. Default is 0.0.

Raises:

ValueError – If there are multiple checkpoints in task.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) torch.Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (DataLoader) – Dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (DataLoader) – Dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

torch.Tensor

cache(full_train_dataloader: DataLoader, max_iter: int | None = None) None

Cache the dataset and statistics for inverse FIM calculation.

Cache the full training dataset as other attributors. Estimate and cache the covariance matrices, eigenvector matrices and corrected eigenvalues based on the samples of training data.

Parameters:
  • full_train_dataloader (DataLoader) – The dataloader with full training samples for inverse FIM calculation.

  • max_iter (int, optional) – An integer indicating the maximum number of batches that will be used for estimating the the covariance matrices and lambdas. Default to length of full_train_dataloader.

generate_test_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of test data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial test representations.

The default implementation calculates the gradient of the test loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The test data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the test data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

generate_train_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of train data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial train representations.

The default implementation calculates the gradient of the train loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The train data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the train data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

transform_test_rep(ckpt_idx: int, test_rep: Tensor) Tensor

Calculate the transformation on the test representations.

Parameters:
  • ckpt_idx (int) – Index of the model checkpoints. Used for ensembling different trained model checkpoints.

  • test_rep (torch.Tensor) – Test representations to be transformed. Typically a 2-d tensor with shape (batch_size, num_parameters).

Returns:

Transformed test representations. Typically a 2-d

tensor with shape (batch_size, transformed_dimension).

Return type:

torch.Tensor

Raises:

ValueError – If specifies a non-zero ckpt_idx.

transform_train_rep(ckpt_idx: int, train_rep: Tensor) Tensor

Transform the train representations.

Inner product attributor calculates the inner product between the (transformed) train representations and test representations. This function calculates the transformation of the train representations. For example, the transformation could be a dimension reduction of the train representations.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • train_rep (torch.Tensor) – The train representations to be transformed. Typically, it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Returns:

The transformed train representations. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, transformed_dimension).

Return type:

torch.Tensor

class dattri.algorithm.influence_function.IFAttributorExplicit(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', regularization: float = 0.0)

Bases: BaseInnerProductAttributor

The inner product attributor with explicit inverse hessian transformation.

__init__(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', regularization: float = 0.0) None

Initialize the explicit inverse Hessian attributor.

Parameters:
  • task (AttributionTask) – Task to attribute. Must be an instance of AttributionTask.

  • layer_name (Optional[Union[str, List[str]]]) – The name of the layer to be used to calculate the train/test representations. If None, full parameters are used. This should be a string or a list of strings if multiple layers are needed. The name of layer should follow the key of model.named_parameters(). Default: None.

  • device (str) – Device to run the attributor on. Default is “cpu”.

  • regularization (float) – Regularization term added to Hessian matrix. Useful for singular or ill-conditioned Hessian matrices. Added as regularization * I, where I is the identity matrix. Default is 0.0.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) torch.Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (DataLoader) – Dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (DataLoader) – Dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

torch.Tensor

cache(full_train_dataloader: DataLoader) None

Cache the full training dataloader or precompute and cache more information.

By default, the cache function only caches the full training dataloader. Subclasses may override this function to precompute and cache more information.

Parameters:

full_train_dataloader (torch.utils.data.DataLoader) – Dataloader for the full training data. Ideally, the batch size of the dataloader should be the same as the number of training samples to get the best accuracy for some attributors. Smaller batch size may lead to a less accurate result but lower memory consumption.

generate_test_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of test data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial test representations.

The default implementation calculates the gradient of the test loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The test data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the test data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

generate_train_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of train data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial train representations.

The default implementation calculates the gradient of the train loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The train data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the train data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

transform_test_rep(ckpt_idx: int, test_rep: Tensor) Tensor

Calculate the transformation on the test rep through ihvp_explicit.

Parameters:
  • ckpt_idx (int) – Index of model parameters. Used for ensembling.

  • test_rep (torch.Tensor) – Test representations to be transformed. Typically a 2-d tensor with shape (batch_size, num_parameters).

Returns:

Transformed test representations. Typically a 2-d

tensor with shape (batch_size, transformed_dimension).

Return type:

torch.Tensor

transform_train_rep(ckpt_idx: int, train_rep: Tensor) Tensor

Transform the train representations.

Inner product attributor calculates the inner product between the (transformed) train representations and test representations. This function calculates the transformation of the train representations. For example, the transformation could be a dimension reduction of the train representations.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • train_rep (torch.Tensor) – The train representations to be transformed. Typically, it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Returns:

The transformed train representations. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, transformed_dimension).

Return type:

torch.Tensor

class dattri.algorithm.influence_function.IFAttributorLiSSA(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', batch_size: int = 1, num_repeat: int = 1, recursion_depth: int = 5000, damping: float = 0.0, scaling: float = 50.0, mode: str = 'rev-rev')

Bases: BaseInnerProductAttributor

The inner product attributor with LiSSA inverse hessian transformation.

__init__(task: AttributionTask, layer_name: str | List[str] | None = None, device: str | None = 'cpu', batch_size: int = 1, num_repeat: int = 1, recursion_depth: int = 5000, damping: float = 0.0, scaling: float = 50.0, mode: str = 'rev-rev') None

Initialize the LiSSA inverse Hessian attributor.

Parameters:
  • task (AttributionTask) – The task to be attributed. Must be an instance of AttributionTask.

  • layer_name (Optional[Union[str, List[str]]]) – The name of the layer to be used to calculate the train/test representations. If None, full parameters are used. This should be a string or a list of strings if multiple layers are needed. The name of layer should follow the key of model.named_parameters(). Default: None.

  • device (str) – Device to run the attributor on. Default is “cpu”.

  • batch_size (int) – Batch size for LiSSA inner loop update. Default is 1.

  • num_repeat (int) – Number of samples of the HVP approximation to average. Default is 1.

  • recursion_depth (int) – Number of recursions to estimate each IHVP sample. Default is 5000.

  • damping (float) – Damping factor for non-convexity in LiSSA IHVP calculation.

  • scaling (float) – Scaling factor for convergence in LiSSA IHVP calculation.

  • mode (str) – Auto-diff mode. Options: - “rev-rev”: Two reverse-mode auto-diffs. Better compatibility, more memory cost. - “rev-fwd”: Reverse-mode + forward-mode. Memory-efficient, less compatible.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) torch.Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (DataLoader) – Dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (DataLoader) – Dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

torch.Tensor

cache(full_train_dataloader: DataLoader) None

Cache the full training dataloader or precompute and cache more information.

By default, the cache function only caches the full training dataloader. Subclasses may override this function to precompute and cache more information.

Parameters:

full_train_dataloader (torch.utils.data.DataLoader) – Dataloader for the full training data. Ideally, the batch size of the dataloader should be the same as the number of training samples to get the best accuracy for some attributors. Smaller batch size may lead to a less accurate result but lower memory consumption.

generate_test_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of test data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial test representations.

The default implementation calculates the gradient of the test loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The test data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the test data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

generate_train_rep(ckpt_idx: int, data: Tuple[torch.Tensor, ...]) torch.Tensor

Generate initial representations of train data.

Inner product attributors calculate the inner product between the (transformed) train representations and test representations. This function generates the initial train representations.

The default implementation calculates the gradient of the train loss with respect to the parameter. Subclasses may override this function to calculate something else.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • data (Tuple[Tensor]) – The train data. Typically, this is a tuple of input data and target data but the number of items in this tuple should align with the corresponding argument in the target function. The tensors’ shape follows (1, batch_size, …).

Returns:

The initial representations of the train data. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Return type:

torch.Tensor

static lissa_collate_fn(sampled_input: List[Tensor]) Tuple[Tensor, List[Tuple[Tensor, ...]]]

Collate function for LISSA.

Parameters:

sampled_input (List[Tensor]) – The sampled input from the dataloader.

Returns:

The collated input for the LISSA.

Return type:

Tuple[Tensor, List[Tuple[Tensor, …]]]

transform_test_rep(ckpt_idx: int, test_rep: Tensor) Tensor

Calculate the transformation on the test rep through ihvp_lissa.

Parameters:
  • ckpt_idx (int) – Index of the model checkpoints. Used for ensembling different trained model checkpoints.

  • test_rep (torch.Tensor) – Test representations to be transformed. Typically a 2-d tensor with shape (batch_size, num_parameters).

Returns:

Transformed test representations. Typically a 2-d

tensor with shape (batch_size, transformed_dimension).

Return type:

torch.Tensor

transform_train_rep(ckpt_idx: int, train_rep: Tensor) Tensor

Transform the train representations.

Inner product attributor calculates the inner product between the (transformed) train representations and test representations. This function calculates the transformation of the train representations. For example, the transformation could be a dimension reduction of the train representations.

Parameters:
  • ckpt_idx (int) – The index of the model checkpoints. This index is used for ensembling different trained model checkpoints.

  • train_rep (torch.Tensor) – The train representations to be transformed. Typically, it is a 2-d dimensional tensor with the shape of (batch_size, num_parameters).

Returns:

The transformed train representations. Typically,

it is a 2-d dimensional tensor with the shape of (batch_size, transformed_dimension).

Return type:

torch.Tensor

class dattri.algorithm.tracin.TracInAttributor(task: AttributionTask, weight_list: Tensor, normalized_grad: bool, projector_kwargs: Dict[str, Any] | None = None, layer_name: str | List[str] | None = None, device: str = 'cpu')

Bases: BaseAttributor

TracIn attributor.

__init__(task: AttributionTask, weight_list: Tensor, normalized_grad: bool, projector_kwargs: Dict[str, Any] | None = None, layer_name: str | List[str] | None = None, device: str = 'cpu') None

Initialize the TracIn attributor.

Parameters:
  • task (AttributionTask) – The task to be attributed. Please refer to the AttributionTask for more details.

  • weight_list (Tensor) – The weight used for the “weighted sum”. For TracIn/CosIn, this will contain a list of learning rates at each ckpt; for Grad-Dot/Grad-Cos, this will be a list of ones.

  • normalized_grad (bool) – Whether to apply normalization to gradients.

  • projector_kwargs (Optional[Dict[str, Any]]) – The keyword arguments for the projector.

  • layer_name (Optional[Union[str, List[str]]]) – The name of the layer to be used to calculate the train/test representations. If None, full parameters are used. This should be a string or a list of strings if multiple layers are needed. The name of layer should follow the key of model.named_parameters(). Default: None.

  • device (str) – The device to run the attributor. Default is cpu.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (torch.utils.data.DataLoader) – The dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (torch.utils.data.DataLoader) – The dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Raises:

ValueError – The length of params_list and weight_list don’t match.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

Tensor

cache() None

Precompute and cache some values for efficiency.

class dattri.algorithm.trak.TRAKAttributor(task: AttributionTask, correct_probability_func: Callable, projector_kwargs: Dict[str, Any] | None = None, layer_name: str | List[str] | None = None, device: str = 'cpu')

Bases: BaseAttributor

TRAK attributor.

__init__(task: AttributionTask, correct_probability_func: Callable, projector_kwargs: Dict[str, Any] | None = None, layer_name: str | List[str] | None = None, device: str = 'cpu') None

Initialize the TRAK attributor.

Parameters:
  • task (AttributionTask) – The task to be attributed. Please refer to the AttributionTask for more details.

  • correct_probability_func (Callable) –

    The function to calculate the probability to correctly predict the label of the input data. A typical example is as follows: ```python @flatten_func(model) def m(params, image_label_pair):

    image, label = image_label_pair image_t = image.unsqueeze(0) label_t = label.unsqueeze(0) loss = nn.CrossEntropyLoss() yhat = torch.func.functional_call(model, params, image_t) p = torch.exp(-loss(yhat, label_t)) return p

    ```

  • projector_kwargs (Optional[Dict[str, Any]], optional) – The kwargs for the random projection. Defaults to None.

  • layer_name (Optional[Union[str, List[str]]]) – The name of the layer to be used to calculate the train/test representations. If None, full parameters are used. This should be a string or a list of strings if multiple layers are needed. The name of layer should follow the key of model.named_parameters(). Default: None.

  • device (str) – The device to run the attributor. Default is “cpu”.

attribute(test_dataloader: torch.utils.data.DataLoader, train_dataloader: torch.utils.data.DataLoader | None = None) torch.Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (torch.utils.data.DataLoader) – The dataloader for training samples to calculate the influence. If cache is called before attribute, this dataloader can consists of a subset of the full training dataset cached in cache. In this case, only a part of the training set’s influence will be calculated. The dataloader should not be shuffled.

  • test_dataloader (torch.utils.data.DataLoader) – The dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

torch.Tensor

Raises:

ValueError – If the train_dataloader is not None and the full training dataloader is cached or no train_loader is provided in both cases.

cache(full_train_dataloader: DataLoader) None

Cache the dataset for gradient calculation.

Parameters:

full_train_dataloader (torch.utils.data.DataLoader) – The dataloader with full training samples for gradient calculation.

class dattri.algorithm.rps.RPSAttributor(task: AttributionTask, final_linear_layer_name: str, normalize_preactivate: bool = False, l2_strength: float = 0.003, epoch: int = 3000, device: str = 'cpu')

Bases: BaseAttributor

Representer point selection attributor.

__init__(task: AttributionTask, final_linear_layer_name: str, normalize_preactivate: bool = False, l2_strength: float = 0.003, epoch: int = 3000, device: str = 'cpu') None

Representer point selection attributor.

Parameters:
  • task (AttributionTask) – The task to be attributed. Please refer to the AttributionTask for more details. Notably, the target_func is required to have inputs are list of pre-activation values (f_i in the paper) and list of labels. Typical examples are loss functions such as BCELoss and CELoss. We also assume the model has a final linear layer. RPS will extract the final linear layer’s input and its parameter. The parameters will be used for the initialization of the l2-finetuning. That is, model_output = linear(second-to-last feature).

  • final_linear_layer_name (str) – The name of the final linear layer’s name in the model.

  • normalize_preactivate (bool) – If set to true, then the intermediate layer output will be normalized. The value of the output inner-product will not be affected by the value of individual output magnitude.

  • l2_strength (float) – The l2 regularization to fine-tune the last layer.

  • epoch (int) – The number of epoch used to fine-tune the last layer.

  • device (str) – The device to run the attributor. Default is cpu.

attribute(train_dataloader: DataLoader, test_dataloader: DataLoader) Tensor

Calculate the influence of the training set on the test set.

Parameters:
  • train_dataloader (DataLoader) – The dataloader for training samples to calculate the influence. It can be a subset of the full training set if cache is called before. A subset means that only a part of the training set’s influence is calculated. The dataloader should not be shuffled.

  • test_dataloader (DataLoader) – The dataloader for test samples to calculate the influence. The dataloader should not be shuffled.

Returns:

The influence of the training set on the test set, with the shape

of (num_train_samples, num_test_samples).

Return type:

Tensor

cache(full_train_dataloader: DataLoader) None

Cache the full dataset for fine-tuning.

Parameters:

full_train_dataloader (DataLoader) – The dataloader with full training samples for the last linear layer fine-tuning.

class dattri.algorithm.data_shapley.KNNShapleyAttributor(k_neighbors: int, task: AttributionTask = None, distance_func: Callable | None = None)

Bases: BaseAttributor

KNN Data Shapley Attributor.

__init__(k_neighbors: int, task: AttributionTask = None, distance_func: Callable | None = None) None

Initialize the AttributionTask.

KNN Data Shapley Valuation is generally dataset-specific. Passing a model is optional and currently can be done in the customizable distance function.

Parameters:
  • k_neighbors (int) – The number of neighbors in KNN model.

  • task (AttributionTask) – The task to be attributed. Used to pass the model and hook information in this attributor. Please refer to the AttributionTask for more details.

  • distance_func (Callable, optional) –

    Customizable function used for distance calculation in KNN. The function can be quite flexible in terms of what is calculated, but it should take two batches of data as input. A typical example is as follows: ```python def f(batch_x, batch_y):

    coord1 = batch_x[0] coord2 = batch_y[0] return torch.cdist(coord1, coord2)

    ```. If not provided, a default Euclidean distance function will be used.

Raises:

NotImplementedError – If task is not None.

attribute(train_dataloader: torch.utils.data.DataLoader, test_dataloader: torch.utils.data.DataLoader, train_labels: List[int] | None = None, test_labels: List[int] | None = None) None

Calculate the KNN shapley values of the training set on each test sample.

Parameters:
  • train_dataloader (torch.utils.data.DataLoader) – The dataloader for training samples to calculate the shapley values. The dataloader should not be shuffled.

  • test_dataloader (torch.utils.data.DataLoader) – The dataloader for test samples to calculate the shapley values. The dataloader should not be shuffled.

  • train_labels – (List[int], optional): The list of training labels, with the same size and order of the training dataloader. If not provided, the last element in each batch from the loader will be used as label.

  • test_labels – (List[int], optional): The list of test labels, with the same size and order of the test dataloader. If not provided, the last element in each batch from the loader will be used as label.

Returns:

The KNN shapley values of the training set on the test set, with

the shape of (num_train_samples, num_test_samples).

Return type:

Tensor

Raises:

ValueError – If the length of provided labels and dataset mismatch.

cache() None

Precompute and cache some values for efficiency.