Skip to contents

Implementation of 'mixup: Beyond Empirical Risk Minimization'. As of today, tested only for categorical data, where targets are expected to be integers, not one-hot encoded vectors. This callback is supposed to be used together with nn_mixup_loss().

Usage

luz_callback_mixup(alpha = 0.4)

Arguments

alpha

parameter for the beta distribution used to sample mixing coefficients

Value

A luz_callback

Details

Overall, we follow the fastai implementation described here. Namely,

  • We work with a single dataloader only, randomly mixing two observations from the same batch.

  • We linearly combine losses computed for both targets: loss(output, new_target) = weight * loss(output, target1) + (1-weight) * loss(output, target2)

  • We draw different mixing coefficients for every pair.

  • We replace weight with weight = max(weight, 1-weight) to avoid duplicates.

Examples

if (torch::torch_is_installed()) {
mixup_callback <- luz_callback_mixup()
}