Skip to contents

Implementation of 'mixup: Beyond Empirical Risk Minimization'. As of today, tested only for categorical data, where targets are expected to be integers, not one-hot encoded vectors. This callback is supposed to be used together with nn_mixup_loss().

Usage

luz_callback_mixup(alpha = 0.4, ..., run_valid = FALSE, auto_loss = FALSE)

Arguments

alpha

parameter for the beta distribution used to sample mixing coefficients

...

currently unused. Just to force named arguments.

run_valid

Should it run during validation

auto_loss

Should it automatically modify the loss function? This will wrap the loss function to create the mixup loss. If TRUE make sure that your loss function does not apply reductions. If run_valid=FALSE, then loss will be mean reduced during validation.

Value

A luz_callback

Details

Overall, we follow the fastai implementation described here. Namely,

  • We work with a single dataloader only, randomly mixing two observations from the same batch.

  • We linearly combine losses computed for both targets: loss(output, new_target) = weight * loss(output, target1) + (1-weight) * loss(output, target2)

  • We draw different mixing coefficients for every pair.

  • We replace weight with weight = max(weight, 1-weight) to avoid duplicates.

Examples

if (torch::torch_is_installed()) {
mixup_callback <- luz_callback_mixup()
}