Skip to contents

Implementation of 'mixup: Beyond Empirical Risk Minimization'. As of today, tested only for categorical data, where targets are expected to be integers, not one-hot encoded vectors. This callback is supposed to be used together with nn_mixup_loss().


luz_callback_mixup(alpha = 0.4, ..., run_valid = FALSE, auto_loss = FALSE)



parameter for the beta distribution used to sample mixing coefficients


currently unused. Just to force named arguments.


Should it run during validation


Should it automatically modify the loss function? This will wrap the loss function to create the mixup loss. If TRUE make sure that your loss function does not apply reductions. If run_valid=FALSE, then loss will be mean reduced during validation.


A luz_callback


Overall, we follow the fastai implementation described here. Namely,

  • We work with a single dataloader only, randomly mixing two observations from the same batch.

  • We linearly combine losses computed for both targets: loss(output, new_target) = weight * loss(output, target1) + (1-weight) * loss(output, target2)

  • We draw different mixing coefficients for every pair.

  • We replace weight with weight = max(weight, 1-weight) to avoid duplicates.


if (torch::torch_is_installed()) {
mixup_callback <- luz_callback_mixup()