Normalizing sparse transform (a la softmax).
Arguments
- dim
The dimension along which to apply sparsemax.
- k
The number of largest elements to partial-sort input over. For optimal performance,
k
should be slightly bigger than the expected number of non-zeros in the solution. If the solution is more than k-sparse, this function is recursively called with a 2*k schedule. IfNULL
, full sorting is performed from the beginning.
Examples
input <- torch::torch_randn(10, 5, requires_grad = TRUE)
# create a top3 alpha=1.5 sparsemax on last input dimension
nn_sparsemax <- sparsemax15(dim=1, k=3)
result <- nn_sparsemax(input)
print(result)
#> torch_tensor
#> 0.0000 0.0000 0.0000 0.0000 1.0000
#> 0.0000 0.2258 0.0000 0.0000 0.0000
#> 0.0000 0.0000 0.0000 0.0460 0.0000
#> 0.0000 0.0000 0.0000 0.8466 0.0000
#> 0.0000 0.0000 0.0000 0.0000 0.0000
#> 0.0000 0.0000 0.6394 0.1074 0.0000
#> 1.0000 0.0623 0.0000 0.0000 0.0000
#> 0.0000 0.0432 0.0000 0.0000 0.0000
#> 0.0000 0.0000 0.0000 0.0000 0.0000
#> 0.0000 0.6688 0.3606 0.0000 0.0000
#> [ CPUFloatType{10,5} ][ grad_fn = <torch::autograd::LanternNode> ]