Normalizing sparse transform (a la softmax).
Arguments
- dim
The dimension along which to apply sparsemax.
- k
The number of largest elements to partial-sort input over. For optimal performance,
k
should be slightly bigger than the expected number of non-zeros in the solution. If the solution is more than k-sparse, this function is recursively called with a 2*k schedule. IfNULL
, full sorting is performed from the beginning.
Examples
input <- torch::torch_randn(10, 5, requires_grad = TRUE)
# create a top3 alpha=1.5 sparsemax on last input dimension
nn_sparsemax <- sparsemax15(dim=1, k=3)
result <- nn_sparsemax(input)
print(result)
#> torch_tensor
#> 0.0000 0.0000 0.0000 0.0000 0.4173
#> 0.0000 0.8498 0.0000 0.0000 0.0000
#> 0.0000 0.0314 0.0000 0.0000 0.3079
#> 0.0000 0.0000 0.0000 0.0000 0.2748
#> 0.4977 0.0000 0.1949 0.0000 0.0000
#> 0.0000 0.0384 0.0000 0.0000 0.0000
#> 0.0000 0.0804 0.0000 1.0000 0.0000
#> 0.0000 0.0000 0.0000 0.0000 0.0000
#> 0.5023 0.0000 0.0000 0.0000 0.0000
#> 0.0000 0.0000 0.8051 0.0000 0.0000
#> [ CPUFloatType{10,5} ][ grad_fn = <torch::autograd::LanternNode> ]