Softmax
The softmax function is widely used in machine learning.
It is used to convert a vector of raw values into a probability distribution.
Given a vector $\mathbf{z} = [z_1, z_2, \ldots, z_n]$, the softmax function is defined as:
$$ \sigma(\mathbf{z_i}) = \frac{e^{z_i}}{\sum_{j=1}^{n} e^{z_j}} $$
Here’s an example in PyTorch:
import torch
import torch.nn.functional as F
logits = logits = torch.tensor([5, 4, -1], dtype=float)
prbabilities = F.softmax(logits)
print(probabilities)
# tensor([0.7297, 0.2685, 0.0018], dtype=torch.float64)