From 94c1a303096fff556a5696a2cdefc45857591382 Mon Sep 17 00:00:00 2001
From: Karishma Battina <67629745+karishma-battina@users.noreply.github.com>
Date: Mon, 2 Feb 2026 07:12:08 +0000
Subject: [PATCH 1/2] Add tensor softmax operation

---
 .../terms/softmax/softmax.md                  | 98 +++++++++++++++++++
 1 file changed, 98 insertions(+)
 create mode 100644 content/pytorch/concepts/tensor-operations/terms/softmax/softmax.md

diff --git a/content/pytorch/concepts/tensor-operations/terms/softmax/softmax.md b/content/pytorch/concepts/tensor-operations/terms/softmax/softmax.md
new file mode 100644
index 00000000000..45624682770
--- /dev/null
+++ b/content/pytorch/concepts/tensor-operations/terms/softmax/softmax.md
@@ -0,0 +1,98 @@
+---
+Title: '.softmax()'
+Description: 'Applies the Softmax function to an n-dimensional input Tensor, rescaling elements so they lie in the range [0, 1] and sum to 1.'
+Subjects:
+  - 'Computer Science'
+  - 'Machine Learning'
+Tags:
+  - 'Neural Networks'
+  - 'PyTorch'
+  - 'Tensor'
+CatalogContent:
+  - 'learn-python-3'
+  - 'paths/machine-learning'
+---
+
+The **`.softmax()`** function applies the Softmax mathematical transformation to an input tensor. It is a critical operation in deep learning, particularly for multi-class classification tasks. Softmax converts a vector of raw scores (often called logits) into a probability distribution where each value represents the likelihood of a specific class.
+
+The Softmax function for an element $x_i$ in a vector $x$ is defined as:
+
+$$\text{Softmax}(x_{i}) = \frac{\exp(x_i)}{\sum_{j} \exp(x_j)}$$
+
+By exponentiating the inputs, the function ensures all outputs are non-negative. By dividing by the sum of these exponentials, it ensures that the resulting values sum to exactly 1.
+
+## Syntax
+
+```pseudo
+torch.softmax(input, dim, dtype=None)
+```
+
+**Parameters:**
+
+- `input`: The input tensor containing the raw scores (logits).
+- `dim`: A dimension along which Softmax will be computed. Every slice along dim will sum to 1.
+- `rounding_mode` (Optional): Controls the rounding behavior. Can be `None` (default), `trunc`, or `floor`.
+- `dtype` (Optional): The desired data type of the returned tensor.
+
+**Return value:**
+
+A tensor of the same shape as `input`, with values scaled between 0 and 1.
+
+## Example 1: Basic Softmax on a 1D Tensor
+
+This example shows how to convert a simple 1D tensor of logits into probabilities:
+
+```py
+import torch
+
+# A 1D tensor of raw scores
+logits = torch.tensor([1.0, 2.0, 3.0])
+
+# Apply softmax along the only dimension (0)
+probabilities = torch.softmax(logits, dim=0)
+print("Logits:", logits)
+print("Probabilities:", probabilities)
+print("Sum of probabilities:", probabilities.sum().item())
+```
+
+Here is the output:
+
+```shell
+Logits: tensor([1., 2., 3.])
+Probabilities: tensor([0.0900, 0.2447, 0.6652])
+Sum of probabilities: 1.0
+```
+
+The function converts raw logits into probabilities where the highest input value (3.0) yields the highest probability (~0.66), and the sum of all probabilities equals 1.0.
+
+## Example 2: Softmax on a 2D Tensor
+
+In real-world scenarios, data is processed in batches. For a 2D tensor where rows represent samples and columns represent classes, we usually apply softmax along `dim=1`.
+
+```py
+import torch
+
+# A 2D tensor (2 samples, 3 classes)
+logits = torch.tensor([
+    [2.0, 1.0, 0.1],
+    [1.0, 3.0, 0.2]
+])
+
+# Apply softmax along the class dimension (dim=1)
+probs = torch.softmax(logits, dim=1)
+
+print("Probabilities:\n", probs)
+print("\nSum of each row:", probs.sum(dim=1))
+```
+
+Here is the output:
+
+```shell
+Probabilities:
+ tensor([[0.6590, 0.2424, 0.0986],
+        [0.1131, 0.8360, 0.0508]])
+
+Sum of each row: tensor([1.0000, 1.0000])
+```
+
+By specifying dim=1, the operation is applied independently to each row (sample), ensuring that the class probabilities for each individual sample sum to 1.0.

From 463afff1fdedd7b6a13bf80db8ac0952bbf7c673 Mon Sep 17 00:00:00 2001
From: Mamta Wardhani <mamta.wardhani@gmail.com>
Date: Tue, 3 Feb 2026 10:34:56 +0530
Subject: [PATCH 2/2] Update softmax.md for clarity and formatting

---
 .../concepts/tensor-operations/terms/softmax/softmax.md    | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/content/pytorch/concepts/tensor-operations/terms/softmax/softmax.md b/content/pytorch/concepts/tensor-operations/terms/softmax/softmax.md
index 45624682770..ba7b78b0fdf 100644
--- a/content/pytorch/concepts/tensor-operations/terms/softmax/softmax.md
+++ b/content/pytorch/concepts/tensor-operations/terms/softmax/softmax.md
@@ -31,12 +31,11 @@ torch.softmax(input, dim, dtype=None)
 
 - `input`: The input tensor containing the raw scores (logits).
 - `dim`: A dimension along which Softmax will be computed. Every slice along dim will sum to 1.
-- `rounding_mode` (Optional): Controls the rounding behavior. Can be `None` (default), `trunc`, or `floor`.
 - `dtype` (Optional): The desired data type of the returned tensor.
 
 **Return value:**
 
-A tensor of the same shape as `input`, with values scaled between 0 and 1.
+Returns a tensor of the same shape as `input`, with values scaled between 0 and 1.
 
 ## Example 1: Basic Softmax on a 1D Tensor
 
@@ -67,7 +66,7 @@ The function converts raw logits into probabilities where the highest input valu
 
 ## Example 2: Softmax on a 2D Tensor
 
-In real-world scenarios, data is processed in batches. For a 2D tensor where rows represent samples and columns represent classes, we usually apply softmax along `dim=1`.
+For batched inputs where rows represent samples and columns represent classes, Softmax is typically applied along the class dimension:
 
 ```py
 import torch
@@ -95,4 +94,4 @@ Probabilities:
 Sum of each row: tensor([1.0000, 1.0000])
 ```
 
-By specifying dim=1, the operation is applied independently to each row (sample), ensuring that the class probabilities for each individual sample sum to 1.0.
+By specifying `dim=1`, the operation is applied independently to each row (sample), ensuring that the class probabilities for each individual sample sum to 1.0.