Skip to content

Commit b1efc1b

Browse files
QuantLLM Version v2.0.0 Completed.
1 parent 83da426 commit b1efc1b

File tree

2 files changed

+7
-5
lines changed

2 files changed

+7
-5
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -188,3 +188,4 @@ cython_debug/
188188
.pypirc.pypirc
189189
venv
190190
upcoming.md
191+
post.md

README.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -112,7 +112,7 @@ Llama 2/3, Mistral, Mixtral, Qwen/Qwen2, Phi-1/2/3, Gemma, Falcon, GPT-NeoX, Sta
112112
<td>
113113

114114
### 📦 6 Export Formats
115-
- **GGUF** - llama.cpp, Ollama, LM Studio
115+
- **GGUF** - Pure Python export (No binaries!)
116116
- **ONNX** - ONNX Runtime, TensorRT
117117
- **SafeTensors** - HuggingFace
118118
- **MLX** - Apple Silicon
@@ -125,19 +125,20 @@ Llama 2/3, Mistral, Mixtral, Qwen/Qwen2, Phi-1/2/3, Gemma, Falcon, GPT-NeoX, Sta
125125
<td>
126126

127127
### 🔧 Zero-Config Smart Defaults
128-
- Hardware auto-detection (GPU, memory, capabilities)
129-
- Optimal quantization selection
130-
- Automatic batch size calculation
128+
- **SmartConfig Stats Panel** (See size before loading)
129+
- Hardware auto-detection & optimization
130+
- Automatic quantization selection
131131
- Memory-aware loading
132132

133133
</td>
134134
<td>
135135

136136
### 💾 Memory Optimizations
137+
- **Dynamic Padding** (Efficient training)
138+
- **OOM Prevention** (Expandable segments)
137139
- Dynamic CPU ↔ GPU offloading
138140
- Gradient checkpointing
139141
- CPU optimizer states
140-
- Layer-wise memory tracking
141142

142143
</td>
143144
</tr>

0 commit comments

Comments
 (0)