Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
Building on HF
16.8
TFLOPS
606
968
1745
Prithiv Sakthi
PRO
prithivMLmods
Follow
Antonin77777's profile picture
ai-lazzman's profile picture
ZKong's profile picture
4,646 followers
·
3 following
https://linktr.ee/prithivsakthi
prithivMLmods
prithivsakthiur
prithiv-sakthi
AI & ML interests
computer vision, nlp, multimodality - HuggingFace Fellow ML 🤗
Recent Activity
reacted
to
lbourdois
's
post
with 🔥
about 10 hours ago
New blog post! An introduction to a little-known but highly effective model reduction method: 𝗧𝗿𝗶𝗺𝗺𝗶𝗻𝗴✂️ We show how to reduce model size (we went up to 87.24% reduction) while preserving its performance. We applied this technique to 16 different model families across several modalities to illustrate that it works on any architecture (as long as the embedding layer is the last one of the model) and on any modality involving text. From these 16 families, we generated over 𝟱,𝟱𝟬𝟬 𝗺𝗼𝗻𝗼𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝟭𝟮𝟰 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲𝘀 🌍 Key takeaways from our experiments: 1️⃣ Trimming does not require a GPU. Our models were obtained on a CPU. 2️⃣ This method scales up to at least 4B parameters (we did not test beyond that). 3️⃣ Trimmed model is smaller than the original while preserving its performance. If you observe a slight performance drop, just fine-tuned to recover or even surpass the original performance. 4️⃣ For an equivalent compute budget, it is better to trim then fine-tune rather than fine-tuning the original model. Since the model is smaller, you can run more epochs/show more data and get in fine a better model than the original. 5️⃣ Trimming is a competitive alternative to distillation and quantization. E.g. we obtained our alternative to DistilBERT in 9 minutes on CPU vs. 90 hours of GPU for the latter. 6️⃣ Trimming could generate reasoning traces in the language of the trimmed model. This could be an alternative to generating traces in English and then translating them into the desired language. And many other things (such as how much data are needed, the impact of the database used, the order in which it should be done, etc.) are available in the blogpost! Blogpost: https://huggingface.co/blog/lbourdois/introduction-to-trimming Models: https://huggingface.co/spaces/alphaedge-ai/Trimming_models_search
reacted
to
lbourdois
's
post
with 🤗
about 10 hours ago
New blog post! An introduction to a little-known but highly effective model reduction method: 𝗧𝗿𝗶𝗺𝗺𝗶𝗻𝗴✂️ We show how to reduce model size (we went up to 87.24% reduction) while preserving its performance. We applied this technique to 16 different model families across several modalities to illustrate that it works on any architecture (as long as the embedding layer is the last one of the model) and on any modality involving text. From these 16 families, we generated over 𝟱,𝟱𝟬𝟬 𝗺𝗼𝗻𝗼𝗹𝗶𝗻𝗴𝘂𝗮𝗹 𝗺𝗼𝗱𝗲𝗹𝘀 𝗶𝗻 𝟭𝟮𝟰 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁 𝗹𝗮𝗻𝗴𝘂𝗮𝗴𝗲𝘀 🌍 Key takeaways from our experiments: 1️⃣ Trimming does not require a GPU. Our models were obtained on a CPU. 2️⃣ This method scales up to at least 4B parameters (we did not test beyond that). 3️⃣ Trimmed model is smaller than the original while preserving its performance. If you observe a slight performance drop, just fine-tuned to recover or even surpass the original performance. 4️⃣ For an equivalent compute budget, it is better to trim then fine-tune rather than fine-tuning the original model. Since the model is smaller, you can run more epochs/show more data and get in fine a better model than the original. 5️⃣ Trimming is a competitive alternative to distillation and quantization. E.g. we obtained our alternative to DistilBERT in 9 minutes on CPU vs. 90 hours of GPU for the latter. 6️⃣ Trimming could generate reasoning traces in the language of the trimmed model. This could be an alternative to generating traces in English and then translating them into the desired language. And many other things (such as how much data are needed, the impact of the database used, the order in which it should be done, etc.) are available in the blogpost! Blogpost: https://huggingface.co/blog/lbourdois/introduction-to-trimming Models: https://huggingface.co/spaces/alphaedge-ai/Trimming_models_search
updated
a collection
about 10 hours ago
MTP Qwen 3.5/3.6 Stable
View all activity
Organizations
prithivMLmods
's datasets
126
Sort: Recently updated
prithivMLmods/Helios-R-6M
Viewer
•
Updated
16 days ago
•
6.19M
•
407
•
3
prithivMLmods/Demeter-LongCoT-6M
Viewer
•
Updated
16 days ago
•
6.44M
•
311
•
4
prithivMLmods/Open-Omega-Atom-1.5M
Viewer
•
Updated
16 days ago
•
1.63M
•
383
•
6
prithivMLmods/Caption3o-Opt-v3-Tiny
Viewer
•
Updated
16 days ago
•
27k
•
273
•
3
prithivMLmods/OpenWeb383K
Viewer
•
Updated
17 days ago
•
383k
•
34
•
4
prithivMLmods/Receipt-KIE-200
Viewer
•
Updated
27 days ago
•
199
•
82
•
1
prithivMLmods/OCR-Markdown-Dense-200x
Viewer
•
Updated
Apr 21
•
200
•
56
•
1
prithivMLmods/harm_bench
Viewer
•
Updated
Apr 20
•
4k
•
341
•
5
prithivMLmods/Open-Omega-Forge-1M
Viewer
•
Updated
Mar 9
•
1M
•
573
•
7
prithivMLmods/Gacrux-Tiny-1M
Viewer
•
Updated
Mar 9
•
1.07M
•
45
•
4
prithivMLmods/d.HTML
Viewer
•
Updated
Mar 9
•
110
•
29
•
1
prithivMLmods/LAP2-K-Think-v1.b
Viewer
•
Updated
Nov 26, 2025
•
380k
•
69
•
4
prithivMLmods/Pegasus-Tiny-250K
Viewer
•
Updated
Nov 26, 2025
•
292k
•
74
•
2
prithivMLmods/LAP2-K-Think-v1.a
Viewer
•
Updated
Nov 25, 2025
•
257k
•
116
•
2
prithivMLmods/Caption3o-LongCap-v4
Viewer
•
Updated
Sep 15, 2025
•
523k
•
253
•
8
prithivMLmods/Caption3o-XL-v4
Viewer
•
Updated
Sep 15, 2025
•
52.8k
•
828
•
3
prithivMLmods/Turing-Reason-CoT
Viewer
•
Updated
Sep 15, 2025
•
4.99M
•
122
•
6
prithivMLmods/Turing-Reason-CoT-Mini
Viewer
•
Updated
Sep 15, 2025
•
558k
•
87
•
3
prithivMLmods/Gargantua-R1-Compact
Viewer
•
Updated
Sep 9, 2025
•
6.67M
•
924
•
7
prithivMLmods/OpenDoc-Null-6K
Viewer
•
Updated
Sep 9, 2025
•
6.91k
•
84
•
2
prithivMLmods/Caption3o-Opt-v3
Viewer
•
Updated
Aug 28, 2025
•
97.1k
•
683
•
5
prithivMLmods/Openpdf-Analysis-Recognition
Viewer
•
Updated
Aug 24, 2025
•
6.91k
•
2.55k
•
4
prithivMLmods/Demeter-LongCoT-400K
Viewer
•
Updated
Aug 24, 2025
•
400k
•
73
•
2
prithivMLmods/Gargantua-R1-Wee
Viewer
•
Updated
Aug 9, 2025
•
233k
•
474
•
1
prithivMLmods/Atlas-Think-Cot-12M
Viewer
•
Updated
Jul 25, 2025
•
12.4M
•
620
•
2
prithivMLmods/Poseidon-Reasoning-5M
Viewer
•
Updated
Jul 18, 2025
•
4.99M
•
241
•
17
prithivMLmods/Poseidon-Reasoning-Mini-300K
Viewer
•
Updated
Jul 18, 2025
•
356k
•
222
•
6
prithivMLmods/Open-Omega-Explora-2.5M
Viewer
•
Updated
Jul 17, 2025
•
2.63M
•
358
•
3
prithivMLmods/Caption3o-Opt-v2
Viewer
•
Updated
Jul 15, 2025
•
10.3k
•
59
•
4
prithivMLmods/Corvus-OCR-Caption-Mix
Viewer
•
Updated
Jul 13, 2025
•
230k
•
314
•
6
Previous
1
2
3
...
5
Next