Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Edit Datasets filters
Main
Tasks
Libraries
Languages
Licenses
Other
Modalities
3D
Audio
Document
Geospatial
Image
Tabular
Text
Time-series
Video
Size (rows)
Reset Size
10M
100M
Format
json
csv
parquet
optimized-parquet
imagefolder
soundfolder
webdataset
text
arrow
Evaluation
Benchmark
Apply filters
Datasets
6,391
Full-text search
Edit filters
Sort: Trending
Active filters:
10M<n<100M
Clear all
allenai/OLMoASR-Pool
Viewer
•
Updated
13 days ago
•
16.9M
•
53
•
12
nvidia/Nemotron-Post-Training-Dataset-v1
Viewer
•
Updated
Aug 25, 2025
•
25.7M
•
5.6k
•
178
OPPOer/X2Edit-Dataset
Viewer
•
Updated
Dec 30, 2025
•
17.6M
•
6.44k
•
19
domyn/FinReflectKG
Viewer
•
Updated
Oct 6, 2025
•
17.5M
•
324
•
16
collectivat/ladino-synthetic-parallel
Viewer
•
Updated
Oct 20, 2025
•
20.6M
•
32
•
1
Open-Bee/Honey-Data-15M
Viewer
•
Updated
23 days ago
•
14.8M
•
71.7k
•
114
mteb/WebFAQRetrieval
Viewer
•
Updated
Oct 19, 2025
•
11.7M
•
1.07k
•
1
tokyotech-llm/swallow-math-v2
Viewer
•
Updated
Nov 6, 2025
•
17.4M
•
6.26k
•
30
tomaztc/wiki_dpr_gemma_embeddings
Viewer
•
Updated
Oct 30, 2025
•
18.6M
•
9
•
1
HuggingFaceFW/finepdfs-edu
Viewer
•
Updated
Nov 11, 2025
•
49.5M
•
6.52k
•
85
ScienceOne-AI/S1-MMAlign
Viewer
•
Updated
20 days ago
•
21.1M
•
4.19k
•
101
tasksource/SYNTH
Viewer
•
Updated
Jan 27
•
12.1M
•
227
•
1
HuggingFaceVLA/community_dataset_v3
Updated
Dec 10, 2025
•
10.1k
•
22
Web3Survivor/Survivor
Viewer
•
Updated
Dec 11, 2025
•
49.5M
•
43
•
2
reasoning-core/symbolic-pretraining-pile
Viewer
•
Updated
10 days ago
•
16.2M
•
254
•
3
nvidia/embed-nemotron-dataset-v1
Viewer
•
Updated
Jan 12
•
12.8M
•
803
•
101
kamalnsr123456/deutsche-bahn-data
Viewer
•
Updated
Jan 18
•
74.2M
•
1.53k
•
1
philippesaade/Wikidata_Vectors_0.2
Viewer
•
Updated
3 days ago
•
44.6M
•
2.25k
•
1
RentoSaijo/NHL_DB
Viewer
•
Updated
1 day ago
•
46.6M
•
894
•
2
Shio-Koube/Danbooru-2026-parquet-metadata
Viewer
•
Updated
Jan 30
•
10.6M
•
189
•
5
Amshaker/Mobile-O-Pre-Train
Viewer
•
Updated
Feb 24
•
22.8M
•
3.59k
•
10
lerobot-data-collection/level12_rac_2_2026-02-07
Viewer
•
Updated
Feb 7
•
13.8M
•
1.56k
•
2
GildasLeDrogoff/spotify-huge-track-analysis-dataset
Viewer
•
Updated
Feb 11
•
56.3M
•
174
•
2
alexroz/CarbonBench
Viewer
•
Updated
Feb 13
•
11.7M
•
94
•
1
AdaMLLab/WebTerminal
Viewer
•
Updated
Feb 14
•
63.7M
•
222
•
3
everycure/matrix-scores
Viewer
•
Updated
2 days ago
•
39.5M
•
46
•
1
aimlresearch2023/ClimbMix10M
Viewer
•
Updated
27 days ago
•
10M
•
320
•
1
MIL-UT/Japanese-Medical-VQA-12m
Viewer
•
Updated
19 days ago
•
12.1M
•
4.82k
•
6
yuyijiong/context_qa_sum_qwen3_synthetic
Viewer
•
Updated
3 days ago
•
22.6M
•
287
•
3
BrockMisner/polymarket-crypto-5m-15m
Viewer
•
Updated
17 days ago
•
27M
•
135
•
1
Previous
1
2
3
4
5
...
100
Next