LARK-Lab/EnvFactory-SFT-DeepSeekV4Flash
Updated
Large Language Models
Attention Amnesia in Hybrid LLMs: When CoT Fine-Tuning Breaks Long-Range Recall, and How to Fix It
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL