Constraint Tax in Open-Weight LLMs: An Empirical Study of Tool Calling Suppression Under Structured Output Constraints Paper • 2606.25605 • Published 1 day ago • 2
Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning Paper • 2606.24428 • Published 3 days ago • 43 • 2
Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding Paper • 2606.21906 • Published 6 days ago • 20 • 13
Sleeping Agents FEST-Style Few-Shot RL for Reasoning 🧠 Solve math problems with step‑by‑step reasoning
Sleeping Agents FEST-Style Few-Shot RL for Reasoning 🧠 Solve math problems with step‑by‑step reasoning
Sleeping Agents Implicit Memory Conflict Validator 🧠 Evaluate LLM responses for outdated memory conflicts
Sleeping Agents Implicit Memory Conflict Validator 🧠 Evaluate LLM responses for outdated memory conflicts
Sleeping Agents Sudanese CoT Reasoning Benchmark 🧠 Run Sudanese Arabic reasoning benchmark with step-by-step analysis
Sleeping Agents COPSD Sudanese Reasoning Demo 🚀 Compare Sudanese math reasoning with and without English context