-
Inference-time factuality improvement in LLMs: from layer contrasting to deep-thinking tokens
Paper Review ·LLMs hallucinate (Huang et al., 2025). They...
-
Out-of-Distribution Detection in Vision-Language Models: A Survey
Paper Review ·Vision-Language Models (VLMs) like CLIP have dramatically shifted the landscape of visual understanding. Trained on internet-scale image-text pairs, these models demonstrate remarkable zero-shot generalization, describing objects they have never explicitly seen during training. Yet this generalization comes with an underappreciated fragility: when deployed in the real world, VLMs routinely encounter inputs that bear no resemblance to anything...
-
Reasoning's Razor: When Thinking More Makes Safety Worse
Paper Review ·Large Reasoning Models (LRMs) like DeepSeek-R1 and QwQ-32B have become remarkably capable at solving complex problems through extended chain-of-thought. The natural instinct is to apply this power to safety-critical tasks: detecting harmful content, catching hallucinations, flagging policy violations. More reasoning = more accuracy = safer AI, right?
A new paper challenges that intuition head-on. “Reasoning’s Razor”