Complexity of Classical Acceleration for ℓ1-Regularized PageRank
Kimon Fountoulakis
University of Waterloo, Canada
kimon.fountoulakis@uwaterloo.ca
David Martínez-Rubio
IMDEA Software Institute, Madrid, Spain
david.martinezrubio@imdea.org
February 25, 2026
Abstract
We study the degree-weighted work required to compute ℓ1-regularized PageRank using the standard one-gradient-
per-iteration accelerated proximal-gradient method (FISTA). For non-accelerated local methods, the best known
worst-case w
Failed to generate LLM review.
Failed to generate research directions.
LUMEN: LONGITUDINAL MULTI-MODAL RADIOLOGY MODEL FOR PROGNOSIS AND
DIAGNOSIS
Zhifan Jiang1
Dong Yang2
Vishwesh Nath2
Abhijeet Parida1,3
Nishad P. Kulkarni1
Ziyue Xu2
Daguang Xu2
Syed Muhammad Anwar1,4
Holger R. Roth2
Marius George Linguraru1,4
1 Sheikh Zayed Institute for Pediatric Surgical Innovation,
Children’s National Hospital, Washington DC, USA
2 Nvidia Corporation, Santa Clara, CA, USA
3 ETSI Telecomunicaci´on, Universidad Polit´ecnica de Madrid, Madrid, Spain
4 School of Medicine and Heal
Failed to generate LLM review.
Failed to generate research directions.
SOM-VQ: Topology-Aware Tokenization for Interactive Generative Models
Alessandro Londei 1 Denise Lanzieri 1 Matteo Benati 1 2
Abstract
Vector-quantized representations enable power-
ful discrete generative models but lack semantic
structure in token space, limiting interpretable
human control. We introduce SOM-VQ, a to-
kenization method that combines vector quanti-
zation with Self-Organizing Maps to learn dis-
crete codebooks with explicit low-dimensional
topology. Unlike standard VQ-VAE, SOM-
Failed to generate LLM review.
Failed to generate research directions.
SparkMe: Adaptive Semi-Structured
Interviewing for Qualitative Insight Discovery
David Anugraha, Vishakh Padmakumar, Diyi Yang
Stanford University
{davidanu, vishakhp, diyiy}@stanford.edu
February 25, 2026
Abstract
Qualitative insights from user experiences are critical for informing product and policy decisions, but
collecting such data at scale is constrained by the time and availability of experts to conduct semi-structured
interviews. Recent work has explored using large language models (LLM
Failed to generate LLM review.
Failed to generate research directions.
Cooperative-Competitive Team Play of Real-World Craft Robots
Rui Zhao1∗, Xihui Li1,2∗, Yizheng Zhang1∗, Yuzhen Liu1∗,
Zhong Zhang1, Yufeng Zhang1, Cheng Zhou1, Zhengyou Zhang1, Lei Han1
Abstract— Multi-agent deep Reinforcement Learning (RL)
has made significant progress in developing intelligent game-
playing agents in recent years. However, the efficient training
of collective robots using multi-agent RL and the transfer
of learned policies to real-world applications remain open
research questi
Failed to generate LLM review.
Failed to generate research directions.
As AI "agents" evolve from simple chatbots into autonomous coworkers that handle our emails, medical data, and software code, we are entering a dangerous era of Agent-Mediated Deception. This research reveals a startling "Expert’s Paradox" where the more we trust these systems to handle complex tasks, the less likely we are to notice when a hidden attack has turned our trusted AI assistant into a digital double agent. By testing over 300 participants on a high-fidelity simulation platform called HAT-Lab, the authors found that a staggering 91% of users failed to detect stealthy attacks, often because their professional expertise created a "cognitive tunnel" that blinded them to security risks. To combat this, the study move beyond simple disclaimers, proving that the best defense is "calibrated friction"—smart, interruptive warnings that break our autopilot and force us to regain a healthy, protective skepticism of the algorithms we rely on.
Failed to generate LLM review.
Failed to generate research directions.
High-stakes reasoning in AI typically requires models to "think" out loud through long chains of thought, which makes them accurate but painfully slow and expensive to run. To solve this, researchers developed Prompt-Level Distillation (PLD), a clever shortcut that moves the complex logic of a giant "Teacher" model directly into the system instructions of a smaller, faster "Student" model. This approach allows compact models like Gemma-3 to perform complex legal and logical reasoning at super-human speeds without any expensive retraining or fine-tuning. By turning a black-box reasoning process into a set of transparent, human-readable instructions, PLD enables smaller AI to match the performance of industry leaders while remaining fast enough for real-time use in law, finance, and mobile devices.
Failed to generate LLM review.
Failed to generate research directions.
Ever wonder if you should keep renting skis or just buy them? This paper tackles the classic "ski rental" dilemma—making a decision today without knowing how long you’ll need it—by using a sophisticated weather-like forecast: a probability distribution instead of a single guess. The authors introduce a clever algorithm that uses these distributional predictions to minimize costs, proving that it remains highly efficient even if the prediction turns out to be wrong. Their main breakthrough is a strategy that doesn’t just perform brilliantly when the forecast is accurate, but also provides a guaranteed safety net if the forecast is a total disaster, all without needing to know the quality of the data beforehand.
Failed to generate LLM review.
Failed to generate research directions.
This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which
this version may no longer be accessible.
Attention-Based SINR Estimation in User-Centric
Non-Terrestrial Networks
Bruno De Filippo∗, Alessandro Guidotti∗†, Alessandro Vanelli-Coralli∗
∗Department of Electrical, Electronic, and Information Engineering (DEI), Univ. of Bologna, Bologna, Italy
†National Inter-University Consortium for Telecommunications (CNIT), Bologna, Italy
Failed to generate LLM review.
Failed to generate research directions.
Standard decision trees often struggle with complex data because they can only split information along one variable at a time, like trying to cut a diamond using only horizontal and vertical strokes. This paper introduces an enhanced "Projection Pursuit" tree classifier that finds the best diagonal angles to separate data groups, offering much-needed flexibility for high-dimensional problems where classes are overlapping or unusually shaped. To prove these upgrades actually work, the researchers developed interactive visual tools and "tours" that allow users to see exactly how the algorithm carves through 2D and 3D space. By consistently outperforming traditional models on dozens of benchmark datasets, this new approach provides a more powerful and interpretable way to navigate the "blind spots" of modern machine learning.
Failed to generate LLM review.
Failed to generate research directions.
Improving Parametric Knowledge Access
in Reasoning Language Models
Melody Ma and John Hewitt
Columbia University
{ym3065, jh5020}@columbia.edu
Abstract
We study reasoning for accessing world knowl-
edge stored in a language model’s parame-
ters. For example, recalling that Canberra is
Australia’s capital may benefit from thinking
through major cities and the concept of purpose-
built capitals. While reasoning language mod-
els are trained via reinforcement learning to
produce reasoning traces on
Failed to generate LLM review.
Failed to generate research directions.
SumTablets
:
A Transliteration Dataset of Sumerian Tablets
Cole Simmons
Stanford University
coles@stanford.edu
Richard Diehl Martinez
University of Cambridge
rd654@cam.ac.uk
Dan Jurafsky
Stanford University
jurafsky@stanford.edu
Abstract
Sumerian transliteration is a conventional
system for representing a scholar’s inter-
pretation of a tablet in the Latin script.
Thanks to visionary digital Assyriology
projects such as ETCSL, CDLI, and Oracc,
a large number of Sumerian transliter-
ations have b
Failed to generate LLM review.
Failed to generate research directions.
Recovered in Translation: Efficient Pipeline for Automated Translation of
Benchmarks and Datasets
Hanna Yukhymenko1†, 2, Anton Alexandrov1, Martin Vechev1,2
1INSAIT, Sofia University "St. Kliment Ohridski", 2ETH Zurich
Correspondence: hanna.yukhymenko@insait.ai
§ Code: insait-institute/ritranslation
Benchmarks: insait-institute/multilingual-benchmarks
Abstract
The reliability of multilingual Large Language
Model (LLM) evaluation is currently compro-
mised by the inconsistent quality of translate
Failed to generate LLM review.
Failed to generate research directions.
GUI-Libra: Training Native GUI Agents to Reason and Act
with Action-aware Supervision and Partially Verifiable RL
Rui Yang1†, Qianhui Wu2∗, Zhaoyang Wang3†, Hanyang Chen1, Ke Yang1†, Hao Cheng2
Huaxiu Yao3, Baolin Peng2, Huan Zhang1, Jianfeng Gao2, Tong Zhang1
1UIUC,
2Microsoft,
3UNC-Chapel Hill
https://gui-libra.github.io
Abstract
Open-source native GUI agents have made rapid progress in visual grounding and low-level action
execution, yet they still lag behind closed-source systems on long-h
Failed to generate LLM review.
Failed to generate research directions.
Surrogate models for Rock–Fluid Interaction: A Grid-Size-Invariant
Approach
Nathalie C. Pinheiroa,∗, Donghu Guoa, Hannah P. Menkeb, Aniket C. Joshia,c, Claire E.
Heaneya,d,∗, Ahmed H. ElSheikhb, Christopher C. Paina,d,e
aApplied Modelling and Computation Group, Department of Earth Science and Engineering, Imperial College
London, London, SW7 2AZ UK
bInstitute of GeoEnergy Engineering, Heriot-Watt University, Edinburgh, EH14 1AS UK
cDepartment of Civil and Environmental Engineering, Imperial Coll
Failed to generate LLM review.
Failed to generate research directions.
DYSCO: Dynamic Attention-Scaling Decoding for Long-Context LMs
Xi Ye * 1 Wuwei Zhang * 1 Fangcong Yin 2 Howard Yen 1 Danqi Chen 1
Abstract
Understanding and reasoning over long contexts
is a crucial capability for language models (LMs).
Although recent models support increasingly long
context windows, their accuracy often deterio-
rates as input length grows. In practice, models
often struggle to keep attention aligned with the
most relevant context throughout decoding. In
this work, we propose
Failed to generate LLM review.
Failed to generate research directions.
As generative AI continues to grow, many creators have turned to "invisible shields"—imperceptible digital perturbations designed to protect images from being stolen, mimicked, or turned into deepfakes. However, this research reveals a startling vulnerability: common, off-the-shelf AI tools like ChatGPT (GPT-4o) and Stable Diffusion can be easily repurposed as "universal denoisers" to strip away these protections with a simple text prompt. By testing eight different case studies, the authors prove that these widely used generative models actually outperform specialized hacking tools at breaking defenses, often restoring the original image's quality while rendering the security measures useless. This study serves as a wake-up call for the cybersecurity community, demonstrating that current image protection schemes offer a false sense of security and must be reinvented to survive the power of modern AI.
Failed to generate LLM review.
Excellent analysis request. This paper presents a compelling and concerning finding: the very generative models artists and creators fear are also potent tools for dismantling the defenses they employ. This "convergent threat" is a fantastic starting point for future research.
Here are potential research directions and areas for future work, categorized as requested.
These are logical next steps that build directly on the paper's methodology and findings.
Expanding the Scope of Attacks:
Characterizing the Attack Surface:
These are more ambitious projects aimed at creating the next generation of defenses that are resilient to the attack vector identified in the paper. The core challenge is to design perturbations that the denoiser either preserves as signal or cannot remove without destroying the image.
Semantic and Style-Space Perturbations:
The paper's attack works because it treats perturbations as high-frequency noise. The next frontier is to design perturbations that are not noise, but meaningful semantic information.
Adversarial Attacks Against the Denoiser:
The paper's attack destroys the defender's utility. A novel defense could aim to destroy the attacker's utility.
Perturbations as Denoising Fixed Points:
The paper showed that simple adversarial training to make a perturbation "denoiser-aware" failed. This points to a more fundamental optimization problem.
P that are approximate fixed points of the denoising operator D. The goal would be to solve for P such that D(Image + P) ≈ Image + P. In other words, the denoiser sees the protected image as already "clean" and makes minimal changes, thus preserving the protection. This is a very challenging but potentially very robust defense direction.Robust Low-Frequency Watermarking:
The paper highlights VINE's low-frequency approach as "promising" but its implementation as "flawed" (vulnerable to cropping due to edge artifacts).
These are gaps or critical questions the paper exposes.
The "Why" of the Black Box: The strongest attacker, GPT-4o, is a closed-source model. It's unclear why its architecture or training makes it so effective. Is it the autoregressive nature, the sheer scale of its training, its multi-modal pre-training, or something else? Research is needed in interpretability for security, aiming to probe and understand the specific mechanisms in foundation models that make them effective at "denoising" to build better defenses.
Forensics of Generative Laundering: The attack can be seen as "laundering" a protected image to remove its safeguards. An unexplored problem is detecting this laundering process. Do images processed by these denoisers have a unique, detectable "fingerprint"? Research could focus on building a classifier that can distinguish between an original clean image, a protected image, and a "laundered" image that has passed through an img2img denoiser. This would be a crucial forensic tool.
The Utility-Security Frontier Under Generative Attacks: The paper effectively invalidates previous assumptions about the trade-off between protection strength and image quality. The unexplored problem is to formally map the new Pareto frontier. For a given level of robustness against a state-of-the-art img2img attacker (e.g., FLUX or GPT-4o), what is the maximum achievable image utility (PSNR, SSIM, BRISQUE)? This creates a new, much harder benchmark for all future protection schemes.
The paper's findings, while presented in a security context, have broader implications.
Positive Applications of "Universal Denoising": The attack itself is a highly effective, blind image restoration technique.
A New Benchmark for Foundation Models: The paper's method can be repurposed as an evaluation metric.
"Immune System" for AI Ecosystems:
LiCQA : A Lightweight Complex Question Answering System
Sourav Saha
Indian Statistical Institute
Kolkata, India
sourav.saha_r@isical.ac.in
Dwaipayan Roy
Indian Institute of Science Education
and Research
Kolkata, India
dwaipayan.roy@iiserkol.ac.in
Mandar Mitra
Indian Statistical Institute
Kolkata, India
mandar@isical.ac.in
Abstract
Over the last twenty years, significant progress has been made in
designing and implementing Question Answering (QA) systems.
However, addressing complex questions, t
Failed to generate LLM review.
Excellent analysis of the research paper "LiCQA: A Lightweight Complex Question Answering System". Based on its contributions, methodology, and limitations, here are several potential research directions and areas for future work, focusing on actionable and innovative ideas.
These are ideas that build directly on the LiCQA pipeline, improving its individual components or refining its core logic.
max-score aggregation (using only the single best-matching sentence) worked best. This suggests that for many complex questions, a single, highly relevant sentence is sufficient. An extension would be to develop an adaptive aggregation strategy. The system could first check the max-score. If it's above a certain confidence threshold, it's used. If not, the system could fall back to a more sophisticated aggregation model (like avg-maxscore or a weighted average) that synthesizes evidence from weaker, distributed signals. This would combine the precision of max-score with the recall of other methods.comb-score*). This is an unsupervised heuristic. A direct extension is to replace this with a lightweight, learnable ranking model (e.g., a simple linear model, or LambdaMART). One could create a small, domain-specific dataset of (question, candidate answer, relevance) tuples to train this model, turning LiCQA into a "weakly supervised" system that learns how to best combine different evidence features (e.g., df, max-score, average score, entity prominence) without needing a large, end-to-end training corpus.These are more transformative ideas that take LiCQA's core philosophy—lightweight, corpus-based, unsupervised—and apply it to new problems or architectures.
+"Brad Pitt" +"Troy", +"Brad Pitt" +"Seven").This work, by succeeding in some areas, implicitly shines a light on problems that remain unsolved.
max-score model works when a single sentence contains most of the required context. What if the evidence is spread across a paragraph? E.g., "The film starred Actor X. ... It was directed by Director Y. ... The movie went on to win an Oscar for best picture." Answering "Which Oscar-winning film starred Actor X and was directed by Director Y?" is impossible for LiCQA if no single sentence contains all three elements. The unexplored problem is entity-centric context aggregation, where a system builds a "profile" for an entity by merging information from multiple sentences within a document before scoring, using techniques like co-reference resolution.The "lightweight, fast, and unsupervised" nature of LiCQA makes it uniquely suited for specific domains where other methods fail.
Learning and Naming Subgroups with Exceptional Survival Characteristics
Mhd Jawad Al Rahwanji 1 Sascha Xu 1 Nils Philipp Walter 1 Jilles Vreeken 1
Abstract
In many applications, it is important to iden-
tify subpopulations that survive longer or shorter
than the rest of the population. In medicine, for
example, it allows determining which patients
benefit from treatment, and in predictive main-
tenance, which components are more likely to
fail. Existing methods for discovering subgroups
with exc
Failed to generate LLM review.
Failed to generate research directions.
DYNAMIC PERSONALITY ADAPTATION IN
LARGE LANGUAGE MODELS VIA STATE MACHINES
PREPRINT
Leon Pielage1,2,
Ole Hätscher3,
Prof. Dr. Mitja Back
3,
Prof. Dr. med. Bernhard Marschall4, and
Prof. Dr. Benjamin Risse*1,2
1Institute for Geoinformatics, University of Münster, 48149 Münster, Germany
2Faculty of Mathematics and Computer Science, University of Münster, 48149 Münster, Germany
3Department of Psychology, University of Münster, 48149 Münster, Germany
4Institute of Medical Education and Student Affai
Failed to generate LLM review.
Failed to generate research directions.