From arxiv.org
Fun with flags: How Compilers Break and Fix Constant-Time Code
2 2
Developers rely on constant-time programming to prevent timing side-channel attacks. But these efforts can be undone by compilers, whose optimizations may silently reintroduce leaks. While recent works have measured the extent of such leakage, they leave developers without actionable insights:...
8h ago
From arxiv.org
PDFMathTranslate: Scientific Document Translation Preserving Layouts
2 2
Language barriers in scientific documents hinder the diffusion and development of science and technologies. However, prior efforts in translating such documents largely overlooked the information in layouts. To bridge the gap, we introduce PDFMathTranslate, the world's first open-source software...
14h ago
From arxiv.org
RAG-R1 : Incentivize the Search and Reasoning Capabilities of LLMs through Multi-query Parallelism
2 2
Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, while they remain prone to generating hallucinated or outdated responses due to their static internal knowledge. Recent advancements in Retrieval-Augmented Generation (RAG) methods have explored...
14h ago
From arxiv.org
Deep neural networks have an inbuilt Occam's razor
2 2
The remarkable performance of overparameterized deep neural networks (DNNs) must arise from an interplay between network architecture, training algorithms, and structure in the data. To disentangle these three components, we apply a Bayesian picture, based on the functions expressed by a DNN, to...
14h ago
From arxiv.org
Entropy stable conservative flux form neural networks
2 2
We propose an entropy-stable conservative flux form neural network (CFN) that integrates classical numerical conservation laws into a data-driven framework using the entropy-stable, second-order, and non-oscillatory Kurganov-Tadmor (KT) scheme. The proposed entropy-stable CFN uses slope limiting...
15h ago
From arxiv.org
2 2
Recent advances in song identification leverage deep neural networks to learn compact audio fingerprints directly from raw waveforms. While these methods perform well under controlled conditions, their accuracy drops significantly in real-world scenarios where the audio is captured via mobile...
16h ago
From arxiv.org
SARA: Selective and Adaptive Retrieval-augmented Generation with Context Compression
2 2
Retrieval-augmented Generation (RAG) extends large language models (LLMs) with external knowledge but faces key challenges: restricted effective context length and redundancy in retrieved documents. Pure compression-based approaches reduce input size but often discard fine-grained details...
16h ago
From arxiv.org
A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models
2 2
The widespread deployment of large language models (LLMs) across critical domains has amplified the societal risks posed by algorithmically generated misinformation. Unlike traditional false content, LLM-generated misinformation can be self-reinforcing, highly plausible, and capable of rapid...
16h ago
From arxiv.org
Tile-Based ViT Inference with Visual-Cluster Priors for Zero-Shot Multi-Species Plant Identification
2 2
We describe DS@GT's second-place solution to the PlantCLEF 2025 challenge on multi-species plant identification in vegetation quadrat images. Our pipeline combines (i) a fine-tuned Vision Transformer ViTD2PC24All for patch-level inference, (ii) a 4x4 tiling strategy that aligns patch size with...
16h ago
From arxiv.org
PLACE: Prompt Learning for Attributed Community Search
2 2
In this paper, we propose PLACE (Prompt Learning for Attributed Community Search), an innovative graph prompt learning framework for ACS. Enlightened by prompt-tuning in Natural Language Processing (NLP), where learnable prompt tokens are inserted to contextualize NLP queries, PLACE integrates...
16h ago
From arxiv.org
Beyond Retrieval: Ensembling Cross-Encoders and GPT Rerankers with LLMs for Biomedical QA
2 2
Biomedical semantic question answering rooted in information retrieval can play a crucial role in keeping up to date with vast, rapidly evolving and ever-growing biomedical literature. A robust system can help researchers, healthcare professionals and even layman users access relevant knowledge...
16h ago
From arxiv.org
2 2
Self-Attentive Sequential Recommendation (SASRec) effectively captures long-term user preferences by applying attention mechanisms to historical interactions. Concurrently, the rise of Large Language Models (LLMs) has motivated research into LLM-based recommendation, which leverages their...
16h ago
From arxiv.org
Exploring LLM Capabilities in Extracting DCAT-Compatible Metadata for Data Cataloging
2 2
Efficient data exploration is crucial as data becomes increasingly important for accelerating processes, improving forecasts and developing new business models. Data consumers often spend 25-98 % of their time searching for suitable data due to the exponential growth, heterogeneity and...
16h ago
From arxiv.org
2 2
Student dropout in distance learning remains a critical challenge, with profound societal and economic consequences. While classical machine learning models leverage structured socio-demographic and behavioral data, they often fail to capture the nuanced emotional and contextual factors embedded...
16h ago
From arxiv.org
Enhancing Learning Path Recommendation via Multi-task Learning
2 2
Personalized learning is a student-centered educational approach that adapts content, pace, and assessment to meet each learner's unique needs. As the key technique to implement the personalized learning, learning path recommendation sequentially recommends personalized learning items such as...
17h ago
From arxiv.org
Enhancing the Interpretability of Rule-based Explanations through Information Retrieval
2 2
The lack of transparency of data-driven Artificial Intelligence techniques limits their interpretability and acceptance into healthcare decision-making processes. We propose an attribution-based approach to improve the interpretability of Explainable AI-based predictions in the specific context...
18h ago
From arxiv.org
On the Costs and Benefits of Learned Indexing for Dynamic High-Dimensional Data: Extended Version
2 2
One of the main challenges within the growing research area of learned indexing is the lack of adaptability to dynamically expanding datasets. This paper explores the dynamization of a static learned index for complex data through operations such as node splitting and broadening, enabling...
18h ago
From arxiv.org
2 2
Context. Empathy, a key social skill, is essential for communication and collaboration in SE but remains an under-researched topic. Aims. This study investigates empathy in SE from practitioners' perspectives, aiming to characterize its meaning, identify barriers, discuss practices to overcome...
19h ago
From arxiv.org
2 2
A search for pseudoscalar or scalar bosons decaying to a top quark pair ($\mathrm{t\bar{t}}$) in final states with one or two charged leptons is presented. The analyzed proton-proton collision data was recorded at $\sqrt{s}$ = 13 TeV by the CMS experiment at the CERN LHC and corresponds to an...
22h ago
From arxiv.org
StreamDiT: Real-Time Streaming Text-to-Video Generation
2 3
Recently, great progress has been achieved in text-to-video (T2V) generation by scaling transformer-based diffusion models to billions of parameters, which can generate high-quality videos. However, existing models typically produce only short clips offline, restricting their use cases in...
on Tue, 8PM
From arxiv.org
2 3
EEG signals capture brain activity with high temporal and low spatial resolution, supporting applications such as neurological diagnosis, cognitive monitoring, and brain-computer interfaces. However, effective analysis is hindered by limited labeled data, high dimensionality, and the absence of...
on Tue, 7PM
From arxiv.org
2 3
T cell receptor (TCR) repertoires encode critical immunological signatures for autoimmune diseases, yet their clinical application remains limited by sequence sparsity and low witness rates. We developed EAMil, a multi-instance deep learning framework that leverages TCR sequencing data to...
on Tue, 3PM
From arxiv.org
LoSiA: Efficient High-Rank Fine-Tuning via Subnet Localization and Optimization
2 3
Parameter-Efficient Fine-Tuning (PEFT) methods, such as LoRA, significantly reduce the number of trainable parameters by introducing low-rank decomposition matrices. However, existing methods perform extensive matrix multiplications in domain specialization tasks, resulting in computational...
on Tue, 2PM
From arxiv.org
2 3
As Artificial Intelligence systems evolve from monolithic models to ecosystems of specialized agents, the need for standardized communication protocols becomes increasingly critical. This paper introduces MOD-X (Modular Open Decentralized eXchange), a novel architectural framework proposal for...
on Tue, 2PM
From arxiv.org
Activation Steering for Chain-of-Thought Compression
2 3
Large language models (LLMs) excel at complex reasoning when they include intermediate steps, known as "chains of thought" (CoTs). However, these rationales are often overly verbose, even for simple problems, leading to wasted context, increased latency, and higher energy consumption. We observe...
on Tue, 2PM
From arxiv.org
Neural-Network solver of ideal MHD equilibria
2 3
We present a novel approach to compute three-dimensional Magnetohydrodynamic equilibria by parametrizing Fourier modes with artificial neural networks and compare it to equilibria computed by conventional solvers. The full nonlinear global force residual across the volume in real space is then...
on Tue, 1PM
From arxiv.org
Efficient Detection of Intermittent Job Failures Using Few-Shot Learning
2 3
One of the main challenges developers face in the use of continuous integration (CI) and deployment pipelines is the occurrence of intermittent job failures, which result from unexpected non-deterministic issues (e.g., flaky tests or infrastructure problems) rather than regular code-related...
on Tue, 11AM
From arxiv.org
Empirical Analysis Of Heuristic and Approximation Algorithms for the The Mutual-Visibility Problem
2 3
The NP-complete mutual-visibility (MV) problem currently lacks empirical analysis on its practical behaviour despite theoretical studies. This paper addresses this gap by implementing and evaluating three distinct algorithms - a direct greedy heuristic, a hypergraph-based approximation, and a...
on Thu, 8AM
From arxiv.org
Calibrating Graph Neural Networks with Wavelet-Aware Temperature Scaling
2 2
Graph Neural Networks (GNNs) have demonstrated strong predictive performance on relational data; however, their confidence estimates often misalign with actual predictive correctness, posing significant limitations for deployment in safety-critical settings. While existing graph-aware...
on Jul 1
From arxiv.org
Computational Complexity of Model-Checking Quantum Pushdown Systems
2 2
In this paper, we study the problem of model-checking quantum pushdown systems from a computational complexity point of view. We arrive at the following equally important, interesting new results: We first extend the notions of the {\it probabilistic pushdown systems} and {\it Markov chains}...
on Jun 24
From arxiv.org
Instruction Following by Boosting Attention of Large Language Models
2 2
Controlling the generation of large language models (LLMs) remains a central challenge to ensure their safe and reliable deployment. While prompt engineering and finetuning are common approaches, recent work has explored latent steering, a lightweight technique that alters LLM internal...
on Jun 17
From arxiv.org
2 3
We provide a distillation scaling law that estimates distilled model performance based on a compute budget and its allocation between the student and teacher. Our findings reduce the risks associated with using distillation at scale; compute allocation for both the teacher and student models can...
on Feb 13
From arxiv.org
1 1
Salient aspects of the commissioning, calibration, and performance of the CMS silicon strip tracker are discussed, drawing on experience during operation with proton-proton collisions delivered by the CERN LHC. The data were obtained with a variety of luminosities. The operating temperature of...
1h ago
From arxiv.org
Spectroscopy of Free-Floating Planetary-Mass Objects and their disks with JWST
1 1
Free-floating planetary-mass objects (FFPMOs) are known to harbor disks at young ages. Here, we present 1-13 $μm$ spectra for eight young FFPMOs with masses of 5-10 M$_\mathrm{Jup}$ (at ages of 1-5 Myr), using the NIRSpec and MIRI instruments on the James Webb Space Telescope. We derive...
4h ago
From arxiv.org
Is Earendel a Star Cluster?: Metal Poor Globular Cluster Progenitors at $z\sim6$
1 1
The strongly-lensed $z\sim 6$ Sunrise galaxy offers an incredible opportunity to investigate star formation in the early universe on parsec or smaller scales. The highly magnified object Earendel within the Sunrise was previously identified as a candidate star or binary due to size constraints...
4h ago
From arxiv.org
Introduction to the China Space Station Telescope (CSST)
1 1
The China Space Station Telescope (CSST) is a next-generation Stage-IV sky survey telescope, distinguished by its large field of view (FoV), high image quality, and multi-band observation capabilities. It can simultaneously conduct precise measurements of the Universe by performing multi-color...
4h ago
From arxiv.org
JWST Spectra of Brown Dwarf Candidates in the Orion Nebula Cluster
1 1
I present an analysis of archival spectra of 200 sources toward the Orion Nebula Cluster (ONC) that were obtained with the Near-Infrared Spectrograph (NIRSpec) on board the James Webb Space Telescope (JWST). I have used these data to assess cluster membership and measure spectral types for the...
4h ago
From arxiv.org
1 1
The interstellar object 3I/ATLAS shows a weak cometary activity. Its brightness suggests a maximum radius of ~10km (A/0.05)^{-1/2} for an asteroid with an albedo A. I show that interstellar objects with that radius would amount to an interstellar mass density that is well above the expected mass...
8h ago
From arxiv.org
News Source Citing Patterns in AI Search Systems
1 1
AI-powered search systems are emerging as new information gatekeepers, fundamentally transforming how users access news and information. Despite their growing influence, the citation patterns of these systems remain poorly understood. We address this gap by analyzing data from the AI Search...
10h ago
From arxiv.org
1 1
Pre-trained language models (PLMs) are widely used to derive semantic representations from item metadata in recommendation and search. In sequential recommendation, PLMs enhance ID-based embeddings through textual metadata, while in product search, they align item characteristics with user...
10h ago
From arxiv.org
MemOS: A Memory OS for AI System
1 1
Large Language Models (LLMs) have become an essential infrastructure for Artificial General Intelligence (AGI), yet their lack of well-defined memory management systems hinders the development of long-context reasoning, continual personalization, and knowledge consistency.Existing models mainly...
10h ago
From arxiv.org
Hierarchical Interaction Summarization and Contrastive Prompting for Explainable Recommendations
1 1
Explainable recommendations, which use the information of user and item with interaction to generate a explanation for why the user would interact with the item, are crucial for improving user trust and decision transparency to the recommender system. Existing methods primarily rely on encoding...
10h ago
From arxiv.org
1 1
Vector retrieval systems exhibit significant performance variance across queries due to heterogeneous embedding quality. We propose a lightweight framework for predicting retrieval performance at the query level by combining quantization robustness and neighborhood density metrics. Our approach...
10h ago
From arxiv.org
KERAG_R: Knowledge-Enhanced Retrieval-Augmented Generation for Recommendation
1 1
Large Language Models (LLMs) have shown strong potential in recommender systems due to their contextual learning and generalisation capabilities. Existing LLM-based recommendation approaches typically formulate the recommendation task using specialised prompts designed to leverage their...
10h ago
From arxiv.org
Unconditional Diffusion for Generative Sequential Recommendation
1 1
Diffusion models, known for their generative ability to simulate data creation through noise-adding and denoising processes, have emerged as a promising approach for building generative recommenders. To incorporate user history for personalization, existing methods typically adopt a conditional...
10h ago
From arxiv.org
Information Needs and Practices Supported by ChatGPT
1 1
This study considers ChatGPT as an information source, investigating the information needs that people come to ChatGPT with and the information practices that ChatGPT supports, through a qualitative content analysis of 205 user vignettes. The findings show that ChatGPT is used in a range of life...
10h ago
From arxiv.org
1 1
Most existing multimodal collaborative filtering recommendation (MCFRec) methods rely heavily on ID features and multimodal content to enhance recommendation performance. However, this paper reveals that ID features are effective but have limited benefits in multimodal collaborative filtering...
10h ago
From arxiv.org
1 1
The rapid transformation of the labor market, driven by technological advancements and the digital economy, requires continuous competence development and constant adaptation. In this context, traditional competence management systems lack interoperability, adaptability, and semantic...
10h ago
From arxiv.org
RecRankerEval: A Flexible and Extensible Framework for Top-k LLM-based Recommendation
1 1
A recent Large language model (LLM)-based recommendation model, called RecRanker, has demonstrated a superior performance in the top-k recommendation task compared to other models. In particular, RecRanker samples users via clustering, generates an initial ranking list using an initial...
10h ago
From arxiv.org
SIGIR 2025 -- LiveRAG Challenge Report
1 1
The LiveRAG Challenge at SIGIR 2025, held between March and May 2025, provided a competitive platform for advancing Retrieval-Augmented Generation (RAG) technologies. Participants from academia and industry were invited to develop a RAG-based question-answering system using a fixed corpus...
10h ago