Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.DB

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Databases

Authors and titles for July 2025

Total of 134 entries : 1-50 51-100 101-134
Showing up to 50 entries per page: fewer | more | all
[51] arXiv:2507.12668 [pdf, html, other]
Title: Targeted Mining of Time-Interval Related Patterns
Shuang Liang, Lili Chen, Wensheng Gan, Philip S. Yu, Shengjie Zhao
Comments: Preprint. 8 figures, 4 tables
Subjects: Databases (cs.DB)
[52] arXiv:2507.13710 [pdf, html, other]
Title: SoftPipe: A Soft-Guided Reinforcement Learning Framework for Automated Data Preparation
Jing Chang, Chang Liu, Jinbin Huang, Shuyuan Zheng, Rui Mao, Jianbin Qin
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[53] arXiv:2507.13712 [pdf, html, other]
Title: LLaPipe: LLM-Guided Reinforcement Learning for Automated Data Preparation Pipeline Construction
Jing Chang, Chang Liu, Jinbin Huang, Rui Mao, Jianbin Qin
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[54] arXiv:2507.13757 [pdf, html, other]
Title: Efficient and Scalable Self-Healing Databases Using Meta-Learning and Dependency-Driven Recovery
Joydeep Chandra, Prabal Manhas
Subjects: Databases (cs.DB)
[55] arXiv:2507.13892 [pdf, html, other]
Title: Towards Next Generation Data Engineering Pipelines
Kevin M. Kramer, Valerie Restat, Sebastian Strasser, Uta Störl, Meike Klettke
Subjects: Databases (cs.DB)
[56] arXiv:2507.14101 [pdf, other]
Title: Project-connex Decompositions and Tractability of Aggregate Group-by Conjunctive Queries
Diego Figueira, Cibele Freire
Comments: 34 pages, 5 figures
Subjects: Databases (cs.DB)
[57] arXiv:2507.14376 [pdf, html, other]
Title: Schemora: schema matching via multi-stage recommendation and metadata enrichment using off-the-shelf llms
Osman Erman Gungor, Derak Paulsen, William Kang
Comments: 11 pages
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[58] arXiv:2507.14475 [pdf, html, other]
Title: Towards Temporal Knowledge Graph Alignment in the Wild
Runhao Zhao, Weixin Zeng, Wentao Zhang, Xiang Zhao, Jiuyang Tang, Lei Chen
Comments: 18 pages, 6 figures
Subjects: Databases (cs.DB)
[59] arXiv:2507.14495 [pdf, html, other]
Title: Opening The Black-Box: Explaining Learned Cost Models For Databases
Roman Heinrich, Oleksandr Havrylov, Manisha Luthra, Johannes Wehrstein, Carsten Binnig
Comments: Accepted to VLDB 2025 Demonstration Track
Subjects: Databases (cs.DB)
[60] arXiv:2507.14682 [pdf, html, other]
Title: IDSS, a Novel P2P Relational Data Storage Service
Massimo Cafaro, Italo Epicoco, Marco Pulimeno, Lunodzo J. Mwinuka, Lucas Pereira, Hugo Morais
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
[61] arXiv:2507.14813 [pdf, html, other]
Title: Mayura: Exploiting Similarities in Motifs for Temporal Co-Mining
Sanjay Sri Vallabh Singapuram, Ronald Dreslinski, Nishil Talati
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
[62] arXiv:2507.17215 [pdf, html, other]
Title: Triadic First-Order Logic Queries in Temporal Networks
Omkar Bhalerao, Yunjie Pan, C. Seshadhri, Nishil Talati
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR); Social and Information Networks (cs.SI)
[63] arXiv:2507.17507 [pdf, html, other]
Title: Unfolding Data Quality Dimensions in Practice: A Survey
Vasileios Papastergios, Lisa Ehrlinger, Anastasios Gounaris
Subjects: Databases (cs.DB)
[64] arXiv:2507.17647 [pdf, html, other]
Title: SHINE: A Scalable HNSW Index in Disaggregated Memory
Manuel Widmoser, Daniel Kocher, Nikolaus Augsten
Subjects: Databases (cs.DB)
[65] arXiv:2507.17778 [pdf, other]
Title: An advanced AI driven database system
M. Tedeschi, S. Rizwan, C. Shringi, V. Devram Chandgir, S. Belich
Comments: 10 pages, 5 figures, appears in EDULEARN25 Conference Proceedings
Journal-ref: EDULEARN25 Proceedings, pp. 5130 5139, 2025
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
[66] arXiv:2507.18891 [pdf, html, other]
Title: ApproxJoin: Approximate Matching for Efficient Verification in Fuzzy Set Similarity Join
Michael Mandulak, S M Ferdous, Sayan Ghosh, Mahantesh Halappanavar, George Slota
Subjects: Databases (cs.DB)
[67] arXiv:2507.19154 [pdf, html, other]
Title: Big Data Energy Systems: A Survey of Practices and Associated Challenges
Lunodzo J. Mwinuka, Massimo Cafaro, Lucas Pereira, Hugo Morais
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
[68] arXiv:2507.19254 [pdf, html, other]
Title: DBMS-LLM Integration Strategies in Industrial and Business Applications: Current Status and Future Challenges
Zhengtong Yan, Gongsheng Yuan, Qingsong Guo, Jiaheng Lu
Subjects: Databases (cs.DB)
[69] arXiv:2507.19329 [pdf, html, other]
Title: Properties for Paths in Graph Databases
Fernando Orejas, Elvira Pino, Renzo Angles, Edelmira Pasarella, Nikos Milonakis
Subjects: Databases (cs.DB); Logic in Computer Science (cs.LO)
[70] arXiv:2507.19802 [pdf, html, other]
Title: CleANN: Efficient Full Dynamism in Graph-based Approximate Nearest Neighbor Search
Ziyu Zhang, Yuanhao Wei, Joshua Engels, Julian Shun
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR)
[71] arXiv:2507.20441 [pdf, html, other]
Title: TIMEST: Temporal Information Motif Estimator Using Sampling Trees
Yunjie Pan, Omkar Bhalerao, C. Seshadhri, Nishil Talati
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR)
[72] arXiv:2507.20671 [pdf, other]
Title: A Functional Data Model and Query Language is All You Need
Jens Dittrich
Subjects: Databases (cs.DB)
[73] arXiv:2507.20815 [pdf, other]
Title: MVIAnalyzer: A Holistic Approach to Analyze Missing Value Imputation
Valerie Restat, Kai Tejkl, Uta Störl
Subjects: Databases (cs.DB)
[74] arXiv:2507.20839 [pdf, other]
Title: Data Cleaning of Data Streams
Valerie Restat, Niklas Rodenhausen, Carina Antonin, Uta Störl
Subjects: Databases (cs.DB)
[75] arXiv:2507.21056 [pdf, html, other]
Title: AI-Driven Generation of Data Contracts in Modern Data Engineering Systems
Harshraj Bhoite
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[76] arXiv:2507.21173 [pdf, other]
Title: Digitalizing Uncertain Information
Chris Partridge, Andrew Mitchell, Andreas Cola
Comments: 9 pages. 2 figures. Conference: Semantic Technology for Intelligence, Defense, and Security (STIDS 2024)
Subjects: Databases (cs.DB)
[77] arXiv:2507.21860 [pdf, other]
Title: Ranking Methods for Skyline Queries
Mickaël Martin-Nevot (AMU), Lotfi Lakhal (AMU)
Subjects: Databases (cs.DB)
[78] arXiv:2507.21989 [pdf, html, other]
Title: Benchmarking Filtered Approximate Nearest Neighbor Search Algorithms on Transformer-based Embedding Vectors
Patrick Iff, Paul Bruegger, Marcin Chrapek, Maciej Besta, Torsten Hoefler
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR)
[79] arXiv:2507.22143 [pdf, html, other]
Title: Compact Answers to Temporal Path Queries
Muhammad Adnan, Diego Calvanese, Julien Corman, Anton Dignös, Werner Nutt, Ognjen Savković
Comments: Extended version of a paper accepted at the ISWC 2025 conference
Subjects: Databases (cs.DB)
[80] arXiv:2507.22305 [pdf, html, other]
Title: Is SHACL Suitable for Data Quality Assessment?
Carolina Cortés, Lisa Ehrlinger, Lorena Etcheverry, Felix Naumann
Comments: 43 pages
Subjects: Databases (cs.DB)
[81] arXiv:2507.22384 [pdf, other]
Title: Scalability, Availability, Reproducibility and Extensibility in Islamic Database Systems
Umar Siddiqui, Habiba Youssef, Adel Sabour, Mohamed Ali
Journal-ref: International Journal on Islamic Applications in Computer Science and Technology, Vol. 9, Issue 3, September 2021, 14-20
Subjects: Databases (cs.DB); Software Engineering (cs.SE)
[82] arXiv:2507.22419 [pdf, html, other]
Title: Systematic Evaluation of Knowledge Graph Repair with Large Language Models
Tung-Wei Lin, Gabe Fierro, Han Li, Tianzhen Hong, Pierluigi Nuzzo, Alberto Sangiovanni-Vinentelli
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[83] arXiv:2507.22701 [pdf, html, other]
Title: SAM: A Stability-Aware Cache Manager for Multi-Tenant Embedded Databases
Haoran Zhang, Decheng Zuo, Yu Yan, Zhiyu Liang, Hongzhi Wang
Comments: 17 pages, 10 figures. An extended version of a paper under review at the VLDB 2026 conference
Subjects: Databases (cs.DB)
[84] arXiv:2507.23084 [pdf, html, other]
Title: AutoIndexer: A Reinforcement Learning-Enhanced Index Advisor Towards Scaling Workloads
Taiyi Wang, Eiko Yoneki
Comments: 14 pages
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[85] arXiv:2507.23499 [pdf, html, other]
Title: Jelly-Patch: a Fast Format for Recording Changes in RDF Datasets
Piotr Sowinski, Kacper Grzymkowski, Anastasiya Danilenka
Comments: Accepted at the International Semantic Web Conference 2025 Posters and Demos, November 2-6, 2025, Nara, Japan
Subjects: Databases (cs.DB)
[86] arXiv:2507.23515 [pdf, other]
Title: DataLens: Enhancing Dataset Discovery via Network Topologies
Anaïs Ollagnier (CRISAM, CNRS, MARIANNE), Aline Menin (WIMMICS, Laboratoire I3S - SPARKS)
Subjects: Databases (cs.DB)
[87] arXiv:2507.00277 (cross-list from cs.DS) [pdf, html, other]
Title: Lazy B-Trees
Casper Moldrup Rysgaard, Sebastian Wild
Comments: MFCS 2025
Subjects: Data Structures and Algorithms (cs.DS); Databases (cs.DB)
[88] arXiv:2507.00938 (cross-list from cs.IR) [pdf, html, other]
Title: WebArXiv: Evaluating Multimodal Agents on Time-Invariant arXiv Tasks
Zihao Sun, Ling Chen
Comments: 10 pages, 9 figures, 4 tables
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Databases (cs.DB)
[89] arXiv:2507.01053 (cross-list from cs.IR) [pdf, html, other]
Title: Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis
Rafi Al Attrach, Pedro Moreira, Rajna Fani, Renato Umeton, Leo Anthony Celi
Comments: 10 pages, 4 figures
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Databases (cs.DB)
[90] arXiv:2507.01267 (cross-list from cs.GT) [pdf, html, other]
Title: Counterfactual Explanation of Shapley Value in Data Coalitions
Michelle Si, Jian Pei
Subjects: Computer Science and Game Theory (cs.GT); Databases (cs.DB)
[91] arXiv:2507.01520 (cross-list from cs.DL) [pdf, other]
Title: A bibliometric analysis on the current situation and hot trends of the impact of microplastics on soil based on CiteSpace
Yiran Zheng, Yue Quan, Su Yan, Xinting Lv, Yuguanmin Cao, Minjie Fu, Mingji Jin
Subjects: Digital Libraries (cs.DL); Databases (cs.DB)
[92] arXiv:2507.01616 (cross-list from cs.IR) [pdf, html, other]
Title: Enhanced Influence-aware Group Recommendation for Online Media Propagation
Chengkun He, Xiangmin Zhou, Chen Wang, Longbing Cao, Jie Shao, Xiaodong Li, Guang Xu, Carrie Jinqiu Hu, Zahir Tari
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Databases (cs.DB)
[93] arXiv:2507.02635 (cross-list from cs.CR) [pdf, html, other]
Title: SAT-BO: Verification Rule Learning and Optimization for FraudTransaction Detection
Mao Luo, Zhi Wang, Yiwen Huang, Qingyun Zhang, Zhouxing Su, Zhipeng Lv, Wen Hu, Jianguo Li
Subjects: Cryptography and Security (cs.CR); Databases (cs.DB)
[94] arXiv:2507.03410 (cross-list from cs.CL) [pdf, html, other]
Title: Graph Repairs with Large Language Models: An Empirical Study
Hrishikesh Terdalkar, Angela Bonifati, Andrea Mauri
Comments: Accepted to the 8th GRADES-NDA 2025 @ SIGMOD/PODS 2025
Subjects: Computation and Language (cs.CL); Databases (cs.DB); Emerging Technologies (cs.ET)
[95] arXiv:2507.05865 (cross-list from cs.IR) [pdf, html, other]
Title: On the Costs and Benefits of Learned Indexing for Dynamic High-Dimensional Data: Extended Version
Terézia Slanináková, Jaroslav Olha, David Procházka, Matej Antol, Vlastislav Dohnal
Subjects: Information Retrieval (cs.IR); Databases (cs.DB)
[96] arXiv:2507.06008 (cross-list from cs.CR) [pdf, html, other]
Title: The Impact of Event Data Partitioning on Privacy-aware Process Discovery
Jungeun Lim, Stephan A. Fahrenkrog-Petersen, Xixi Lu, Jan Mendling, Minseok Song
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Databases (cs.DB)
[97] arXiv:2507.06107 (cross-list from cs.DC) [pdf, html, other]
Title: A Unified Ontology for Scalable Knowledge Graph-Driven Operational Data Analytics in High-Performance Computing Systems
Junaid Ahmed Khan, Andrea Bartolini
Comments: This paper has been accepted for presentation at the GraphSys'25 workshop during EURO-PAR 2025. It spans 12 pages in single-column format
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Databases (cs.DB)
[98] arXiv:2507.07216 (cross-list from cs.LG) [pdf, html, other]
Title: Bias-Aware Mislabeling Detection via Decoupled Confident Learning
Yunyi Li, Maria De-Arteaga, Maytal Saar-Tsechansky
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Databases (cs.DB); Human-Computer Interaction (cs.HC)
[99] arXiv:2507.08107 (cross-list from cs.CL) [pdf, html, other]
Title: GRASP: Generic Reasoning And SPARQL Generation across Knowledge Graphs
Sebastian Walter, Hannah Bast
Subjects: Computation and Language (cs.CL); Databases (cs.DB); Information Retrieval (cs.IR)
[100] arXiv:2507.10019 (cross-list from stat.CO) [pdf, html, other]
Title: Sampling-Based Estimation of Jaccard Containment and Similarity
Pranav Joshi
Subjects: Computation (stat.CO); Databases (cs.DB); Machine Learning (stat.ML)
Total of 134 entries : 1-50 51-100 101-134
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack