close this message
arXiv smileybones

arXiv Is Hiring a DevOps Engineer

Work on one of the world's most important websites and make an impact on open science.

View Jobs
Skip to main content
Cornell University

arXiv Is Hiring a DevOps Engineer

View Jobs
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.DB

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Databases

Authors and titles for October 2024

Total of 99 entries : 1-50 51-99
Showing up to 50 entries per page: fewer | more | all
[1] arXiv:2410.00067 [pdf, html, other]
Title: Ranking the Top-K Realizations of Stochastically Known Event Logs
Arvid Lepsien, Marco Pegoraro, Frederik Fonger, Dominic Langhammer, Milda Aleknonytė-Resch, Agnes Koschmider
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[2] arXiv:2410.00596 [pdf, html, other]
Title: Dynamic and Scalable Data Preparation for Object-Centric Process Mining
Lien Bosmans, Jari Peeperkorn, Alexandre Goossens, Giovanni Lugaresi, Johannes De Smedt, Jochen De Weerdt
Subjects: Databases (cs.DB)
[3] arXiv:2410.00846 [pdf, html, other]
Title: Why Are Learned Indexes So Effective but Sometimes Ineffective?
Qiyu Liu, Siyuan Han, Yanlin Qi, Jingshu Peng, Jin Li, Longlong Lin, Lei Chen
Subjects: Databases (cs.DB)
[4] arXiv:2410.00884 [pdf, html, other]
Title: Low-Latency Sliding Window Connectivity
Chao Zhang, Angela Bonifati, Tamer Özsu
Subjects: Databases (cs.DB)
[5] arXiv:2410.01231 [pdf, html, other]
Title: Revisiting the Index Construction of Proximity Graph-Based Approximate Nearest Neighbor Search
Shuo Yang, Jiadong Xie, Yingfan Liu, Jeffrey Xu Yu, Xiyue Gao, Qianru Wang, Yanguo Peng, Jiangtao Cui
Comments: The paper has been accepted by PVLDB 2025
Subjects: Databases (cs.DB)
[6] arXiv:2410.01760 [pdf, html, other]
Title: Competitive Ratio of Online Caching with Predictions: Lower and Upper Bounds
Daniel Skachkov, Denis Ponomaryov, Yuri Dorn, Alexander Demin
Subjects: Databases (cs.DB)
[7] arXiv:2410.01869 [pdf, html, other]
Title: Enhancing LLM Fine-tuning for Text-to-SQLs by SQL Quality Measurement
Shouvon Sarker, Xishuang Dong, Xiangfang Li, Lijun Qian
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
[8] arXiv:2410.01978 [pdf, html, other]
Title: LLM+KG@VLDB'24 Workshop Summary
Arijit Khan, Tianxing Wu, Xi Chen
Comments: accepted at ACM SIGMOD Record 2025
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[9] arXiv:2410.02234 [pdf, html, other]
Title: GORAM: Graph-oriented ORAM for Efficient Ego-centric Queries on Federated Graphs
Xiaoyu Fan, Kun Chen, Jiping Yu, Xiaowei Zhu, Yunyi Chen, Huanchen Zhang, Wei Xu
Subjects: Databases (cs.DB); Data Structures and Algorithms (cs.DS)
[10] arXiv:2410.03411 [pdf, html, other]
Title: Benchmarking the Fidelity and Utility of Synthetic Relational Data
Valter Hudovernik, Martin Jurkovič, Erik Štrumbelj
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[11] arXiv:2410.04349 [pdf, html, other]
Title: HyperBlocker: Accelerating Rule-based Blocking in Entity Resolution using GPUs
Xiaoke Zhu, Min Xie, Ting Deng, Qi Zhang
Comments: In PVLDB 2025
Subjects: Databases (cs.DB)
[12] arXiv:2410.04783 [pdf, html, other]
Title: When GDD meets GNN: A Knowledge-driven Neural Connection for Effective Entity Resolution in Property Graphs
Junwei Hu, Michael Bewong, Selasi Kwashie, Yidi Zhang, Vincent Nofong, John Wondoh, Zaiwen Feng
Subjects: Databases (cs.DB)
[13] arXiv:2410.05091 [pdf, html, other]
Title: DIMS: Distributed Index for Similarity Search in Metric Spaces
Yifan Zhu, Chengyang Luo, Tang Qian, Lu Chen, Yunjun Gao, Baihua Zheng
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
[14] arXiv:2410.06010 [pdf, other]
Title: A large collection of bioinformatics question-query pairs over federated knowledge graphs: methodology and applications
Jerven Bolleman, Vincent Emonet, Adrian Altenhoff, Amos Bairoch, Marie-Claude Blatter, Alan Bridge, Severine Duvaud, Elisabeth Gasteiger, Dmitry Kuznetsov, Sebastien Moretti, Pierre-Andre Michel, Anne Morgat, Marco Pagni, Nicole Redaschi, Monique Zahn-Zabal, Tarcisio Mendes de Farias, Ana Claudia Sima
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[15] arXiv:2410.06011 [pdf, html, other]
Title: Large Language Model Enhanced Text-to-SQL Generation: A Survey
Xiaohu Zhu, Qian Li, Lizhen Cui, Yongkang Liu
Comments: 14 pages, 2 figures
Subjects: Databases (cs.DB)
[16] arXiv:2410.06062 [pdf, html, other]
Title: LLM-based SPARQL Query Generation from Natural Language over Federated Knowledge Graphs
Vincent Emonet, Jerven Bolleman, Severine Duvaud, Tarcisio Mendes de Farias, Ana Claudia Sima
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[17] arXiv:2410.06526 [pdf, other]
Title: KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks
Kaijing Ma, Xinrun Du, Yunran Wang, Haoran Zhang, Zhoufutu Wen, Xingwei Qu, Jian Yang, Jiaheng Liu, Minghao Liu, Xiang Yue, Wenhao Huang, Ge Zhang
Subjects: Databases (cs.DB)
[18] arXiv:2410.07144 [pdf, html, other]
Title: Natural Language Query Engine for Relational Databases using Generative AI
Steve Tueno Fotso
Comments: Artificial Intelligence, Machine Learning, Generative AI, SQL, Relational Database, SQL Correctness
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Software Engineering (cs.SE)
[19] arXiv:2410.07485 [pdf, other]
Title: Gem: Gaussian Mixture Model Embeddings for Numerical Feature Distributions
Hafiz Tayyab Rauf, Alex Bogatu, Norman W. Paton, Andre Freitas
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[20] arXiv:2410.07603 [pdf, other]
Title: An Analysis of XML Compression Efficiency
Christopher James Augeri, Barry E. Mullins, Leemon C. Baird III, Dursun A. Bulutoglu, Rusty O. Baldwin
Comments: 1. test data at this https URL 2. one next step is testing newer compressors, e.g., Brotli, along with Zstandard, which leverages the asymmetric numeral system (ANS) 3. citations at this https URL
Journal-ref: Proceedings of the 2007 workshop on Experimental Computer Science (ExpCS) at ACM FCRC 2007
Subjects: Databases (cs.DB); Information Theory (cs.IT); Performance (cs.PF)
[21] arXiv:2410.07895 [pdf, html, other]
Title: Grid-AR: A Grid-based Booster for Learned Cardinality Estimation and Range Joins
Damjan Gjurovski, Angjela Davitkova, Sebastian Michel
Comments: 13 pages, 6 figures, 9 tables
Subjects: Databases (cs.DB)
[22] arXiv:2410.09244 [pdf, html, other]
Title: Using off-the-shelf LLMs to query enterprise data by progressively revealing ontologies
C. Civili, E. Sherkhonov, R.E.K. Stirewalt
Comments: 5 pages
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[23] arXiv:2410.09441 [pdf, other]
Title: From Text to Databases: attribute grammar as database meta-model
Jacques Chabin, Mirian Halfeld-Ferrari, Nicolas Hiot
Subjects: Databases (cs.DB)
[24] arXiv:2410.09925 [pdf, html, other]
Title: The Case for DBMS Live Patching [Extended Version]
Michael Fruth, Stefanie Scherzinger
Subjects: Databases (cs.DB)
[25] arXiv:2410.10081 [pdf, html, other]
Title: Data Modeling for Connected Data -- A systematic literature review
Veronica Santos
Subjects: Databases (cs.DB)
[26] arXiv:2410.10680 [pdf, html, other]
Title: Evaluating SQL Understanding in Large Language Models
Ananya Rahaman, Anny Zheng, Mostafa Milani, Fei Chiang, Rachel Pottinger
Comments: 12 pages conference submission
Subjects: Databases (cs.DB)
[27] arXiv:2410.10931 [pdf, html, other]
Title: Combining Observational Data and Language for Species Range Estimation
Max Hamilton, Christian Lange, Elijah Cole, Alexander Shepard, Samuel Heinrich, Oisin Mac Aodha, Grant Van Horn, Subhransu Maji
Comments: NeurIPS 2024
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[28] arXiv:2410.11435 [pdf, html, other]
Title: Summarized Causal Explanations For Aggregate Views (Full version)
Brit Youngmann, Michael Cafarella, Amir Gilad, Sudeepa Roy
Subjects: Databases (cs.DB)
[29] arXiv:2410.11457 [pdf, other]
Title: LR-SQL: A Supervised Fine-Tuning Method for Text2SQL Tasks under Low-Resource Scenarios
Wen Wuzhenghong, Zhang Yongpan, Pan Su, Sun Yuwei, Lu Pengwei, Ding Cheng
Comments: 12pages, 4 figures,submitting to a journal
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR)
[30] arXiv:2410.11853 [pdf, html, other]
Title: GeoLife+: Large-Scale Simulated Trajectory Datasets Calibrated to the GeoLife Dataset
Hossein Amiri, Richard Yang, Andreas Zufle
Comments: Accepted paper at this https URL
Subjects: Databases (cs.DB); Information Retrieval (cs.IR); Machine Learning (cs.LG); Social and Information Networks (cs.SI)
[31] arXiv:2410.12056 [pdf, other]
Title: Utilizing Spatiotemporal Data Analytics to Pinpoint Outage Location
Reddy Mandati, Po-Chen Chen, Vladyslav Anderson, Bishwa Sapkota, Michael Jarrell Warren, Bobby Besharati, Ankush Agarwal, Samuel Johnston III
Subjects: Databases (cs.DB); Computers and Society (cs.CY)
[32] arXiv:2410.12189 [pdf, html, other]
Title: DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
Shreya Shankar, Tristan Chambers, Tarak Shah, Aditya G. Parameswaran, Eugene Wu
Comments: 22 pages, 6 figures, 7 tables
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[33] arXiv:2410.12418 [pdf, html, other]
Title: Privacy-Preserving Synthetically Augmented Knowledge Graphs with Semantic Utility
Luigi Bellomarini, Costanza Catalano, Andrea Coletta, Michela Iezzi, Pierangela Samarati
Comments: 32 pages, 5 figures
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[34] arXiv:2410.12496 [pdf, html, other]
Title: Finding Logic Bugs in Spatial Database Engines via Affine Equivalent Inputs
Wenjing Deng, Qiuyang Mang, Chengyu Zhang, Manuel Rigger
Subjects: Databases (cs.DB); Programming Languages (cs.PL); Software Engineering (cs.SE)
[35] arXiv:2410.12734 [pdf, html, other]
Title: Machine Learning-Augmented Ontology-Based Data Access for Renewable Energy Data
Marco Calautti, Damiano Duranti, Paolo Giorgini
Comments: Paper accepted for pubblication to the 32nd Symposium on Advanced Database Systems (SEBD) 2024: this https URL
Subjects: Databases (cs.DB)
[36] arXiv:2410.12965 [pdf, html, other]
Title: Realizing a Collaborative RDF Benchmark Suite in Practice
Piotr Sowinski, Maria Ganzha
Comments: Accepted at 24th International Conference on Knowledge Engineering and Knowledge Management (EKAW 2024) as a demo paper
Subjects: Databases (cs.DB)
[37] arXiv:2410.13813 [pdf, html, other]
Title: Meta-Property Graphs: Extending Property Graphs with Metadata Awareness and Reification
Sepehr Sadoughi, Nikolay Yakovets, George Fletcher
Subjects: Databases (cs.DB)
[38] arXiv:2410.13880 [pdf, html, other]
Title: Query Based Construction of Chronic Disease Datasets
Vuong M. Ngo, Geetika Sood, Patricia Kearney, Fionnuala Donohue, Dongyun Nie, Mark Roantree
Comments: White paper for the RECONNECT project, 14 pages, 4 figures and 18 example queries
Subjects: Databases (cs.DB)
[39] arXiv:2410.13884 [pdf, other]
Title: Cartographier des trajectoires maritimes incertaines du XVIII ème siècle
Christine Plumejeaud-Perreau (Migrinter (Poitiers)), Bernard Pradines (Migrinter (Poitiers))
Comments: in French language
Journal-ref: Spatial Analysis and GEOmatics (SAGEO) 2023, Thierry Badard, Jacynthe Pouliot, Matthieu Noucher, Marl{\`e}ne Villanova-Oliver, Jun 2023, Qu{\'e}bec, Canada
Subjects: Databases (cs.DB)
[40] arXiv:2410.14066 [pdf, html, other]
Title: Lightweight Correlation-Aware Table Compression
Mihail Stoian, Alexander van Renen, Jan Kobiolka, Ping-Lin Kuo, Josif Grabocka, Andreas Kipf
Comments: Third Table Representation Learning Workshop (TRL @ NeurIPS 2024)
Subjects: Databases (cs.DB); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[41] arXiv:2410.14495 [pdf, html, other]
Title: Towards a Simple and Extensible Standard for Object-Centric Event Data (OCED) -- Core Model, Design Space, and Lessons Learned
Dirk Fahland, Marco Montali, Julian Lebherz, Wil M.P. van der Aalst, Maarten van Asseldonk, Peter Blank, Lien Bosmans, Marcus Brenscheidt, Claudio di Ciccio, Andrea Delgado, Daniel Calegari, Jari Peeperkorn, Eric Verbeek, Lotte Vugs, Moe Thandar Wynn
Comments: 46 pages, 11 figures, report of the OCED working group of the IEEE Taskforce on Process Mining towards the development of a new standard for exchange and storage of object-centric event data
Subjects: Databases (cs.DB)
[42] arXiv:2410.14692 [pdf, other]
Title: Attribute-Based Semantic Type Detection and Data Quality Assessment
Marcelo Valentim Silva, Hannes Herrmann, Valerie Maxville
Comments: 10 pages, 9 tables, sent for approval at BDCAT 2024
Subjects: Databases (cs.DB); Information Retrieval (cs.IR)
[43] arXiv:2410.15547 [pdf, html, other]
Title: Data Cleaning Using Large Language Models
Shuo Zhang, Zezhou Huang, Eugene Wu
Subjects: Databases (cs.DB)
[44] arXiv:2410.15831 [pdf, html, other]
Title: Rethinking State Management in Actor Systems for Cloud-Native Applications
Yijian Liu, Rodrigo Laigner, Yongluan Zhou
Subjects: Databases (cs.DB); Software Engineering (cs.SE)
[45] arXiv:2410.16120 [pdf, other]
Title: Learning SQL from within: integrating database exercises into the database itself
Aristide Grange
Comments: 36 pages
Subjects: Databases (cs.DB)
[46] arXiv:2410.16501 [pdf, html, other]
Title: The Cost of Representation by Subset Repairs
Yuxi Liu, Fangzhu Shen, Kushagra Ghosh, Amir Gilad, Benny Kimelfeld, Sudeepa Roy
Comments: full version, to appear at VLDB25
Subjects: Databases (cs.DB)
[47] arXiv:2410.16720 [pdf, html, other]
Title: NodeOP: Optimizing Node Management for Decentralized Networks
Angela Tsang, Jiankai Sun, Boo Xie, Azeem Khan, Ender Lu, Fletcher Fan, Maggie Wu, Jing Tang
Subjects: Databases (cs.DB); Cryptography and Security (cs.CR)
[48] arXiv:2410.16929 [pdf, html, other]
Title: CUBIT: Concurrent Updatable Bitmap Indexing (Extended Version)
Junchang Wang, Manos Athanassoulis
Subjects: Databases (cs.DB)
[49] arXiv:2410.17134 [pdf, html, other]
Title: TELII: Temporal Event Level Inverted Indexing for Cohort Discovery on a Large Covid-19 EHR Dataset
Yan Huang
Subjects: Databases (cs.DB); Information Retrieval (cs.IR)
[50] arXiv:2410.17465 [pdf, html, other]
Title: Bauplan: zero-copy, scale-up FaaS for data pipelines
Jacopo Tagliabue, Tyler Caraza-Harter, Ciro Greco
Comments: Accepted for the 10th International Workshop on Serverless Computing (pre-print)
Subjects: Databases (cs.DB); Machine Learning (cs.LG); Operating Systems (cs.OS)
Total of 99 entries : 1-50 51-99
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack