Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.DC

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Distributed, Parallel, and Cluster Computing

Authors and titles for May 2025

Total of 302 entries : 1-50 51-100 101-150 151-200 201-250 251-300 ... 301-302
Showing up to 50 entries per page: fewer | more | all
[101] arXiv:2505.12242 [pdf, html, other]
Title: ZenFlow: Enabling Stall-Free Offloading Training via Asynchronous Updates
Tingfeng Lan, Yusen Wu, Bin Ma, Zhaoyuan Su, Rui Yang, Tekin Bicer, Masahiro Tanaka, Olatunji Ruwase, Dong Li, Yue Cheng
Comments: 13 pages, 16 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[102] arXiv:2505.12608 [pdf, html, other]
Title: Quantum Modeling of Spatial Contiguity Constraints
Yunhan Chang, Amr Magdy, Federico M. Spedalieri
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[103] arXiv:2505.12658 [pdf, html, other]
Title: HydraInfer: Hybrid Disaggregated Scheduling for Multimodal Large Language Model Serving
Xianzhe Dong, Tongxuan Liu, Yuting Zeng, Liangyu Liu, Yang Liu, Siyu Wu, Yu Wu, Hailong Yang, Ke Zhang, Jing Li
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[104] arXiv:2505.12663 [pdf, html, other]
Title: MTGRBoost: Boosting Large-scale Generative Recommendation Models in Meituan
Yuxiang Wang, Xiao Yan, Chi Ma, Mincong Huang, Xiaoguang Li, Lei Yu, Chuan Liu, Ruidong Han, He Jiang, Bin Yin, Shangyu Chen, Fei Jiang, Xiang Li, Wei Lin, Haowei Han, Bo Du, Jiawei Jiang
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[105] arXiv:2505.12815 [pdf, html, other]
Title: Learning in Chaos: Efficient Autoscaling and Self-healing for Distributed Training at the Edge
Wenjiao Feng, Rongxing Xiao, Zonghang Li, Hongfang Yu, Gang Sun, Long Luo, Mohsen Guizani, Qirong Ho
Comments: 13 pages, 16 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
[106] arXiv:2505.12832 [pdf, html, other]
Title: A Study on Distributed Strategies for Deep Learning Applications in GPU Clusters
Md Sultanul Islam Ovi
Comments: 10 pages, 15 figures, 4 tables
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[107] arXiv:2505.12853 [pdf, other]
Title: Optimization of Hybrid Quantum-Classical Algorithms
Lian Remme, Alexander Weinert, Andre Waschk
Comments: 15 pages, 3 figures, published in IEEE International Conference on Quantum Software QSW 2025
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[108] arXiv:2505.12928 [pdf, html, other]
Title: Minos: Exploiting Cloud Performance Variation with Function-as-a-Service Instance Selection
Trever Schirmer, Valentin Carl, Nils Höller, Tobias Pfandzelter, David Bermbach
Comments: Accepted for Publication at the 13th IEEE International Conference on Cloud Engineering (IC2E 2025)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[109] arXiv:2505.13153 [pdf, html, other]
Title: Prink: $k_s$-Anonymization for Streaming Data in Apache Flink
Philip Groneberg, Saskia Nuñez von Voigt, Thomas Janke, Louis Loechel, Karl Wolf, Elias Grünewald, Frank Pallas
Comments: accepted for ARES 2025
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR); Software Engineering (cs.SE)
[110] arXiv:2505.13160 [pdf, html, other]
Title: eBPF-Based Instrumentation for Generalisable Diagnosis of Performance Degradation
Diogo Landau, Jorge Barbosa, Nishant Saurabh
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
[111] arXiv:2505.13955 [pdf, html, other]
Title: Paradigm Shift in Infrastructure Inspection Technology: Leveraging High-performance Imaging and Advanced AI Analytics to Inspect Road Infrastructure
Du Wu, Enzhi Zhang, Isaac Lyngaas, Xiao Wang, Amir Ziabari, Tao Luo, Peng Chen, Kento Sato, Fumiyoshi Shoji, Takaki Hatsui, Kentaro Uesugi, Akira Seo, Yasuhito Sakai, Toshio Endo, Tetsuya Ishikawa, Satoshi Matsuoka, Mohamed Wahib
Comments: Submitting this work to be considered for the Gordon Bell Award in SC25
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[112] arXiv:2505.14065 [pdf, html, other]
Title: Prime Collective Communications Library -- Technical Report
Michael Keiblinger, Mario Sieg, Jack Min Ong, Sami Jaghouar, Johannes Hagemann
Comments: 31 pages, 5 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[113] arXiv:2505.14427 [pdf, html, other]
Title: SkyMemory: A LEO Edge Cache for Transformer Inference Optimization and Scale Out
Thomas Sandholm, Sayandev Mukherjee, Lin Cheng, Bernardo A. Huberman
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[114] arXiv:2505.14507 [pdf, html, other]
Title: Federated prediction for scalable and privacy-preserved knowledge-based planning in radiotherapy
Jingyun Chen, David Horowitz, Yading Yuan
Comments: Under review for publication by the journal of Medical Physics
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Medical Physics (physics.med-ph)
[115] arXiv:2505.14796 [pdf, html, other]
Title: Extracting Practical, Actionable Energy Insights from Supercomputer Telemetry and Logs
Melanie Cornelius, Greg Cross, Shilpika Shilpika, Matthew T. Dearing, Zhiling Lan
Comments: 11 pages, 4 tables, 14 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
[116] arXiv:2505.14864 [pdf, html, other]
Title: Balanced and Elastic End-to-end Training of Dynamic LLMs
Mohamed Wahib, Muhammed Abdullah Soyturk, Didem Unat
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
[117] arXiv:2505.14914 [pdf, html, other]
Title: Sei Giga
Benjamin Marsh, Steven Landers, Jayendra Jog
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR)
[118] arXiv:2505.15020 [pdf, html, other]
Title: COSMIC: Enabling Full-Stack Co-Design and Optimization of Distributed Machine Learning Systems
Aditi Raju, Jared Ni, William Won, Changhai Man, Srivatsan Krishnan, Srinivas Sridharan, Amir Yazdanbakhsh, Tushar Krishna, Vijay Janapa Reddi
Comments: 11 pages (excluding references), 10 figures, 6 tables
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[119] arXiv:2505.15112 [pdf, html, other]
Title: Parallel Scan on Ascend AI Accelerators
Bartłomiej Wróblewski, Gioele Gottardo, Anastasios Zouzias
Comments: To appear as a conference poster at the 39th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2025)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
[120] arXiv:2505.15122 [pdf, html, other]
Title: Exploring Dynamic Load Balancing Algorithms for Block-Structured Mesh-and-Particle Simulations in AMReX
Amitash Nanda, Md Kamal Hossain Chowdhury, Hannah Ross, Kevin Gott
Comments: 13 pages, 5 figures, Accepted in the ACM Practice and Experience in Advanced Research Computing (PEARC) Conference Series 2025
Journal-ref: Practice and Experience in Advanced Research Computing 2025 (PEARC '25), ACM, New York, NY, Article 5, 9 pages
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[121] arXiv:2505.15171 [pdf, html, other]
Title: Enhancing Cloud Task Scheduling Using a Hybrid Particle Swarm and Grey Wolf Optimization Approach
Raveena Prasad, Aarush Roy, Suchi Kumari
Comments: 10 pages, 5 figures, 1 table
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[122] arXiv:2505.15542 [pdf, html, other]
Title: Hardware-Level QoS Enforcement Features: Technologies, Use Cases, and Research Challenges
Oliver Larsson (1), Thijs Metsch (2), Cristian Klein (1), Erik Elmroth (1), ((1) Umeå University, (2) Intel Corporation)
Comments: 34 pages, 8 figures, 5 tables
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[123] arXiv:2505.15652 [pdf, html, other]
Title: Breaking Barriers for Distributed MIS by Faster Degree Reduction
Seri Khoury, Aaron Schild
Comments: The abstract was shortened and slightly modified to meet Arxiv's requirements
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
[124] arXiv:2505.15654 [pdf, html, other]
Title: Round Elimination via Self-Reduction: Closing Gaps for Distributed Maximal Matching
Seri Khoury, Aaron Schild
Comments: The abstract was shortened and slightly modified to meet Arxiv requirements
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Data Structures and Algorithms (cs.DS)
[125] arXiv:2505.15988 [pdf, html, other]
Title: An Ecosystem of Services for FAIR Computational Workflows
Sean R. Wilkinson, Johan Gustafsson, Finn Bacall, Khalid Belhajjame, Salvador Capella, Jose Maria Fernandez Gonzalez, Jacob Fosso Tande, Luiz Gadelha, Daniel Garijo, Patricia Grubel, Bjorn Grüning, Farah Zaib Khan, Sehrish Kanwal, Simone Leo, Stuart Owen, Luca Pireddu, Line Pouchard, Laura Rodríguez-Navas, Beatriz Serrano-Solano, Stian Soiland-Reyes, Baiba Vilne, Alan Williams, Merridee Ann Wouters, Frederik Coppens, Carole Goble
Comments: 41 pages, 4 figures, 3 tables; to appear as chapter in upcoming book
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[126] arXiv:2505.16139 [pdf, html, other]
Title: On the Runtime of Local Mutual Exclusion for Anonymous Dynamic Networks
Anya Chaturvedi, Joshua J. Daymude, Andréa W. Richa
Comments: 16 pages, 1 table
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[127] arXiv:2505.16280 [pdf, html, other]
Title: Brand: Managing Training Data with Batched Random Access
Yuhao Li, Xuanhua Shi, Yunfei Zhao, Yongluan Zhou, Yusheng Hua, Xuehai Qian
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[128] arXiv:2505.16496 [pdf, html, other]
Title: Minimizing Energy in Reliability and Deadline-Ensured Workflow Scheduling in Cloud
Suvarthi Sarkar, Dhanesh V, Ketan Singh, Aryabartta Sahu
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[129] arXiv:2505.16499 [pdf, html, other]
Title: Smaller, Smarter, Closer: The Edge of Collaborative Generative AI
Roberto Morabito, SiYoung Jang
Comments: This paper has been accepted for publication in IEEE Internet Computing. Upon publication, the copyright will be transferred to IEEE
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI)
[130] arXiv:2505.16502 [pdf, html, other]
Title: Recursive Offloading for LLM Serving in Multi-tier Networks
Zhiyuan Wu, Sheng Sun, Yuwei Wang, Min Liu, Bo Gao, Jinda Lu, Zheming Yang, Tian Wen
Comments: 7 figures, 3 tables
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Networking and Internet Architecture (cs.NI)
[131] arXiv:2505.16508 [pdf, html, other]
Title: Edge-First Language Model Inference: Models, Metrics, and Tradeoffs
SiYoung Jang, Roberto Morabito
Comments: This paper has been accepted for publication and presentation at the 45th IEEE International Conference on Distributed Computing Systems (IEEE ICDCS 2025). The copyright will be transferred to IEEE upon publication in the conference proceedings
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Networking and Internet Architecture (cs.NI); Performance (cs.PF)
[132] arXiv:2505.17548 [pdf, html, other]
Title: H2:Towards Efficient Large-Scale LLM Training on Hyper-Heterogeneous Cluster over 1,000 Chips
Ding Tang, Jiecheng Zhou, Jiakai Hu, Shengwei Li, Huihuang Zheng, Zhilin Pei, Hui Wang, Xingcheng Zhang
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[133] arXiv:2505.17641 [pdf, html, other]
Title: DecLock: A Case of Decoupled Locking for Disaggregated Memory
Hanze Zhang, Ke Cheng, Rong Chen, Xingda Wei, Haibo Chen
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[134] arXiv:2505.17891 [pdf, other]
Title: DAG-based Consensus with Asymmetric Trust [Extended Version]
Ignacio Amores-Sesar, Christian Cachin, Juan Villacis, Luca Zanolini
Comments: Extended version of the article from PODC 25
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[135] arXiv:2505.18013 [pdf, html, other]
Title: DiFache: Efficient and Scalable Caching on Disaggregated Memory using Decentralized Coherence
Hanze Zhang, Kaiming Wang, Rong Chen, Xingda Wei, Haibo Chen
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[136] arXiv:2505.18278 [pdf, html, other]
Title: A Comparative Review of Parallel Exact, Heuristic, Metaheuristic, and Hybrid Optimization Techniques for the Traveling Salesman Problem
Rabab Alkhalifa, Fatima Alkhomayes, Boushra Almazroua, Dana Alhaidan, Maryam Alothman, Jumana Almuhaidib
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[137] arXiv:2505.18357 [pdf, html, other]
Title: CarbonFlex: Enabling Carbon-aware Provisioning and Scheduling for Cloud Clusters
Walid A. Hanafy, Li Wu, David Irwin, Prashant Shenoy
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[138] arXiv:2505.18563 [pdf, html, other]
Title: PacTrain: Pruning and Adaptive Sparse Gradient Compression for Efficient Collective Communication in Distributed Deep Learning
Yisu Wang, Ruilong Wu, Xinjiao Li, Dirk Kutscher
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI)
[139] arXiv:2505.18648 [pdf, other]
Title: TEE is not a Healer: Rollback-Resistant Reliable Storage
Sadegh Keshavarzi, Gregory Chockler, Alexey Gotsman
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[140] arXiv:2505.18681 [pdf, html, other]
Title: EvoSort: A Genetic-Algorithm-Based Adaptive Parallel Sorting Framework for Large-Scale High Performance Computing
Shashank Raj, Kalyanmoy Deb
Comments: Under review at the International Journal of Parallel, Emergent and Distributed Systems
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[141] arXiv:2505.18836 [pdf, html, other]
Title: Distributed Incremental SAT Solving with Mallob: Report and Case Study with Hierarchical Planning
Dominik Schreiber
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Logic in Computer Science (cs.LO)
[142] arXiv:2505.19216 [pdf, html, other]
Title: Grassroots Consensus
Idit Keidar, Andrew Lewis-Pye, Ehud Shapiro
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Cryptography and Security (cs.CR); Data Structures and Algorithms (cs.DS); Networking and Internet Architecture (cs.NI)
[143] arXiv:2505.19467 [pdf, html, other]
Title: GPU acceleration of non-equilibrium Green's function calculation using OpenACC and CUDA FORTRAN
Jia Yin, Khaled Z. Ibrahim, Mauro Del Ben, Jack Deslippe, Yang-hao Chan, Chao Yang
Comments: 14 pages, 20 figures
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[144] arXiv:2505.19739 [pdf, other]
Title: Justin: Hybrid CPU/Memory Elastic Scaling for Distributed Stream Processing
Donatien Schmitz (EPL), Guillaume Rosinosky (LS2N), Etienne Rivière (EPL)
Comments: Artifacts available at this https URL
Journal-ref: DAIS 2025 - 25th International Conference on Distributed Applications and Interoperable Systems, Daniel Balouek; Ib\'eria Medeiros, Jun 2025, Lille, France. pp.1-17
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[145] arXiv:2505.19880 [pdf, html, other]
Title: Universal Workers: A Vision for Eliminating Cold Starts in Serverless Computing
Saman Akbari, Manfred Hauswirth
Comments: Accepted for publication in 2025 IEEE 18th International Conference on Cloud Computing (CLOUD)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Performance (cs.PF)
[146] arXiv:2505.19989 [pdf, html, other]
Title: From Few to Many Faults: Adaptive Byzantine Agreement with Optimal Communication
Andrei Constantinescu, Marc Dufay, Anton Paramonov, Roger Wattenhofer
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
[147] arXiv:2505.19995 [pdf, html, other]
Title: Optimizing edge AI models on HPC systems with the edge in the loop
Marcel Aach, Cyril Blanc, Andreas Lintermann, Kurt De Grave
Comments: 13 pages, accepted for oral presentation at Computational Aspects of Deep Learning 2025 (at ISC 2025)
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2505.20600 [pdf, html, other]
Title: InstGenIE: Generative Image Editing Made Efficient with Mask-aware Caching and Scheduling
Xiaoxiao Jiang, Suyi Li, Lingyun Yang, Tianyu Feng, Zhipeng Di, Weiyi Lu, Guoxuan Zhu, Xiu Lin, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Liping Zhang, Wei Wang
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[149] arXiv:2505.20705 [pdf, other]
Title: Time-Series Learning for Proactive Fault Prediction in Distributed Systems with Deep Neural Structures
Yang Wang, Wenxuan Zhu, Xuehui Quan, Heyi Wang, Chang Liu, Qiyuan Wu
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG)
[150] arXiv:2505.20835 [pdf, html, other]
Title: ECC-SNN: Cost-Effective Edge-Cloud Collaboration for Spiking Neural Networks
Di Yu, Changze Lv, Xin Du, Linshan Jiang, Wentao Tong, Zhenyu Liao, Xiaoqing Zheng, Shuiguang Deng
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC)
Total of 302 entries : 1-50 51-100 101-150 151-200 201-250 251-300 ... 301-302
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status
    Get status notifications via email or slack