Skip to main content

Showing 1–15 of 15 results for author: Cherti, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.03712  [pdf, other

    cs.CV cs.AI cs.CE cs.LG

    Scalable heliostat surface predictions from focal spots: Sim-to-Real transfer of inverse Deep Learning Raytracing

    Authors: Jan Lewen, Max Pargmann, Jenia Jitsev, Mehdi Cherti, Robert Pitz-Paal, Daniel Maldonado Quinto

    Abstract: Concentrating Solar Power (CSP) plants are a key technology in the transition toward sustainable energy. A critical factor for their safe and efficient operation is the distribution of concentrated solar flux on the receiver. However, flux distributions from individual heliostats are sensitive to surface imperfections. Measuring these surfaces across many heliostats remains impractical in real-wor… ▽ More

    Submitted 28 March, 2025; originally announced April 2025.

  2. Application-Driven Exascale: The JUPITER Benchmark Suite

    Authors: Andreas Herten, Sebastian Achilles, Damian Alvarez, Jayesh Badwaik, Eric Behle, Mathis Bode, Thomas Breuer, Daniel Caviedes-Voullième, Mehdi Cherti, Adel Dabah, Salem El Sayed, Wolfgang Frings, Ana Gonzalez-Nicolas, Eric B. Gregory, Kaveh Haghighi Mood, Thorsten Hater, Jenia Jitsev, Chelsea Maria John, Jan H. Meinke, Catrin I. Meyer, Pavel Mezentsev, Jan-Oliver Mirus, Stepan Nassyr, Carolin Penke, Manoel Römmer , et al. (6 additional authors not shown)

    Abstract: Benchmarks are essential in the design of modern HPC installations, as they define key aspects of system components. Beyond synthetic workloads, it is crucial to include real applications that represent user requirements into benchmark suites, to guarantee high usability and widespread adoption of a new system. Given the significant investments in leadership-class supercomputers of the exascale er… ▽ More

    Submitted 30 August, 2024; originally announced August 2024.

    Comments: To be published in Proceedings of The International Conference for High Performance Computing Networking, Storage, and Analysis (SC '24) (2024)

    ACM Class: B.8.2; C.0; C.5.1; D.1.0; C.4

    Journal ref: 2024 SC24: International Conference for High Performance Computing, Networking, Storage and Analysis SC

  3. arXiv:2408.14471  [pdf, other

    cs.CV cs.CL cs.LG

    A Practitioner's Guide to Continual Multimodal Pretraining

    Authors: Karsten Roth, Vishaal Udandarao, Sebastian Dziadzio, Ameya Prabhu, Mehdi Cherti, Oriol Vinyals, Olivier Hénaff, Samuel Albanie, Matthias Bethge, Zeynep Akata

    Abstract: Multimodal foundation models serve numerous applications at the intersection of vision and language. Still, despite being pretrained on extensive data, they become outdated over time. To keep models updated, research into continual pretraining mainly explores scenarios with either (1) infrequent, indiscriminate updates on large-scale new data, or (2) frequent, sample-level updates. However, practi… ▽ More

    Submitted 6 December, 2024; v1 submitted 26 August, 2024; originally announced August 2024.

    Comments: Technical Report. 52 pages. Shorter version published at the NeurIPS 2024 Dataset & Benchmarks track

  4. arXiv:2408.10802  [pdf, other

    cs.LG cs.AI

    Inverse Deep Learning Ray Tracing for Heliostat Surface Prediction

    Authors: Jan Lewen, Max Pargmann, Mehdi Cherti, Jenia Jitsev, Robert Pitz-Paal, Daniel Maldonado Quinto

    Abstract: Concentrating Solar Power (CSP) plants play a crucial role in the global transition towards sustainable energy. A key factor in ensuring the safe and efficient operation of CSP plants is the distribution of concentrated flux density on the receiver. However, the non-ideal flux density generated by individual heliostats can undermine the safety and efficiency of the power plant. The flux density fr… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  5. arXiv:2406.02061  [pdf, other

    cs.LG cs.AI cs.CL

    Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Models

    Authors: Marianna Nezhurina, Lucia Cipolina-Kun, Mehdi Cherti, Jenia Jitsev

    Abstract: Large Language Models (LLMs) are often described as instances of foundation models that possess strong generalization obeying scaling laws, and therefore transfer robustly across various conditions in few- or zero-shot manner. Such claims rely on standardized benchmarks that suppose to measure generalization and reasoning, where state-of-the-art (SOTA) models score high. We demonstrate here a dram… ▽ More

    Submitted 4 March, 2025; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: v3.0. Control experiments, further AIW problem versions, testing recent reasoning models. Short version appeared at NeurIPS Scientific Methods for Understanding Deep Learning Workshop (SciDL) 2024, https://openreview.net/forum?id=Mkl7dzjYiW

  6. arXiv:2304.14108  [pdf, other

    cs.CV cs.CL cs.LG

    DataComp: In search of the next generation of multimodal datasets

    Authors: Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song , et al. (9 additional authors not shown)

    Abstract: Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ML ecosystem, we introduce DataComp, a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Commo… ▽ More

    Submitted 20 October, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023 Datasets and Benchmarks Track

  7. arXiv:2304.07169  [pdf, other

    cs.CV cs.LG

    A Comparative Study on Generative Models for High Resolution Solar Observation Imaging

    Authors: Mehdi Cherti, Alexander Czernik, Stefan Kesselheim, Frederic Effenberger, Jenia Jitsev

    Abstract: Solar activity is one of the main drivers of variability in our solar system and the key source of space weather phenomena that affect Earth and near Earth space. The extensive record of high resolution extreme ultraviolet (EUV) observations from the Solar Dynamics Observatory (SDO) offers an unprecedented, very large dataset of solar images. In this work, we make use of this comprehensive dataset… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  8. Reproducible scaling laws for contrastive language-image learning

    Authors: Mehdi Cherti, Romain Beaumont, Ross Wightman, Mitchell Wortsman, Gabriel Ilharco, Cade Gordon, Christoph Schuhmann, Ludwig Schmidt, Jenia Jitsev

    Abstract: Scaling up neural networks has led to remarkable performance across a wide range of tasks. Moreover, performance often follows reliable scaling laws as a function of training set size, model size, and compute, which offers valuable guidance as large-scale experiments are becoming increasingly expensive. However, previous work on scaling laws has primarily used private data \& models or focused on… ▽ More

    Submitted 13 July, 2024; v1 submitted 14 December, 2022; originally announced December 2022.

    Comments: CVPR 2023. Version with minor extension. Original: https://openaccess.thecvf.com/content/CVPR2023/html/Cherti_Reproducible_Scaling_Laws_for_Contrastive_Language-Image_Learning_CVPR_2023_paper

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 2818-2829

  9. arXiv:2210.08402  [pdf, other

    cs.CV cs.AI cs.LG

    LAION-5B: An open large-scale dataset for training next generation image-text models

    Authors: Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, Jenia Jitsev

    Abstract: Groundbreaking language-vision architectures like CLIP and DALL-E proved the utility of training on large amounts of noisy image-text data, without relying on expensive accurate labels used in standard vision unimodal supervised learning. The resulting models showed capabilities of strong text-guided image generation and transfer to downstream tasks, while performing remarkably at zero-shot classi… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022), Track on Datasets and Benchmarks. OpenReview: https://openreview.net/forum?id=M3Y74vmsMcY

  10. arXiv:2108.11976  [pdf, other

    cs.DC cs.LG

    JUWELS Booster -- A Supercomputer for Large-Scale AI Research

    Authors: Stefan Kesselheim, Andreas Herten, Kai Krajsek, Jan Ebert, Jenia Jitsev, Mehdi Cherti, Michael Langguth, Bing Gong, Scarlet Stadtler, Amirpasha Mozaffari, Gabriele Cavallaro, Rocco Sedona, Alexander Schug, Alexandre Strube, Roshni Kamath, Martin G. Schultz, Morris Riedel, Thomas Lippert

    Abstract: In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the Jülich Supercomputing Center. With its system architecture, most importantly its large number of powerful Graphics Processing Units (GPUs) and its fast interconnect via InfiniBand, it is an ideal machine for large-scale Artificial Intelligence (AI) research and applications. We detail its s… ▽ More

    Submitted 30 June, 2021; originally announced August 2021.

    Comments: 12 pages, 5 figures. Accepted at ISC 2021, Workshop Deep Learning on Supercomputers. This is a duplicate submission as my previous submission is on hold for several weeks now and my attempts to contact the moderators failed

    Report number: 1234567Dummy

  11. Effect of Pre-Training Scale on Intra- and Inter-Domain Full and Few-Shot Transfer Learning for Natural and Medical X-Ray Chest Images

    Authors: Mehdi Cherti, Jenia Jitsev

    Abstract: Increasing model, data and compute budget scale in the pre-training has been shown to strongly improve model generalization and transfer learning in vast line of work done in language modeling and natural image recognition. However, most studies on the positive effect of larger scale were done in scope of in-domain setting, with source and target data being in close proximity. To study effect of l… ▽ More

    Submitted 18 December, 2022; v1 submitted 31 May, 2021; originally announced June 2021.

    Comments: Short version published in MedNeurIPS 2021. Long version published in IJCNN 2022. Code: https://github.com/SLAMPAI/large-scale-pretraining-transfer

    Journal ref: 2022 International Joint Conference on Neural Networks (IJCNN), 2022, pp. 1-9

  12. arXiv:1906.11898  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    InsectUp: Crowdsourcing Insect Observations to Assess Demographic Shifts and Improve Classification

    Authors: Léonard Boussioux, Tomás Giro-Larraz, Charles Guille-Escuret, Mehdi Cherti, Balázs Kégl

    Abstract: Insects play such a crucial role in ecosystems that a shift in demography of just a few species can have devastating consequences at environmental, social and economic levels. Despite this, evaluation of insect demography is strongly limited by the difficulty of collecting census data at sufficient scale. We propose a method to gather and leverage observations from bystanders, hikers, and entomolo… ▽ More

    Submitted 29 January, 2020; v1 submitted 29 May, 2019; originally announced June 2019.

    Comments: Appearing at the International Conference on Machine Learning, AI for Social Good Workshop, Long Beach, United States, 2019 Appearing at the International Conference on Computer Vision, AI for Wildlife Conservation Workshop, Seoul, South Korea, 2019 5 pages, 6 figures

  13. arXiv:1810.01876  [pdf, other

    cs.LG stat.ML

    Spurious samples in deep generative models: bug or feature?

    Authors: Balázs Kégl, Mehdi Cherti, Akın Kazakçı

    Abstract: Traditional wisdom in generative modeling literature is that spurious samples that a model can generate are errors and they should be avoided. Recent research, however, has shown interest in studying or even exploiting such samples instead of eliminating them. In this paper, we ask the question whether such samples can be eliminated all together without sacrificing coverage of the generating distr… ▽ More

    Submitted 3 October, 2018; originally announced October 2018.

  14. arXiv:1705.07099  [pdf, other

    q-bio.QM cs.LG

    Machine learning for classification and quantification of monoclonal antibody preparations for cancer therapy

    Authors: Laetitia Le, Camille Marini, Alexandre Gramfort, David Nguyen, Mehdi Cherti, Sana Tfaili, Ali Tfayli, Arlette Baillet-Guffroy, Patrice Prognon, Pierre Chaminade, Eric Caudron, Balázs Kégl

    Abstract: Monoclonal antibodies constitute one of the most important strategies to treat patients suffering from cancers such as hematological malignancies and solid tumors. In order to guarantee the quality of those preparations prepared at hospital, quality control has to be developed. The aim of this study was to explore a noninvasive, nondestructive, and rapid analytical method to ensure the quality of… ▽ More

    Submitted 31 May, 2017; v1 submitted 19 May, 2017; originally announced May 2017.

  15. arXiv:1606.04345  [pdf, other

    cs.AI

    Digits that are not: Generating new types through deep neural nets

    Authors: Akın Kazakçıand Mehdi Cherti, Balázs Kégl

    Abstract: For an artificial creative agent, an essential driver of the search for novelty is a value function which is often provided by the system designer or users. We argue that an important barrier for progress in creativity research is the inability of these systems to develop their own notion of value for novelty. We propose a notion of knowledge-driven creativity that circumvent the need for an exter… ▽ More

    Submitted 14 June, 2016; originally announced June 2016.

    Comments: preprint ICCC'16, International Conference on Computational Creativity