Skip to main content

Showing 1–18 of 18 results for author: Mullins, D

.
  1. arXiv:2411.05197  [pdf, other

    cs.LG

    Hardware and Software Platform Inference

    Authors: Cheng Zhang, Hanna Foerster, Robert D. Mullins, Yiren Zhao, Ilia Shumailov

    Abstract: It is now a common business practice to buy access to large language model (LLM) inference rather than self-host, because of significant upfront hardware infrastructure and energy costs. However, as a buyer, there is no mechanism to verify the authenticity of the advertised service including the serving hardware platform, e.g. that it is actually being served using an NVIDIA H100. Furthermore, the… ▽ More

    Submitted 7 November, 2024; originally announced November 2024.

  2. arXiv:2406.14963  [pdf, other

    cs.LG

    Optimised Grouped-Query Attention Mechanism for Transformers

    Authors: Yuang Chen, Cheng Zhang, Xitong Gao, Robert D. Mullins, George A. Constantinides, Yiren Zhao

    Abstract: Grouped-query attention (GQA) has been widely adopted in LLMs to mitigate the complexity of multi-head attention (MHA). To transform an MHA to a GQA, neighbour queries in MHA are evenly split into groups where each group shares the value and key layers. In this work, we propose AsymGQA, an activation-informed approach to asymmetrically grouping an MHA to a GQA for better model performance. Our Asy… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML2024 ES-FoMo-II Workshop

  3. arXiv:2406.14956  [pdf, other

    cs.LG cs.CL

    Unlocking the Global Synergies in Low-Rank Adapters

    Authors: Zixi Zhang, Cheng Zhang, Xitong Gao, Robert D. Mullins, George A. Constantinides, Yiren Zhao

    Abstract: Low-rank Adaption (LoRA) has been the de-facto parameter-efficient fine-tuning technique for large language models. We present HeteroLoRA, a light-weight search algorithm that leverages zero-cost proxies to allocate the limited LoRA trainable parameters across the model for better fine-tuned performance. In addition to the allocation for the standard LoRA-adapted models, we also demonstrate the ef… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML2024 ES-FoMo-II Workshop

  4. arXiv:2210.02570  [pdf, other

    cs.LG cs.AI cs.CL

    Revisiting Structured Dropout

    Authors: Yiren Zhao, Oluwatomisin Dada, Xitong Gao, Robert D Mullins

    Abstract: Large neural networks are often overparameterised and prone to overfitting, Dropout is a widely used regularization technique to combat overfitting and improve model generalization. However, unstructured Dropout is not always effective for specific network architectures and this has led to the formation of multiple structured Dropout approaches to improve model performance and, sometimes, reduce t… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

  5. arXiv:2210.00641  [pdf, other

    cs.LG

    DARTFormer: Finding The Best Type Of Attention

    Authors: Jason Ross Brown, Yiren Zhao, Ilia Shumailov, Robert D Mullins

    Abstract: Given the wide and ever growing range of different efficient Transformer attention mechanisms, it is important to identify which attention is most effective when given a task. In this work, we are also interested in combining different attention types to build heterogeneous Transformers. We first propose a DARTS-like Neural Architecture Search (NAS) method to find the best attention for a given ta… ▽ More

    Submitted 2 October, 2022; originally announced October 2022.

    ACM Class: I.2.7; I.2.6

  6. arXiv:2210.00640  [pdf, other

    cs.LG

    Wide Attention Is The Way Forward For Transformers?

    Authors: Jason Ross Brown, Yiren Zhao, Ilia Shumailov, Robert D Mullins

    Abstract: The Transformer is an extremely powerful and prominent deep learning architecture. In this work, we challenge the commonly held belief in deep learning that going deeper is better, and show an alternative design approach that is building wider attention Transformers. We demonstrate that wide single layer Transformer models can compete with or outperform deeper ones in a variety of Natural Language… ▽ More

    Submitted 8 November, 2022; v1 submitted 2 October, 2022; originally announced October 2022.

    ACM Class: I.2.7

  7. arXiv:2209.09338  [pdf, other

    cs.LG

    Revisiting Embeddings for Graph Neural Networks

    Authors: S. Purchase, A. Zhao, R. D. Mullins

    Abstract: Current graph representation learning techniques use Graph Neural Networks (GNNs) to extract features from dataset embeddings. In this work, we examine the quality of these embeddings and assess how changing them can affect the accuracy of GNNs. We explore different embedding extraction techniques for both images and texts; and find that the performance of different GNN architectures is dependent… ▽ More

    Submitted 29 November, 2022; v1 submitted 19 September, 2022; originally announced September 2022.

  8. E-Scooter Rider Detection and Classification in Dense Urban Environments

    Authors: Shane Gilroy, Darragh Mullins, Edward Jones, Ashkan Parsi, Martin Glavin

    Abstract: Accurate detection and classification of vulnerable road users is a safety critical requirement for the deployment of autonomous vehicles in heterogeneous traffic. Although similar in physical appearance to pedestrians, e-scooter riders follow distinctly different characteristics of movement and can reach speeds of up to 45kmph. The challenge of detecting e-scooter riders is exacerbated in urban e… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

  9. arXiv:2205.05412  [pdf, other

    cs.CV

    An Objective Method for Pedestrian Occlusion Level Classification

    Authors: Shane Gilroy, Martin Glavin, Edward Jones, Darragh Mullins

    Abstract: Pedestrian detection is among the most safety-critical features of driver assistance systems for autonomous vehicles. One of the most complex detection challenges is that of partial occlusion, where a target object is only partially available to the sensor due to obstruction by another foreground object. A number of current pedestrian detection benchmarks provide annotation for partial occlusion t… ▽ More

    Submitted 31 May, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

  10. The Impact of Partial Occlusion on Pedestrian Detectability

    Authors: Shane Gilroy, Darragh Mullins, Edward Jones, Ashkan Parsi, Martin Glavin

    Abstract: Robust detection of vulnerable road users is a safety critical requirement for the deployment of autonomous vehicles in heterogeneous traffic. One of the most complex outstanding challenges is that of partial occlusion where a target object is only partially available to the sensor due to obstruction by another foreground object. A number of leading pedestrian detection benchmarks provide annotati… ▽ More

    Submitted 27 July, 2023; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: This research has been published under the title "Replacing the human driver: An objective benchmark for occluded pedestrian detection" in Biomimetic Intelligence and Robotics https://doi.org/10.1016/j.birob.2023.100115

    Journal ref: Biomimetic Intelligence and Robotics. 2023 Jul 18:100115

  11. arXiv:2101.08730  [pdf, other

    physics.ins-det nucl-ex

    The design of the n2EDM experiment

    Authors: N. J. Ayres, G. Ban, L. Bienstman, G. Bison, K. Bodek, V. Bondar, T. Bouillaud, E. Chanel, J. Chen, P. -J. Chiu, B. Clément, C. Crawford, M. Daum, B. Dechenaux, C. B. Doorenbos, S. Emmenegger, L. Ferraris-Bouchez, M. Fertl, A. Fratangelo, P. Flaux, D. Goupillière, W. C. Griffith, Z. D. Grujic, P. G. Harris, K. Kirch , et al. (36 additional authors not shown)

    Abstract: We present the design of a next-generation experiment, n2EDM, currently under construction at the ultracold neutron source at the Paul Scherrer Institute (PSI) with the aim of carrying out a high-precision search for an electric dipole moment of the neutron. The project builds on experience gained with the previous apparatus operated at PSI until 2017, and is expected to deliver an order of magnit… ▽ More

    Submitted 22 January, 2021; v1 submitted 21 January, 2021; originally announced January 2021.

    Journal ref: Eur. Phys. J. C 81, 512 (2021)

  12. arXiv:2007.15957  [pdf, other

    quant-ph

    Using Reinforcement Learning to Perform Qubit Routing in Quantum Compilers

    Authors: Matteo G. Pozzi, Steven J. Herbert, Akash Sengupta, Robert D. Mullins

    Abstract: "Qubit routing" refers to the task of modifying quantum circuits so that they satisfy the connectivity constraints of a target quantum computer. This involves inserting SWAP gates into the circuit so that the logical gates only ever occur between adjacent physical qubits. The goal is to minimise the circuit depth added by the SWAP gates. In this paper, we propose a qubit routing procedure that u… ▽ More

    Submitted 31 July, 2020; originally announced July 2020.

    Comments: 13 pages, 12 figures

  13. If your P value looks too good to be true, it probably is: Communicating reproducibility and variability in cell biology

    Authors: Samuel J. Lord, Katrina B. Velle, R. Dyche Mullins, Lillian K. Fritz-Laylin

    Abstract: The cell biology literature is littered with erroneously tiny P values, often the result of evaluating individual cells as independent samples. Because readers use P values and error bars to infer whether a reported difference would likely recur if the experiment were repeated, the sample size N used for statistical tests should actually be the number of times an experiment is performed, not the n… ▽ More

    Submitted 20 December, 2019; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Modified Figure 1A to use the identical dataset as B-C. Included tutorial for making plots in R, Python, and Excel. Replaced on comparing biological vs technical replicates with expanded explanation of population sampling. Included discussion of estimation statistics and forest plots as a reasonable alternative to P values. Clarified the benefits of the P value, despite its flaws

    Journal ref: J. Cell. Biol. 219 (2020) e202001064

  14. Characterizing a Dramatic $ΔV\sim-9$ Flare on an Ultracool Dwarf Found by the ASAS-SN Survey

    Authors: Sarah J. Schmidt, Jose L. Prieto, K. Z. Stanek, Benjamin J. Shappee, Nidia Morrell, Daniella C. Bardalez Gagliuffi, C. S. Kochanek, J. Jencson, T. W-S. Holoien, U. Basu, John. F. Beacom, D. M. Szczygiel, G. Pojmanski, J. Brimacombe, M. Dubberley, M. Elphick, S. Foale, E. Hawkins, D. Mullins, W. Rosing, R. Ross, Z. Walker

    Abstract: We analyze a $ΔV\sim-9$ magnitude flare on the newly identified M8 dwarf SDSS J022116.84+194020.4 (hereafter SDSSJ0221) detected as part of the All-Sky Automated Survey for Supernovae (ASAS-SN). Using infrared and optical spectra, we confirm that SDSSJ0221 is a relatively nearby (d$\sim$76 pc) M8 dwarf with strong quiescent H$α$ emission. Based on kinematics and the absence of features consistent… ▽ More

    Submitted 21 November, 2013; v1 submitted 16 October, 2013; originally announced October 2013.

    Comments: Updated version in response to referee report. 6 pages, 3 figures, 2 tables. Submitted to ApJL. For a brief video explaining this paper, see http://youtu.be/uue8G0NnjJU

  15. arXiv:1310.2241  [pdf, ps, other

    astro-ph.HE astro-ph.CO

    The Man Behind the Curtain: X-rays Drive the UV through NIR Variability in the 2013 AGN Outburst in NGC 2617

    Authors: B. J. Shappee, J. L. Prieto, D. Grupe, C. S. Kochanek, K. Z. Stanek, G. De Rosa, S. Mathur, Y. Zu, B. M. Peterson, R. W. Pogge, S. Komossa, M. Im, J. Jencson, T. W-S. Holoien, U. Basu, J. F. Beacom, D. M. Szczygiel, J. Brimacombe, S. Adams, A. Campillay, C. Choi, C. Contreras, M. Dietrich, M. Dubberley, M. Elphick , et al. (22 additional authors not shown)

    Abstract: After the All-Sky Automated Survey for SuperNovae (ASAS-SN) discovered a significant brightening of the inner region of NGC 2617, we began a ~70 day photometric and spectroscopic monitoring campaign from the X-ray through near-infrared (NIR) wavelengths. We report that NGC 2617 went through a dramatic outburst, during which its X-ray flux increased by over an order of magnitude followed by an incr… ▽ More

    Submitted 26 June, 2014; v1 submitted 8 October, 2013; originally announced October 2013.

    Comments: 36 pages, 11 figures, 3 Tables. Accepted for publication in ApJ. Spectroscopic and photometric data presented in this submission are included as ancillary files. To see a video of the Swift UV/optical data see http://youtu.be/XuG6uhx-zs4 For a brief video explaining this paper, see http://youtu.be/W4RXTNHCh-g

  16. arXiv:1305.2437  [pdf, ps, other

    astro-ph.IM

    Las Cumbres Observatory Global Telescope Network

    Authors: T. M. Brown, N. Baliber, F. B. Bianco, M. Bowman, B. Burleson, P. Conway, M. Crellin, É. Depagne, J. De Vera, B. Dilday, D. Dragomir, M. Dubberley, J. D. Eastman, M. Elphick, M. Falarski, S. Foale, M. Ford, B. J. Fulton, J. Garza, E. L. Gomez, M. Graham, R. Greene, B. Haldeman, E. Hawkins, B. Haworth , et al. (30 additional authors not shown)

    Abstract: Las Cumbres Observatory Global Telescope (LCOGT) is a young organization dedicated to time-domain observations at optical and (potentially) near-IR wavelengths. To this end, LCOGT is constructing a world-wide network of telescopes, including the two 2m Faulkes telescopes, as many as 17 x 1m telescopes, and as many as 23 x 40cm telescopes. These telescopes initially will be outfitted for imaging an… ▽ More

    Submitted 29 July, 2013; v1 submitted 10 May, 2013; originally announced May 2013.

    Comments: 59 pages, 9 figures, 4 tables. AAS Latex v5.2. Accepted for publication in Pub. Astr. Soc. Pacific

  17. Quaternionic Formulation of the Dirac Equation

    Authors: Don Colladay, Patrick McDonald, David Mullins

    Abstract: The Dirac equation with Lorentz violation involves additional coefficients and yields a fourth-order polynomial that must be solved to yield the dispersion relation. The conventional method of taking the determinant of $4\times 4$ matrices of complex numbers often yields unwieldy dispersion relations. By using quaternions, the Dirac equation may be reduced to $2 \times 2$ form in which the structu… ▽ More

    Submitted 6 August, 2010; originally announced August 2010.

    Comments: Presented at the Fifth Meeting on CPT and Lorentz Symmetry, Bloomington, Indiana, June 28-July 2, 2010

  18. Factoring the Dispersion Relation in the Presence of Lorentz Violation

    Authors: Don Colladay, Patrick McDonald, David Mullins

    Abstract: We produce an explicit formula for the dispersion relation for the Dirac Equation in the Standard Model Extension (SME) in the presence of Lorentz violation. Our expression is obtained using a novel techniques which exploit the algebra of quaternions. The dispersion relation is found to conveniently factor in two special cases that each involve a mutually exclusive set of non-vanishing Lorentz-v… ▽ More

    Submitted 21 January, 2010; originally announced January 2010.

    Comments: 15 pages