-
Energy-Aware Workflow Execution: An Overview of Techniques for Saving Energy and Emissions in Scientific Compute Clusters
Authors:
Lauritz Thamsen,
Yehia Elkhatib,
Paul Harvey,
Syed Waqar Nabi,
Jeremy Singer,
Wim Vanderbauwhede
Abstract:
Scientific research in many fields routinely requires the analysis of large datasets, and scientists often employ workflow systems to leverage clusters of computers for their data analysis. However, due to their size and scale, these workflow applications can have a considerable environmental footprint in terms of compute resource use, energy consumption, and carbon emissions. Mitigating this is c…
▽ More
Scientific research in many fields routinely requires the analysis of large datasets, and scientists often employ workflow systems to leverage clusters of computers for their data analysis. However, due to their size and scale, these workflow applications can have a considerable environmental footprint in terms of compute resource use, energy consumption, and carbon emissions. Mitigating this is critical in light of climate change and the urgent need to reduce carbon emissions.
In this chapter, we exemplify the problem by estimating the carbon footprint of three real-world scientific workflows from different scientific domains. We then describe techniques for reducing the energy consumption and, thereby, carbon footprint of individual workflow tasks and entire workflow applications, such as using energy-efficient heterogeneous architectures, generating optimised code, scaling processor voltages and frequencies, consolidating workloads on shared cluster nodes, and scheduling workloads for optimised energy efficiency.
△ Less
Submitted 4 June, 2025;
originally announced June 2025.
-
The Development of Reflective Practice on a Work-Based Software Engineering Program: A Longitudinal Study
Authors:
Matthew Barr,
Syed Waqar Nabi,
Oana Andrei
Abstract:
This study examines the development of reflective practice among students on a four-year work-based Software Engineering program. Using two established models of reflection - Boud et al.'s Model of Reflective Process and Bain et al.'s 5R Framework for Reflection - we analyse a series of reflective assignments submitted by students over four years. Our longitudinal analysis reveals clear trends in…
▽ More
This study examines the development of reflective practice among students on a four-year work-based Software Engineering program. Using two established models of reflection - Boud et al.'s Model of Reflective Process and Bain et al.'s 5R Framework for Reflection - we analyse a series of reflective assignments submitted by students over four years. Our longitudinal analysis reveals clear trends in how students' reflective abilities evolve over the course of the program. We find that more sophisticated forms of reflection, such as integration of knowledge, appropriation of skills, and reconstruction of practice, increase markedly in prevalence in later years. The complementary nature of workplace experience and university study is highlighted in students' reflections, demonstrating a key benefit of the work-based learning approach. By the final year, all students demonstrate the ability to reconstruct their experiences to inform future practice. Our findings provide insight into how reflective practice develops in Software Engineering education and suggest potential value in incorporating more structured reflection into traditional degree programs. The study also reveals instances of meta-reflection, where students reflect on the value of reflection itself, indicating a deep engagement with the reflective process. While acknowledging limitations, this work offers a unique longitudinal perspective on the development of reflective practice in work-based Software Engineering education.
△ Less
Submitted 1 May, 2025; v1 submitted 29 April, 2025;
originally announced April 2025.
-
Towards Robust Legal Reasoning: Harnessing Logical LLMs in Law
Authors:
Manuj Kant,
Sareh Nabi,
Manav Kant,
Roland Scharrer,
Megan Ma,
Marzieh Nabi
Abstract:
Legal services rely heavily on text processing. While large language models (LLMs) show promise, their application in legal contexts demands higher accuracy, repeatability, and transparency. Logic programs, by encoding legal concepts as structured rules and facts, offer reliable automation, but require sophisticated text extraction. We propose a neuro-symbolic approach that integrates LLMs' natura…
▽ More
Legal services rely heavily on text processing. While large language models (LLMs) show promise, their application in legal contexts demands higher accuracy, repeatability, and transparency. Logic programs, by encoding legal concepts as structured rules and facts, offer reliable automation, but require sophisticated text extraction. We propose a neuro-symbolic approach that integrates LLMs' natural language understanding with logic-based reasoning to address these limitations.
As a legal document case study, we applied neuro-symbolic AI to coverage-related queries in insurance contracts using both closed and open-source LLMs. While LLMs have improved in legal reasoning, they still lack the accuracy and consistency required for complex contract analysis. In our analysis, we tested three methodologies to evaluate whether a specific claim is covered under a contract: a vanilla LLM, an unguided approach that leverages LLMs to encode both the contract and the claim, and a guided approach that uses a framework for the LLM to encode the contract. We demonstrated the promising capabilities of LLM + Logic in the guided approach.
△ Less
Submitted 24 February, 2025;
originally announced February 2025.
-
Dynamic Loop Fusion in High-Level Synthesis
Authors:
Robert Szafarczyk,
Syed Waqar Nabi,
Wim Vanderbauwhede
Abstract:
Dynamic High-Level Synthesis (HLS) uses additional hardware to perform memory disambiguation at runtime, increasing loop throughput in irregular codes compared to static HLS. However, most irregular codes consist of multiple sibling loops, which currently have to be executed sequentially by all HLS tools. Static HLS performs loop fusion only on regular codes, while dynamic HLS relies on loops with…
▽ More
Dynamic High-Level Synthesis (HLS) uses additional hardware to perform memory disambiguation at runtime, increasing loop throughput in irregular codes compared to static HLS. However, most irregular codes consist of multiple sibling loops, which currently have to be executed sequentially by all HLS tools. Static HLS performs loop fusion only on regular codes, while dynamic HLS relies on loops with dependencies to run to completion before the next loop starts.
We present dynamic loop fusion for HLS, a compiler/hardware co-design approach that enables multiple loops to run in parallel, even if they contain unpredictable memory dependencies. Our only requirement is that memory addresses are monotonically non-decreasing in inner loops. We present a novel program-order schedule for HLS, inspired by polyhedral compilers, that together with our address monotonicity analysis enables dynamic memory disambiguation that does not require searching of address histories and sequential loop execution. Our evaluation shows an average speedup of 14$\times$ over static and 4$\times$ over dynamic HLS.
△ Less
Submitted 24 January, 2025;
originally announced January 2025.
-
Compiler Support for Speculation in Decoupled Access/Execute Architectures
Authors:
Robert Szafarczyk,
Syed Waqar Nabi,
Wim Vanderbauwhede
Abstract:
Irregular codes are bottlenecked by memory and communication latency. Decoupled access/execute (DAE) is a common technique to tackle this problem. It relies on the compiler to separate memory address generation from the rest of the program, however, such a separation is not always possible due to control and data dependencies between the access and execute slices, resulting in a loss of decoupling…
▽ More
Irregular codes are bottlenecked by memory and communication latency. Decoupled access/execute (DAE) is a common technique to tackle this problem. It relies on the compiler to separate memory address generation from the rest of the program, however, such a separation is not always possible due to control and data dependencies between the access and execute slices, resulting in a loss of decoupling.
In this paper, we present compiler support for speculation in DAE architectures that preserves decoupling in the face of control dependencies. We speculate memory requests in the access slice and poison mis-speculations in the execute slice without the need for replays or synchronization. Our transformation works on arbitrary, reducible control flow and is proven to preserve sequential consistency. We show that our approach applies to a wide range of architectural work on CPU/GPU prefetchers, CGRAs, and accelerators, enabling DAE on a wider range of codes than before.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
Red is Sus: Automated Identification of Low-Quality Service Availability Claims in the US National Broadband Map
Authors:
Syed Tauhidun Nabi,
Zhuowei Wen,
Brooke Ritter,
Shaddi Hasan
Abstract:
The FCC's National Broadband Map aspires to provide an unprecedented view into broadband availability in the US. However, this map, which also determines eligibility for public grant funding, relies on self-reported data from service providers that in turn have incentives to strategically misrepresent their coverage. In this paper, we develop an approach for automatically identifying these low-qua…
▽ More
The FCC's National Broadband Map aspires to provide an unprecedented view into broadband availability in the US. However, this map, which also determines eligibility for public grant funding, relies on self-reported data from service providers that in turn have incentives to strategically misrepresent their coverage. In this paper, we develop an approach for automatically identifying these low-quality service claims in the National Broadband Map. To do this, we develop a novel dataset of broadband availability consisting of 750k observations from more than 900 US ISPs, derived from a combination of regulatory data and crowdsourced speed tests. Using this dataset, we develop a model to classify the accuracy of service provider regulatory filings and achieve AUCs over 0.98 for unseen examples. Our approach provides an effective technique to enable policymakers, civil society, and the public to identify portions of the National Broadband Map that are likely to have integrity challenges.
△ Less
Submitted 16 October, 2024; v1 submitted 11 October, 2024;
originally announced October 2024.
-
Continuum excitations in a spin-supersolid on a triangular lattice
Authors:
M. Zhu,
V. Romerio,
N. Steiger,
S. D. Nabi,
N. Murai,
S. Ohira-Kawamura,
K. Yu. Povarov,
Y. Skourski,
R. Sibille,
L. Keller,
Z. Yan,
S. Gvasaliya,
A. Zheludev
Abstract:
Magnetic, thermodynamic, neutron diffraction and inelastic neutron scattering are used to study spin correlations in the easy-axis XXZ triangular lattice magnet K2Co(SeO3)2. Despite the presence of quasi-2D "supersolid" magnetic order, the low-energy excitation spectrum contains no sharp modes and is instead a broad and structured multi-particle continuum. Applying a weak magnetic field drives the…
▽ More
Magnetic, thermodynamic, neutron diffraction and inelastic neutron scattering are used to study spin correlations in the easy-axis XXZ triangular lattice magnet K2Co(SeO3)2. Despite the presence of quasi-2D "supersolid" magnetic order, the low-energy excitation spectrum contains no sharp modes and is instead a broad and structured multi-particle continuum. Applying a weak magnetic field drives the system into an m = 1/3 fractional magnetization plateau phase and restores sharp spin wave modes. To some extent, the behavior at zero field can be understood in terms of spin wave decay. However, the presence of clear excitation minima at the M-points of the Brillouin zone suggest that the spinon language may provide a more adequate description, and signals a possible proximity to a Dirac spin liquid state.
△ Less
Submitted 20 August, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
Magnetic field-induced phases and spin Hamiltonian in Cs2CoBr4
Authors:
L. Facheris,
S. D. Nabi,
K. Yu. Povarov,
Z. Yan,
A. Glezer Moshe,
U. Nagel,
T. Rõõm,
A. Podlesnyak,
E. Ressouche,
K. Beauvois,
J. R. Stewart,
P. Manuel,
D. Khalyavin,
F. Orlandi,
A. Zheludev
Abstract:
Magnetic structures and spin excitations are studied across the phase diagram of the geometrically frustrated S = 3/2 quantum antiferromagnet Cs2CoBr4 in magnetic fields applied along the magnetic easy axis, using neutron diffraction, inelastic neutron scattering and THz absorption spectroscopy. The data are analyzed, where appropriate, using extended SU (4) linear spin wave theory. A minimal magn…
▽ More
Magnetic structures and spin excitations are studied across the phase diagram of the geometrically frustrated S = 3/2 quantum antiferromagnet Cs2CoBr4 in magnetic fields applied along the magnetic easy axis, using neutron diffraction, inelastic neutron scattering and THz absorption spectroscopy. The data are analyzed, where appropriate, using extended SU (4) linear spin wave theory. A minimal magnetic Hamiltonian is proposed based on measurements in the high field polarized state. It deviates considerably from the previously considered models. Additional dilatometry experiments highlight the importance of magnetoelastic coupling in this system.
△ Less
Submitted 14 March, 2024; v1 submitted 17 November, 2023;
originally announced November 2023.
-
A High-Frequency Load-Store Queue with Speculative Allocations for High-Level Synthesis
Authors:
Robert Szafarczyk,
Syed Waqar Nabi,
Wim Vanderbauwhede
Abstract:
Dynamically scheduled high-level synthesis (HLS) enables the use of load-store queues (LSQs) which can disambiguate data hazards at circuit runtime, increasing throughput in codes with unpredictable memory accesses. However, the increased throughput comes at the price of lower clock frequency and higher resource usage compared to statically scheduled circuits without LSQs. The lower frequency ofte…
▽ More
Dynamically scheduled high-level synthesis (HLS) enables the use of load-store queues (LSQs) which can disambiguate data hazards at circuit runtime, increasing throughput in codes with unpredictable memory accesses. However, the increased throughput comes at the price of lower clock frequency and higher resource usage compared to statically scheduled circuits without LSQs. The lower frequency often nullifies any throughput improvements over static scheduling, while the resource usage becomes prohibitively expensive with large queue sizes. This paper presents a method for achieving dynamically scheduled memory operations in HLS without significant clock period and resource usage increase. We present a novel LSQ based on shift-registers enabled by the opportunity to specialize queue sizes to a target code in HLS. We show a method to speculatively allocate addresses to our LSQ, significantly increasing pipeline parallelism in codes that could not benefit from an LSQ before. In stark contrast to traditional load value speculation, we do not require pipeline replays and have no overhead on misspeculation. On a set of benchmarks with data hazards, our approach achieves an average speedup of 11$\times$ against static HLS and 5$\times$ against dynamic HLS that uses a state of the art LSQ from previous work. Our LSQ also uses several times fewer resources, scaling to queues with hundreds of entries, and supports both on-chip and off-chip memory.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
AI (r)evolution -- where are we heading? Thoughts about the future of music and sound technologies in the era of deep learning
Authors:
Giovanni Bindi,
Nils Demerlé,
Rodrigo Diaz,
David Genova,
Aliénor Golvet,
Ben Hayes,
Jiawen Huang,
Lele Liu,
Vincent Martos,
Sarah Nabi,
Teresa Pelinski,
Lenny Renault,
Saurjya Sarkar,
Pedro Sarmento,
Cyrus Vahidi,
Lewis Wolstanholme,
Yixiao Zhang,
Axel Roebel,
Nick Bryan-Kinns,
Jean-Louis Giavitto,
Mathieu Barthet
Abstract:
Artificial Intelligence (AI) technologies such as deep learning are evolving very quickly bringing many changes to our everyday lives. To explore the future impact and potential of AI in the field of music and sound technologies a doctoral day was held between Queen Mary University of London (QMUL, UK) and Sciences et Technologies de la Musique et du Son (STMS, France). Prompt questions about curr…
▽ More
Artificial Intelligence (AI) technologies such as deep learning are evolving very quickly bringing many changes to our everyday lives. To explore the future impact and potential of AI in the field of music and sound technologies a doctoral day was held between Queen Mary University of London (QMUL, UK) and Sciences et Technologies de la Musique et du Son (STMS, France). Prompt questions about current trends in AI and music were generated by academics from QMUL and STMS. Students from the two institutions then debated these questions. This report presents a summary of the student debates on the topics of: Data, Impact, and the Environment; Responsible Innovation and Creative Practice; Creativity and Bias; and From Tools to the Singularity. The students represent the future generation of AI and music researchers. The academics represent the incumbent establishment. The student debates reported here capture visions, dreams, concerns, uncertainties, and contentious issues for the future of AI and music as the establishment is rightfully challenged by the next generation.
△ Less
Submitted 20 September, 2023;
originally announced October 2023.
-
Compiler Discovered Dynamic Scheduling of Irregular Code in High-Level Synthesis
Authors:
Robert Szafarczyk,
Syed Waqar Nabi,
Wim Vanderbauwhede
Abstract:
Dynamically scheduled high-level synthesis (HLS) achieves higher throughput than static HLS for codes with unpredictable memory accesses and control flow. However, excessive dataflow scheduling results in circuits that use more resources and have a slower critical path, even when only a part of the circuit exhibits dynamic behavior. Recent work has shown that marking parts of a dataflow circuit fo…
▽ More
Dynamically scheduled high-level synthesis (HLS) achieves higher throughput than static HLS for codes with unpredictable memory accesses and control flow. However, excessive dataflow scheduling results in circuits that use more resources and have a slower critical path, even when only a part of the circuit exhibits dynamic behavior. Recent work has shown that marking parts of a dataflow circuit for static scheduling can save resources and improve performance (hybrid scheduling), but the dynamic part of the circuit still bottlenecks the critical path. We propose instead to selectively introduce dynamic scheduling into static HLS. This paper presents an algorithm for identifying code regions amenable to dynamic scheduling and shows a methodology for introducing dynamically scheduled basic blocks, loops, and memory operations into static HLS. Our algorithm is informed by modulo-scheduling and can be integrated into any modulo-scheduled HLS tool. On a set of ten benchmarks, we show that our approach achieves on average an up to 3.7$\times$ and 3$\times$ speedup against dynamic and hybrid scheduling, respectively, with an area overhead of 1.3$\times$ and frequency degradation of 0.74$\times$ when compared to static HLS.
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Advancing Ad Auction Realism: Practical Insights & Modeling Implications
Authors:
Ming Chen,
Sareh Nabi,
Marciano Siniscalchi
Abstract:
Contemporary real-world online ad auctions differ from canonical models [Edelman et al., 2007; Varian, 2009] in at least four ways: (1) values and click-through rates can depend upon users' search queries, but advertisers can only partially "tune" their bids to specific queries; (2) advertisers do not know the number, identity, and precise value distribution of competing bidders; (3) advertisers o…
▽ More
Contemporary real-world online ad auctions differ from canonical models [Edelman et al., 2007; Varian, 2009] in at least four ways: (1) values and click-through rates can depend upon users' search queries, but advertisers can only partially "tune" their bids to specific queries; (2) advertisers do not know the number, identity, and precise value distribution of competing bidders; (3) advertisers only receive partial, aggregated feedback, and (4) payment rules are only partially known to bidders. These features make it virtually impossible to fully characterize equilibrium bidding behavior. This paper shows that, nevertheless, one can still gain useful insight into modern ad auctions by modeling advertisers as agents governed by an adversarial bandit algorithm, independent of auction mechanism intricacies. To demonstrate our approach, we first simulate "soft-floor" auctions [Zeithammer, 2019], a complex, real-world pricing rule for which no complete equilibrium characterization is known. We find that (i) when values and click-through rates are query-dependent, soft floors can improve revenues relative to standard auction formats even if bidder types are drawn from the same distribution; and (ii) with distributional asymmetries that reflect relevant real-world scenario, we find that soft floors yield lower revenues than suitably chosen reserve prices, even restricting attention to a single query. We then demonstrate how to infer advertiser value distributions from observed bids for a variety of pricing rules, and illustrate our approach with aggregate data from an e-commerce website.
△ Less
Submitted 9 April, 2024; v1 submitted 21 July, 2023;
originally announced July 2023.
-
MESOB: Balancing Equilibria & Social Optimality
Authors:
Xin Guo,
Lihong Li,
Sareh Nabi,
Rabih Salhab,
Junzi Zhang
Abstract:
Motivated by bid recommendation in online ad auctions, this paper considers a general class of multi-level and multi-agent games, with two major characteristics: one is a large number of anonymous agents, and the other is the intricate interplay between competition and cooperation. To model such complex systems, we propose a novel and tractable bi-objective optimization formulation with mean-field…
▽ More
Motivated by bid recommendation in online ad auctions, this paper considers a general class of multi-level and multi-agent games, with two major characteristics: one is a large number of anonymous agents, and the other is the intricate interplay between competition and cooperation. To model such complex systems, we propose a novel and tractable bi-objective optimization formulation with mean-field approximation, called MESOB (Mean-field Equilibria & Social Optimality Balancing), as well as an associated occupation measure optimization (OMO) method called MESOB-OMO to solve it. MESOB-OMO enables obtaining approximately Pareto efficient solutions in terms of the dual objectives of competition and cooperation in MESOB, and in particular allows for Nash equilibrium selection and social equalization in an asymptotic manner. We apply MESOB-OMO to bid recommendation in a simulated pay-per-click ad auction. Experiments demonstrate its efficacy in balancing the interests of different parties and in handling the competitive nature of bidders, as well as its advantages over baselines that only consider either the competitive or the cooperative aspects.
△ Less
Submitted 15 July, 2023;
originally announced July 2023.
-
Continuous descriptor-based control for deep audio synthesis
Authors:
Ninon Devis,
Nils Demerlé,
Sarah Nabi,
David Genova,
Philippe Esling
Abstract:
Despite significant advances in deep models for music generation, the use of these techniques remains restricted to expert users. Before being democratized among musicians, generative models must first provide expressive control over the generation, as this conditions the integration of deep generative models in creative workflows. In this paper, we tackle this issue by introducing a deep generati…
▽ More
Despite significant advances in deep models for music generation, the use of these techniques remains restricted to expert users. Before being democratized among musicians, generative models must first provide expressive control over the generation, as this conditions the integration of deep generative models in creative workflows. In this paper, we tackle this issue by introducing a deep generative audio model providing expressive and continuous descriptor-based control, while remaining lightweight enough to be embedded in a hardware synthesizer. We enforce the controllability of real-time generation by explicitly removing salient musical features in the latent space using an adversarial confusion criterion. User-specified features are then reintroduced as additional conditioning information, allowing for continuous control of the generation, akin to a synthesizer knob. We assess the performance of our method on a wide variety of sounds including instrumental, percussive and speech recordings while providing both timbre and attributes transfer, allowing new ways of generating sounds.
△ Less
Submitted 27 February, 2023;
originally announced February 2023.
-
Confinement of fractional excitations in a triangular lattice antiferromagnet
Authors:
L. Facheris,
S. D. Nabi,
A. Glezer Moshe,
U. Nagel,
T. Rõõm,
K. Yu. Povarov,
J. R. Stewart,
Z. Yan,
A. Zheludev
Abstract:
High-resolution neutron and THz spectroscopies are used to study the magnetic excitation spectrum of Cs$_2$CoBr$_4$, a distorted-triangular-lattice antiferromagnet with nearly XY-type anisotropy. What was previously thought of as a broad excitation continuum [Phys. Rev. Lett. 129, 087201 (2022)] is shown to be a series of dispersive bound states reminiscent of "Zeeman ladders" in quasi-one-dimensi…
▽ More
High-resolution neutron and THz spectroscopies are used to study the magnetic excitation spectrum of Cs$_2$CoBr$_4$, a distorted-triangular-lattice antiferromagnet with nearly XY-type anisotropy. What was previously thought of as a broad excitation continuum [Phys. Rev. Lett. 129, 087201 (2022)] is shown to be a series of dispersive bound states reminiscent of "Zeeman ladders" in quasi-one-dimensional Ising systems. At wave vectors where inter-chain interactions cancel at the Mean Field level, they can indeed be interpreted as bound finite-width kinks in individual chains. Elsewhere in the Brillouin zone their true two-dimensional structure and propagation are revealed.
△ Less
Submitted 23 June, 2023; v1 submitted 31 January, 2023;
originally announced January 2023.
-
Data-driven control of COVID-19 in buildings: a reinforcement-learning approach
Authors:
Ashkan Haji Hosseinloo,
Saleh Nabi,
Anette Hosoi,
Munther A. Dahleh
Abstract:
In addition to its public health crisis, COVID-19 pandemic has led to the shutdown and closure of workplaces with an estimated total cost of more than $16 trillion. Given the long hours an average person spends in buildings and indoor environments, this research article proposes data-driven control strategies to design optimal indoor airflow to minimize the exposure of occupants to viral pathogens…
▽ More
In addition to its public health crisis, COVID-19 pandemic has led to the shutdown and closure of workplaces with an estimated total cost of more than $16 trillion. Given the long hours an average person spends in buildings and indoor environments, this research article proposes data-driven control strategies to design optimal indoor airflow to minimize the exposure of occupants to viral pathogens in built environments. A general control framework is put forward for designing an optimal velocity field and proximal policy optimization, a reinforcement learning algorithm is employed to solve the control problem in a data-driven fashion. The same framework is used for optimal placement of disinfectants to neutralize the viral pathogens as an alternative to the airflow design when the latter is practically infeasible or hard to implement. We show, via simulation experiments, that the control agent learns the optimal policy in both scenarios within a reasonable time. The proposed data-driven control framework in this study will have significant societal and economic benefits by setting the foundation for an improved methodology in designing case-specific infection control guidelines that can be realized by affordable ventilation devices and disinfectants.
△ Less
Submitted 27 December, 2022;
originally announced December 2022.
-
Physics-Informed Koopman Network
Authors:
Yuying Liu,
Aleksei Sholokhov,
Hassan Mansour,
Saleh Nabi
Abstract:
Koopman operator theory is receiving increased attention due to its promise to linearize nonlinear dynamics. Neural networks that are developed to represent Koopman operators have shown great success thanks to their ability to approximate arbitrarily complex functions. However, despite their great potential, they typically require large training data-sets either from measurements of a real system…
▽ More
Koopman operator theory is receiving increased attention due to its promise to linearize nonlinear dynamics. Neural networks that are developed to represent Koopman operators have shown great success thanks to their ability to approximate arbitrarily complex functions. However, despite their great potential, they typically require large training data-sets either from measurements of a real system or from high-fidelity simulations. In this work, we propose a novel architecture inspired by physics-informed neural networks, which leverage automatic differentiation to impose the underlying physical laws via soft penalty constraints during model training. We demonstrate that it not only reduces the need of large training data-sets, but also maintains high effectiveness in approximating Koopman eigenfunctions.
△ Less
Submitted 17 November, 2022;
originally announced November 2022.
-
Critical and Topological Phases of Dimerized Kitaev Chain in Presence of Quasiperiodic Potential
Authors:
Shilpi Roy,
Sk Noor Nabi,
Saurabh Basu
Abstract:
We investigate localization and topological properties of a dimerized Kitaev chain with p-wave superconducting correlations and a quasiperiodically modulated chemical potential. With regard to the localization studies, we demonstrate the existence of distinct phases, such as, the extended phase, the critical (intermediate) phase, and the localized phase that arise due to the competition between th…
▽ More
We investigate localization and topological properties of a dimerized Kitaev chain with p-wave superconducting correlations and a quasiperiodically modulated chemical potential. With regard to the localization studies, we demonstrate the existence of distinct phases, such as, the extended phase, the critical (intermediate) phase, and the localized phase that arise due to the competition between the dimerization and the onsite quasiperiodic potential. Most interestingly, the critical phase comprises of two different mobility edges that are found to exist between the extended to the localized phase, and between the critical (multifractal) and localized phases. We perform our analysis employing the inverse and the normalized participation ratios, fractal dimension, and the level spacing. Subsequently, a finite-size analysis is done to provide support of our findings. Furthermore, we study the topological properties of the zero-energy edge modes via computing the real-space winding number and number of the Majorana zero modes present in the system. We specifically illustrate that our model exhibits a phase transition from a topologically trivial to a non-trivial phase (topological Anderson phase) beyond a critical dimerization strength under the influence of the quasiperiodic potential strength. Finally, in presence of a large potential, we demonstrate that the system undergoes yet another transition from the topologically non-trivial to an Anderson localized phase. Thus, we believe that our results will aid exploration of fundamentally different physics pertaining to the critical and the topological Anderson phases.
△ Less
Submitted 13 September, 2022;
originally announced September 2022.
-
Spin Density Wave versus Fractional Magnetization Plateau in a Triangular Antiferromagnet
Authors:
L. Facheris,
K. Yu. Povarov,
S. D. Nabi,
D. G. Mazzone,
J. Lass,
B. Roessli,
E. Ressouche,
Z. Yan,
S. Gvasaliya,
A. Zheludev
Abstract:
We report an excellent realization of the highly non-classical incommensurate spin-density wave (SDW) state in the quantum frustrated antiferromagnetic insulator Cs$_2$CoBr$_4$. In contrast to the well-known Ising spin chain case, here the SDW is stabilized by virtue of competing planar in-chain anisotropies and frustrated interchain exchange. Adjacent to the SDW phase is a broad $m = 1/3$ magneti…
▽ More
We report an excellent realization of the highly non-classical incommensurate spin-density wave (SDW) state in the quantum frustrated antiferromagnetic insulator Cs$_2$CoBr$_4$. In contrast to the well-known Ising spin chain case, here the SDW is stabilized by virtue of competing planar in-chain anisotropies and frustrated interchain exchange. Adjacent to the SDW phase is a broad $m = 1/3$ magnetization plateau that can be seen as a commensurate locking of the SDW state into the up-up-down (UUD) spin structure. This represents the first example of long-sought SDW-UUD transition in triangular-type quantum magnets.
△ Less
Submitted 18 August, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Phase Properties of Interacting Bosons in Presence of Quasiperiodic and Random Disorder
Authors:
Sk Noor Nabi,
Shilpi Roy,
Saurabh Basu
Abstract:
Motivated by two different types of disorder that occur in quantum systems with ubiquity, namely, the random and the quasiperiodic (QP) disorder, we have performed a systematic comparison of the emerging phase properties corresponding to these two cases for a system of interacting bosons in a two dimensional square lattice. Such a comparison is imperative as a random disorder at each lattice is co…
▽ More
Motivated by two different types of disorder that occur in quantum systems with ubiquity, namely, the random and the quasiperiodic (QP) disorder, we have performed a systematic comparison of the emerging phase properties corresponding to these two cases for a system of interacting bosons in a two dimensional square lattice. Such a comparison is imperative as a random disorder at each lattice is completely uncorrelated, while a quasiperiodic disorder is deterministic in nature. Using a site decoupled mean-field approximation followed by a percolation analysis on a BoseHubbard model, several different phases are realized, such as the familiar Bose-glass (BG), Mott insulator (MI), superfluid (SF) phases, and, additionally, we observe a mixed phase, specific to the QP disorder, which we call as a QM phase. Incidentally, the QP disorder stabilizes the BG phase more efficiently than the case of random disorder. Further, we have employed a finite-size scaling analysis to characterize various phase transitions via computing the critical transition points and the corresponding critical exponents. The results show that for both types of disorder, the transition from the BG phase to the SF phase belongs to the same universality class. However, the QM to the SF transition for the QP disorder comprises of different critical exponents, thereby hinting at the involvement of a different universality class therein. The critical exponents that depict all the various phase transitions occurring as a function of the disorder strength are found to be in good agreement with the quantum Monte-Carlo results available in the literature.
△ Less
Submitted 28 March, 2022;
originally announced March 2022.
-
Optimal control of PDEs using physics-informed neural networks
Authors:
Saviz Mowlavi,
Saleh Nabi
Abstract:
Physics-informed neural networks (PINNs) have recently become a popular method for solving forward and inverse problems governed by partial differential equations (PDEs). By incorporating the residual of the PDE into the loss function of a neural network-based surrogate model for the unknown state, PINNs can seamlessly blend measurement data with physical constraints. Here, we extend this framewor…
▽ More
Physics-informed neural networks (PINNs) have recently become a popular method for solving forward and inverse problems governed by partial differential equations (PDEs). By incorporating the residual of the PDE into the loss function of a neural network-based surrogate model for the unknown state, PINNs can seamlessly blend measurement data with physical constraints. Here, we extend this framework to PDE-constrained optimal control problems, for which the governing PDE is fully known and the goal is to find a control variable that minimizes a desired cost objective. We provide a set of guidelines for obtaining a good optimal control solution; first by selecting an appropriate PINN architecture and training parameters based on a forward problem, second by choosing the best value for a critical scalar weight in the loss function using a simple but effective two-step line search strategy. We then validate the performance of the PINN framework by comparing it to adjoint-based nonlinear optimal control, which performs gradient descent on the discretized control variable while satisfying the discretized PDE. This comparison is carried out on several distributed control examples based on the Laplace, Burgers, Kuramoto-Sivashinsky, and Navier-Stokes equations. Finally, we discuss the advantages and caveats of using the PINN and adjoint-based approaches for solving optimal control problems constrained by nonlinear PDEs.
△ Less
Submitted 3 November, 2022; v1 submitted 18 November, 2021;
originally announced November 2021.
-
Quasiparticle trapping by orbital effect in a hybrid superconducting-semiconducting circuit
Authors:
Willemijn Uilhoorn,
James G. Kroll,
Arno Bargerbos,
Syed D. Nabi,
Chung-Kai Yang,
Peter Krogstrup,
Leo P. Kouwenhoven,
Angela Kou,
Gijs de Lange
Abstract:
The tunneling of quasiparticles (QPs) across Josephson junctions (JJs) detrimentally affects the coherence of superconducting and charge-parity qubits, and is shown to occur more frequently in magnetic fields. Here we demonstrate the parity lifetime to survive in excess of 50$\,\mathrmμ$s in magnetic fields up to 1$\,$T, utilising a semiconducting nanowire transmon to detect QP tunneling in real t…
▽ More
The tunneling of quasiparticles (QPs) across Josephson junctions (JJs) detrimentally affects the coherence of superconducting and charge-parity qubits, and is shown to occur more frequently in magnetic fields. Here we demonstrate the parity lifetime to survive in excess of 50$\,\mathrmμ$s in magnetic fields up to 1$\,$T, utilising a semiconducting nanowire transmon to detect QP tunneling in real time. We exploit gate-tunable QP filters and find magnetic-field-enhanced parity lifetimes, consistent with increased QP trapping by the ungated nanowire due to orbital effects. Our findings highlight the importance of QP trap engineering for building magnetic-field compatible hybrid superconducting circuits.
△ Less
Submitted 23 May, 2021;
originally announced May 2021.
-
Bayesian Meta-Prior Learning Using Empirical Bayes
Authors:
Sareh Nabi,
Houssam Nassif,
Joseph Hong,
Hamed Mamani,
Guido Imbens
Abstract:
Adding domain knowledge to a learning system is known to improve results. In multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in Operation Management and Management Science applications are the absence of inform…
▽ More
Adding domain knowledge to a learning system is known to improve results. In multi-parameter Bayesian frameworks, such knowledge is incorporated as a prior. On the other hand, various model parameters can have different learning rates in real-world problems, especially with skewed data. Two often-faced challenges in Operation Management and Management Science applications are the absence of informative priors, and the inability to control parameter learning rates. In this study, we propose a hierarchical Empirical Bayes approach that addresses both challenges, and that can generalize to any Bayesian framework. Our method learns empirical meta-priors from the data itself and uses them to decouple the learning rates of first-order and second-order features (or any other given feature grouping) in a Generalized Linear Model. As the first-order features are likely to have a more pronounced effect on the outcome, focusing on learning first-order weights first is likely to improve performance and convergence time. Our Empirical Bayes method clamps features in each group together and uses the deployed model's observed data to empirically compute a hierarchical prior in hindsight. We report theoretical results for the unbiasedness, strong consistency, and optimal frequentist cumulative regret properties of our meta-prior variance estimator. We apply our method to a standard supervised learning optimization problem, as well as an online combinatorial optimization problem in a contextual bandit setting implemented in an Amazon production system. Both during simulations and live experiments, our method shows marked improvements, especially in cases of small traffic. Our findings are promising, as optimizing over sparse data is often a challenge.
△ Less
Submitted 12 July, 2021; v1 submitted 4 February, 2020;
originally announced February 2020.
-
Towards Automatic Transformation of Legacy Scientific Code into OpenCL for Optimal Performance on FPGAs
Authors:
Wim Vanderbauwhede,
Syed Waqar Nabi
Abstract:
There is a large body of legacy scientific code written in languages like Fortran that is not optimised to get the best performance out of heterogeneous acceleration devices like GPUs and FPGAs, and manually porting such code into parallel languages frameworks like OpenCL requires considerable effort. We are working towards developing a turn-key, self-optimising compiler for accelerating scientifi…
▽ More
There is a large body of legacy scientific code written in languages like Fortran that is not optimised to get the best performance out of heterogeneous acceleration devices like GPUs and FPGAs, and manually porting such code into parallel languages frameworks like OpenCL requires considerable effort. We are working towards developing a turn-key, self-optimising compiler for accelerating scientific applications, that can automatically transform legacy code into a solution for heterogeneous targets. In this paper we focus on FPGAs as the acceleration devices, and carry out our discussion in the context of the OpenCL programming framework. We show a route to automatic creation of kernels which are optimised for execution in a "streaming" fashion, which gives optimal performance on FPGAs. We use a 2D shallow-water model as an illustration; specifically we show how the use of \emph{channels} to communicate directly between peer kernels and the use of on-chip memory to create stencil buffers can lead to significant performance improvements. Our results show better FPGA performance against a baseline CPU implementation, and better energy-efficiency against both CPU and GPU implementations.
△ Less
Submitted 24 January, 2019; v1 submitted 21 December, 2018;
originally announced January 2019.
-
Effects of an attractive three body interaction on a spin-1 Bose Hubbard model
Authors:
Sk Noor Nabi,
Saurabh Basu
Abstract:
We study the effects of an attractive three body interaction potential on a spin-1 ultracold Bose gas using mean field approach (MFA). For an antiferromagnetic (AF) interaction, the third MI lobe is predominantly affected, where it completely engulfs the second and the fourth MI lobes at large values of the interaction strength. Albeit no significant change is observed beyond the fourth MI lobe. T…
▽ More
We study the effects of an attractive three body interaction potential on a spin-1 ultracold Bose gas using mean field approach (MFA). For an antiferromagnetic (AF) interaction, the third MI lobe is predominantly affected, where it completely engulfs the second and the fourth MI lobes at large values of the interaction strength. Albeit no significant change is observed beyond the fourth MI lobe. The formation of the spin singlet (nematic) MI phase and the different order of phase transitions to the SF phase have been carefully scrutinized with the help of spin eigenvalues and spin nematic order parameter. In the ferromagnetic case, the phase diagram shows similar features as that of a scalar Bose gas. We have compared our results on the MFA phase diagrams for both types of the interaction potential via a perturbation expansion in both the cases.
△ Less
Submitted 28 June, 2018;
originally announced June 2018.
-
Reinforcement Learning with Function-Valued Action Spaces for Partial Differential Equation Control
Authors:
Yangchen Pan,
Amir-massoud Farahmand,
Martha White,
Saleh Nabi,
Piyush Grover,
Daniel Nikovski
Abstract:
Recent work has shown that reinforcement learning (RL) is a promising approach to control dynamical systems described by partial differential equations (PDE). This paper shows how to use RL to tackle more general PDE control problems that have continuous high-dimensional action spaces with spatial relationship among action dimensions. In particular, we propose the concept of action descriptors, wh…
▽ More
Recent work has shown that reinforcement learning (RL) is a promising approach to control dynamical systems described by partial differential equations (PDE). This paper shows how to use RL to tackle more general PDE control problems that have continuous high-dimensional action spaces with spatial relationship among action dimensions. In particular, we propose the concept of action descriptors, which encode regularities among spatially-extended action dimensions and enable the agent to control high-dimensional action PDEs. We provide theoretical evidence suggesting that this approach can be more sample efficient compared to a conventional approach that treats each action dimension separately and does not explicitly exploit the spatial regularity of the action space. The action descriptor approach is then used within the deep deterministic policy gradient algorithm. Experiments on two PDE control problems, with up to 256-dimensional continuous actions, show the advantage of the proposed approach over the conventional one.
△ Less
Submitted 12 June, 2018;
originally announced June 2018.
-
Reduced-order modeling of fully turbulent buoyancy-driven flows using the Green's function method
Authors:
M. A. Khodkar,
Pedram Hassanzadeh,
Saleh Nabi,
Piyush Grover
Abstract:
A One-Dimensional (1D) Reduced-Order Model (ROM) has been developed for a 3D Rayleigh-Bénard convection system in the turbulent regime with Rayleigh number $\mathrm{Ra}=10^6$. The state vector of the 1D ROM is horizontally averaged temperature. Using the Green's Function (GRF) method, which involves applying many localized, weak forcings to the system one at a time and calculating the responses us…
▽ More
A One-Dimensional (1D) Reduced-Order Model (ROM) has been developed for a 3D Rayleigh-Bénard convection system in the turbulent regime with Rayleigh number $\mathrm{Ra}=10^6$. The state vector of the 1D ROM is horizontally averaged temperature. Using the Green's Function (GRF) method, which involves applying many localized, weak forcings to the system one at a time and calculating the responses using long-time averaged Direct Numerical Simulations (DNS), the system's Linear Response Function (LRF) has been computed. Another matrix, called the Eddy Flux Matrix (EFM), that relates changes in the divergence of vertical eddy heat fluxes to changes in the state vector, has also been calculated. Using various tests, it is shown that the LRF and EFM can accurately predict the time-mean responses of temperature and eddy heat flux to external forcings, and that the LRF can well predict the forcing needed to change the mean flow in a specified way (inverse problem). The non-normality of the LRF is discussed and its eigen/singular vectors are compared with the leading Proper Orthogonal Decomposition (POD) modes of the DNS data. Furthermore, it is shown that if the LRF and EFM are simply scaled by the square-root of Rayleigh number, they perform equally well for flows at other $\mathrm{Ra}$, at least in the investigated range of $ 5 \times 10^5 \le \mathrm{Ra} \le 1.25 \times 10^6$. The GRF method can be applied to develop 1D or 3D ROMs for any turbulent flow, and the calculated LRF and EFM can help with better analyzing and controlling the nonlinear system.
△ Less
Submitted 4 December, 2018; v1 submitted 3 May, 2018;
originally announced May 2018.
-
Quantum phases of a spin-1 ultracold Bose gas with three body interactions
Authors:
Sk Noor Nabi,
Saurabh Basu
Abstract:
We study the effects of both a repulsive and an attractive three body interaction potential on a spin-1 ultracold Bose gas using mean field approach (MFA). For an antiferromagnetic (AF) inter- action, we have found the existence of the odd-even asymmetry in the Mott insulating (MI) lobes in presence of both the repulsive two and three body interactions. In case of a purely three body repulsive int…
▽ More
We study the effects of both a repulsive and an attractive three body interaction potential on a spin-1 ultracold Bose gas using mean field approach (MFA). For an antiferromagnetic (AF) inter- action, we have found the existence of the odd-even asymmetry in the Mott insulating (MI) lobes in presence of both the repulsive two and three body interactions. In case of a purely three body repulsive interaction, the higher order MI lobes stabilize against the superfluid phase. However, the spin nematic (singlet) formation is restricted upto the first (second) MI lobes for the former one, while there is neither any asymmetry nor spin nematic (singlet) formation is observed for the later case. The results are confirmed after carefully scrutinizing the spin eigen value and spin nematic order parameter for both the cases. On the other hand, for an attractive three body interaction, the third MI lobe is predominantly affected, where it completely engulfs the second and the fourth MI lobes at large values of the interaction strength. Albeit no significant change is observed beyond the fourth MI lobe. In the ferromagnetic case, the phase diagram shows similar features as that of a scalar Bose gas. We have compared our results on the MFA phase diagrams for both types of the interaction potential via a perturbation expansion in both the cases.
△ Less
Submitted 12 November, 2017;
originally announced November 2017.
-
Spin-1 Bose Hubbard model with nearest neighbour extended interaction
Authors:
Sk Noor Nabi,
Saurabh Basu
Abstract:
We have studied a spinor (F = 1) Bose gas in presence of the density-density interaction through the mean field approach and the perturbation theory for either sign of the spin dependent interaction, namely the antiferromagnetic (AF) and the ferromagnetic cases. In the AF case, the charge density wave (CDW) phase appears to be sandwiched between the Mott insulating (MI) and the supersolid phases f…
▽ More
We have studied a spinor (F = 1) Bose gas in presence of the density-density interaction through the mean field approach and the perturbation theory for either sign of the spin dependent interaction, namely the antiferromagnetic (AF) and the ferromagnetic cases. In the AF case, the charge density wave (CDW) phase appears to be sandwiched between the Mott insulating (MI) and the supersolid phases for small values of the extended interaction strength. But the CDW phase completely occupies the MI lobe when the extended interaction strength is larger than a certain critical value related to the width of the MI lobes and hence opens up the possibilities of spin singlet and nematic CDW insulating phases. In the ferromagnetic case, the phase diagram shows similar features as that of the AF case and are in complete agreement with a spin-0 Bose gas. The perturbation expansion calculations nicely corroborate the mean field phase results in both these cases. Further, we extend our calculations in presence of a harmonic confinement and obtained the momentum distribution profile that is related to the absorption spectra in order to distinguish between different phases.
△ Less
Submitted 1 May, 2017;
originally announced May 2017.
-
Spin-1 bosons in an external magnetic field and a three body interaction potential
Authors:
Sk Noor Nabi,
Saurabh Basu
Abstract:
We perform a thorough study of the effect of an external magnetic field on a spin-1 ultracold Bose gas via mean field approach corresponding to the both signs of the spin dependent interaction. In contrast to some of the earlier studies, the magnetic field in our work is included through both the hopping frequencies (via Peierls coupling) and the zeeman interaction, thereby facilitating an explo-…
▽ More
We perform a thorough study of the effect of an external magnetic field on a spin-1 ultracold Bose gas via mean field approach corresponding to the both signs of the spin dependent interaction. In contrast to some of the earlier studies, the magnetic field in our work is included through both the hopping frequencies (via Peierls coupling) and the zeeman interaction, thereby facilitating an explo- ration for competition between the two. The phase diagrams in the antiferromagnetic case shows that the Mott insulating (MI) phase with even particle occupancies is stable at low magnetic fields. At higher magnetic fields, due to a competition between the hopping and the zeeman interaction terms, the latter tries to destabilize the MI phase by suppressing the formation of singlet pairs, while the former tends to stabilize the MI phase. In the ferromagnetic case, the MI lobes become more stable with increasing flux strengths. Further inclusion of a three body interaction potential in order to ascertain its role on the phase diagram, we found that in absence of the magnetic field, the MI lobes become more stable compared to the superfluid (SF) phase and the location of the transition point for the MI-SF phase increases with increasing the three body interaction strength. A strong coupling perturbative calculation has also been done to provide a comparison with our mean field phase diagrams. Lastly, with inclusion of the external field, the insulating phases are found to be further stabilized by the three body interaction potential.
△ Less
Submitted 14 June, 2016; v1 submitted 9 June, 2016;
originally announced June 2016.
-
Sparse sensing and DMD based identification of flow regimes and bifurcations in complex flows
Authors:
Boris Kramer,
Piyush Grover,
Petros Boufounos,
Mouhacine Benosman,
Saleh Nabi
Abstract:
We present a sparse sensing framework based on Dynamic Mode Decomposition (DMD) to identify flow regimes and bifurcations in large-scale thermo-fluid systems. Motivated by real-time sensing and control of thermal-fluid flows in buildings and equipment, we apply this method to a Direct Numerical Simulation (DNS) data set of a 2D laterally heated cavity. The resulting flow solutions can be divided i…
▽ More
We present a sparse sensing framework based on Dynamic Mode Decomposition (DMD) to identify flow regimes and bifurcations in large-scale thermo-fluid systems. Motivated by real-time sensing and control of thermal-fluid flows in buildings and equipment, we apply this method to a Direct Numerical Simulation (DNS) data set of a 2D laterally heated cavity. The resulting flow solutions can be divided into several regimes, ranging from steady to chaotic flow. The DMD modes and eigenvalues capture the main temporal and spatial scales in the dynamics belonging to different regimes. Our proposed classification method is data-driven, robust w.r.t measurement noise, and exploits the dynamics extracted from the DMD method. Namely, we construct an augmented DMD basis, with "built-in" dynamics, given by the DMD eigenvalues. This allows us to employ a short time-series of data from sensors, to more robustly classify flow regimes, particularly in the presence of measurement noise. We also exploit the incoherence exhibited among the data generated by different regimes, which persists even if the number of measurements is small compared to the dimension of the DNS data. The data-driven regime identification algorithm can enable robust low-order modeling of flows for state estimation and control.
△ Less
Submitted 22 August, 2016; v1 submitted 9 October, 2015;
originally announced October 2015.
-
Percolation analysis of a disordered spinor Bose gas
Authors:
Sk Noor Nabi,
Saurabh Basu
Abstract:
We study the effects of an on-site disorder potential in a gas of spinor (spin-1) ultracold atoms loaded in an optical lattice corresponding to both ferromagnetic and antiferromagnetic spin dependent interactions. Starting with a disordered spinor Bose-Hubbard model (SBHM) on a two dimensional square lattice, we observe the appearance of a Bose glass phase using the fraction of the lattice sites h…
▽ More
We study the effects of an on-site disorder potential in a gas of spinor (spin-1) ultracold atoms loaded in an optical lattice corresponding to both ferromagnetic and antiferromagnetic spin dependent interactions. Starting with a disordered spinor Bose-Hubbard model (SBHM) on a two dimensional square lattice, we observe the appearance of a Bose glass phase using the fraction of the lattice sites having finite superfluid order parameter and non integer local densities as an indicator. A precise distinction between three different types of phases namely, superfluid (SF), Mott insulator (MI) and Bose glass (BG) is done via a percolation analysis thereby demonstrating that a reliable enumeration of phases is possible at particular values of the parameters of the SBHM. Finally we present the phase diagram based on the above information for both antiferromagnetic and ferromagnetic interactions.
△ Less
Submitted 20 April, 2015;
originally announced April 2015.
-
A Reconfigurable Vector Instruction Processor for Accelerating a Convection Parametrization Model on FPGAs
Authors:
Syed Waqar Nabi,
Saji N. Hameed,
Wim Vanderbauwhede
Abstract:
High Performance Computing (HPC) platforms allow scientists to model computationally intensive algorithms. HPC clusters increasingly use General-Purpose Graphics Processing Units (GPGPUs) as accelerators; FPGAs provide an attractive alternative to GPGPUs for use as co-processors, but they are still far from being mainstream due to a number of challenges faced when using FPGA-based platforms. Our r…
▽ More
High Performance Computing (HPC) platforms allow scientists to model computationally intensive algorithms. HPC clusters increasingly use General-Purpose Graphics Processing Units (GPGPUs) as accelerators; FPGAs provide an attractive alternative to GPGPUs for use as co-processors, but they are still far from being mainstream due to a number of challenges faced when using FPGA-based platforms. Our research aims to make FPGA-based high performance computing more accessible to the scientific community. In this work we present the results of investigating the acceleration of a particular atmospheric model, Flexpart, on FPGAs. We focus on accelerating the most computationally intensive kernel from this model. The key contribution of our work is the architectural exploration we undertook to arrive at a solution that best exploits the parallelism available in the legacy code, and is also convenient to program, so that eventually the compilation of high-level legacy code to our architecture can be fully automated. We present the three different types of architecture, comparing their resource utilization and performance, and propose that an architecture where there are a number of computational cores, each built along the lines of a vector instruction processor, works best in this particular scenario, and is a promising candidate for a generic FPGA-based platform for scientific computation. We also present the results of experiments done with various configuration parameters of the proposed architecture, to show its utility in adapting to a range of scientific applications.
△ Less
Submitted 17 April, 2015;
originally announced April 2015.
-
An Intermediate Language and Estimator for Automated Design Space Exploration on FPGAs
Authors:
Syed Waqar Nabi,
Wim Vanderbauwhede
Abstract:
We present the TyTra-IR, a new intermediate language intended as a compilation target for high-level language compilers and a front-end for HDL code generators. We develop the requirements of this new language based on the design-space of FPGAs that it should be able to express and the estimation-space in which each configuration from the design-space should be mappable in an automated design flow…
▽ More
We present the TyTra-IR, a new intermediate language intended as a compilation target for high-level language compilers and a front-end for HDL code generators. We develop the requirements of this new language based on the design-space of FPGAs that it should be able to express and the estimation-space in which each configuration from the design-space should be mappable in an automated design flow. We use a simple kernel to illustrate multiple configurations using the semantics of TyTra-IR. The key novelty of this work is the cost model for resource-costs and throughput for different configurations of interest for a particular kernel. Through the realistic example of a Successive Over-Relaxation kernel implemented both in TyTra-IR and HDL, we demonstrate both the expressiveness of the IR and the accuracy of our cost model.
△ Less
Submitted 17 April, 2015;
originally announced April 2015.
-
Paraglide: Interactive Parameter Space Partitioning for Computer Simulations
Authors:
Steven Bergner,
Michael Sedlmair,
Sareh Nabi,
Ahmed Saad,
Torsten Möller
Abstract:
In this paper we introduce paraglide, a visualization system designed for interactive exploration of parameter spaces of multi-variate simulation models. To get the right parameter configuration, model developers frequently have to go back and forth between setting parameters and qualitatively judging the outcomes of their model. During this process, they build up a grounded understanding of the p…
▽ More
In this paper we introduce paraglide, a visualization system designed for interactive exploration of parameter spaces of multi-variate simulation models. To get the right parameter configuration, model developers frequently have to go back and forth between setting parameters and qualitatively judging the outcomes of their model. During this process, they build up a grounded understanding of the parameter effects in order to pick the right setting. Current state-of-the-art tools and practices, however, fail to provide a systematic way of exploring these parameter spaces, making informed decisions about parameter settings a tedious and workload-intensive task. Paraglide endeavors to overcome this shortcoming by assisting the sampling of the parameter space and the discovery of qualitatively different model outcomes. This results in a decomposition of the model parameter space into regions of distinct behaviour. We developed paraglide in close collaboration with experts from three different domains, who all were involved in developing new models for their domain. We first analyzed current practices of six domain experts and derived a set of design requirements, then engaged in a longitudinal user-centered design process, and finally conducted three in-depth case studies underlining the usefulness of our approach.
△ Less
Submitted 24 October, 2011;
originally announced October 2011.
-
Effect of strain on the stability and electronic properties of ferrimagnetic Fe$_{2-x}$Ti$_x$O$_3$ heterostructures from correlated band theory
Authors:
Hasan Sadat Nabi,
Rossitza Pentcheva
Abstract:
Based on density functional theory (DFT) calculations including an on-site Hubbard $U$ term we investigate the effect of substrate-induced strain on the properties of ferrimagnetic Fe$_2$O$_3$-FeTiO$_3$ solid solutions and heterostructures. While the charge compensation mechanism through formation of a mixed \fetw, \feth-contact layer is unaffected, strain can be used to tune the electronic prop…
▽ More
Based on density functional theory (DFT) calculations including an on-site Hubbard $U$ term we investigate the effect of substrate-induced strain on the properties of ferrimagnetic Fe$_2$O$_3$-FeTiO$_3$ solid solutions and heterostructures. While the charge compensation mechanism through formation of a mixed \fetw, \feth-contact layer is unaffected, strain can be used to tune the electronic properties of the system, e.g. by changing the position of impurity levels in the band gap. Straining hematite/ilmenite films at the lateral parameters of Al$_{2}$O$_{3}$(0001), commonly used as a substrate, is found to be energetically unfavorable as compared to films on Fe$_{2}$O$_{3}$(0001) or FeTiO$_{3}$(0001)-substrates.
△ Less
Submitted 2 October, 2009;
originally announced October 2009.
-
Interface magnetism in Fe2O3/FeTiO3-heterostructures
Authors:
Rossitza Pentcheva,
Hasan Sadat Nabi
Abstract:
To resolve the microscopic origin of magnetism in the Fe2O3/FeTiO3-system, we have performed density functional theory calculations taking into account on-site Coulomb repulsion. By varying systematically the concentration, distribution and charge state of Ti in a hematite host, we compile a phase diagram of the stability with respect to the end members and find a clear preference to form layere…
▽ More
To resolve the microscopic origin of magnetism in the Fe2O3/FeTiO3-system, we have performed density functional theory calculations taking into account on-site Coulomb repulsion. By varying systematically the concentration, distribution and charge state of Ti in a hematite host, we compile a phase diagram of the stability with respect to the end members and find a clear preference to form layered arrangements as opposed to solid solutions. The charge mismatch at the interface is accommodated through Ti4+ and a disproportionation in the Fe contact layer into Fe2+, Fe3+, leading to uncompensated moments in the contact layer and giving first theoretical evidence for the lamellar magnetism hypothesis. This interface magnetism is associated with impurity levels in the band gap showing halfmetallic behavior and making Fe2O3/FeTiO3 heterostructures prospective materials for spintronics applications.
△ Less
Submitted 17 April, 2008;
originally announced April 2008.