-
ParsiNorm: A Persian Toolkit for Speech Processing Normalization
Authors:
Romina Oji,
Seyedeh Fatemeh Razavi,
Sajjad Abdi Dehsorkh,
Alireza Hariri,
Hadi Asheri,
Reshad Hosseini
Abstract:
In general, speech processing models consist of a language model along with an acoustic model. Regardless of the language model's complexity and variants, three critical pre-processing steps are needed in language models: cleaning, normalization, and tokenization. Among mentioned steps, the normalization step is so essential to format unification in pure textual applications. However, for embedded…
▽ More
In general, speech processing models consist of a language model along with an acoustic model. Regardless of the language model's complexity and variants, three critical pre-processing steps are needed in language models: cleaning, normalization, and tokenization. Among mentioned steps, the normalization step is so essential to format unification in pure textual applications. However, for embedded language models in speech processing modules, normalization is not limited to format unification. Moreover, it has to convert each readable symbol, number, etc., to how they are pronounced. To the best of our knowledge, there is no Persian normalization toolkits for embedded language models in speech processing modules, So in this paper, we propose an open-source normalization toolkit for text processing in speech applications. Briefly, we consider different readable Persian text like symbols (common currencies, #, @, URL, etc.), numbers (date, time, phone number, national code, etc.), and so on. Comparison with other available Persian textual normalization tools indicates the superiority of the proposed method in speech processing. Also, comparing the model's performance for one of the proposed functions (sentence separation) with other common natural language libraries such as HAZM and Parsivar indicates the proper performance of the proposed method. Besides, its evaluation of some Persian Wikipedia data confirms the proper performance of the proposed method.
△ Less
Submitted 15 December, 2021; v1 submitted 1 November, 2021;
originally announced November 2021.
-
Graph Generative Models for Fast Detector Simulations in High Energy Physics
Authors:
Ali Hariri,
Darya Dyachkova,
Sergei Gleyzer
Abstract:
Accurate and fast simulation of particle physics processes is crucial for the high-energy physics community. Simulating particle interactions with detectors is both time consuming and computationally expensive. With the proton-proton collision energy of 13 TeV, the Large Hadron Collider is uniquely positioned to detect and measure the rare phenomena that can shape our knowledge of new interactions…
▽ More
Accurate and fast simulation of particle physics processes is crucial for the high-energy physics community. Simulating particle interactions with detectors is both time consuming and computationally expensive. With the proton-proton collision energy of 13 TeV, the Large Hadron Collider is uniquely positioned to detect and measure the rare phenomena that can shape our knowledge of new interactions. The High-Luminosity Large Hadron Collider (HL-LHC) upgrade will put a significant strain on the computing infrastructure due to increased event rate and levels of pile-up. Simulation of high-energy physics collisions needs to be significantly faster without sacrificing the physics accuracy. Machine learning approaches can offer faster solutions, while maintaining a high level of fidelity. We discuss a graph generative model that provides effective reconstruction of LHC events, paving the way for full detector level fast simulation for HL-LHC.
△ Less
Submitted 24 August, 2021; v1 submitted 4 April, 2021;
originally announced April 2021.
-
Deep Learning Improves Contrast in Low-Fluence Photoacoustic Imaging
Authors:
Ali Hariri,
Kamran Alipour,
Yash Mantri,
Jurgen P. Schulze,
Jesse V. Jokerst
Abstract:
Low fluence illumination sources can facilitate clinical transition of photoacoustic imaging because they are rugged, portable, affordable, and safe. However, these sources also decrease image quality due to their low fluence. Here, we propose a denoising method using a multi-level wavelet-convolutional neural network to map low fluence illumination source images to its corresponding high fluence…
▽ More
Low fluence illumination sources can facilitate clinical transition of photoacoustic imaging because they are rugged, portable, affordable, and safe. However, these sources also decrease image quality due to their low fluence. Here, we propose a denoising method using a multi-level wavelet-convolutional neural network to map low fluence illumination source images to its corresponding high fluence excitation map. Quantitative and qualitative results show a significant potential to remove the background noise and preserve the structures of target. Substantial improvements up to 2.20, 2.25, and 4.3-fold for PSNR, SSIM, and CNR metrics were observed, respectively. We also observed enhanced contrast (up to 1.76-fold) in an in vivo application using our proposed methods. We suggest that this tool can improve the value of such sources in photoacoustic imaging.
△ Less
Submitted 19 April, 2020;
originally announced April 2020.
-
Arabs and Atheism: Religious Discussions in the Arab Twittersphere
Authors:
Youssef Al Hariri,
Walid Magdy,
Maria Wolters
Abstract:
Most previous research on online discussions of atheism has focused on atheism within a Christian context. In contrast, discussions about atheism in the Arab world and from Islamic background are relatively poorly studied. An added complication is that open atheism is against the law in some Arab countries, which may further restrict atheist activity on social media. In this work, we explore athei…
▽ More
Most previous research on online discussions of atheism has focused on atheism within a Christian context. In contrast, discussions about atheism in the Arab world and from Islamic background are relatively poorly studied. An added complication is that open atheism is against the law in some Arab countries, which may further restrict atheist activity on social media. In this work, we explore atheistic discussion in the Arab Twittersphere. We identify four relevant categories of Twitter users according to the content they post: atheistic, theistic, tanweeri (religious renewal), and other. We characterise the typical content posted by these four sets of users and their social networks, paying particular attention to the topics discussed and the interaction among them. Our findings have implication for the study of religious and spiritual discourse on social media and provide a better cross-cultural understanding of relevant aspects.
△ Less
Submitted 21 August, 2019;
originally announced August 2019.
-
A High Dynamic Range 3-Moduli-Set with Efficient Reverse Converter
Authors:
Arash Hariri,
K. Navi,
Reza Rastegar
Abstract:
-Residue Number System (RNS) is a valuable tool for fast and parallel arithmetic. It has a wide application in digital signal processing, fault tolerant systems, etc. In this work, we introduce the 3-moduli set {2^n, 2^{2n}-1, 2^{2n}+1} and propose its residue to binary converter using the Chinese Remainder Theorem. We present its simple hardware implementation that mainly includes one Carry Sav…
▽ More
-Residue Number System (RNS) is a valuable tool for fast and parallel arithmetic. It has a wide application in digital signal processing, fault tolerant systems, etc. In this work, we introduce the 3-moduli set {2^n, 2^{2n}-1, 2^{2n}+1} and propose its residue to binary converter using the Chinese Remainder Theorem. We present its simple hardware implementation that mainly includes one Carry Save Adder (CSA) and a Modular Adder (MA). We compare the performance and area utilization of our reverse converter to the reverse converters of the moduli sets {2^n-1, 2^n, 2^n+1, 2^{2n}+1} and {2^n-1, 2^n, 2^n+1, 2^n-2^{(n+1)/2}+1, 2^n+2^{(n+1)/2}+1} that have the same dynamic range and we demonstrate that our architecture is better in terms of performance and area utilization. Also, we show that our reverse converter is faster than the reverse converter of {2^n-1, 2^n, 2^n+1} for dynamic ranges like 8-bit, 16-bit, 32-bit and 64-bit however it requires more area.
△ Less
Submitted 8 January, 2009;
originally announced January 2009.
-
A Step Forward in Studying the Compact Genetic Algorithm
Authors:
Reza Rastegar,
Arash Hariri
Abstract:
The compact Genetic Algorithm (cGA) is an Estimation of Distribution Algorithm that generates offspring population according to the estimated probabilistic model of the parent population instead of using traditional recombination and mutation operators. The cGA only needs a small amount of memory; therefore, it may be quite useful in memory-constrained applications. This paper introduces a theor…
▽ More
The compact Genetic Algorithm (cGA) is an Estimation of Distribution Algorithm that generates offspring population according to the estimated probabilistic model of the parent population instead of using traditional recombination and mutation operators. The cGA only needs a small amount of memory; therefore, it may be quite useful in memory-constrained applications. This paper introduces a theoretical framework for studying the cGA from the convergence point of view in which, we model the cGA by a Markov process and approximate its behavior using an Ordinary Differential Equation (ODE). Then, we prove that the corresponding ODE converges to local optima and stays there. Consequently, we conclude that the cGA will converge to the local optima of the function to be optimized.
△ Less
Submitted 6 January, 2009;
originally announced January 2009.