-
Probing Speaker-specific Features in Speaker Representations
Authors:
Aemon Yat Fei Chiu,
Paco Kei Ching Fung,
Roger Tsz Yeung Li,
Jingyu Li,
Tan Lee
Abstract:
This study explores speaker-specific features encoded in speaker embeddings and intermediate layers of speech self-supervised learning (SSL) models. By utilising a probing method, we analyse features such as pitch, tempo, and energy across prominent speaker embedding models and speech SSL models, including HuBERT, WavLM, and Wav2vec 2.0. The results reveal that speaker embeddings like CAM++ excel…
▽ More
This study explores speaker-specific features encoded in speaker embeddings and intermediate layers of speech self-supervised learning (SSL) models. By utilising a probing method, we analyse features such as pitch, tempo, and energy across prominent speaker embedding models and speech SSL models, including HuBERT, WavLM, and Wav2vec 2.0. The results reveal that speaker embeddings like CAM++ excel in energy classification, while speech SSL models demonstrate superior performance across multiple features due to their hierarchical feature encoding. Intermediate layers effectively capture a mix of acoustic and para-linguistic information, with deeper layers refining these representations. This investigation provides insights into model design and highlights the potential of these representations for downstream applications, such as speaker verification and text-to-speech synthesis, while laying the groundwork for exploring additional features and advanced probing methods.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
An Investigation of Reprogramming for Cross-Language Adaptation in Speaker Verification Systems
Authors:
Jingyu Li,
Aemon Yat Fei Chiu,
Tan Lee
Abstract:
Language mismatch is among the most common and challenging domain mismatches in deploying speaker verification (SV) systems. Adversarial reprogramming has shown promising results in cross-language adaptation for SV. The reprogramming is implemented by padding learnable parameters on the two sides of input speech signals. In this paper, we investigate the relationship between the number of padded p…
▽ More
Language mismatch is among the most common and challenging domain mismatches in deploying speaker verification (SV) systems. Adversarial reprogramming has shown promising results in cross-language adaptation for SV. The reprogramming is implemented by padding learnable parameters on the two sides of input speech signals. In this paper, we investigate the relationship between the number of padded parameters and the performance of the reprogrammed models. Sufficient experiments are conducted with different scales of SV models and datasets. The results demonstrate that reprogramming consistently improves the performance of cross-language SV, while the improvement is saturated or even degraded when using larger padding lengths. The performance is mainly determined by the capacity of the original SV models instead of the number of padded parameters. The SV models with larger scales have higher upper bounds in performance and can endure longer padding without performance degradation.
△ Less
Submitted 18 November, 2024;
originally announced November 2024.
-
Multi-line AI-assisted Code Authoring
Authors:
Omer Dunay,
Daniel Cheng,
Adam Tait,
Parth Thakkar,
Peter C Rigby,
Andy Chiu,
Imad Ahmad,
Arun Ganesan,
Chandra Maddila,
Vijayaraghavan Murali,
Ali Tayyebi,
Nachiappan Nagappan
Abstract:
CodeCompose is an AI-assisted code authoring tool powered by large language models (LLMs) that provides inline suggestions to 10's of thousands of developers at Meta. In this paper, we present how we scaled the product from displaying single-line suggestions to multi-line suggestions. This evolution required us to overcome several unique challenges in improving the usability of these suggestions f…
▽ More
CodeCompose is an AI-assisted code authoring tool powered by large language models (LLMs) that provides inline suggestions to 10's of thousands of developers at Meta. In this paper, we present how we scaled the product from displaying single-line suggestions to multi-line suggestions. This evolution required us to overcome several unique challenges in improving the usability of these suggestions for developers.
First, we discuss how multi-line suggestions can have a 'jarring' effect, as the LLM's suggestions constantly move around the developer's existing code, which would otherwise result in decreased productivity and satisfaction.
Second, multi-line suggestions take significantly longer to generate; hence we present several innovative investments we made to reduce the perceived latency for users. These model-hosting optimizations sped up multi-line suggestion latency by 2.5x.
Finally, we conduct experiments on 10's of thousands of engineers to understand how multi-line suggestions impact the user experience and contrast this with single-line suggestions. Our experiments reveal that (i) multi-line suggestions account for 42% of total characters accepted (despite only accounting for 16% for displayed suggestions) (ii) multi-line suggestions almost doubled the percentage of keystrokes saved for users from 9% to 17%. Multi-line CodeCompose has been rolled out to all engineers at Meta, and less than 1% of engineers have opted out of multi-line suggestions.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Graph Sparsifications using Neural Network Assisted Monte Carlo Tree Search
Authors:
Alvin Chiu,
Mithun Ghosh,
Reyan Ahmed,
Kwang-Sung Jun,
Stephen Kobourov,
Michael T. Goodrich
Abstract:
Graph neural networks have been successful for machine learning, as well as for combinatorial and graph problems such as the Subgraph Isomorphism Problem and the Traveling Salesman Problem. We describe an approach for computing graph sparsifiers by combining a graph neural network and Monte Carlo Tree Search. We first train a graph neural network that takes as input a partial solution and proposes…
▽ More
Graph neural networks have been successful for machine learning, as well as for combinatorial and graph problems such as the Subgraph Isomorphism Problem and the Traveling Salesman Problem. We describe an approach for computing graph sparsifiers by combining a graph neural network and Monte Carlo Tree Search. We first train a graph neural network that takes as input a partial solution and proposes a new node to be added as output. This neural network is then used in a Monte Carlo search to compute a sparsifier. The proposed method consistently outperforms several standard approximation algorithms on different types of graphs and often finds the optimal solution.
△ Less
Submitted 16 November, 2023;
originally announced November 2023.
-
Causal Panel Analysis under Parallel Trends: Lessons from a Large Reanalysis Study
Authors:
Albert Chiu,
Xingchen Lan,
Ziyi Liu,
Yiqing Xu
Abstract:
Two-way fixed effects (TWFE) models are widely used in political science to establish causality, but recent methodological discussions highlight their limitations under heterogeneous treatment effects (HTE) and violations of the parallel trends (PT) assumption. This growing literature has introduced numerous new estimators and procedures, causing confusion among researchers about the reliability o…
▽ More
Two-way fixed effects (TWFE) models are widely used in political science to establish causality, but recent methodological discussions highlight their limitations under heterogeneous treatment effects (HTE) and violations of the parallel trends (PT) assumption. This growing literature has introduced numerous new estimators and procedures, causing confusion among researchers about the reliability of existing results and best practices. To address these concerns, we replicated and reanalyzed 49 studies from leading journals using TWFE models for observational panel data with binary treatments. Using six HTE-robust estimators, diagnostic tests, and sensitivity analyses, we find: (i) HTE-robust estimators yield qualitatively similar but highly variable results; (ii) while a few studies show clear signs of PT violations, many lack evidence to support this assumption; and (iii) many studies are underpowered when accounting for HTE and potential PT violations. We emphasize the importance of strong research designs and rigorous validation of key identifying assumptions.
△ Less
Submitted 6 June, 2025; v1 submitted 27 September, 2023;
originally announced September 2023.
-
Manipulating Weights to Improve Stress-Graph Drawings of 3-Connected Planar Graphs
Authors:
Alvin Chiu,
David Eppstein,
Michael T. Goodrich
Abstract:
We study methods to manipulate weights in stress-graph embeddings to improve convex straight-line planar drawings of 3-connected planar graphs. Stress-graph embeddings are weighted versions of Tutte embeddings, where solving a linear system places vertices at a minimum-energy configuration for a system of springs. A major drawback of the unweighted Tutte embedding is that it often results in drawi…
▽ More
We study methods to manipulate weights in stress-graph embeddings to improve convex straight-line planar drawings of 3-connected planar graphs. Stress-graph embeddings are weighted versions of Tutte embeddings, where solving a linear system places vertices at a minimum-energy configuration for a system of springs. A major drawback of the unweighted Tutte embedding is that it often results in drawings with exponential area. We present a number of approaches for choosing better weights. One approach constructs weights (in linear time) that uniformly spread all vertices in a chosen direction, such as parallel to the $x$- or $y$-axis. A second approach morphs $x$- and $y$-spread drawings to produce a more aesthetically pleasing and uncluttered drawing. We further explore a "kaleidoscope" paradigm for this $xy$-morph approach, where we rotate the coordinate axes so as to find the best spreads and morphs. A third approach chooses the weight of each edge according to its depth in a spanning tree rooted at the outer vertices, such as a Schnyder wood or BFS tree, in order to pull vertices closer to the boundary.
△ Less
Submitted 30 August, 2023; v1 submitted 19 July, 2023;
originally announced July 2023.
-
Drop in the hard pulsed fraction and a candidate cyclotron line in IGR J16320-4751 seen by NuSTAR
Authors:
Arash Bodaghee,
Alan J. -L. Chiu,
John A. Tomsick,
Varun Bhalerao,
Eugenio Bottacini,
Maica Clavel,
Cody Cox,
Felix Fürst,
Matthew J. Middleton,
Farid Rahoui,
Jerome Rodriguez,
Pat Romano,
Joern Wilms
Abstract:
We report on a timing and spectral analysis of a 50-ks NuSTAR observation of IGR J16320-4751 (= AX J1631.9-4752); a high-mass X-ray binary hosting a slowly-rotating neutron star. In this observation from 2015, the spin period was 1,308.8+/-0.4 s giving a period derivative dP/dt ~ 2E-8 s s-1 when compared with the period measured in 2004. In addition, the pulsed fraction decreased as a function of…
▽ More
We report on a timing and spectral analysis of a 50-ks NuSTAR observation of IGR J16320-4751 (= AX J1631.9-4752); a high-mass X-ray binary hosting a slowly-rotating neutron star. In this observation from 2015, the spin period was 1,308.8+/-0.4 s giving a period derivative dP/dt ~ 2E-8 s s-1 when compared with the period measured in 2004. In addition, the pulsed fraction decreased as a function of energy, as opposed to the constant trend that was seen previously. This suggests a change in the accretion geometry of the system during the intervening 11 years. The phase-averaged spectra were fit with the typical model for accreting pulsars: a power law with an exponential cutoff. This left positive residuals at 6.4 keV attributable to the known iron K-alpha line, as well as negative residuals around 14 keV from a candidate cyclotron line detected at a significance of 5-sigma. We found no significant differences in the spectral parameters across the spin period, other than the expected changes in flux and component normalizations. A flare lasting around 5 ks was captured during the first half of the observation where the X-ray emission hardened and the local column density decreased. Finally, the binary orbital period was refined to 8.9912+/-0.0078 d thanks to Swift/BAT monitoring data from 2005-2022.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
A Fine-grained Chinese Software Privacy Policy Dataset for Sequence Labeling and Regulation Compliant Identification
Authors:
Kaifa Zhao,
Le Yu,
Shiyao Zhou,
Jing Li,
Xiapu Luo,
Yat Fei Aemon Chiu,
Yutong Liu
Abstract:
Privacy protection raises great attention on both legal levels and user awareness. To protect user privacy, countries enact laws and regulations requiring software privacy policies to regulate their behavior. However, privacy policies are written in natural languages with many legal terms and software jargon that prevent users from understanding and even reading them. It is desirable to use NLP te…
▽ More
Privacy protection raises great attention on both legal levels and user awareness. To protect user privacy, countries enact laws and regulations requiring software privacy policies to regulate their behavior. However, privacy policies are written in natural languages with many legal terms and software jargon that prevent users from understanding and even reading them. It is desirable to use NLP techniques to analyze privacy policies for helping users understand them. Furthermore, existing datasets ignore law requirements and are limited to English. In this paper, we construct the first Chinese privacy policy dataset, namely CA4P-483, to facilitate the sequence labeling tasks and regulation compliance identification between privacy policies and software. Our dataset includes 483 Chinese Android application privacy policies, over 11K sentences, and 52K fine-grained annotations. We evaluate families of robust and representative baseline models on our dataset. Based on baseline performance, we provide findings and potential research directions on our dataset. Finally, we investigate the potential applications of CA4P-483 combing regulation requirements and program analysis.
△ Less
Submitted 4 December, 2022;
originally announced December 2022.
-
Deep learning models for predicting RNA degradation via dual crowdsourcing
Authors:
Hannah K. Wayment-Steele,
Wipapat Kladwang,
Andrew M. Watkins,
Do Soon Kim,
Bojan Tunguz,
Walter Reade,
Maggie Demkin,
Jonathan Romano,
Roger Wellington-Oguri,
John J. Nicol,
Jiayang Gao,
Kazuki Onodera,
Kazuki Fujikawa,
Hanfei Mao,
Gilles Vandewiele,
Michele Tinti,
Bram Steenwinckel,
Takuya Ito,
Taiga Noumi,
Shujun He,
Keiichiro Ishi,
Youhan Lee,
Fatih Öztürk,
Anthony Chiu,
Emin Öztürk
, et al. (4 additional authors not shown)
Abstract:
Messenger RNA-based medicines hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a ke…
▽ More
Messenger RNA-based medicines hold immense potential, as evidenced by their rapid deployment as COVID-19 vaccines. However, worldwide distribution of mRNA molecules has been limited by their thermostability, which is fundamentally limited by the intrinsic instability of RNA molecules to a chemical degradation reaction called in-line hydrolysis. Predicting the degradation of an RNA molecule is a key task in designing more stable RNA-based therapeutics. Here, we describe a crowdsourced machine learning competition ("Stanford OpenVaccine") on Kaggle, involving single-nucleotide resolution measurements on 6043 102-130-nucleotide diverse RNA constructs that were themselves solicited through crowdsourcing on the RNA design platform Eterna. The entire experiment was completed in less than 6 months, and 41% of nucleotide-level predictions from the winning model were within experimental error of the ground truth measurement. Furthermore, these models generalized to blindly predicting orthogonal degradation data on much longer mRNA molecules (504-1588 nucleotides) with improved accuracy compared to previously published models. Top teams integrated natural language processing architectures and data augmentation techniques with predictions from previous dynamic programming models for RNA secondary structure. These results indicate that such models are capable of representing in-line hydrolysis with excellent accuracy, supporting their use for designing stabilized messenger RNAs. The integration of two crowdsourcing platforms, one for data set creation and another for machine learning, may be fruitful for other urgent problems that demand scientific discovery on rapid timescales.
△ Less
Submitted 22 April, 2022; v1 submitted 14 October, 2021;
originally announced October 2021.
-
Core-Periphery Structure in Directed Networks
Authors:
Andrew Elliott,
Angus Chiu,
Marya Bazzi,
Gesine Reinert,
Mihai Cucuringu
Abstract:
While studies of meso-scale structures in networks often focus on community structure, core--periphery structures can reveal new insights. This structure typically consists of a well-connected core and a periphery that is well connected to the core but sparsely connected internally. Most studies of core--periphery structure focus on undirected networks.
We propose a generalisation of core-periph…
▽ More
While studies of meso-scale structures in networks often focus on community structure, core--periphery structures can reveal new insights. This structure typically consists of a well-connected core and a periphery that is well connected to the core but sparsely connected internally. Most studies of core--periphery structure focus on undirected networks.
We propose a generalisation of core-periphery structure to directed networks. Our approach yields a family of core-periphery block model formulations in which core and periphery sets are edge-direction dependent. We mainly focus on a particular core--periphery structure consisting of two core sets and two periphery sets which we motivate empirically.
To detect this directed core-periphery structure we propose four different methods, with different trade-offs between computational complexity and accuracy. We assess these methods on three benchmarks and compare to four standard methods. On simulated data, the proposed methods match or outperform the standard methods. Applying our methods to three empirical networks -- a political blogs networks, a faculty hiring network, and a trade network -- illustrates that this directed core--periphery structure can offer novel insights about the underlying dataset.
△ Less
Submitted 2 December, 2019;
originally announced December 2019.
-
Counting locally flat-foldable origami configurations via 3-coloring graphs
Authors:
Alvin Chiu,
William Hoganson,
Thomas C. Hull,
Sylvia Wu
Abstract:
Origami, where two-dimensional sheets are folded into complex structures, is proving to be rich with combinatorial and geometric structure, most of which remains to be fully understood. In this paper we consider \emph{flat origami}, where the sheet of material is folded into a two-dimensional object, and consider the mountain (convex) and valley (concave) creases made by such foldings, called a \e…
▽ More
Origami, where two-dimensional sheets are folded into complex structures, is proving to be rich with combinatorial and geometric structure, most of which remains to be fully understood. In this paper we consider \emph{flat origami}, where the sheet of material is folded into a two-dimensional object, and consider the mountain (convex) and valley (concave) creases made by such foldings, called a \emph{MV assignment} of the crease pattern. We establish a method to, given a flat-foldable crease pattern $C$ under certain conditions, create a planar graph $C^*$ whose 3-colorings are in one-to-one correspondence with the locally-valid MV assignments of $C$. This reduces the general, unsolved problem of enumerating locally-valid MV assignments to the enumeration of 3-colorings of graphs.
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
Religious Festivals and Influenza
Authors:
Alice P. Y. Chiu,
Qianying Lin,
Daihai He
Abstract:
Objectives Influenza outbreaks have been widely studied. However, the patterns between influenza and religious festivals remained unexplored. This study examined the patterns of influenza and Hanukkah in Israel, and that of influenza and Hajj in Bahrain, Egypt, Iraq, Jordan, Oman and Qatar. Method Influenza surveillance data of these seven countries from 2009 to 2017 were downloaded from the FluNe…
▽ More
Objectives Influenza outbreaks have been widely studied. However, the patterns between influenza and religious festivals remained unexplored. This study examined the patterns of influenza and Hanukkah in Israel, and that of influenza and Hajj in Bahrain, Egypt, Iraq, Jordan, Oman and Qatar. Method Influenza surveillance data of these seven countries from 2009 to 2017 were downloaded from the FluNet of the World Health Organization. Secondary data were collected for the countries' population, and the dates of Hajj and Hanukkah. We aggregated the weekly influenza A and B laboratory confirmations for each country over the study period. Weekly influenza A patterns and religious festival dates were further explored across the study period. Results We found that influenza A peaks closely followed Hanukkah in Israel in six out of seven years from 2010 to 2017. Aggregated influenza A peaks of the other six Middle East countries also occurred right after Hajj every year during the study period. Conclusions We predict that unless there is an emergence of new influenza strain, such influenza patterns are likely to persist in future years. Our results suggested that the optimal timing of mass influenza vaccination should take into considerations of the dates of these religious festivals.
△ Less
Submitted 24 October, 2017;
originally announced October 2017.
-
A Data-driven Approach Towards Human-robot Collaborative Problem Solving in a Shared Space
Authors:
Michael Wollowski,
Carlotta Berry,
Ryder Winck,
Alan Jern,
David Voltmer,
Alan Chiu,
Yosi Shibberu
Abstract:
We are developing a system for human-robot communication that enables people to communicate with robots in a natural way and is focused on solving problems in a shared space. Our strategy for developing this system is fundamentally data-driven: we use data from multiple input sources and train key components with various machine learning techniques. We developed a web application that is collectin…
▽ More
We are developing a system for human-robot communication that enables people to communicate with robots in a natural way and is focused on solving problems in a shared space. Our strategy for developing this system is fundamentally data-driven: we use data from multiple input sources and train key components with various machine learning techniques. We developed a web application that is collecting data on how two humans communicate to accomplish a task, as well as a mobile laboratory that is instrumented to collect data on how two humans communicate to accomplish a task in a physically shared space. The data from these systems will be used to train and fine-tune the second stage of our system, in which the robot will be simulated through software. A physical robot will be used in the final stage of our project. We describe these instruments, a test-suite and performance metrics designed to evaluate and automate the data gathering process as well as evaluate an initial data set.
△ Less
Submitted 30 September, 2017;
originally announced October 2017.
-
Patterns of Influenza Vaccination Coverage in the United States from 2009 to 2015
Authors:
Alice P. Y. Chiu,
Duo Yu,
Jonathan Dushoff,
Daihai He
Abstract:
Background: Globally, influenza is a major cause of morbidity, hospitalization and mortality. Influenza vaccination has shown substantial protective effectiveness in the United States. We investigated state-level patterns of coverage rates of seasonal and pandemic influenza vaccination, among the overall population in the U.S. and specifically among children and the elderly, from 2009/10 to 2014/1…
▽ More
Background: Globally, influenza is a major cause of morbidity, hospitalization and mortality. Influenza vaccination has shown substantial protective effectiveness in the United States. We investigated state-level patterns of coverage rates of seasonal and pandemic influenza vaccination, among the overall population in the U.S. and specifically among children and the elderly, from 2009/10 to 2014/15, and associations with ecological factors.
Methods and Findings: We obtained state-level influenza vaccination coverage rates from national surveys, and state-level socio-demographic and health data from a variety of sources. We employed a retrospective ecological study design, and used mixed-model regression to determine the levels of ecological association of the state-level vaccinations rates with these factors, both with and without region as a factor for the three populations. We found that health-care access is positively and significantly associated with mean influenza vaccination coverage rates across all populations and models. We also found that prevalence of asthma in adults are negatively and significantly associated with mean influenza vaccination coverage rates in the elderly populations.
Conclusions: Health-care access has a robust, positive association with state-level vaccination rates across different populations. This highlights a potential population-level advantage of expanding health-care access.
△ Less
Submitted 13 March, 2017;
originally announced March 2017.
-
Increasing Trends of Guillain-Barré Syndrome (GBS) and Dengue in Hong Kong
Authors:
Xiujuan Tang,
Shi Zhao,
Alice P. Y. Chiu,
Xin Wang,
Lin Yang,
Daihai He
Abstract:
Background: Guillain-Barré Syndrome (GBS) is a common type of severe acute paralytic neuropathy and associated with other virus infections such as dengue fever and Zika. This study investigate the relationship between GBS, dengue, local meteorological factors in Hong Kong and global climatic factors from January 2000 to June 2016.
Methods: The correlations between GBS, dengue, Multivariate El Ni…
▽ More
Background: Guillain-Barré Syndrome (GBS) is a common type of severe acute paralytic neuropathy and associated with other virus infections such as dengue fever and Zika. This study investigate the relationship between GBS, dengue, local meteorological factors in Hong Kong and global climatic factors from January 2000 to June 2016.
Methods: The correlations between GBS, dengue, Multivariate El Nino Southern Oscillation Index (MEI) and local meteorological data were explored by the Spearman Rank correlations and cross-correlations between these time series. Poisson regression models were fitted to identify nonlinear associations between MEI and dengue. Cross wavelet analysis was applied to infer potential non-stationary oscillating associations among MEI, dengue and GBS.
Findings : An increasing trend was found for both GBS cases and imported dengue cases in Hong Kong. We found a weak but statistically significant negative correlation between GBS and local meteorological factors. MEI explained over 12\% of dengue's variations from Poisson regression models. Wavelet analyses showed that there is possible non-stationary oscillating association between dengue and GBS from 2005 to 2015 in Hong Kong. Our study has led to an improved understanding of the timing and relationship between GBS, dengue and MEI.
△ Less
Submitted 13 March, 2017;
originally announced March 2017.
-
Effects of Reactive Social Distancing on the 1918 Influenza Pandemic
Authors:
Duo Yu,
Qianying Lin,
Alice PY Chiu,
Daihai He
Abstract:
The 1918 influenza pandemic was characterized by multiple epidemic waves. We investigated into reactive social distancing, a form of behavioral responses, and its effect on the multiple influenza waves in the United Kingdom. Two forms of reactive social distancing have been used in previous studies: Power function, which is a function of the proportion of recent influenza mortality in a population…
▽ More
The 1918 influenza pandemic was characterized by multiple epidemic waves. We investigated into reactive social distancing, a form of behavioral responses, and its effect on the multiple influenza waves in the United Kingdom. Two forms of reactive social distancing have been used in previous studies: Power function, which is a function of the proportion of recent influenza mortality in a population, and Hill function, which is a function of the actual number of recent influenza mortality. Using a simple epidemic model with a Power function and one common set of parameters, we provided a good model fit for the observed multiple epidemic waves in London boroughs, Birmingham and Liverpool. Our approach is different from previous studies where separate models are fitted to each city. We then applied these model parameters obtained from fitting three cities to all 334 administrative units in England and Wales and including the population sizes of individual administrative units. We computed the Pearson's correlation between the observed and simulated data for each administrative unit. We achieved a median correlation of 0.636, indicating our model predictions perform reasonably well. Our modelling approach which requires reduced number of parameters resulted in computational efficiency gain without over-fitting the model. Our works have both scientific and public health significance.
△ Less
Submitted 12 March, 2017;
originally announced March 2017.
-
Spatio-temporal patterns of influenza B proportions
Authors:
Daihai He,
Alice PY Chiu,
Qianying Lin,
Duo Yu
Abstract:
We study the spatio-temporal patterns of the proportion of influenza B out of laboratory confirmations of both influenza A and B, with data from 139 countries and regions downloaded from the FluNet compiled by the World Health Organization, from January 2006 to October 2015, excluding 2009. We restricted our analysis to 34 countries that reported more than 2000 confirmations for each of types A an…
▽ More
We study the spatio-temporal patterns of the proportion of influenza B out of laboratory confirmations of both influenza A and B, with data from 139 countries and regions downloaded from the FluNet compiled by the World Health Organization, from January 2006 to October 2015, excluding 2009. We restricted our analysis to 34 countries that reported more than 2000 confirmations for each of types A and B over the study period. We find that Pearson's correlation is 0.669 between effective distance from Mexico and influenza B proportion among the countries from January 2006 to October 2015. In the United States, influenza B proportion in the pre-pandemic period (2003-2008) negatively correlated with that in the post-pandemic era (2010-2015) at the regional level. Our study limitations are the country-level variations in both surveillance methods and testing policies. Influenza B proportion displayed wide variations over the study period. Our findings suggest that even after excluding 2009's data, the influenza pandemic still has an evident impact on the relative burden of the two influenza types. Future studies could examine whether there are other additional factors. This study has potential implications in prioritizing public health control measures.
△ Less
Submitted 26 January, 2016;
originally announced January 2016.
-
Lensed galaxies in CANDELS
Authors:
Asantha Cooray,
Hai Fu,
Jae Calanog,
J. L. Wardlow,
A. Chiu,
Sam Kim,
Joseph Smidt,
V. Acquaviva,
H. C. Ferguson,
S. M. Faber,
A. Galametz,
N. A. Grogin,
W. Hartley,
D. Kocevski,
A. Koekemoer,
D. C. Koo,
R. A. Lucas,
L. Moustakas,
J. A. Newman
Abstract:
We present results from a search for gravitationally lensed galaxies present in the Hubble Space Telescope (HST)/Wide Field Camera-3 (WFC3) images of the Cosmic Assembly Near-IR Deep Extragalactic Legacy Survey (CANDELS). We present one bona fide lens system in UDS and two compact lens candidates in the GOODS-S field. The lensing system in UDS involves two background galaxies, one at z=1.847 lense…
▽ More
We present results from a search for gravitationally lensed galaxies present in the Hubble Space Telescope (HST)/Wide Field Camera-3 (WFC3) images of the Cosmic Assembly Near-IR Deep Extragalactic Legacy Survey (CANDELS). We present one bona fide lens system in UDS and two compact lens candidates in the GOODS-S field. The lensing system in UDS involves two background galaxies, one at z=1.847 lensed to an arc and a counterimage, and the second at a photometric redshift of z=2.32^{+0.10}_{-0.06} lensed to a double image. We reconstruct the lensed sources in the source plane and find in each of the two cases the sources can be separated to a pair of galaxies. The sources responsible for the arc are compact with effective radii of 0.3 to 0.4 kpc in WFC3 J_{125}-band and a total stellar mass and a star-formation rate of 2.1_{-0.4}^{+2.4} times 10^7 M_sun and 2.3_{-1.7}^{+ 0.6} M_sun yr^{-1}, respectively.The abnormally high H_{160}-band flux of this source is likely due to OIII emission lines with a rest-frame equivalent width about 700 Angstroms for OIII 5007 Angstroms. The sources responsible for the double image have corresponding values of about 0.4 to 0.5 kpc, 1.4_{-0.8}^{+1.9} times 10^9 M_sun, and 8.7_{-7.0}^{+11.1} M_sun yr^{-1}. Once completed CANDELS is expected to contain about 15 lensing systems and will allow statistical studies on both lensing mass profiles and z ~ 2 lensed galaxies.
△ Less
Submitted 17 October, 2011;
originally announced October 2011.
-
Combining WMAP and SDSS Quasar Data on Reionization Constrains Cosmological Parameters and the Star Formation Efficiency
Authors:
Weihsueh A. Chiu,
Xiaohui Fan,
Jeremiah P. Ostriker
Abstract:
We present constraints on cosmological and star formation parameters based on combining observations of the Wilkinson Microwave Anisotropy Probe (WMAP) and high-redshift quasars from the Sloan Digital Sky Survey (SDSS). We use a semi-analytic model for reionization (Chiu and Ostriker 2000) that takes into account a number of important physical processes both within collapsing halos and in the in…
▽ More
We present constraints on cosmological and star formation parameters based on combining observations of the Wilkinson Microwave Anisotropy Probe (WMAP) and high-redshift quasars from the Sloan Digital Sky Survey (SDSS). We use a semi-analytic model for reionization (Chiu and Ostriker 2000) that takes into account a number of important physical processes both within collapsing halos and in the intergalactic medium. Assuming that the efficiency of producing UV photons per baryon is constant, we derive a constraint of the form sigma_8 Omega_0^0.5~0.33 in a flat, Lambda-dominated universe with h=0.72, n=0.99, and Omega_b h^2=0.024. However, the calculated optical depth to electron scattering of tau_es~0.06 is well below the value found by WMAP of 0.17+/-(0.04~0.07) (Spergel et al 2003). Since the WMAP constraints on tau_es are somewhat degenerate with the value of the spectral index n, we then permit the primordial spectral index n to float and fix Omega_0 h^2=0.14, while normalizing the power spectrum using WMAP. In addition, we allow the UV-efficiency to have time-dependence. Combining the WMAP constraints with the quasar transmission data, our analysis then favors a model with tau_es=0.11^{+0.02}_{-0.03}, n=0.96^{+0.02}_{-0.03}$, implying sigma_8=0.83^{+0.03}_{-0.05} (95% confidence), and an effective UV-efficiency that was at least ~10x greater at z >> 6. These results indicate that the quasar and WMAP observations are consistent. If future observations confirm an optical depth to electron scattering tau_es~0.1, then it would appear that no more "exotic" sources of UV-photons, such as mini-quasars or AGNs, are necessary; but our analysis indicates that a determination of tau_es>~0.17 would require a more radical solution.
△ Less
Submitted 12 April, 2003;
originally announced April 2003.
-
The Expected Mass Function for Low Mass Galaxies in a CDM Cosmology: Is There a Problem?
Authors:
Weihsueh A. Chiu,
Nickolay Y. Gnedin,
Jeremiah P. Ostriker
Abstract:
It is well known that the mass function for_halos_ in CDM cosmology is a relatively steep power law for low masses, possibly too steep to be consistent with observations. But how steep is the_galaxy_ mass function? We have analyzed the stellar and gas mass functions of the first massive luminous objects formed in a ΛCDM universe, as calculated in the numerical simulation described in Gnedin (200…
▽ More
It is well known that the mass function for_halos_ in CDM cosmology is a relatively steep power law for low masses, possibly too steep to be consistent with observations. But how steep is the_galaxy_ mass function? We have analyzed the stellar and gas mass functions of the first massive luminous objects formed in a ΛCDM universe, as calculated in the numerical simulation described in Gnedin (2000ab). We found that while the dark matter mass function is steep, the stellar and gas mass functions are flatter for low mass objects. The stellar mass function is consistently flat at the low mass end. Moreover, while the gas mass function follows the dark matter mass function until reionization at z~7, between z=7 and z=4, the gas mass function also flattens considerably at the low mass end. At z=4, the gas and stellar mass functions are fit by a Schechter function with α~ -1.2 +/- 0.1, significantly shallower than the dark matter halo mass function and consistent with some recent observations. The baryonic mass functions are shallower because (a) the dark matter halo mass function is consistent with the Press-Schechter formulation at low masses n(M)
M^-2 and (b) heating/cooling and ionization processes appear to cause baryons to collect in halos with the relationship M_b M_d^4 at low masses. Combining (a) and (b) gives n(M_b) M_b^-5/4, comparable to the simulation results. Thus, the well known observational fact that low mass galaxies are underabundant as compared to expectations from numerical dark matter simulations or Press-Schechter modeling of CDM universes emerges naturally from these results, implying that perhaps no ``new physics'' beyond the standard model is needed.
△ Less
Submitted 22 March, 2001;
originally announced March 2001.
-
A Semi-Analytic Model for Cosmological Reheating and Reionization Due to the Gravitational Collapse of Structure
Authors:
Weihsueh A. Chiu,
Jeremiah P. Ostriker
Abstract:
We present a semi-analytic model for the thermal and ionization history of the universe at 1000 >~ z >~ 3. This model incorporates much of the essential physics included in full-scale hydrodynamical simulations, such as (1) gravitational collapse and virialization; (2) star/quasar formation and subsequent ionizing radiation; (3) heating and cooling; (4) atomic and molecular physics of hydrogen;…
▽ More
We present a semi-analytic model for the thermal and ionization history of the universe at 1000 >~ z >~ 3. This model incorporates much of the essential physics included in full-scale hydrodynamical simulations, such as (1) gravitational collapse and virialization; (2) star/quasar formation and subsequent ionizing radiation; (3) heating and cooling; (4) atomic and molecular physics of hydrogen; and (5) the feedback relationships between these processes. In addition, we model the process of reheating and reionization using two separate phases, self-consistently calculating the filling factor of each phase. Thus radiative transfer is treated more accurately than simulations to date have done: we allow to lowest order for both the inhomogeneity of the sources and the sinks of radiation. After calibrating and checking the results of this model against a hydrodynamical simulation, we apply our model to a variety of Gaussian and non-Gaussian CDM-dominated cosmological models. Our major conclusions include: (1) the epoch of reheating depends most strongly on the power spectrum of density fluctuations at small scales; (2) because of the effects of gas clumping, full reionization occurs at z ~ 10 in all models; (3) the CMBR polarization and the stars and quasars to baryons ratio are strong potential discrimants between different assumed power spectra; (4) the formation of galactic spheroids may be regulated by the evolution of reheating through feedback, so that the Jeams mass tracks the non-linear mass scale; and (5) the evolution of the bias of luminous objects can potentially discriminate strongly between Gaussian and non-Gaussian PDFs.
△ Less
Submitted 16 July, 1999;
originally announced July 1999.
-
Stellar Atmospheres Near an AGN: The Importance of Radiation Pressure from Trapped Lyman-alpha Photons
Authors:
Weihsueh A. Chiu,
B. T. Draine
Abstract:
We derive an analytic expression for the intensity of resonance-line radiation ``trapped'' in a semi-infinite medium. Given a source function and destruction probability per scattering, the radiation pressure due to trapped photons can be calculated by numerically integrating over analytic functions. We apply this formalism to a plane-parallel model stellar atmosphere to calculate the radiation…
▽ More
We derive an analytic expression for the intensity of resonance-line radiation ``trapped'' in a semi-infinite medium. Given a source function and destruction probability per scattering, the radiation pressure due to trapped photons can be calculated by numerically integrating over analytic functions. We apply this formalism to a plane-parallel model stellar atmosphere to calculate the radiation pressure due to Lyman-alpha photons produced following absorption of UV and X-rays from an AGN. For low surface gravity stars near the AGN (g~10 cm/sec^2, r~0.25 pc), we find that the pressure due to Lyman-alpha photons becomes an appreciable fraction of that required for hydrostatic support. If the broad emission line emitting gas in AGNs and QSOs consists of stellar outflows, it may be driven, in part, by Lyman-alpha pressure.
△ Less
Submitted 17 March, 1998;
originally announced March 1998.
-
Using Cluster Abundances and Peculiar Velocities to Test the Gaussianity of the Cosmological Density Field
Authors:
Weihsueh A. Chiu,
Jeremiah P. Ostriker,
Michael A. Strauss
Abstract:
(Abridged) By comparing the frequency of typical events with that of unusual events, one can test whether the cosmological density distribution function is consistent with the normally made assumption of Gaussianity. To this end, we compare the consistency of the tail-inferred (from clusters) and measured values (from large-scale flows) of the rms level of mass fluctuations for two distribution…
▽ More
(Abridged) By comparing the frequency of typical events with that of unusual events, one can test whether the cosmological density distribution function is consistent with the normally made assumption of Gaussianity. To this end, we compare the consistency of the tail-inferred (from clusters) and measured values (from large-scale flows) of the rms level of mass fluctuations for two distribution functions: a Gaussian, and a texture (positively-skewed) PDF. Averaging the recent large-scale flow measurements, we find that observations of the rms and the tail at the 10 h^-1 Mpc scale disfavor a texture PDF at ~1.5 sigma in all cases. However, taking only the most recent measurement of the rms, that from Willick et al. (1997b), the comparison disfavors textures for low Omega_0=0.3, and disfavors Gaussian models if Omega_0=1 (again at ~1.5 sigma). Predictions for evolution of high temperature clusters can also be made for the models considered, and strongly disfavor Omega_0=1 in Gaussian models and marginally disfavor Omega_0=1 in texture models. Only Omega_0=0.3 Gaussian models are consistent with all the data considered.
△ Less
Submitted 27 August, 1997;
originally announced August 1997.