Search | arXiv e-print repository

arXiv:2007.13826 [pdf, other]

Large Scale Subject Category Classification of Scholarly Papers with Deep Attentive Neural Networks

Authors: Bharath Kandimalla, Shaurya Rohatgi, Jian Wu, C Lee Giles

Abstract: Subject categories of scholarly papers generally refer to the knowledge domain(s) to which the papers belong, examples being computer science or physics. Subject category information can be used for building faceted search for digital library search engines. This can significantly assist users in narrowing down their search space of relevant documents. Unfortunately, many academic papers do not ha… ▽ More Subject categories of scholarly papers generally refer to the knowledge domain(s) to which the papers belong, examples being computer science or physics. Subject category information can be used for building faceted search for digital library search engines. This can significantly assist users in narrowing down their search space of relevant documents. Unfortunately, many academic papers do not have such information as part of their metadata. Existing methods for solving this task usually focus on unsupervised learning that often relies on citation networks. However, a complete list of papers citing the current paper may not be readily available. In particular, new papers that have few or no citations cannot be classified using such methods. Here, we propose a deep attentive neural network (DANN) that classifies scholarly papers using only their abstracts. The network is trained using 9 million abstracts from Web of Science (WoS). We also use the WoS schema that covers 104 subject categories. The proposed network consists of two bi-directional recurrent neural networks followed by an attention layer. We compare our model against baselines by varying the architecture and text representation. Our best model achieves micro-F1 measure of 0.76 with F1 of individual subject categories ranging from 0.50-0.95. The results showed the importance of retraining word embedding models to maximize the vocabulary overlap and the effectiveness of the attention mechanism. The combination of word vectors with TFIDF outperforms character and sentence level embedding models. We discuss imbalanced samples and overlapping categories and suggest possible strategies for mitigation. We also determine the subject category distribution in CiteSeerX by classifying a random sample of one million academic papers. △ Less

Submitted 27 July, 2020; originally announced July 2020.

Comments: submitted to "Frontiers Mining Scientific Papers Volume II: Knowledge Discovery and Data Exploitation"

arXiv:2006.11470 [pdf, other]

doi 10.3847/1538-4357/ab9ebe

Direct Measurement of the Solar-Wind Taylor Microscale using MMS Turbulence Campaign Data

Authors: Riddhi Bandyopadhyay, William H. Matthaeus, Alexandros Chasapis, Christopher T. Russell, Robert J. Strangeway, Roy B. Torbert, Barbara L. Giles, Daniel J. Gershman, Craig J. Pollock, James L. Burch

Abstract: Using the novel Magnetospheric Multiscale (MMS) mission data accumulated during the 2019 MMS Solar Wind Turbulence Campaign, we calculate the Taylor microscale $(λ_{\mathrm{T}})$ of the turbulent magnetic field in the solar wind. The Taylor microscale represents the onset of dissipative processes in classical turbulence theory. An accurate estimation of Taylor scale from spacecraft data is, howeve… ▽ More Using the novel Magnetospheric Multiscale (MMS) mission data accumulated during the 2019 MMS Solar Wind Turbulence Campaign, we calculate the Taylor microscale $(λ_{\mathrm{T}})$ of the turbulent magnetic field in the solar wind. The Taylor microscale represents the onset of dissipative processes in classical turbulence theory. An accurate estimation of Taylor scale from spacecraft data is, however, usually difficult due to low time cadence, the effect of time decorrelation, and other factors. Previous reports were based either entirely on the Taylor frozen-in approximation, which conflates time dependence, or that were obtained using multiple datasets, which introduces sample-to-sample variation of plasma parameters, or where inter-spacecraft distance were larger than the present study. The unique configuration of linear formation with logarithmic spacing of the 4 MMS spacecraft, during the campaign, enables a direct evaluation of the $λ_{\mathrm{T}}$ from a single dataset, independent of the Taylor frozen-in approximation. A value of $λ_{\mathrm{T}} \approx 7000 \, \mathrm{km}$ is obtained, which is about 3 times larger than the previous estimates. △ Less

Submitted 19 June, 2020; originally announced June 2020.

Comments: Accepted for publication in the Astrophysical Journal

arXiv:2006.10316 [pdf, other]

doi 10.1063/5.0098625

Interplay of Turbulence and Proton-Microinstability Growth in Space Plasmas

Authors: Riddhi Bandyopadhyay, Ramiz A. Qudsi, William H. Matthaeus, Tulasi N. Parashar, Bennett A. Maruca, S. Peter Gary, Vadim Roytershteyn, Alexandros Chasapis, Barbara L. Giles, Daniel J. Gershman, Craig J. Pollock, Christopher T. Russell, Robert J. Strangeway, Roy B. Torbert, Thomas E. Moore, James L. Burch

Abstract: Numerous prior studies have shown that as proton beta increases, a narrower range of proton temperature anisotropy values is observed. This effect has often been ascribed to the actions of kinetic microinstabilities because the distribution of observational data aligns with contours of constant instability growth rates in the beta-anisotropy plane. However, the linear Vlasov theory of instabilitie… ▽ More Numerous prior studies have shown that as proton beta increases, a narrower range of proton temperature anisotropy values is observed. This effect has often been ascribed to the actions of kinetic microinstabilities because the distribution of observational data aligns with contours of constant instability growth rates in the beta-anisotropy plane. However, the linear Vlasov theory of instabilities assumes a uniform background in which perturbations grow. The established success of linear-microinstability theories suggests that the conditions in regions of extreme temperature anisotropy may remain uniform for a long enough time so that the instabilities have the chance to grow to sufficient amplitude. Turbulence, on the other hand, is intrinsically non-uniform and non-linear. Thin current sheets and other coherent structures generated in a turbulent plasma, may destroy the uniformity fast enough. It is therefore not a-priori obvious whether the presence of intermittency and coherent structures favors or disfavors instabilities. To address this question, we examined the statistical distribution of growth rates associated with proton temperature-anisotropy driven microinstabilities and local nonlinear time scales in turbulent plasmas. Linear growth rates are, on average, substantially less than the local nonlinear rates. However, at the regions of extreme values of temperature anisotropy, near the "edges" of the populated part of the proton temperature anisotropy-parallel beta plane, the instability growth rates are comparable or faster than the turbulence time scales. These results provide a possible answer to the question as to why the linear theory appears to work in limiting plasma excursions in anisotropy and plasma beta. △ Less

Submitted 21 September, 2022; v1 submitted 18 June, 2020; originally announced June 2020.

Comments: Accepted for publication in Physics of Plasmas

arXiv:2006.03651 [pdf, other]

A provably stable neural network Turing Machine

Authors: John Stogin, Ankur Mali, C Lee Giles

Abstract: We introduce a neural stack architecture, including a differentiable parametrized stack operator that approximates stack push and pop operations for suitable choices of parameters that explicitly represents a stack. We prove the stability of this stack architecture: after arbitrarily many stack operations, the state of the neural stack still closely resembles the state of the discrete stack. Using… ▽ More We introduce a neural stack architecture, including a differentiable parametrized stack operator that approximates stack push and pop operations for suitable choices of parameters that explicitly represents a stack. We prove the stability of this stack architecture: after arbitrarily many stack operations, the state of the neural stack still closely resembles the state of the discrete stack. Using the neural stack with a recurrent neural network, we introduce a neural network Pushdown Automaton (nnPDA) and prove that nnPDA with finite/bounded neurons and time can simulate any PDA. Furthermore, we extend our construction and propose new architecture neural state Turing Machine (nnTM). We prove that differentiable nnTM with bounded neurons can simulate Turing Machine (TM) in real-time. Just like the neural stack, these architectures are also stable. Finally, we extend our construction to show that differentiable nnTM is equivalent to Universal Turing Machine (UTM) and can simulate any TM with only \textbf{seven finite/bounded precision} neurons. This work provides a new theoretical bound for the computational capability of bounded precision RNNs augmented with memory. △ Less

Submitted 18 September, 2022; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: 28 pages, 2 figures

arXiv:2005.09232 [pdf, other]

doi 10.1103/PhysRevLett.124.255101

Statistics of Kinetic Dissipation in Earth's Magnetosheath -- MMS Observations

Authors: Riddhi Bandyopadhyay, William H. Matthaeus, Tulasi N. Parashar, Yan Yang, Alexandros Chasapis, Barbara L. Giles, Daniel J. Gershman, Craig J. Pollock, Christopher T. Russell, Robert J. Strangeway, Roy B. Torbert, Thomas E. Moore, James L. Burch

Abstract: A familiar problem in space and astrophysical plasmas is to understand how dissipation and heating occurs. These effects are often attributed to the cascade of broadband turbulence which transports energy from large scale reservoirs to small scale kinetic degrees of freedom. When collisions are infrequent, local thermodynamic equilibrium is not established. In this case the final stage of energy c… ▽ More A familiar problem in space and astrophysical plasmas is to understand how dissipation and heating occurs. These effects are often attributed to the cascade of broadband turbulence which transports energy from large scale reservoirs to small scale kinetic degrees of freedom. When collisions are infrequent, local thermodynamic equilibrium is not established. In this case the final stage of energy conversion becomes more complex than in the fluid case, and both pressure-dilatation and pressure strain interactions (Pi-D $\equiv -Π_{ij} D_{ij}$) become relevant and potentially important. Pi-D in plasma turbulence has been studied so far primarily using simulations. The present study provides a statistical analysis of Pi-D in the Earth's magnetosheath using the unique measurement capabilities of the Magnetospheric Multiscale (MMS) mission. We find that the statistics of Pi-D in this naturally occurring plasma environment exhibit strong resemblance to previously established fully kinetic simulations results. The conversion of energy is concentrated in space and occurs near intense current sheets, but not within them. This supports recent suggestions that the chain of energy transfer channels involves regional, rather than pointwise, correlations. △ Less

Submitted 19 May, 2020; originally announced May 2020.

Comments: Accepted for publication in Physical Review Letters

arXiv:2005.02367 [pdf, other]

CODA-19: Using a Non-Expert Crowd to Annotate Research Aspects on 10,000+ Abstracts in the COVID-19 Open Research Dataset

Authors: Ting-Hao 'Kenneth' Huang, Chieh-Yang Huang, Chien-Kuang Cornelia Ding, Yen-Chia Hsu, C. Lee Giles

Abstract: This paper introduces CODA-19, a human-annotated dataset that codes the Background, Purpose, Method, Finding/Contribution, and Other sections of 10,966 English abstracts in the COVID-19 Open Research Dataset. CODA-19 was created by 248 crowd workers from Amazon Mechanical Turk within 10 days, and achieved labeling quality comparable to that of experts. Each abstract was annotated by nine different… ▽ More This paper introduces CODA-19, a human-annotated dataset that codes the Background, Purpose, Method, Finding/Contribution, and Other sections of 10,966 English abstracts in the COVID-19 Open Research Dataset. CODA-19 was created by 248 crowd workers from Amazon Mechanical Turk within 10 days, and achieved labeling quality comparable to that of experts. Each abstract was annotated by nine different workers, and the final labels were acquired by majority vote. The inter-annotator agreement (Cohen's kappa) between the crowd and the biomedical expert (0.741) is comparable to inter-expert agreement (0.788). CODA-19's labels have an accuracy of 82.2% when compared to the biomedical expert's labels, while the accuracy between experts was 85.0%. Reliable human annotations help scientists access and integrate the rapidly accelerating coronavirus literature, and also serve as the battery of AI/NLP research, but obtaining expert annotations can be slow. We demonstrated that a non-expert crowd can be rapidly employed at scale to join the fight against COVID-19. △ Less

Submitted 17 September, 2020; v1 submitted 5 May, 2020; originally announced May 2020.

Comments: Accepted by the NLP COVID-19 Workshop at ACL 2020. (The data, code, and model are available at: https://github.com/windx0303/CODA-19)

arXiv:2004.11131 [pdf, other]

doi 10.18653/v1/2021.acl-long.532

Privacy at Scale: Introducing the PrivaSeer Corpus of Web Privacy Policies

Authors: Mukund Srinath, Shomir Wilson, C. Lee Giles

Abstract: Organisations disclose their privacy practices by posting privacy policies on their website. Even though users often care about their digital privacy, they often don't read privacy policies since they require a significant investment in time and effort. Although natural language processing can help in privacy policy understanding, there has been a lack of large scale privacy policy corpora that co… ▽ More Organisations disclose their privacy practices by posting privacy policies on their website. Even though users often care about their digital privacy, they often don't read privacy policies since they require a significant investment in time and effort. Although natural language processing can help in privacy policy understanding, there has been a lack of large scale privacy policy corpora that could be used to analyse, understand, and simplify privacy policies. Thus, we create PrivaSeer, a corpus of over one million English language website privacy policies, which is significantly larger than any previously available corpus. We design a corpus creation pipeline which consists of crawling the web followed by filtering documents using language detection, document classification, duplicate and near-duplication removal, and content extraction. We investigate the composition of the corpus and show results from readability tests, document similarity, keyphrase extraction, and explored the corpus through topic modeling. △ Less

Submitted 30 March, 2024; v1 submitted 23 April, 2020; originally announced April 2020.

Journal ref: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021

arXiv:2004.07623 [pdf, other]

Recognizing Long Grammatical Sequences Using Recurrent Networks Augmented With An External Differentiable Stack

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Clyde Lee Giles

Abstract: Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. Despite success in applications such as machine translation and voice recognition, these stateful models have several critical shortcomings. Specifically, RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and t… ▽ More Recurrent neural networks (RNNs) are a widely used deep architecture for sequence modeling, generation, and prediction. Despite success in applications such as machine translation and voice recognition, these stateful models have several critical shortcomings. Specifically, RNNs generalize poorly over very long sequences, which limits their applicability to many important temporal processing and time series forecasting problems. For example, RNNs struggle in recognizing complex context free languages (CFLs), never reaching 100% accuracy on training. One way to address these shortcomings is to couple an RNN with an external, differentiable memory structure, such as a stack. However, differentiable memories in prior work have neither been extensively studied on CFLs nor tested on sequences longer than those seen in training. The few efforts that have studied them have shown that continuous differentiable memory structures yield poor generalization for complex CFLs, making the RNN less interpretable. In this paper, we improve the memory-augmented RNN with important architectural and state updating mechanisms that ensure that the model learns to properly balance the use of its latent states with external memory. Our improved RNN models exhibit better generalization performance and are able to classify long strings generated by complex hierarchical context free grammars (CFGs). We evaluate our models on CGGs, including the Dyck languages, as well as on the Penn Treebank language modelling task, and achieve stable, robust performance across these benchmarks. Furthermore, we show that only our memory-augmented networks are capable of retaining memory for a longer duration up to strings of length 160. △ Less

Submitted 22 April, 2020; v1 submitted 4 April, 2020; originally announced April 2020.

Comments: 14 pages, 10 tables

arXiv:2004.07199 [pdf, other]

MMS SITL Ground Loop: Automating the burst data selection process

Authors: Matthew R. Argall, Colin Small, Samantha Piatt, Liam Breen, Marek Petrik, Kim Kokkonen, Julie Barnum, Kristopher Larsen, Frederick D. Wilder, Mitsuo Oka, William R. Paterson, Roy B. Torbert, Robert E. Ergun, Tai Phan, Barbara L. Giles, James L. Burch

Abstract: Global-scale energy flow throughout Earth's magnetosphere (MSP) is catalyzed by processes that occur at Earth's magnetopause (MP). Magnetic reconnection is one process responsible for solar wind entry into and global convection within the MSP, and the MP location, orientation, and motion have an impact on the dynamics. Statistical studies that focus on these and other MP phenomena and characterist… ▽ More Global-scale energy flow throughout Earth's magnetosphere (MSP) is catalyzed by processes that occur at Earth's magnetopause (MP). Magnetic reconnection is one process responsible for solar wind entry into and global convection within the MSP, and the MP location, orientation, and motion have an impact on the dynamics. Statistical studies that focus on these and other MP phenomena and characteristics inherently require MP identification in their event search criteria, a task that can be automated using machine learning. We introduce a Long-Short Term Memory (LSTM) Recurrent Neural Network model to detect MP crossings and assist studies of energy transfer into the MSP. As its first application, the LSTM has been implemented into the operational data stream of the Magnetospheric Multiscale (MMS) mission. MMS focuses on the electron diffusion region of reconnection, where electron dynamics break magnetic field lines and plasma is energized. MMS employs automated burst triggers onboard the spacecraft and a Scientist-in-the-Loop (SITL) on the ground to select intervals likely to contain diffusion regions. Only low-resolution data is available to the SITL, which is insufficient to resolve electron dynamics. A strategy for the SITL, then, is to select all MP crossings. Of all 219 SITL selections classified as MP crossings during the first five months of model operations, the model predicted 166 (76%) of them, and of all 360 model predictions, 257 (71%) were selected by the SITL. Most predictions that were not classified as MP crossings by the SITL were still MP-like; the intervals contained mixed magnetosheath and magnetospheric plasmas. The LSTM model and its predictions are public to ease the burden of arduous event searches involving the MP, including those for EDRs. For MMS, this helps free up mission operation costs by consolidating manual classification processes into automated routines. △ Less

Submitted 20 July, 2020; v1 submitted 15 April, 2020; originally announced April 2020.

Comments: 21 pages, 8 figures, 3 tables, submitted to Frontiers: Space Science

arXiv:2004.06164 [pdf, other]

doi 10.3847/1538-4357/ab89ad

Intermittency and Ion Temperature-Anisotropy Instabilities: Simulation and Magnetosheath Observation

Authors: Ramiz A. Qudsi, Riddhi Bandyopadhyay, Bennett A. Maruca, Tulasi N. Parashar, William H. Matthaeus, Alexandros Chasapis, S. Peter Gary, Barbara L. Giles, Daniel J. Gershman, Craig J. Pollock, Robert J. Strangeway, Roy B. Torbert, Thomas E. Moore, James L. Burch

Abstract: Weakly collisional space plasmas are rarely in local thermal equilibrium and often exhibit non-Maxwellian electron and ion velocity distributions that lead to the growth of microinstabilities, that is, enhanced electric and magnetic fields at relatively short wavelengths. These instabilities play an active role in the evolution of space plasmas, as does ubiquitous broadband turbulence induced by t… ▽ More Weakly collisional space plasmas are rarely in local thermal equilibrium and often exhibit non-Maxwellian electron and ion velocity distributions that lead to the growth of microinstabilities, that is, enhanced electric and magnetic fields at relatively short wavelengths. These instabilities play an active role in the evolution of space plasmas, as does ubiquitous broadband turbulence induced by turbulent structures. This study compares certain properties of a 2.5 dimensional Particle-In-Cell (PIC) simulation for the forward cascade of Alfvenic turbulence in a collisionless plasma against the same properties of turbulence observed by the Magnetospheric Multiscale Mission spacecraft in the terrestrial magnetosheath. The PIC simulation is of decaying turbulence which develops both coherent structures and anisotropic ion velocity distributions with the potential to drive kinetic scale instabilities. The uniform background magnetic field points perpendicular to the plane of the simulation. Growth rates are computed from linear theory using the ion temperature anisotropies and ion beta values for both the simulation and the observations. Both the simulation and the observations show that strong anisotropies and growth rates occur highly intermittently in the plasma, and the simulation further shows that such anisotropies preferentially occur near current sheets. This suggests that, though microinstabilities may affect the plasma globally , they act locally and develop in response to extreme temperature anisotropies generated by turbulent structures. Further studies will be necessary to understand why there is an apparent correlation between linear instability theory and strongly intermittent turbulence. △ Less

Submitted 13 April, 2020; originally announced April 2020.

arXiv:2003.03010 [pdf, other]

doi 10.1029/2020JA027985

Multi-scale coupling during magnetopause reconnection: the interface between the electron and ion diffusion regions

Authors: K. J. Genestreti, Y. -H. Liu, T. -D. Phan, R. E. Denton, R. B. Torbert, J. L. Burch, J. M. Webster, S. Wang, K. J. Trattner, M. R. Argall, L. -J. Chen, S. A. Fuselier, N. Ahmadi, R. E. Ergun, B. L. Giles, C. T. Russell, R. J. Strangeway, S. Eriksson

Abstract: Magnetospheric Multiscale (MMS) encountered the primary low-latitude magnetopause reconnection site when the inter-spacecraft separation exceeded the upstream ion inertial length. Classical signatures of the ion diffusion region (IDR), including a sub-ion-Alfvénic de-magnetized ion exhaust, a super-ion-Alfvénic magnetized electron exhaust, and Hall electromagnetic fields, are identified. The openi… ▽ More Magnetospheric Multiscale (MMS) encountered the primary low-latitude magnetopause reconnection site when the inter-spacecraft separation exceeded the upstream ion inertial length. Classical signatures of the ion diffusion region (IDR), including a sub-ion-Alfvénic de-magnetized ion exhaust, a super-ion-Alfvénic magnetized electron exhaust, and Hall electromagnetic fields, are identified. The opening angle between the magnetopause and magnetospheric separatrix is $30^\circ\pm5^\circ$. The exhaust preferentially expands sunward, displacing the magnetosheath. Intense pileup of reconnected magnetic flux occurs between the magnetosheath separatrix and the magnetopause in a narrow channel intermediate between the ion and electron scales. The strength of the pileup (normalized values of 0.3-0.5) is consistent with the large angle at which the magnetopause is inclined relative to the overall reconnection coordinates. MMS-4, which was two ion inertial lengths closer to the X-line than the other three spacecraft, observed intense electron-dominated currents and kinetic-to-electromagnetic-field energy conversion within the pileup. MMS-1, 2, and 3 did not observe the intense currents nor the particle-to-field energy conversion but did observe the pileup, indicating that the edge of the generation region was contained within the tetrahedron. Comparisons with particle-in-cell simulations reveal that the electron currents and large inclination angle of the magnetopause are interconnected features of the asymmetric Hall effect. Between the separatrix and the magnetopause, high-density inflowing magnetosheath electrons brake and turn into the outflow direction, imparting energy to the normal magnetic field and generating the pileup. The findings indicate that electron dynamics are likely an important influence on the magnetic field structure within the ion diffusion region. △ Less

Submitted 9 July, 2020; v1 submitted 5 March, 2020; originally announced March 2020.

Comments: Submitted to the Journal of Geophysical Research: Space Physics

arXiv:2002.06787 [pdf, ps, other]

doi 10.1103/PhysRevLett.124.065101

Observational Evidence for Stochastic Shock Drift Acceleration of Electrons at the Earth's Bow Shock

Authors: T. Amano, T. Katou, N. Kitamura, M. Oka, Y. Matsumoto, M. Hoshino, Y. Saito, S. Yokota, B. L. Giles, W. R. Paterson, C. T. Russell, O. Le Contel, R. E. Ergun, P. -A. Lindqvist, D. L. Turner, J. F. Fennell, J. B. Blake

Abstract: The first-order Fermi acceleration of electrons requires an injection of electrons into a mildly relativistic energy range. However, the mechanism of injection has remained a puzzle both in theory and observation. We present direct evidence for a novel stochastic shock drift acceleration theory for the injection obtained with Magnetospheric Multiscale (MMS) observations at Earth's bow shock. The t… ▽ More The first-order Fermi acceleration of electrons requires an injection of electrons into a mildly relativistic energy range. However, the mechanism of injection has remained a puzzle both in theory and observation. We present direct evidence for a novel stochastic shock drift acceleration theory for the injection obtained with Magnetospheric Multiscale (MMS) observations at Earth's bow shock. The theoretical model can explain electron acceleration to mildly relativistic energies at high-speed astrophysical shocks, which may provide a solution to the long-standing issue of electron injection. △ Less

Submitted 17 February, 2020; originally announced February 2020.

Comments: 7 pages, 4 figures. Published in PRL

Journal ref: Phys. Rev. Lett., 124, 065101, 2020

arXiv:2002.03911 [pdf, other]

Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment

Authors: Alexander Ororbia, Ankur Mali, Daniel Kifer, C. Lee Giles

Abstract: Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, it requires researchers to continually develop vario… ▽ More Training deep neural networks on large-scale datasets requires significant hardware resources whose costs (even on cloud platforms) put them out of reach of smaller organizations, groups, and individuals. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. Furthermore, it requires researchers to continually develop various tricks, such as specialized weight initializations and activation functions, in order to ensure a stable parameter optimization. Our goal is to seek an effective, neuro-biologically-plausible alternative to backprop that can be used to train deep networks. In this paper, we propose a gradient-free learning procedure, recursive local representation alignment, for training large-scale neural architectures. Experiments with residual networks on CIFAR-10 and the large benchmark, ImageNet, show that our algorithm generalizes as well as backprop while converging sooner due to weight updates that are parallelizable and computationally less demanding. This is empirical evidence that a backprop-free algorithm can scale up to larger datasets. △ Less

Submitted 18 September, 2020; v1 submitted 10 February, 2020; originally announced February 2020.

Comments: Further revised submission -- main description of rec-LRA revamped and architecture-agnostic pseudo-code moved to appendix with additional results/derivation updates

arXiv:1912.09046 [pdf, other]

doi 10.3847/2041-8213/ab846e

In situ Measurement of Curvature of Magnetic Field in Turbulent Space Plasmas: A Statistical Study

Authors: Riddhi Bandyopadhyay, Yan Yang, William H. Matthaeus, Alexandros Chasapis, Tulasi N. Parashar, Christopher T. Russell, Robert J. Strangeway, Roy B. Torbert, Barbara L. Giles, Daniel J. Gershman, Craig J. Pollock, Thomas E. Moore, James L. Burch

Abstract: Using in situ data, accumulated in the turbulent magnetosheath by the Magnetospheric Multiscale (MMS) Mission, we report a statistical study of magnetic field curvature and discuss its role in the turbulent space plasmas. Consistent with previous simulation results, the Probability Distribution Function (PDF) of the curvature is shown to have distinct power-law tails for both high and low value li… ▽ More Using in situ data, accumulated in the turbulent magnetosheath by the Magnetospheric Multiscale (MMS) Mission, we report a statistical study of magnetic field curvature and discuss its role in the turbulent space plasmas. Consistent with previous simulation results, the Probability Distribution Function (PDF) of the curvature is shown to have distinct power-law tails for both high and low value limits. We find that the magnetic-field-line curvature is intermittently distributed in space. High curvature values reside near weak magnetic-field regions, while low curvature values are correlated with small magnitude of the force acting normal to the field lines. A simple statistical treatment provides an explanation for the observed curvature distribution. This novel statistical characterization of magnetic curvature in space plasma provides a starting point for assessing, in a turbulence context, the applicability and impact of particle energization processes, such as curvature drift, that rely on this fundamental quantity. △ Less

Submitted 29 March, 2020; v1 submitted 19 December, 2019; originally announced December 2019.

Comments: Accepted for Publication in the Astrophysical Journal Letters

arXiv:1912.04115 [pdf, other]

Query Auto Completion for Math Formula Search

Authors: Shaurya Rohatgi, Wei Zhong, Richard Zanibbi, Jian Wu, C. Lee Giles

Abstract: Query Auto Completion (QAC) is among the most appealing features of a web search engine. It helps users formulate queries quickly with less effort. Although there has been much effort in this area for text, to the best of our knowledge there is few work on mathematical formula auto completion. In this paper, we implement 5 existing QAC methods on mathematical formula and evaluate them on the NTCIR… ▽ More Query Auto Completion (QAC) is among the most appealing features of a web search engine. It helps users formulate queries quickly with less effort. Although there has been much effort in this area for text, to the best of our knowledge there is few work on mathematical formula auto completion. In this paper, we implement 5 existing QAC methods on mathematical formula and evaluate them on the NTCIR-12 MathIR task dataset. We report the efficiency of retrieved results using Mean Reciprocal Rank (MRR) and Mean Average Precision(MAP). Our study indicates that the Finite State Transducer outperforms other QAC models with a MRR score of $0.642$. △ Less

Submitted 9 December, 2019; originally announced December 2019.

arXiv:1912.00839 [pdf, other]

Automatic Generation of Headlines for Online Math Questions

Authors: Ke Yuan, Dafang He, Zhuoren Jiang, Liangcai Gao, Zhi Tang, C. Lee Giles

Abstract: Mathematical equations are an important part of dissemination and communication of scientific information. Students, however, often feel challenged in reading and understanding math content and equations. With the development of the Web, students are posting their math questions online. Nevertheless, constructing a concise math headline that gives a good description of the posted detailed math que… ▽ More Mathematical equations are an important part of dissemination and communication of scientific information. Students, however, often feel challenged in reading and understanding math content and equations. With the development of the Web, students are posting their math questions online. Nevertheless, constructing a concise math headline that gives a good description of the posted detailed math question is nontrivial. In this study, we explore a novel summarization task denoted as geNerating A concise Math hEadline from a detailed math question (NAME). Compared to conventional summarization tasks, this task has two extra and essential constraints: 1) Detailed math questions consist of text and math equations which require a unified framework to jointly model textual and mathematical information; 2) Unlike text, math equations contain semantic and structural features, and both of them should be captured together. To address these issues, we propose MathSum, a novel summarization model which utilizes a pointer mechanism combined with a multi-head attention mechanism for mathematical representation augmentation. The pointer mechanism can either copy textual tokens or math tokens from source questions in order to generate math headlines. The multi-head attention mechanism is designed to enrich the representation of math equations by modeling and integrating both its semantic and structural features. For evaluation, we collect and make available two sets of real-world detailed math questions along with human-written math headlines, namely EXEQ-300k and OFEQ-10k. Experimental results demonstrate that our model (MathSum) significantly outperforms state-of-the-art models for both the EXEQ-300k and OFEQ-10k datasets. △ Less

Submitted 27 November, 2019; originally announced December 2019.

Journal ref: AAA2020

arXiv:1911.08478 [pdf, other]

Sibling Neural Estimators: Improving Iterative Image Decoding with Gradient Communication

Authors: Ankur Mali, Alexander G. Ororbia, Clyde Lee Giles

Abstract: For lossy image compression, we develop a neural-based system which learns a nonlinear estimator for decoding from quantized representations. The system links two recurrent networks that \help" each other reconstruct same target image patches using complementary portions of spatial context that communicate via gradient signals. This dual agent system builds upon prior work that proposed the iterat… ▽ More For lossy image compression, we develop a neural-based system which learns a nonlinear estimator for decoding from quantized representations. The system links two recurrent networks that \help" each other reconstruct same target image patches using complementary portions of spatial context that communicate via gradient signals. This dual agent system builds upon prior work that proposed the iterative refinement algorithm for recurrent neural network (RNN)based decoding which improved image reconstruction compared to standard decoding techniques. Our approach, which works with any encoder, neural or non-neural, This system progressively reduces image patch reconstruction error over a fixed number of steps. Experiment with variants of RNN memory cells, with and without future information, find that our model consistently creates lower distortion images of higher perceptual quality compared to other approaches. Specifically, on the Kodak Lossless True Color Image Suite, we observe as much as a 1:64 decibel (dB) gain over JPEG, a 1:46 dB gain over JPEG 2000, a 1:34 dB gain over the GOOG neural baseline, 0:36 over E2E (a modern competitive neural compression model), and 0:37 over a single iterative neural decoder. △ Less

Submitted 19 November, 2019; originally announced November 2019.

Comments: 11 Pages, 2 figures, 1 Table

arXiv:1911.04644 [pdf, other]

Connecting First and Second Order Recurrent Networks with Deterministic Finite Automata

Authors: Qinglong Wang, Kaixuan Zhang, Xue Liu, C. Lee Giles

Abstract: We propose an approach that connects recurrent networks with different orders of hidden interaction with regular grammars of different levels of complexity. We argue that the correspondence between recurrent networks and formal computational models gives understanding to the analysis of the complicated behaviors of recurrent networks. We introduce an entropy value that categorizes all regular gram… ▽ More We propose an approach that connects recurrent networks with different orders of hidden interaction with regular grammars of different levels of complexity. We argue that the correspondence between recurrent networks and formal computational models gives understanding to the analysis of the complicated behaviors of recurrent networks. We introduce an entropy value that categorizes all regular grammars into three classes with different levels of complexity, and show that several existing recurrent networks match grammars from either all or partial classes. As such, the differences between regular grammars reveal the different properties of these models. We also provide a unification of all investigated recurrent networks. Our evaluation shows that the unified recurrent network has improved performance in learning grammars, and demonstrates comparable performance on a real-world dataset with more complicated models. △ Less

Submitted 11 November, 2019; originally announced November 2019.

arXiv:1910.06509 [pdf, other]

Shapley Homology: Topological Analysis of Sample Influence for Neural Networks

Authors: Kaixuan Zhang, Qinglong Wang, Xue Liu, C. Lee Giles

Abstract: Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (iid). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this wor… ▽ More Data samples collected for training machine learning models are typically assumed to be independent and identically distributed (iid). Recent research has demonstrated that this assumption can be problematic as it simplifies the manifold of structured data. This has motivated different research areas such as data poisoning, model improvement, and explanation of machine learning models. In this work, we study the influence of a sample on determining the intrinsic topological features of its underlying manifold. We propose the Shapley Homology framework, which provides a quantitative metric for the influence of a sample of the homology of a simplicial complex. By interpreting the influence as a probability measure, we further define an entropy which reflects the complexity of the data manifold. Our empirical studies show that when using the 0-dimensional homology, on neighboring graphs, samples with higher influence scores have more impact on the accuracy of neural networks for determining the graph connectivity and on several regular grammars whose higher entropy values imply more difficulty in being learned. △ Less

Submitted 14 October, 2019; originally announced October 2019.

arXiv:1909.11255 [pdf]

doi 10.1029/2019GL085542

A New Method of 3D Magnetic Field Reconstruction

Authors: R. B. Torbert, I. Dors, M. R. Argall, K. J. Genestreti, J. L. Burch, C. J. Farrugia, T. G. Forbes, B. L. Giles, R. J. Strangeway

Abstract: A method is described to model the magnetic field in the vicinity of constellations of multiple satellites using field and plasma current measurements. This quadratic model has the properties that the divergence is zero everywhere and matches the measured values of the magnetic field and its curl (current) at each spacecraft, and thus extends the linear curlometer method to second order. It is abl… ▽ More A method is described to model the magnetic field in the vicinity of constellations of multiple satellites using field and plasma current measurements. This quadratic model has the properties that the divergence is zero everywhere and matches the measured values of the magnetic field and its curl (current) at each spacecraft, and thus extends the linear curlometer method to second order. It is able to predict the topology of the field lines near magnetic structures, such as near reconnecting regions or flux ropes, and allows a tracking of the motion of these structures relative to the spacecraft constellation. Comparisons to PIC simulations estimate the model accuracy. Reconstruction of two electron diffusion regions show the expected field line structure. The model can be applied to other small-scale phenomena (bow shock, waves of commensurate wavelength), and can be modified to reconstruct also the electric field, allowing tracing of particle trajectories. △ Less

Submitted 24 September, 2019; originally announced September 2019.

Comments: 27 pages; 7 figures; 1 table

arXiv:1909.05233 [pdf, other]

The Neural State Pushdown Automata

Authors: Ankur Mali, Alexander Ororbia, C. Lee Giles

Abstract: In order to learn complex grammars, recurrent neural networks (RNNs) require sufficient computational resources to ensure correct grammar recognition. A widely-used approach to expand model capacity would be to couple an RNN to an external memory stack. Here, we introduce a "neural state" pushdown automaton (NSPDA), which consists of a digital stack, instead of an analog one, that is coupled to a… ▽ More In order to learn complex grammars, recurrent neural networks (RNNs) require sufficient computational resources to ensure correct grammar recognition. A widely-used approach to expand model capacity would be to couple an RNN to an external memory stack. Here, we introduce a "neural state" pushdown automaton (NSPDA), which consists of a digital stack, instead of an analog one, that is coupled to a neural network state machine. We empirically show its effectiveness in recognizing various context-free grammars (CFGs). First, we develop the underlying mechanics of the proposed higher order recurrent network and its manipulation of a stack as well as how to stably program its underlying pushdown automaton (PDA) to achieve desired finite-state network dynamics. Next, we introduce a noise regularization scheme for higher-order (tensor) networks, to our knowledge the first of its kind, and design an algorithm for improved incremental learning. Finally, we design a method for inserting grammar rules into a NSPDA and empirically show that this prior knowledge improves its training convergence time by an order of magnitude and, in some cases, leads to better generalization. The NSPDA is also compared to a classical analog stack neural network pushdown automaton (NNPDA) as well as a wide array of first and second-order RNNs with and without external memory, trained using different learning algorithms. Our results show that, for Dyck(2) languages, prior rule-based knowledge is critical for optimization convergence and for ensuring generalization to longer sequences at test time. We observe that many RNNs with and without memory, but no prior knowledge, fail to converge and generalize poorly on CFGs. △ Less

Submitted 19 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

Comments: 10 pages, 7 Table, 1 figure

arXiv:1907.06802 [pdf, other]

doi 10.1103/PhysRevLett.124.225101

In-situ observation of Hall Magnetohydrodynamic Cascade in Space Plasma

Authors: Riddhi Bandyopadhyay, Luca Sorriso-Valvo, Alexandros Chasapis, Petr Hellinger, William H. Matthaeus, Andrea Verdini, Simone Landi, Luca Franci, Lorenzo Matteini, Barbara L. Giles, Daniel J. Gershman Craig J. Pollock, Christopher T. Russell, Robert J. Strangeway, Roy B. Torbert, Thomas E. Moore, James L. Burch

Abstract: We present estimates of the turbulent energy cascade rate, derived from a Hall-MHD third-order law. We compute the contribution from the Hall term and the MHD term to the energy flux. We use MMS data accumulated in the magnetosheath and the solar wind, and compare the results with previously established simulation results. We find that in observation, the MHD contribution is dominant at inertial s… ▽ More We present estimates of the turbulent energy cascade rate, derived from a Hall-MHD third-order law. We compute the contribution from the Hall term and the MHD term to the energy flux. We use MMS data accumulated in the magnetosheath and the solar wind, and compare the results with previously established simulation results. We find that in observation, the MHD contribution is dominant at inertial scales, as in the simulations, but the Hall term becomes significant in observations at larger scales than in the simulations. Possible reasons are offered for this unanticipated result. △ Less

Submitted 2 May, 2020; v1 submitted 15 July, 2019; originally announced July 2019.

Comments: Accepted for Publication in Physical Review Letters

Journal ref: Phys. Rev. Lett. 124, 225101 (2020)

arXiv:1906.08470 [pdf, other]

Cleaning Noisy and Heterogeneous Metadata for Record Linking Across Scholarly Big Datasets

Authors: Athar Sefid, Jian Wu, Allen C. Ge, Jing Zhao, Lu Liu, Cornelia Caragea, Prasenjit Mitra, C. Lee Giles

Abstract: Automatically extracted metadata from scholarly documents in PDF formats is usually noisy and heterogeneous, often containing incomplete fields and erroneous values. One common way of cleaning metadata is to use a bibliographic reference dataset. The challenge is to match records between corpora with high precision. The existing solution which is based on information retrieval and string similarit… ▽ More Automatically extracted metadata from scholarly documents in PDF formats is usually noisy and heterogeneous, often containing incomplete fields and erroneous values. One common way of cleaning metadata is to use a bibliographic reference dataset. The challenge is to match records between corpora with high precision. The existing solution which is based on information retrieval and string similarity on titles works well only if the titles are cleaned. We introduce a system designed to match scholarly document entities with noisy metadata against a reference dataset. The blocking function uses the classic BM25 algorithm to find the matching candidates from the reference data that has been indexed by ElasticSearch. The core components use supervised methods which combine features extracted from all available metadata fields. The system also leverages available citation information to match entities. The combination of metadata and citation achieves high accuracy that significantly outperforms the baseline method on the same test dataset. We apply this system to match the database of CiteSeerX against Web of Science, PubMed, and DBLP. This method will be deployed in the CiteSeerX system to clean metadata and link records to other scholarly big datasets. △ Less

Submitted 20 June, 2019; originally announced June 2019.

arXiv:1905.10696 [pdf, other]

Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting

Authors: Alexander Ororbia, Ankur Mali, Daniel Kifer, C. Lee Giles

Abstract: In lifelong learning systems based on artificial neural networks, one of the biggest obstacles is the inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points… ▽ More In lifelong learning systems based on artificial neural networks, one of the biggest obstacles is the inability to retain old knowledge as new information is encountered. This phenomenon is known as catastrophic forgetting. In this paper, we propose a new kind of connectionist architecture, the Sequential Neural Coding Network, that is robust to forgetting when learning from streams of data points and, unlike networks of today, does not learn via the popular back-propagation of errors. Grounded in the neurocognitive theory of predictive processing, our model adapts synapses in a biologically-plausible fashion while another neural system learns to direct and control this cortex-like structure, mimicking some of the task-executive control functionality of the basal ganglia. In our experiments, we demonstrate that our self-organizing system experiences significantly less forgetting compared to standard neural models, outperforming a swath of previously proposed methods, including rehearsal/data buffer-based methods, on both standard (SplitMNIST, Split Fashion MNIST, etc.) and custom benchmarks even though it is trained in a stream-like fashion. Our work offers evidence that emulating mechanisms in real neuronal systems, e.g., local learning, lateral competition, can yield new directions and possibilities for tackling the grand challenge of lifelong machine learning. △ Less

Submitted 14 August, 2022; v1 submitted 25 May, 2019; originally announced May 2019.

Comments: Updated revision, additional baseline results, and expanded appendix (includes derivation from total discrepancy/variational free energy)

arXiv:1905.09466 [pdf, ps, other]

doi 10.1017/S0022377820000021

On the deviation from Maxwellian of the ion velocity distribution functions in the turbulent magnetosheath

Authors: Silvia Perri, D. Perrone, E. Yordanova, L. Sorriso-Valvo, W. R. Paterson, D. J. Gershman, B. L. Giles, C. J. Pollock, J. C. Dorelli, L. A. Avanov, B. Lavraud, Y. Saito, R. Nakamura, D. Fischer, W. Baumjohann, F. Plaschke, Y. Narita, W. Magnes, C. T. Russell, R. J. Strangeway, O. Le Contel, Y. Khotyaintsev, F. Valentini

Abstract: The degree of deviation from the thermodynamic equilibrium in the ion velocity distribution functions (VDFs), measured by the Magnetospheric Multiscale (MMS) mission in the Earth's turbulent magnetosheath, is quantitatively investigated. Taking advantage of MMS ion data, having a resolution never reached before in space missions, and of the comparison with Vlasov-Maxwell simulations, this analysis… ▽ More The degree of deviation from the thermodynamic equilibrium in the ion velocity distribution functions (VDFs), measured by the Magnetospheric Multiscale (MMS) mission in the Earth's turbulent magnetosheath, is quantitatively investigated. Taking advantage of MMS ion data, having a resolution never reached before in space missions, and of the comparison with Vlasov-Maxwell simulations, this analysis aims at relating any deviation from Maxwellian equilibrium to typical plasma parameters. Correlations of the non-Maxwellian features with plasma quantities such as electric fields, ion temperature, current density and ion vorticity are very similar in both magnetosheath data and numerical experiments, and suggest that distortions in the ion VDFs occur close to (but not exactly at) peaks in current density and ion temperature. Similar results have also been found during a magnetopause crossing by MMS. This work could help clarifying the origin of distortion of the ion VDFs in space plasmas. △ Less

Submitted 22 May, 2019; originally announced May 2019.

arXiv:1903.01876 [pdf, other]

doi 10.1103/PhysRevE.99.043204

In situ spacecraft observations of a structured electron diffusion region during magnetopause reconnection

Authors: Giulia Cozzani, Alessandro Retinò, Francesco Califano, Alexandra Alexandrova, Olivier Le Contel, Yuri Khotyaintsev, Andris Vaivads, Huishan Fu, Filomena Catapano, Hugo Breuillard, Narges Ahmadi, Per-Arne Lindqvist, Robert E. Ergun, Robert B. Torbert, Barbara L. Giles, Christopher T. Russell, Rumi Nakamura, Stephen Fuselier, Barry H. Mauk, Thomas Moore, James L. Burch

Abstract: The Electron Diffusion Region (EDR) is the region where magnetic reconnection is initiated and electrons are energized. Because of experimental difficulties, the structure of the EDR is still poorly understood. A key question is whether the EDR has a homogeneous or patchy structure. Here we report Magnetospheric MultiScale (MMS) novel spacecraft observations providing evidence of inhomogeneous cur… ▽ More The Electron Diffusion Region (EDR) is the region where magnetic reconnection is initiated and electrons are energized. Because of experimental difficulties, the structure of the EDR is still poorly understood. A key question is whether the EDR has a homogeneous or patchy structure. Here we report Magnetospheric MultiScale (MMS) novel spacecraft observations providing evidence of inhomogeneous current densities and energy conversion over a few electron inertial lengths within an EDR at the terrestrial magnetopause, suggesting that the EDR can be rather structured. These inhomogenenities are revealed through multi-point measurements because the spacecraft separation is comparable to a few electron inertial lengths, allowing the entire MMS tetrahedron to be within the EDR most of the time. These observations are consistent with recent high-resolution and low-noise kinetic simulations. △ Less

Submitted 5 March, 2019; originally announced March 2019.

Journal ref: Phys. Rev. E 99, 043204 (2019)

arXiv:1901.03869 [pdf, other]

doi 10.3847/2041-8213/aafe0d

Magnetospheric Multiscale Observation of Kinetic Signatures in the Alfvén Vortex

Authors: Tieyan Wang, Olga Alexandrova, Denise Perrone, Malcolm Dunlop, Xiangcheng Dong, Robert Bingham, Yu. V. Khotyaintsev, C. T. Russell, B. L. Giles, R. B. Torbert, R. E. Ergun, J. L. Burch

Abstract: Alfvén vortex is a multi-scale nonlinear structure which contributes to intermittency of turbulence. Despite previous explorations mostly on the spatial properties of the Alfvén vortex (i.e., scale, orientation, and motion), the plasma characteristics within the Alfvén vortex are unknown. Moreover, the connection between the plasma energization and the Alfvén vortex still remains unclear. Based on… ▽ More Alfvén vortex is a multi-scale nonlinear structure which contributes to intermittency of turbulence. Despite previous explorations mostly on the spatial properties of the Alfvén vortex (i.e., scale, orientation, and motion), the plasma characteristics within the Alfvén vortex are unknown. Moreover, the connection between the plasma energization and the Alfvén vortex still remains unclear. Based on high resolution in-situ measurement from the Magnetospheric Multiscale (MMS) mission, we report for the first time, distinctive plasma features within an Alfvén vortex. This Alfvén vortex is identified to be two-dimensional ($k_{\bot} \gg k_{\|}$) quasi-monopole with a radius of ~10 proton gyroscales. Its magnetic fluctuations $δB_{\bot}$ are anti correlated with velocity fluctuations $δV_{\bot}$, thus the parallel current density $j_{\|}$ and flow vorticity $ω_{\|}$ are anti-aligned. In different part of the vortex (i.e., edge, middle, center), the ion and electron temperatures are found to be quite different and they behave in the reverse trend: the ion temperature variations are correlated with $j_{\|}$, while the electron temperature variations are correlated with $ω_{\|}$. Furthermore, the temperature anisotropies, together with the non-Maxwellian kinetic effects, exhibit strong enhancement at peaks of $|ω_{\|}| (|j_{\|}|)$ within the vortex. Comparison between observations and numerical/theoretical results are made. In addition, the energy-conversion channels and the compressibility associated with the Alfvén vortex are discussed. These results may help to understand the link between coherent vortex structures and the kinetic processes, which determines how turbulence energy dissipate in the weakly-collisional space plasmas. △ Less

Submitted 12 January, 2019; originally announced January 2019.

arXiv:1901.01076 [pdf, other]

doi 10.1029/2018GL081804

Observations of Magnetic Reconnection in the Transition Region of Quasi-Parallel Shocks

Authors: I. Gingell, S. J. Schwartz, J. P. Eastwood, J. E. Stawarz, J. L. Burch, R. E. Ergun, S. Fuselier, D. J. Gershman, B. L. Giles, Y. V. Khotyaintsev, B. Lavraud, P. -A. Lindqvist, W. R. Paterson, T. D. Phan, C. T. Russell, R. J. Strangeway, R. B. Torbert, F. Wilder

Abstract: Using observations of Earth's bow shock by the Magnetospheric Multiscale mission, we show for the first time that active magnetic reconnection is occurring at current sheets embedded within the quasi-parallel shock's transition layer. We observe an electron jet and heating but no ion response, suggesting we have observed an electron-only mode. The lack of ion response is consistent with simulation… ▽ More Using observations of Earth's bow shock by the Magnetospheric Multiscale mission, we show for the first time that active magnetic reconnection is occurring at current sheets embedded within the quasi-parallel shock's transition layer. We observe an electron jet and heating but no ion response, suggesting we have observed an electron-only mode. The lack of ion response is consistent with simulations showing reconnection onset on sub-ion timescales. We also discuss the impact of electron heating in shocks via reconnection. △ Less

Submitted 4 January, 2019; originally announced January 2019.

arXiv:1811.06029 [pdf, other]

Verification of Recurrent Neural Networks Through Rule Extraction

Authors: Qinglong Wang, Kaixuan Zhang, Xue Liu, C. Lee Giles

Abstract: The verification problem for neural networks is verifying whether a neural network will suffer from adversarial samples, or approximating the maximal allowed scale of adversarial perturbation that can be endured. While most prior work contributes to verifying feed-forward networks, little has been explored for verifying recurrent networks. This is due to the existence of a more rigorous constraint… ▽ More The verification problem for neural networks is verifying whether a neural network will suffer from adversarial samples, or approximating the maximal allowed scale of adversarial perturbation that can be endured. While most prior work contributes to verifying feed-forward networks, little has been explored for verifying recurrent networks. This is due to the existence of a more rigorous constraint on the perturbation space for sequential data, and the lack of a proper metric for measuring the perturbation. In this work, we address these challenges by proposing a metric which measures the distance between strings, and use deterministic finite automata (DFA) to represent a rigorous oracle which examines if the generated adversarial samples violate certain constraints on a perturbation. More specifically, we empirically show that certain recurrent networks allow relatively stable DFA extraction. As such, DFAs extracted from these recurrent networks can serve as a surrogate oracle for when the ground truth DFA is unknown. We apply our verification mechanism to several widely used recurrent networks on a set of the Tomita grammars. The results demonstrate that only a few models remain robust against adversarial samples. In addition, we show that for grammars with different levels of complexity, there is also a difference in the difficulty of robust learning of these grammars. △ Less

Submitted 14 November, 2018; originally announced November 2018.

arXiv:1810.07411 [pdf, other]

Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations

Authors: Alexander Ororbia, Ankur Mali, C. Lee Giles, Daniel Kifer

Abstract: Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications. However, training these models often relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of back-propagation itself does not per… ▽ More Temporal models based on recurrent neural networks have proven to be quite powerful in a wide variety of applications. However, training these models often relies on back-propagation through time, which entails unfolding the network over many time steps, making the process of conducting credit assignment considerably more challenging. Furthermore, the nature of back-propagation itself does not permit the use of non-differentiable activation functions and is inherently sequential, making parallelization of the underlying training process difficult. Here, we propose the Parallel Temporal Neural Coding Network (P-TNCN), a biologically inspired model trained by the learning algorithm we call Local Representation Alignment. It aims to resolve the difficulties and problems that plague recurrent networks trained by back-propagation through time. The architecture requires neither unrolling in time nor the derivatives of its internal activation functions. We compare our model and learning procedure to other back-propagation through time alternatives (which also tend to be computationally expensive), including real-time recurrent learning, echo state networks, and unbiased online recurrent optimization. We show that it outperforms these on sequence modeling benchmarks such as Bouncing MNIST, a new benchmark we denote as Bouncing NotMNIST, and Penn Treebank. Notably, our approach can in some instances outperform full back-propagation through time as well as variants such as sparse attentive back-tracking. Significantly, the hidden unit correction phase of P-TNCN allows it to adapt to new datasets even if its synaptic weights are held fixed (zero-shot adaptation) and facilitates retention of prior generative knowledge when faced with a task sequence. We present results that show the P-TNCN's ability to conduct zero-shot adaptation and online continual sequence modeling. △ Less

Submitted 10 August, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

Comments: Important revisions made throughout (additional items/results added, including a complexity analysis)

arXiv:1809.06932 [pdf]

doi 10.1126/science.aat2998

Electron-Scale Dynamics of the Diffusion Region during Symmetric Magnetic Reconnection in Space

Authors: R. B. Torbert, J. L. Burch, T. D. Phan, M. Hesse, M. R. Argall, J. Shuster, R. E. Ergun, L. Alm, R. Nakamura, K. Genestreti, D. J. Gershman, W. R. Paterson, D. L. Turner, I. Cohen, B. L. Giles, C. J. Pollock, S. Wang, L. -J. Chen, Julia Stawarz, J. P. Eastwood, K. - J. Hwang, C. Farrugia, I. Dors, H. Vaith, C. Mouikis , et al. (24 additional authors not shown)

Abstract: Magnetic reconnection is an energy conversion process important in many astrophysical contexts including the Earth's magnetosphere, where the process can be investigated in-situ. Here we present the first encounter of a reconnection site by NASA's Magnetospheric Multiscale (MMS) spacecraft in the magnetotail, where reconnection involves symmetric inflow conditions. The unprecedented electron-scale… ▽ More Magnetic reconnection is an energy conversion process important in many astrophysical contexts including the Earth's magnetosphere, where the process can be investigated in-situ. Here we present the first encounter of a reconnection site by NASA's Magnetospheric Multiscale (MMS) spacecraft in the magnetotail, where reconnection involves symmetric inflow conditions. The unprecedented electron-scale plasma measurements revealed (1) super-Alfvenic electron jets reaching 20,000 km/s, (2) electron meandering motion and acceleration by the electric field, producing multiple crescent-shaped structures, (3) spatial dimensions of the electron diffusion region implying a reconnection rate of 0.1-0.2. The well-structured multiple layers of electron populations indicate that, despite the presence of turbulence near the reconnection site, the key electron dynamics appears to be largely laminar. △ Less

Submitted 18 September, 2018; originally announced September 2018.

Comments: 4 pages, 3 figures, and supplementary material

arXiv:1809.03050 [pdf, other]

TextContourNet: a Flexible and Effective Framework for Improving Scene Text Detection Architecture with a Multi-task Cascade

Authors: Dafang He, Xiao Yang, Daniel Kifer, C. Lee Giles

Abstract: We study the problem of extracting text instance contour information from images and use it to assist scene text detection. We propose a novel and effective framework for this and experimentally demonstrate that: (1) A CNN that can be effectively used to extract instance-level text contour from natural images. (2) The extracted contour information can be used for better scene text detection. We pr… ▽ More We study the problem of extracting text instance contour information from images and use it to assist scene text detection. We propose a novel and effective framework for this and experimentally demonstrate that: (1) A CNN that can be effectively used to extract instance-level text contour from natural images. (2) The extracted contour information can be used for better scene text detection. We propose two ways for learning the contour task together with the scene text detection: (1) as an auxiliary task and (2) as multi-task cascade. Extensive experiments with different benchmark datasets demonstrate that both designs improve the performance of a state-of-the-art scene text detector and that a multi-task cascade design achieves the best performance. △ Less

Submitted 2 December, 2018; v1 submitted 9 September, 2018; originally announced September 2018.

Comments: 9 pages(including references); WACV 2019

arXiv:1809.03036 [pdf, ps, other]

A Neural Temporal Model for Human Motion Prediction

Authors: Anand Gopalakrishnan, Ankur Mali, Dan Kifer, C. Lee Giles, Alexander G. Ororbia

Abstract: We propose novel neural temporal models for predicting and synthesizing human motion, achieving state-of-the-art in modeling long-term motion trajectories while being competitive with prior work in short-term prediction and requiring significantly less computation. Key aspects of our proposed system include: 1) a novel, two-level processing architecture that aids in generating planned trajectories… ▽ More We propose novel neural temporal models for predicting and synthesizing human motion, achieving state-of-the-art in modeling long-term motion trajectories while being competitive with prior work in short-term prediction and requiring significantly less computation. Key aspects of our proposed system include: 1) a novel, two-level processing architecture that aids in generating planned trajectories, 2) a simple set of easily computable features that integrate derivative information, and 3) a novel multi-objective loss function that helps the model to slowly progress from simple next-step prediction to the harder task of multi-step, closed-loop prediction. Our results demonstrate that these innovations improve the modeling of long-term motion trajectories. Finally, we propose a novel metric, called Normalized Power Spectrum Similarity (NPSS), to evaluate the long-term predictive ability of motion synthesis models, complementing the popular mean-squared error (MSE) measure of Euler joint angles over time. We conduct a user study to determine if the proposed NPSS correlates with human evaluation of long-term motion more strongly than MSE and find that it indeed does. We release code and additional results (visualizations) for this paper at: https://github.com/cr7anand/neural_temporal_models △ Less

Submitted 22 November, 2019; v1 submitted 9 September, 2018; originally announced September 2018.

Comments: accepted to cvpr 2019

Journal ref: In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 12116-12125. 2019

arXiv:1809.02033 [pdf, other]

doi 10.1103/PhysRevLett.121.265101

Kinetic range spectral features of cross-helicity using MMS

Authors: Tulasi N. Parashar, Alexandros Chasapis, Riddhi Bandyopadhyay, Rohit Chhiber, W. H. Matthaeus, B. Maruca, M. A. Shay, J. L. Burch, T. E. Moore, B. L. Giles, D. J. Gershman, C. J. Pollock, R. B. Torbert, C. T. Russell, R. J. Strangeway, Vadim Roytershteyn

Abstract: We study spectral features of ion velocity and magnetic field correlations in the solar wind and in the magnetosheath using data from the Magnetospheric Multi-Scale (MMS) spacecraft. High resolution MMS observations enable the study of transition of these correlations between their magnetofluid character at larger scales into the sub-proton kinetic range, previously unstudied in spacecraft data. C… ▽ More We study spectral features of ion velocity and magnetic field correlations in the solar wind and in the magnetosheath using data from the Magnetospheric Multi-Scale (MMS) spacecraft. High resolution MMS observations enable the study of transition of these correlations between their magnetofluid character at larger scales into the sub-proton kinetic range, previously unstudied in spacecraft data. Cross-helicity, angular alignment and energy partitioning is examined over a suit- able range of scales, employing measurements based on the Taylor frozen-in approximation as well as direct two-spacecraft correlation measurements. The results demonstrate signatures of alignment at large scales. As kinetic scales are approached, the alignment between v and b is destroyed by demagnetization of protons. △ Less

Submitted 6 September, 2018; originally announced September 2018.

Journal ref: Phys. Rev. Lett. 121, 265101 (2018)

arXiv:1808.03603 [pdf, other]

doi 10.1029/2018JA025711

How accurately can we measure the reconnection rate $E_M$ for the MMS diffusion region event of 2017-07-11?

Authors: Kevin J. Genestreti, Takuma Nakamura, Rumi Nakamura, Richard E. Denton, Roy B. Torbert, James L. Burch, Ferdinand Plaschke, Stephen A. Fuselier, Robert E. Ergun, Barbara L. Giles, Christopher T. Russell

Abstract: We investigate the accuracy with which the reconnection electric field $E_M$ can be determined from in-situ plasma data. We study the magnetotail electron diffusion region observed by NASA's Magnetospheric Multiscale (MMS) on 2017-07-11 at 22:34 UT and focus on the very large errors in $E_M$ that result from errors in an $LMN$ boundary-normal coordinate system. We determine several $LMN$ coordinat… ▽ More We investigate the accuracy with which the reconnection electric field $E_M$ can be determined from in-situ plasma data. We study the magnetotail electron diffusion region observed by NASA's Magnetospheric Multiscale (MMS) on 2017-07-11 at 22:34 UT and focus on the very large errors in $E_M$ that result from errors in an $LMN$ boundary-normal coordinate system. We determine several $LMN$ coordinates for this MMS event using several different methods. We use these $M$ axes to estimate $E_M$. We find some consensus that the reconnection rate was roughly $E_M$=3.2 mV/m $\pm$ 0.06 mV/m, which corresponds to a normalized reconnection rate of $0.18\pm0.035$. Minimum variance analysis of the electron velocity (MVA-$v_e$), MVA of $E$, minimization of Faraday residue, and an adjusted version of the maximum directional derivative of the magnetic field (MDD-$B$) technique all produce {reasonably} similar coordinate axes. We use virtual MMS data from a particle-in-cell simulation of this event to estimate the errors in the coordinate axes and reconnection rate associated with MVA-$v_e$ and MDD-$B$. The $L$ and $M$ directions are most reliably determined by MVA-$v_e$ when the spacecraft observes a clear electron jet reversal. When the magnetic field data has errors as small as 0.5\% of the background field strength, the $M$ direction obtained by MDD-$B$ technique may be off by as much as 35$^\circ$. The normal direction is most accurately obtained by MDD-$B$. Overall, we find that these techniques were able to identify $E_M$ from the virtual data within error bars $\geq$20\%. △ Less

Submitted 10 August, 2018; originally announced August 2018.

Comments: Submitted to JGR - Space Physics

arXiv:1807.06140 [pdf, other]

doi 10.3847/1538-4357/aade93

Solar Wind Turbulence Studies using MMS Fast Plasma Investigation Data

Authors: Riddhi Bandyopadhyay, A. Chasapis, R. Chhiber, T. N. Parashar, B. A. Maruca, W. H. Matthaeus, S. J. Schwartz, S. Eriksson, O. LeContel, H. Breuillard, J. L. Burch, T. E. Moore, C. J. Pollock, B. L. Giles, W. R. Paterson, J. Dorelli, D. J. Gershman, R. B. Torbert, C. T. Russell, R. J. Strangeway

Abstract: Studies of solar wind turbulence traditionally employ high-resolution magnetic field data, but high-resolution measurements of ion and electron moments have been possible only recently. We report the first turbulence studies of ion and electron velocity moments accumulated in pristine solar wind by the Fast Particle Investigation instrument onboard the Magnetospheric Multiscale (MMS) Mission. Use… ▽ More Studies of solar wind turbulence traditionally employ high-resolution magnetic field data, but high-resolution measurements of ion and electron moments have been possible only recently. We report the first turbulence studies of ion and electron velocity moments accumulated in pristine solar wind by the Fast Particle Investigation instrument onboard the Magnetospheric Multiscale (MMS) Mission. Use of these data is made possible by a novel implementation of a frequency domain Hampel filter, described herein. After presenting procedures for processing of the data, we discuss statistical properties of solar wind turbulence extending into the kinetic range. Magnetic field fluctuations dominate electron and ion velocity fluctuation spectra throughout the energy-containing and inertial ranges. However, a multi-spacecraft analysis indicates that at scales shorter than the ion-inertial length, electron velocity fluctuations become larger than ion velocity and magnetic field fluctuations. The kurtosis of ion velocity peaks around few ion-inertial lengths and returns to near gaussian value at sub-ion scales. △ Less

Submitted 2 September, 2018; v1 submitted 16 July, 2018; originally announced July 2018.

Comments: Accepted for publication in The Astrophysical Journal Supplement

arXiv:1806.04275 [pdf, ps, other]

doi 10.3847/1538-4357/aade93

Incompressive Energy Transfer in the Earth's Magnetosheath: Magnetospheric Multiscale Observations

Authors: Riddhi Bandyopadhyay, A. Chasapis, R. Chhiber, T. N. Parashar, W. H. Matthaeus, M. A. Shay, B. A. Maruca, J. L. Burch, T. E. Moore, C. J. Pollock, B. L. Giles, W. R. Paterson, J. Dorelli, D. J. Gershman, R. B. Torbert, C. T. Russell, R. J. Strangeway

Abstract: Using observational data from the \emph{Magnetospheric Multiscale} (MMS) Mission in the Earth's magnetosheath, we estimate the energy cascade rate using different techniques within the framework of incompressible magnetohydrodynamic (MHD) turbulence. At the energy containing scale, the energy budget is controlled by the von Kármán decay law. Inertial range cascade is estimated by fitting a linear… ▽ More Using observational data from the \emph{Magnetospheric Multiscale} (MMS) Mission in the Earth's magnetosheath, we estimate the energy cascade rate using different techniques within the framework of incompressible magnetohydrodynamic (MHD) turbulence. At the energy containing scale, the energy budget is controlled by the von Kármán decay law. Inertial range cascade is estimated by fitting a linear scaling to the mixed third-order structure function. Finally, we use a multi-spacecraft technique to estimate the Kolmogorov-Yaglom-like cascade rate in the kinetic range, well below the ion inertial length scale. We find that the inertial range cascade rate is almost equal to the one predicted by the von Kármán law at the energy containing scale, while the cascade rate evaluated at the kinetic scale is somewhat lower, as anticipated in theory~\citep{Yang2017PoP}. Further, in agreement with a recent study~\citep{Hadid2018PRL}, we find that the incompressive cascade rate in the Earth's magnetosheath is about $1000$ times larger than the cascade rate in the pristine solar wind. △ Less

Submitted 29 August, 2018; v1 submitted 11 June, 2018; originally announced June 2018.

Comments: Accepted for publication in Astrophysical Journal

arXiv:1804.08588 [pdf, other]

Large Scale Scene Text Verification with Guided Attention

Authors: Dafang He, Yeqing Li, Alexander Gorban, Derrall Heath, Julian Ibarz, Qian Yu, Daniel Kifer, C. Lee Giles

Abstract: Many tasks are related to determining if a particular text string exists in an image. In this work, we propose a new framework that learns this task in an end-to-end way. The framework takes an image and a text string as input and then outputs the probability of the text string being present in the image. This is the first end-to-end framework that learns such relationships between text and images… ▽ More Many tasks are related to determining if a particular text string exists in an image. In this work, we propose a new framework that learns this task in an end-to-end way. The framework takes an image and a text string as input and then outputs the probability of the text string being present in the image. This is the first end-to-end framework that learns such relationships between text and images in scene text area. The framework does not require explicit scene text detection or recognition and thus no bounding box annotations are needed for it. It is also the first work in scene text area that tackles suh a weakly labeled problem. Based on this framework, we developed a model called Guided Attention. Our designed model achieves much better results than several state-of-the-art scene text reading based solutions for a challenging Street View Business Matching task. The task tries to find correct business names for storefront images and the dataset we collected for it is substantially larger, and more challenging than existing scene text dataset. This new real-world task provides a new perspective for studying scene text related problems. We also demonstrate the uniqueness of our task via a comparison between our problem and a typical Visual Question Answering problem. △ Less

Submitted 18 November, 2018; v1 submitted 23 April, 2018; originally announced April 2018.

Comments: 18 pages, ACCV 2019

arXiv:1804.08058 [pdf, other]

Adversarial Training for Community Question Answer Selection Based on Multi-scale Matching

Authors: Xiao Yang, Madian Khabsa, Miaosen Wang, Wei Wang, Madian Khabsa, Ahmed Awadallah, Daniel Kifer, C. Lee Giles

Abstract: Community-based question answering (CQA) websites represent an important source of information. As a result, the problem of matching the most valuable answers to their corresponding questions has become an increasingly popular research topic. We frame this task as a binary (relevant/irrelevant) classification problem, and present an adversarial training framework to alleviate label imbalance issue… ▽ More Community-based question answering (CQA) websites represent an important source of information. As a result, the problem of matching the most valuable answers to their corresponding questions has become an increasingly popular research topic. We frame this task as a binary (relevant/irrelevant) classification problem, and present an adversarial training framework to alleviate label imbalance issue. We employ a generative model to iteratively sample a subset of challenging negative samples to fool our classification model. Both models are alternatively optimized using REINFORCE algorithm. The proposed method is completely different from previous ones, where negative samples in training set are directly used or uniformly down-sampled. Further, we propose using Multi-scale Matching which explicitly inspects the correlation between words and ngrams of different levels of granularity. We evaluate the proposed method on SemEval 2016 and SemEval 2017 datasets and achieves state-of-the-art or similar performance. △ Less

Submitted 16 November, 2018; v1 submitted 21 April, 2018; originally announced April 2018.

arXiv:1803.05863 [pdf, other]

Learned Neural Iterative Decoding for Lossy Image Compression Systems

Authors: Alexander G. Ororbia, Ankur Mali, Jian Wu, Scott O'Connell, David Miller, C. Lee Giles

Abstract: For lossy image compression systems, we develop an algorithm, iterative refinement, to improve the decoder's reconstruction compared to standard decoding techniques. Specifically, we propose a recurrent neural network approach for nonlinear, iterative decoding. Our decoder, which works with any encoder, employs self-connected memory units that make use of causal and non-causal spatial context info… ▽ More For lossy image compression systems, we develop an algorithm, iterative refinement, to improve the decoder's reconstruction compared to standard decoding techniques. Specifically, we propose a recurrent neural network approach for nonlinear, iterative decoding. Our decoder, which works with any encoder, employs self-connected memory units that make use of causal and non-causal spatial context information to progressively reduce reconstruction error over a fixed number of steps. We experiment with variants of our estimator and find that iterative refinement consistently creates lower distortion images of higher perceptual quality compared to other approaches. Specifically, on the Kodak Lossless True Color Image Suite, we observe as much as a 0.871 decibel (dB) gain over JPEG, a 1.095 dB gain over JPEG 2000, and a 0.971 dB gain over a competitive neural model. △ Less

Submitted 10 November, 2018; v1 submitted 15 March, 2018; originally announced March 2018.

Comments: Vastly updated version, now includes JP2

arXiv:1803.01834 [pdf, other]

Conducting Credit Assignment by Aligning Local Representations

Authors: Alexander G. Ororbia, Ankur Mali, Daniel Kifer, C. Lee Giles

Abstract: Using back-propagation and its variants to train deep networks is often problematic for new users. Issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often make networks difficult to train, especially when users are experimenting with new architectures. Here, we present Local Representation Alignment (LRA), a training procedure that is… ▽ More Using back-propagation and its variants to train deep networks is often problematic for new users. Issues such as exploding gradients, vanishing gradients, and high sensitivity to weight initialization strategies often make networks difficult to train, especially when users are experimenting with new architectures. Here, we present Local Representation Alignment (LRA), a training procedure that is much less sensitive to bad initializations, does not require modifications to the network architecture, and can be adapted to networks with highly nonlinear and discrete-valued activation functions. Furthermore, we show that one variation of LRA can start with a null initialization of network weights and still successfully train networks with a wide variety of nonlinearities, including tanh, ReLU-6, softplus, signum and others that may draw their inspiration from biology. A comprehensive set of experiments on MNIST and the much harder Fashion MNIST data sets show that LRA can be used to train networks robustly and effectively, succeeding even when back-propagation fails and outperforming other alternative learning algorithms, such as target propagation and feedback alignment. △ Less

Submitted 12 July, 2018; v1 submitted 5 March, 2018; originally announced March 2018.

Comments: Full document revision/overhaul, new results/analyses, new diagrams, addition of appendices

arXiv:1803.01054 [pdf]

doi 10.1103/PhysRevB.100.134402

Long lifetime of thermally-excited magnons in bulk yttrium iron garnet

Authors: John S. Jamison, Zihao Yang, Brandon L. Giles, Jack T. Brangham, Guanzhong Wu, P. Chris Hammel, Fengyuan Yang, Roberto C. Myers

Abstract: Spin currents are generated within the bulk of magnetic materials due to heat flow, an effect called intrinsic spin-Seebeck. This bulk bosonic spin current consists of a diffusing thermal magnon cloud, parametrized by the magnon chemical potential ($μ_{m}$), with a diffusion length of several microns in yttrium iron garnet (YIG). Transient opto-thermal measurements of the spin-Seebeck effect (SSE)… ▽ More Spin currents are generated within the bulk of magnetic materials due to heat flow, an effect called intrinsic spin-Seebeck. This bulk bosonic spin current consists of a diffusing thermal magnon cloud, parametrized by the magnon chemical potential ($μ_{m}$), with a diffusion length of several microns in yttrium iron garnet (YIG). Transient opto-thermal measurements of the spin-Seebeck effect (SSE) as a function of temperature reveal the time evolution of $μ_{m}$ due to intrinsic SSE in YIG. The interface SSE develops at times < 2 ns while the intrinsic SSE signal continues to evolve at times > 500 $μ$s, dominating the temperature dependence of SSE in bulk YIG. Time-dependent SSE data are fit to a multi-temperature model of coupled spin/heat transport using finite element method (FEM), where the magnon spin lifetime ($τ$) and magnon-phonon thermalization time ($τ_{mp}$) are used as fit parameters. From 300 K to 4 K, $τ_{mp}$ varies from 1 to 10 ns, whereas $τ$ varies from 2 to 60 $μ$s with the spin lifetime peaking at 90 K. At low temperature, a reduction in $τ$ is observed consistent with impurity relaxation reported in ferromagnetic resonance measurements. These results demonstrate that the thermal magnon cloud in YIG contains extremely low frequency magnons (~10 GHz) providing spectral insight to the microscopic scattering processes involved in magnon spin/heat diffusion. △ Less

Submitted 9 September, 2019; v1 submitted 2 March, 2018; originally announced March 2018.

Comments: 35 pages and 17 figures

Journal ref: Phys. Rev. B 100, 134402 (2019)

arXiv:1801.06481 [pdf, other]

Active Learning of Strict Partial Orders: A Case Study on Concept Prerequisite Relations

Authors: Chen Liang, Jianbo Ye, Han Zhao, Bart Pursel, C. Lee Giles

Abstract: Strict partial order is a mathematical structure commonly seen in relational data. One obstacle to extracting such type of relations at scale is the lack of large-scale labels for building effective data-driven solutions. We develop an active learning framework for mining such relations subject to a strict order. Our approach incorporates relational reasoning not only in finding new unlabeled pair… ▽ More Strict partial order is a mathematical structure commonly seen in relational data. One obstacle to extracting such type of relations at scale is the lack of large-scale labels for building effective data-driven solutions. We develop an active learning framework for mining such relations subject to a strict order. Our approach incorporates relational reasoning not only in finding new unlabeled pairs whose labels can be deduced from an existing label set, but also in devising new query strategies that consider the relational structure of labels. Our experiments on concept prerequisite relations show our proposed framework can substantially improve the classification performance with the same query budget compared to other baseline approaches. △ Less

Submitted 19 January, 2018; originally announced January 2018.

Comments: 12 pages

arXiv:1801.05420 [pdf, other]

A Comparative Study of Rule Extraction for Recurrent Neural Networks

Authors: Qinglong Wang, Kaixuan Zhang, Alexander G. Ororbia II, Xinyu Xing, Xue Liu, C. Lee Giles

Abstract: Understanding recurrent networks through rule extraction has a long history. This has taken on new interests due to the need for interpreting or verifying neural networks. One basic form for representing stateful rules is deterministic finite automata (DFA). Previous research shows that extracting DFAs from trained second-order recurrent networks is not only possible but also relatively stable. Re… ▽ More Understanding recurrent networks through rule extraction has a long history. This has taken on new interests due to the need for interpreting or verifying neural networks. One basic form for representing stateful rules is deterministic finite automata (DFA). Previous research shows that extracting DFAs from trained second-order recurrent networks is not only possible but also relatively stable. Recently, several new types of recurrent networks with more complicated architectures have been introduced. These handle challenging learning tasks usually involving sequential data. However, it remains an open problem whether DFAs can be adequately extracted from these models. Specifically, it is not clear how DFA extraction will be affected when applied to different recurrent networks trained on data sets with different levels of complexity. Here, we investigate DFA extraction on several widely adopted recurrent networks that are trained to learn a set of seven regular Tomita grammars. We first formally analyze the complexity of Tomita grammars and categorize these grammars according to that complexity. Then we empirically evaluate different recurrent networks for their performance of DFA extraction on all Tomita grammars. Our experiments show that for most recurrent networks, their extraction performance decreases as the complexity of the underlying grammar increases. On grammars of lower complexity, most recurrent networks obtain desirable extraction performance. As for grammars with the highest level of complexity, while several complicated models fail with only certain recurrent networks having satisfactory extraction performance. △ Less

Submitted 14 November, 2018; v1 submitted 15 January, 2018; originally announced January 2018.

arXiv:1801.01316 [pdf, other]

Text Extraction and Retrieval from Smartphone Screenshots: Building a Repository for Life in Media

Authors: Agnese Chiatti, Mu Jung Cho, Anupriya Gagneja, Xiao Yang, Miriam Brinberg, Katie Roehrick, Sagnik Ray Choudhury, Nilam Ram, Byron Reeves, C. Lee Giles

Abstract: Daily engagement in life experiences is increasingly interwoven with mobile device use. Screen capture at the scale of seconds is being used in behavioral studies and to implement "just-in-time" health interventions. The increasing psychological breadth of digital information will continue to make the actual screens that people view a preferred if not required source of data about life experiences… ▽ More Daily engagement in life experiences is increasingly interwoven with mobile device use. Screen capture at the scale of seconds is being used in behavioral studies and to implement "just-in-time" health interventions. The increasing psychological breadth of digital information will continue to make the actual screens that people view a preferred if not required source of data about life experiences. Effective and efficient Information Extraction and Retrieval from digital screenshots is a crucial prerequisite to successful use of screen data. In this paper, we present the experimental workflow we exploited to: (i) pre-process a unique collection of screen captures, (ii) extract unstructured text embedded in the images, (iii) organize image text and metadata based on a structured schema, (iv) index the resulting document collection, and (v) allow for Image Retrieval through a dedicated vertical search engine application. The adopted procedure integrates different open source libraries for traditional image processing, Optical Character Recognition (OCR), and Image Retrieval. Our aim is to assess whether and how state-of-the-art methodologies can be applied to this novel data set. We show how combining OpenCV-based pre-processing modules with a Long short-term memory (LSTM) based release of Tesseract OCR, without ad hoc training, led to a 74% character-level accuracy of the extracted text. Further, we used the processed repository as baseline for a dedicated Image Retrieval system, for the immediate use and application for behavioral and prevention scientists. We discuss issues of Text Information Extraction and Retrieval that are particular to the screenshot image case and suggest important future work. △ Less

Submitted 4 January, 2018; originally announced January 2018.

arXiv:1712.09866 [pdf]

doi 10.1029/2018JA025245

Magnetospheric Multiscale Dayside Reconnection Electron Diffusion Region Events

Authors: J. M. Webster, J. L. Burch, P. H. Reiff, D. B. Graham, R. B. Torbert, R. E. Ergun, A. G. Daou, S. Y. Sazykin, A. Marshall, R. C. Allen, L. -J. Chen, S. Wang, T. D. Phan, K. J. Genestreti, B. L. Giles, T. E. Moore, S. A. Fuselier, G. Cozzani, C. T. Russell, S. Eriksson, A. C. Rager, J. M. Broll, K. Goodrich, F. Wilder

Abstract: We have used the high-resolution data of the Magnetospheric Multiscale (MMS) mission dayside phase to identify twenty-one previously unreported encounters with the electron diffusion region (EDR), as evidenced by electron agyrotropy, ion jet reversals, and j dot E greater than 0. Three of the new EDR encounters, which occurred within a one-minute-long interval on November 23rd, 2016, are analyzed… ▽ More We have used the high-resolution data of the Magnetospheric Multiscale (MMS) mission dayside phase to identify twenty-one previously unreported encounters with the electron diffusion region (EDR), as evidenced by electron agyrotropy, ion jet reversals, and j dot E greater than 0. Three of the new EDR encounters, which occurred within a one-minute-long interval on November 23rd, 2016, are analyzed in detail. These events, which resulted from a relatively low and oscillating magnetopause velocity, contained large electric fields (several tens to hundreds of milliVolts per meter), crescent-shaped electron velocity phase space densities, large currents (greater than 2 microAmperes per square meter), and Ohmic heating of the plasma (near or exceeding 10 nanoWatts per cubic meter). Because of the slow in-and-out motion of the magnetopause, two of these events show the unprecedented mixture of perpendicular and parallel crescents, indicating the first breaking and reconnecting of solar wind and magnetospheric field lines. An extended list of thirty-two EDR or near-EDR events is also included, and demonstrates a wide variety of observed plasma behavior inside and surrounding the reconnection site. △ Less

Submitted 28 December, 2017; originally announced December 2017.

arXiv:1712.05697 [pdf]

doi 10.1002/2017GL076809

Localized Oscillatory Dissipation in Magnetopause Reconnection

Authors: J. L. Burch, R. E. Ergun, P. A. Cassak, J. M. Webster, R. B. Torbert, B. L. Giles, J. C. Dorelli, A. C. Rager, K. -J. Hwang, T. D. Phan, K. J. Genestreti, R. C. Allen, L. -J. Chen, S. Wang, D. Gershman, O. Le Contel, C. T. Russell, R. J. Strangeway, F. D. Wilder, D. B. Graham, M. Hesse, J. F. Drake, M. Swisdak, L. M. Price, M. A. Shay , et al. (4 additional authors not shown)

Abstract: Data from the NASA Magnetospheric Multiscale (MMS) mission are used to investigate asymmetric magnetic reconnection at the dayside boundary between the Earth's magnetosphere and the solar wind (the magnetopause). High-resolution measurements of plasmas, electric and magnetic fields, and waves are used to identify highly localized (~15 electron Debye lengths) standing wave structures with large ele… ▽ More Data from the NASA Magnetospheric Multiscale (MMS) mission are used to investigate asymmetric magnetic reconnection at the dayside boundary between the Earth's magnetosphere and the solar wind (the magnetopause). High-resolution measurements of plasmas, electric and magnetic fields, and waves are used to identify highly localized (~15 electron Debye lengths) standing wave structures with large electric-field amplitudes (up to 100 mV/m). These wave structures are associated with spatially oscillatory dissipation, which appears as alternatingly positive and negative values of J dot E (dissipation). For small guide magnetic fields the wave structures occur in the electron stagnation region at the magnetosphere edge of the EDR. For larger guide fields the structures also occur near the reconnection x-line. This difference is explained in terms of channels for the out-of-plane current (agyrotropic electrons at the stagnation point and guide-field-aligned electrons at the x-line). △ Less

Submitted 13 December, 2017; originally announced December 2017.

arXiv:1712.01804 [pdf, other]

doi 10.1038/s41586-018-0315-8

Hot Streaks in Artistic, Cultural, and Scientific Careers

Authors: Lu Liu, Yang Wang, Roberta Sinatra, C. Lee Giles, Chaoming Song, Dashun Wang

Abstract: The hot streak, loosely defined as winning begets more winnings, highlights a specific period during which an individual's performance is substantially higher than her typical performance. While widely debated in sports, gambling, and financial markets over the past several decades, little is known if hot streaks apply to individual careers. Here, building on rich literature on lifecycle of creati… ▽ More The hot streak, loosely defined as winning begets more winnings, highlights a specific period during which an individual's performance is substantially higher than her typical performance. While widely debated in sports, gambling, and financial markets over the past several decades, little is known if hot streaks apply to individual careers. Here, building on rich literature on lifecycle of creativity, we collected large-scale career histories of individual artists, movie directors and scientists, tracing the artworks, movies, and scientific publications they produced. We find that, across all three domains, hit works within a career show a high degree of temporal regularity, each career being characterized by bursts of high-impact works occurring in sequence. We demonstrate that these observations can be explained by a simple hot-streak model we developed, allowing us to probe quantitatively the hot streak phenomenon governing individual careers, which we find to be remarkably universal across diverse domains we analyzed: The hot streaks are ubiquitous yet unique across different careers. While the vast majority of individuals have at least one hot streak, hot streaks are most likely to occur only once. The hot streak emerges randomly within an individual's sequence of works, is temporally localized, and is unassociated with any detectable change in productivity. We show that, since works produced during hot streaks garner significantly more impact, the uncovered hot streaks fundamentally drives the collective impact of an individual, ignoring which leads us to systematically over- or under-estimate the future impact of a career. These results not only deepen our quantitative understanding of patterns governing individual ingenuity and success, they may also have implications for decisions and policies involving predicting and nurturing individuals with lasting impact. △ Less

Submitted 16 June, 2018; v1 submitted 5 December, 2017; originally announced December 2017.

arXiv:1711.11542 [pdf, other]

Learning to Adapt by Minimizing Discrepancy

Authors: Alexander G. Ororbia II, Patrick Haffner, David Reitter, C. Lee Giles

Abstract: We explore whether useful temporal neural generative models can be learned from sequential data without back-propagation through time. We investigate the viability of a more neurocognitively-grounded approach in the context of unsupervised generative modeling of sequences. Specifically, we build on the concept of predictive coding, which has gained influence in cognitive science, in a neural frame… ▽ More We explore whether useful temporal neural generative models can be learned from sequential data without back-propagation through time. We investigate the viability of a more neurocognitively-grounded approach in the context of unsupervised generative modeling of sequences. Specifically, we build on the concept of predictive coding, which has gained influence in cognitive science, in a neural framework. To do so we develop a novel architecture, the Temporal Neural Coding Network, and its learning algorithm, Discrepancy Reduction. The underlying directed generative model is fully recurrent, meaning that it employs structural feedback connections and temporal feedback connections, yielding information propagation cycles that create local learning signals. This facilitates a unified bottom-up and top-down approach for information transfer inside the architecture. Our proposed algorithm shows promise on the bouncing balls generative modeling problem. Further experiments could be conducted to explore the strengths and weaknesses of our approach. △ Less

Submitted 30 November, 2017; originally announced November 2017.

Comments: Note: Additional experiments in support of this paper are still running (updates will be made as they are completed)

arXiv:1711.08262 [pdf, other]

doi 10.1002/2017JA025019

MMS observation of asymmetric reconnection supported by 3-D electron pressure divergence

Authors: Kevin J. Genestreti, Ali Varsani, Jim L. Burch, Paul A. Cassak, Roy B. Torbert, Rumi Nakamura, Robert E. Ergun, Tai D. Phan, Sergio Toledo-Redondo, Michael Hesse, Shan Wang, Barbara L. Giles, Chris T. Russell, Zoltan Vörös, Kyoung-Joo Kim, Jonathan P. Eastwood, Benoit Lavraud, C. Philippe Escoubet, Robert C. Fear, Yuri Khotyaintsev, Takuma Nakamura, James M. Webster, Wolfgang Baumjohann

Abstract: We identify a dayside electron diffusion region (EDR) encountered by the Magnetospheric Multiscale (MMS) mission and estimate the terms in generalized Ohm's law that controlled energy conversion near the X-point. MMS crossed the moderate-shear (130 degrees) magnetopause southward of the exact X-point. MMS likely entered the magnetopause far from the X-point, outside the EDR, as the size of the rec… ▽ More We identify a dayside electron diffusion region (EDR) encountered by the Magnetospheric Multiscale (MMS) mission and estimate the terms in generalized Ohm's law that controlled energy conversion near the X-point. MMS crossed the moderate-shear (130 degrees) magnetopause southward of the exact X-point. MMS likely entered the magnetopause far from the X-point, outside the EDR, as the size of the reconnection layer was less than but comparable to the magnetosheath proton gyro-radius, and also as anisotropic gyrotropic "outflow" crescent electron distributions were observed. MMS then approached the X-point, where all four spacecraft simultaneously observed signatures of the EDR, e.g., an intense out-of-plane electron current, moderate electron agyrotropy, intense electron anisotropy, non-ideal electric fields, non-ideal energy conversion, etc. We find that the electric field associated with the non-ideal energy conversion is (a) well described by the sum of the electron inertial and pressure divergence terms in generalized Ohms law though (b) the pressure divergence term dominates the inertial term by roughly a factor of 5:1, (c) both the gyrotropic and agyrotropic pressure forces contribute to energy conversion at the X-point, and (d) both out-of-the-reconnection-plane gradients (d/dM) and in-plane (d/dL,N) in the pressure tensor contribute to energy conversion near the X-point. This indicates that this EDR had some electron-scale structure in the out-of-plane direction during the time when (and at the location where) the reconnection site was observed. △ Less

Submitted 5 January, 2018; v1 submitted 22 November, 2017; originally announced November 2017.

Comments: Submitted to JGR Space Physics (November 2017), resubmitted after 1st round of minor revisions (January 2018)

Showing 51–100 of 129 results for author: Giles, L