Search | arXiv e-print repository

arXiv:2505.19173 [pdf, other]

Investigating Pedagogical Teacher and Student LLM Agents: Genetic Adaptation Meets Retrieval Augmented Generation Across Learning Style

Authors: Debdeep Sanyal, Agniva Maiti, Umakanta Maharana, Dhruv Kumar, Ankur Mali, C. Lee Giles, Murari Mandal

Abstract: Effective teaching requires adapting instructional strategies to accommodate the diverse cognitive and behavioral profiles of students, a persistent challenge in education and teacher training. While Large Language Models (LLMs) offer promise as tools to simulate such complex pedagogical environments, current simulation frameworks are limited in two key respects: (1) they often reduce students to… ▽ More Effective teaching requires adapting instructional strategies to accommodate the diverse cognitive and behavioral profiles of students, a persistent challenge in education and teacher training. While Large Language Models (LLMs) offer promise as tools to simulate such complex pedagogical environments, current simulation frameworks are limited in two key respects: (1) they often reduce students to static knowledge profiles, and (2) they lack adaptive mechanisms for modeling teachers who evolve their strategies in response to student feedback. To address these gaps, \textbf{we introduce a novel simulation framework that integrates LLM-based heterogeneous student agents with a self-optimizing teacher agent}. The teacher agent's pedagogical policy is dynamically evolved using a genetic algorithm, allowing it to discover and refine effective teaching strategies based on the aggregate performance of diverse learners. In addition, \textbf{we propose Persona-RAG}, a Retrieval Augmented Generation module that enables student agents to retrieve knowledge tailored to their individual learning styles. Persona-RAG preserves the retrieval accuracy of standard RAG baselines while enhancing personalization, an essential factor in modeling realistic educational scenarios. Through extensive experiments, we demonstrate how our framework supports the emergence of distinct and interpretable teaching patterns when interacting with varied student populations. Our results highlight the potential of LLM-driven simulations to inform adaptive teaching practices and provide a testbed for training human educators in controlled, data-driven environments. △ Less

Submitted 25 May, 2025; originally announced May 2025.

Comments: 38 Pages

arXiv:2501.19353 [pdf, other]

Do Large Multimodal Models Solve Caption Generation for Scientific Figures? Lessons Learned from SciCap Challenge 2023

Authors: Ting-Yao E. Hsu, Yi-Li Hsu, Shaurya Rohatgi, Chieh-Yang Huang, Ho Yin Sam Ng, Ryan Rossi, Sungchul Kim, Tong Yu, Lun-Wei Ku, C. Lee Giles, Ting-Hao K. Huang

Abstract: Since the SciCap datasets launch in 2021, the research community has made significant progress in generating captions for scientific figures in scholarly articles. In 2023, the first SciCap Challenge took place, inviting global teams to use an expanded SciCap dataset to develop models for captioning diverse figure types across various academic fields. At the same time, text generation models advan… ▽ More Since the SciCap datasets launch in 2021, the research community has made significant progress in generating captions for scientific figures in scholarly articles. In 2023, the first SciCap Challenge took place, inviting global teams to use an expanded SciCap dataset to develop models for captioning diverse figure types across various academic fields. At the same time, text generation models advanced quickly, with many powerful pre-trained large multimodal models (LMMs) emerging that showed impressive capabilities in various vision-and-language tasks. This paper presents an overview of the first SciCap Challenge and details the performance of various models on its data, capturing a snapshot of the fields state. We found that professional editors overwhelmingly preferred figure captions generated by GPT-4V over those from all other models and even the original captions written by authors. Following this key finding, we conducted detailed analyses to answer this question: Have advanced LMMs solved the task of generating captions for scientific figures? △ Less

Submitted 18 February, 2025; v1 submitted 31 January, 2025; originally announced January 2025.

Comments: Accepted to TACL 2025

arXiv:2501.02552 [pdf, other]

Multi-LLM Collaborative Caption Generation in Scientific Documents

Authors: Jaeyoung Kim, Jongho Lee, Hong-Jun Choi, Ting-Yao Hsu, Chieh-Yang Huang, Sungchul Kim, Ryan Rossi, Tong Yu, Clyde Lee Giles, Ting-Hao 'Kenneth' Huang, Sungchul Choi

Abstract: Scientific figure captioning is a complex task that requires generating contextually appropriate descriptions of visual content. However, existing methods often fall short by utilizing incomplete information, treating the task solely as either an image-to-text or text summarization problem. This limitation hinders the generation of high-quality captions that fully capture the necessary details. Mo… ▽ More Scientific figure captioning is a complex task that requires generating contextually appropriate descriptions of visual content. However, existing methods often fall short by utilizing incomplete information, treating the task solely as either an image-to-text or text summarization problem. This limitation hinders the generation of high-quality captions that fully capture the necessary details. Moreover, existing data sourced from arXiv papers contain low-quality captions, posing significant challenges for training large language models (LLMs). In this paper, we introduce a framework called Multi-LLM Collaborative Figure Caption Generation (MLBCAP) to address these challenges by leveraging specialized LLMs for distinct sub-tasks. Our approach unfolds in three key modules: (Quality Assessment) We utilize multimodal LLMs to assess the quality of training data, enabling the filtration of low-quality captions. (Diverse Caption Generation) We then employ a strategy of fine-tuning/prompting multiple LLMs on the captioning task to generate candidate captions. (Judgment) Lastly, we prompt a prominent LLM to select the highest quality caption from the candidates, followed by refining any remaining inaccuracies. Human evaluations demonstrate that informative captions produced by our approach rank better than human-written captions, highlighting its effectiveness. Our code is available at https://github.com/teamreboott/MLBCAP △ Less

Submitted 5 January, 2025; originally announced January 2025.

Comments: Accepted to AAAI 2025 AI4Research Workshop

arXiv:2411.12649 [pdf, other]

PseudoSeer: a Search Engine for Pseudocode

Authors: Levent Toksoz, Mukund Srinath, Gang Tan, C. Lee Giles

Abstract: A novel pseudocode search engine is designed to facilitate efficient retrieval and search of academic papers containing pseudocode. By leveraging Elasticsearch, the system enables users to search across various facets of a paper, such as the title, abstract, author information, and LaTeX code snippets, while supporting advanced features like combined facet searches and exact-match queries for more… ▽ More A novel pseudocode search engine is designed to facilitate efficient retrieval and search of academic papers containing pseudocode. By leveraging Elasticsearch, the system enables users to search across various facets of a paper, such as the title, abstract, author information, and LaTeX code snippets, while supporting advanced features like combined facet searches and exact-match queries for more targeted results. A description of the data acquisition process is provided, with arXiv as the primary data source, along with methods for data extraction and text-based indexing, highlighting how different data elements are stored and optimized for search. A weighted BM25-based ranking algorithm is used by the search engine, and factors considered when prioritizing search results for both single and combined facet searches are described. We explain how each facet is weighted in a combined search. Several search engine results pages are displayed. Finally, there is a brief overview of future work and potential evaluation methodology for assessing the effectiveness and performance of the search engine is described. △ Less

Submitted 19 November, 2024; originally announced November 2024.

arXiv:2410.03118 [pdf, other]

Precision, Stability, and Generalization: A Comprehensive Assessment of RNNs learnability capability for Classifying Counter and Dyck Languages

Authors: Neisarg Dave, Daniel Kifer, Lee Giles, Ankur Mali

Abstract: This study investigates the learnability of Recurrent Neural Networks (RNNs) in classifying structured formal languages, focusing on counter and Dyck languages. Traditionally, both first-order (LSTM) and second-order (O2RNN) RNNs have been considered effective for such tasks, primarily based on their theoretical expressiveness within the Chomsky hierarchy. However, our research challenges this not… ▽ More This study investigates the learnability of Recurrent Neural Networks (RNNs) in classifying structured formal languages, focusing on counter and Dyck languages. Traditionally, both first-order (LSTM) and second-order (O2RNN) RNNs have been considered effective for such tasks, primarily based on their theoretical expressiveness within the Chomsky hierarchy. However, our research challenges this notion by demonstrating that RNNs primarily operate as state machines, where their linguistic capabilities are heavily influenced by the precision of their embeddings and the strategies used for sampling negative examples. Our experiments revealed that performance declines significantly as the structural similarity between positive and negative examples increases. Remarkably, even a basic single-layer classifier using RNN embeddings performed better than chance. To evaluate generalization, we trained models on strings up to a length of 40 and tested them on strings from lengths 41 to 500, using 10 unique seeds to ensure statistical robustness. Stability comparisons between LSTM and O2RNN models showed that O2RNNs generally offer greater stability across various scenarios. We further explore the impact of different initialization strategies revealing that our hypothesis is consistent with various RNNs. Overall, this research questions established beliefs about RNNs' computational capabilities, highlighting the importance of data structure and sampling techniques in assessing neural networks' potential for language classification tasks. It emphasizes that stronger constraints on expressivity are crucial for understanding true learnability, as mere expressivity does not capture the essence of learning. △ Less

Submitted 3 October, 2024; originally announced October 2024.

Comments: 21 pages, 5 figures, 5 tables

arXiv:2408.09176 [pdf, other]

Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making

Authors: Siyu Wu, Alessandro Oltramari, Jonathan Francis, C. Lee Giles, Frank E. Ritter

Abstract: Resolving the dichotomy between the human-like yet constrained reasoning processes of Cognitive Architectures and the broad but often noisy inference behavior of Large Language Models (LLMs) remains a challenging but exciting pursuit, for enabling reliable machine reasoning capabilities in production systems. Because Cognitive Architectures are famously developed for the purpose of modeling the in… ▽ More Resolving the dichotomy between the human-like yet constrained reasoning processes of Cognitive Architectures and the broad but often noisy inference behavior of Large Language Models (LLMs) remains a challenging but exciting pursuit, for enabling reliable machine reasoning capabilities in production systems. Because Cognitive Architectures are famously developed for the purpose of modeling the internal mechanisms of human cognitive decision-making at a computational level, new investigations consider the goal of informing LLMs with the knowledge necessary for replicating such processes, e.g., guided perception, memory, goal-setting, and action. Previous approaches that use LLMs for grounded decision-making struggle with complex reasoning tasks that require slower, deliberate cognition over fast and intuitive inference -- reporting issues related to the lack of sufficient grounding, as in hallucination. To resolve these challenges, we introduce LLM-ACTR, a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making by integrating the ACT-R Cognitive Architecture with LLMs. Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations, injects this information into trainable LLM adapter layers, and fine-tunes the LLMs for downstream prediction. Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability of our approach, compared to LLM-only baselines that leverage chain-of-thought reasoning strategies. △ Less

Submitted 17 August, 2024; originally announced August 2024.

Comments: 20 pages, 8 figures, 2 tables

arXiv:2406.04635 [pdf, other]

Scaling Automatic Extraction of Pseudocode

Authors: Levent Toksoz, Gang Tan, C. Lee Giles

Abstract: Pseudocode in a scholarly paper provides a concise way to express the algorithms implemented therein. Pseudocode can also be thought of as an intermediary representation that helps bridge the gap between programming languages and natural languages. Having access to a large collection of pseudocode can provide various benefits ranging from enhancing algorithmic understanding, facilitating further a… ▽ More Pseudocode in a scholarly paper provides a concise way to express the algorithms implemented therein. Pseudocode can also be thought of as an intermediary representation that helps bridge the gap between programming languages and natural languages. Having access to a large collection of pseudocode can provide various benefits ranging from enhancing algorithmic understanding, facilitating further algorithmic design, to empowering NLP or computer vision based models for tasks such as automated code generation and optical character recognition (OCR). We have created a large pseudocode collection by extracting nearly 320,000 pseudocode examples from arXiv papers. This process involved scanning over $2.2$ million scholarly papers, with 1,000 of them being manually inspected and labeled. Our approach encompasses an extraction mechanism tailored to optimize the coverage and a validation mechanism based on random sampling to check its accuracy and reliability, given the inherent heterogeneity of the collection. In addition, we offer insights into common pseudocode structures, supported by clustering and statistical analyses. Notably, these analyses indicate an exponential-like growth in the usage of pseudocodes, highlighting their increasing significance. △ Less

Submitted 7 June, 2024; originally announced June 2024.

arXiv:2405.13209 [pdf, other]

Investigating Symbolic Capabilities of Large Language Models

Authors: Neisarg Dave, Daniel Kifer, C. Lee Giles, Ankur Mali

Abstract: Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to b… ▽ More Prompting techniques have significantly enhanced the capabilities of Large Language Models (LLMs) across various complex tasks, including reasoning, planning, and solving math word problems. However, most research has predominantly focused on language-based reasoning and word problems, often overlooking the potential of LLMs in handling symbol-based calculations and reasoning. This study aims to bridge this gap by rigorously evaluating LLMs on a series of symbolic tasks, such as addition, multiplication, modulus arithmetic, numerical precision, and symbolic counting. Our analysis encompasses eight LLMs, including four enterprise-grade and four open-source models, of which three have been pre-trained on mathematical tasks. The assessment framework is anchored in Chomsky's Hierarchy, providing a robust measure of the computational abilities of these models. The evaluation employs minimally explained prompts alongside the zero-shot Chain of Thoughts technique, allowing models to navigate the solution process autonomously. The findings reveal a significant decline in LLMs' performance on context-free and context-sensitive symbolic tasks as the complexity, represented by the number of symbols, increases. Notably, even the fine-tuned GPT3.5 exhibits only marginal improvements, mirroring the performance trends observed in other models. Across the board, all models demonstrated a limited generalization ability on these symbol-intensive tasks. This research underscores LLMs' challenges with increasing symbolic complexity and highlights the need for specialized training, memory and architectural adjustments to enhance their proficiency in symbol-based reasoning tasks. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.11003 [pdf, other]

Out-of-plane Parallel Current in the Diffusion Regions: The Interaction Between Diffusion Region Systems and their Impact on the Outer EDR

Authors: Jason M. H. Beedle, Daniel J. Gershman, Vadim M. Uritsky, Jason R. Shuster, Tai D. Phan, Barbara L. Giles, Kevin J. Genestreti, Roy B. Torbert

Abstract: Dayside magnetic reconnection allows for the transfer of the solar wind's energy into Earth's magnetosphere. This process takes place in electron diffusion regions (EDRs) embedded in ion diffusion regions (IDRs), which form in the magnetopause boundary's current sheet. A significant out-of-plane parallel current contribution in the diffusion regions was reported in Beedle et al. 2023. In order to… ▽ More Dayside magnetic reconnection allows for the transfer of the solar wind's energy into Earth's magnetosphere. This process takes place in electron diffusion regions (EDRs) embedded in ion diffusion regions (IDRs), which form in the magnetopause boundary's current sheet. A significant out-of-plane parallel current contribution in the diffusion regions was reported in Beedle et al. 2023. In order to understand the underlying structure of this parallel current, we compared EDR statistical results with a 2.5D Particle-In Cell (PIC) simulation. From this comparison, we identified out-of-plane parallel current signatures as defining features of the outer EDR and IDR. This significant out-of-plane parallel current indicates the interaction of the IDR and EDR systems, and provides implications for not only understanding energy dissipation in the diffusion regions, but also determining the location of the outer EDR. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2403.17784 [pdf, other]

doi 10.1145/3613905.3650738

SciCapenter: Supporting Caption Composition for Scientific Figures with Machine-Generated Captions and Ratings

Authors: Ting-Yao Hsu, Chieh-Yang Huang, Shih-Hong Huang, Ryan Rossi, Sungchul Kim, Tong Yu, C. Lee Giles, Ting-Hao K. Huang

Abstract: Crafting effective captions for figures is important. Readers heavily depend on these captions to grasp the figure's message. However, despite a well-developed set of AI technologies for figures and captions, these have rarely been tested for usefulness in aiding caption writing. This paper introduces SciCapenter, an interactive system that puts together cutting-edge AI technologies for scientific… ▽ More Crafting effective captions for figures is important. Readers heavily depend on these captions to grasp the figure's message. However, despite a well-developed set of AI technologies for figures and captions, these have rarely been tested for usefulness in aiding caption writing. This paper introduces SciCapenter, an interactive system that puts together cutting-edge AI technologies for scientific figure captions to aid caption composition. SciCapenter generates a variety of captions for each figure in a scholarly article, providing scores and a comprehensive checklist to assess caption quality across multiple critical aspects, such as helpfulness, OCR mention, key takeaways, and visual properties reference. Users can directly edit captions in SciCapenter, resubmit for revised evaluations, and iteratively refine them. A user study with Ph.D. students indicates that SciCapenter significantly lowers the cognitive load of caption writing. Participants' feedback further offers valuable design insights for future systems aiming to enhance caption writing. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems

arXiv:2402.11006 [pdf, other]

Automated Detection and Analysis of Data Practices Using A Real-World Corpus

Authors: Mukund Srinath, Pranav Venkit, Maria Badillo, Florian Schaub, C. Lee Giles, Shomir Wilson

Abstract: Privacy policies are crucial for informing users about data practices, yet their length and complexity often deter users from reading them. In this paper, we propose an automated approach to identify and visualize data practices within privacy policies at different levels of detail. Leveraging crowd-sourced annotations from the ToS;DR platform, we experiment with various methods to match policy ex… ▽ More Privacy policies are crucial for informing users about data practices, yet their length and complexity often deter users from reading them. In this paper, we propose an automated approach to identify and visualize data practices within privacy policies at different levels of detail. Leveraging crowd-sourced annotations from the ToS;DR platform, we experiment with various methods to match policy excerpts with predefined data practice descriptions. We further conduct a case study to evaluate our approach on a real-world policy, demonstrating its effectiveness in simplifying complex policies. Experiments show that our approach accurately matches data practice descriptions with policy excerpts, facilitating the presentation of simplified privacy information to users. △ Less

Submitted 16 February, 2024; originally announced February 2024.

arXiv:2402.02627 [pdf, other]

Stability Analysis of Various Symbolic Rule Extraction Methods from Recurrent Neural Network

Authors: Neisarg Dave, Daniel Kifer, C. Lee Giles, Ankur Mali

Abstract: This paper analyzes two competing rule extraction methodologies: quantization and equivalence query. We trained $3600$ RNN models, extracting $18000$ DFA with a quantization approach (k-means and SOM) and $3600$ DFA by equivalence query($L^{*}$) methods across $10$ initialization seeds. We sampled the datasets from $7$ Tomita and $4$ Dyck grammars and trained them on $4$ RNN cells: LSTM, GRU, O2RN… ▽ More This paper analyzes two competing rule extraction methodologies: quantization and equivalence query. We trained $3600$ RNN models, extracting $18000$ DFA with a quantization approach (k-means and SOM) and $3600$ DFA by equivalence query($L^{*}$) methods across $10$ initialization seeds. We sampled the datasets from $7$ Tomita and $4$ Dyck grammars and trained them on $4$ RNN cells: LSTM, GRU, O2RNN, and MIRNN. The observations from our experiments establish the superior performance of O2RNN and quantization-based rule extraction over others. $L^{*}$, primarily proposed for regular grammars, performs similarly to quantization methods for Tomita languages when neural networks are perfectly trained. However, for partially trained RNNs, $L^{*}$ shows instability in the number of states in DFA, e.g., for Tomita 5 and Tomita 6 languages, $L^{*}$ produced more than $100$ states. In contrast, quantization methods result in rules with number of states very close to ground truth DFA. Among RNN cells, O2RNN produces stable DFA consistently compared to other cells. For Dyck Languages, we observe that although GRU outperforms other RNNs in network performance, the DFA extracted by O2RNN has higher performance and better stability. The stability is computed as the standard deviation of accuracy on test sets on networks trained across $10$ seeds. On Dyck Languages, quantization methods outperformed $L^{*}$ with better stability in accuracy and the number of states. $L^{*}$ often showed instability in accuracy in the order of $16\% - 22\%$ for GRU and MIRNN while deviation for quantization methods varied in $5\% - 15\%$. In many instances with LSTM and GRU, DFA's extracted by $L^{*}$ even failed to beat chance accuracy ($50\%$), while those extracted by quantization method had standard deviation in the $7\%-17\%$ range. For O2RNN, both rule extraction methods had deviation in the $0.5\% - 3\%$ range. △ Less

Submitted 4 February, 2024; originally announced February 2024.

arXiv:2312.15627 [pdf]

Ultrahigh electrostrain in Pb-free piezoceramics: Effect of bending

Authors: Gobinda Das Adhikary, John Daniels, Luke Giles, Rajeev Ranjan

Abstract: Recently several reports showing ultra-high electrostrain (> 1 %) have appeared in Pb-free piezoceramics. However, there is lack of clarity on the nature of the ultrahigh strain. Here, we demonsrate that the ultrahigh strain is a consequence of bending of the disc. We show that the propensity for bending arises from the difference in the response magnitude of the grains at the positive and negativ… ▽ More Recently several reports showing ultra-high electrostrain (> 1 %) have appeared in Pb-free piezoceramics. However, there is lack of clarity on the nature of the ultrahigh strain. Here, we demonsrate that the ultrahigh strain is a consequence of bending of the disc. We show that the propensity for bending arises from the difference in the response magnitude of the grains at the positive and negative surfaces of the piezoceramic when the field is applied. △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: 8 pages 4 figures

arXiv:2311.05411 [pdf, other]

Multi-scale observation of magnetotail reconnection onset: 2. microscopic dynamics

Authors: K. J. Genestreti, C. Farrugia, S. Lu, S. K. Vines, P. H. Reiff, T. -D. Phan, D. N. Baker, T. W. Leonard, J. L. Burch, S. T. Bingham, I. J. Cohen, J. R. Shuster, D. J. Gershman, C. G. Mouikis, A. T. Rogers, R. B. Torbert, K. J. Trattner, J. M. Webster, L. -J. Chen, B. L. Giles, N. Ahmadi, R. E. Ergun, C. T. Russell, R. J. Strangeway, R. Nakamura , et al. (1 additional authors not shown)

Abstract: We analyze the local dynamics of magnetotail reconnection onset using Magnetospheric Multiscale (MMS) data. In conjunction with MMS, the macroscopic dynamics of this event were captured by a number of other ground and space-based observatories, as is reported in a companion paper. We find that the local dynamics of the onset were characterized by the rapid thinning of the cross-tail current sheet… ▽ More We analyze the local dynamics of magnetotail reconnection onset using Magnetospheric Multiscale (MMS) data. In conjunction with MMS, the macroscopic dynamics of this event were captured by a number of other ground and space-based observatories, as is reported in a companion paper. We find that the local dynamics of the onset were characterized by the rapid thinning of the cross-tail current sheet below the ion inertial scale, accompanied by the growth of flapping waves and the subsequent onset of electron tearing. Multiple kinetic-scale magnetic islands were detected coincident with the growth of an initially sub-Alfvénic, demagnetized tailward ion exhaust. The onset and rapid enhancement of parallel electron inflow at the exhaust boundary was a remote signature of the intensification of reconnection Earthward of the spacecraft. Two secondary reconnection sites are found embedded within the exhaust from a primary X-line. The primary X-line was designated as such on the basis that (1) while multiple jet reversals were observed in the current sheet, only one reversal of the electron inflow was observed at the high-latitude exhaust boundary, (2) the reconnection electric field was roughly 5 times larger at the primary X-line than the secondary X-lines, and (3) energetic electron fluxes increased and transitioned from anti-field-aligned to isotropic during the primary X-line crossing, indicating a change in magnetic topology. The results are consistent with the idea that a primary X-line mediates the reconnection of lobe magnetic field lines and accelerates electrons more efficiently than its secondary X-line counterparts. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: In press, JGR Space Physics, JGRA58162

arXiv:2311.05405 [pdf, other]

Multi-scale observation of magnetotail reconnection onset: 1. macroscopic dynamics

Authors: K. J. Genestreti, C. Farrugia, S. Lu, S. K. Vines, P. H. Reiff, T. -D. Phan, D. N. Baker, T. W. Leonard, J. L. Burch, S. T. Bingham, I. J. Cohen, J. R. Shuster, D. J. Gershman, C. G. Mouikis, A. T. Rogers, R. B. Torbert, K. J. Trattner, J. M. Webster, L. -J. Chen, B. L. Giles, N. Ahmadi, R. E. Ergun, C. T. Russell, R. J. Strangeway, R. Nakamura

Abstract: We analyze a magnetotail reconnection onset event on 3 July 2017 that was observed under otherwise quiescent magnetospheric conditions by a fortuitous conjunction of six space and ground-based observatories. The study investigates the large-scale coupling of the solar wind - magnetosphere system that precipitated the onset of the magnetotail reconnection, focusing on the processes that thinned and… ▽ More We analyze a magnetotail reconnection onset event on 3 July 2017 that was observed under otherwise quiescent magnetospheric conditions by a fortuitous conjunction of six space and ground-based observatories. The study investigates the large-scale coupling of the solar wind - magnetosphere system that precipitated the onset of the magnetotail reconnection, focusing on the processes that thinned and stretched the cross-tail current layer in the absence of significant flux loading during a two-hour-long preconditioning phase. It is demonstrated with data in the (1) upstream solar wind, (2) at the low-latitude magnetopause, (3) in the high-latitude polar cap, and (4) in the magnetotail that the typical picture of solar wind-driven current sheet thinning via flux loading does not appear relevant for this particular event. We find that the current sheet thinning was, instead, initiated by a transient solar wind pressure pulse and that the current sheet thinning continued even as the magnetotail and solar wind pressures decreased. We suggest that field line curvature induced scattering (observed by Magnetospheric Multiscale (MMS)) and precipitation (observed by Defense Meteorological Satellite Program (DMSP)) of high-energy thermal protons may have evacuated plasma sheet thermal energy, which may require a thinning of the plasma sheet to preserve pressure equilibrium with the solar wind. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: In press, JGR space physics, JGRA58161

arXiv:2310.15405 [pdf, other]

GPT-4 as an Effective Zero-Shot Evaluator for Scientific Figure Captions

Authors: Ting-Yao Hsu, Chieh-Yang Huang, Ryan Rossi, Sungchul Kim, C. Lee Giles, Ting-Hao K. Huang

Abstract: There is growing interest in systems that generate captions for scientific figures. However, assessing these systems output poses a significant challenge. Human evaluation requires academic expertise and is costly, while automatic evaluation depends on often low-quality author-written captions. This paper investigates using large language models (LLMs) as a cost-effective, reference-free method fo… ▽ More There is growing interest in systems that generate captions for scientific figures. However, assessing these systems output poses a significant challenge. Human evaluation requires academic expertise and is costly, while automatic evaluation depends on often low-quality author-written captions. This paper investigates using large language models (LLMs) as a cost-effective, reference-free method for evaluating figure captions. We first constructed SCICAP-EVAL, a human evaluation dataset that contains human judgments for 3,600 scientific figure captions, both original and machine-made, for 600 arXiv figures. We then prompted LLMs like GPT-4 and GPT-3 to score (1-6) each caption based on its potential to aid reader understanding, given relevant context such as figure-mentioning paragraphs. Results show that GPT-4, used as a zero-shot evaluator, outperformed all other models and even surpassed assessments made by Computer Science and Informatics undergraduates, achieving a Kendall correlation score of 0.401 with Ph.D. students rankings △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: To Appear in EMNLP 2023 Findings

arXiv:2309.14691 [pdf, other]

On the Computational Complexity and Formal Hierarchy of Second Order Recurrent Neural Networks

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Lee Giles

Abstract: Artificial neural networks (ANNs) with recurrence and self-attention have been shown to be Turing-complete (TC). However, existing work has shown that these ANNs require multiple turns or unbounded computation time, even with unbounded precision in weights, in order to recognize TC grammars. However, under constraints such as fixed or bounded precision neurons and time, ANNs without memory are sho… ▽ More Artificial neural networks (ANNs) with recurrence and self-attention have been shown to be Turing-complete (TC). However, existing work has shown that these ANNs require multiple turns or unbounded computation time, even with unbounded precision in weights, in order to recognize TC grammars. However, under constraints such as fixed or bounded precision neurons and time, ANNs without memory are shown to struggle to recognize even context-free languages. In this work, we extend the theoretical foundation for the $2^{nd}$-order recurrent network ($2^{nd}$ RNN) and prove there exists a class of a $2^{nd}$ RNN that is Turing-complete with bounded time. This model is capable of directly encoding a transition table into its recurrent weights, enabling bounded time computation and is interpretable by design. We also demonstrate that $2$nd order RNNs, without memory, under bounded weights and time constraints, outperform modern-day models such as vanilla RNNs and gated recurrent units in recognizing regular grammars. We provide an upper bound and a stability analysis on the maximum number of neurons required by $2$nd order RNNs to recognize any class of regular grammar. Extensive experiments on the Tomita grammars support our findings, demonstrating the importance of tensor connections in crafting computationally efficient RNNs. Finally, we show $2^{nd}$ order RNNs are also interpretable by extraction and can extract state machines with higher success rates as compared to first-order RNNs. Our results extend the theoretical foundations of RNNs and offer promising avenues for future explainable AI research. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 12 pages, 5 tables, 1 figure

arXiv:2309.14690 [pdf, ps, other]

On the Tensor Representation and Algebraic Homomorphism of the Neural State Turing Machine

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Lee Giles

Abstract: Recurrent neural networks (RNNs) and transformers have been shown to be Turing-complete, but this result assumes infinite precision in their hidden representations, positional encodings for transformers, and unbounded computation time in general. In practical applications, however, it is crucial to have real-time models that can recognize Turing complete grammars in a single pass. To address this… ▽ More Recurrent neural networks (RNNs) and transformers have been shown to be Turing-complete, but this result assumes infinite precision in their hidden representations, positional encodings for transformers, and unbounded computation time in general. In practical applications, however, it is crucial to have real-time models that can recognize Turing complete grammars in a single pass. To address this issue and to better understand the true computational power of artificial neural networks (ANNs), we introduce a new class of recurrent models called the neural state Turing machine (NSTM). The NSTM has bounded weights and finite-precision connections and can simulate any Turing Machine in real-time. In contrast to prior work that assumes unbounded time and precision in weights, to demonstrate equivalence with TMs, we prove that a $13$-neuron bounded tensor RNN, coupled with third-order synapses, can model any TM class in real-time. Furthermore, under the Markov assumption, we provide a new theoretical bound for a non-recurrent network augmented with memory, showing that a tensor feedforward network with $25$th-order finite precision weights is equivalent to a universal TM. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 14 pages, 7 tables

arXiv:2306.09370 [pdf, other]

Differentiating EDRs from the Background Magnetopause Current Sheet: A Statistical Study

Authors: Jason M. H. Beedle, Daniel J. Gershman, Vadim M. Uritsky, Tai D. Phan, Barbara L. Giles

Abstract: The solar wind is a continuous outflow of charged particles from the Sun's atmosphere into the solar system. At Earth, the solar wind's outward pressure is balanced by the Earth's magnetic field in a boundary layer known as the magnetopause. Plasma density and temperature differences across the boundary layer generate the Chapman-Ferraro current which supports the magnetopause. Along the dayside m… ▽ More The solar wind is a continuous outflow of charged particles from the Sun's atmosphere into the solar system. At Earth, the solar wind's outward pressure is balanced by the Earth's magnetic field in a boundary layer known as the magnetopause. Plasma density and temperature differences across the boundary layer generate the Chapman-Ferraro current which supports the magnetopause. Along the dayside magnetopause, magnetic reconnection can occur in electron diffusion regions (EDRs) embedded into the larger ion diffusion regions (IDRs). These diffusion regions form when opposing magnetic field lines in the solar wind and Earth's magnetic field merge, releasing magnetic energy into the surrounding plasma. While previous studies have given us a general understanding of the structure of the diffusion regions, we still do not have a good grasp of how they are statistically differentiated from the non-diffusion region magnetopause. By investigating 251 magnetopause crossings from NASA's Magnetospheric Multiscale (MMS) Mission, we demonstrate that EDR magnetopause crossings show current densities an order of magnitude higher than regular magnetopause crossings - crossings that either passed through the reconnection exhausts or through the non-reconnecting magnetopause, providing a baseline for the magnetopause current sheet under a wide range of driving conditions. Significant current signatures parallel to the local magnetic field in EDR crossings are also identified, which is in contrast to the dominantly perpendicular current found in the regular magnetopause. Additionally, we show that the ion velocity along the magnetopause is highly correlated with a crossing's location, indicating the presence of magnetosheath flows inside the magnetopause. △ Less

Submitted 18 September, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

arXiv:2305.14520 [pdf, other]

Three-dimensional energy transfer in space plasma turbulence from multipoint measurement

Authors: Francesco Pecora, Sergio Servidio, Yan Yang, William H. Matthaeus, Alexandros Chasapis, Antonella Greco, Daniel J. Gershman, Barbara L. Giles, James L. Burch

Abstract: A novel multispacecraft technique applied to Magnetospheric Multiscale (MMS) mission data collected in the Earth's magnetosheath enables evaluation of the energy cascade rate solving the full Yaglom's equation in a turbulent space plasma. The method differs from existing approaches in that (i) it is inherently three-dimensional; (ii) it provides a statistically significant number of estimates from… ▽ More A novel multispacecraft technique applied to Magnetospheric Multiscale (MMS) mission data collected in the Earth's magnetosheath enables evaluation of the energy cascade rate solving the full Yaglom's equation in a turbulent space plasma. The method differs from existing approaches in that (i) it is inherently three-dimensional; (ii) it provides a statistically significant number of estimates from a single data stream; and (iii) it allows for a direct visualization of energy flux in turbulent plasmas. This new technique will ultimately provide a realistic, comprehensive picture of the turbulence process in plasmas. △ Less

Submitted 23 May, 2023; originally announced May 2023.

arXiv:2303.00866 [pdf, other]

A prototype hybrid prediction market for estimating replicability of published work

Authors: Tatiana Chakravorti, Robert Fraleigh, Timothy Fritton, Michael McLaughlin, Vaibhav Singh, Christopher Griffin, Anthony Kwasnica, David Pennock, C. Lee Giles, Sarah Rajtmajer

Abstract: We present a prototype hybrid prediction market and demonstrate the avenue it represents for meaningful human-AI collaboration. We build on prior work proposing artificial prediction markets as a novel machine-learning algorithm. In an artificial prediction market, trained AI agents buy and sell outcomes of future events. Classification decisions can be framed as outcomes of future events, and acc… ▽ More We present a prototype hybrid prediction market and demonstrate the avenue it represents for meaningful human-AI collaboration. We build on prior work proposing artificial prediction markets as a novel machine-learning algorithm. In an artificial prediction market, trained AI agents buy and sell outcomes of future events. Classification decisions can be framed as outcomes of future events, and accordingly, the price of an asset corresponding to a given classification outcome can be taken as a proxy for the confidence of the system in that decision. By embedding human participants in these markets alongside bot traders, we can bring together insights from both. In this paper, we detail pilot studies with prototype hybrid markets for the prediction of replication study outcomes. We highlight challenges and opportunities, share insights from semi-structured interviews with hybrid market participants, and outline a vision for ongoing and future work. △ Less

Submitted 1 March, 2023; originally announced March 2023.

arXiv:2302.12324 [pdf, other]

Summaries as Captions: Generating Figure Captions for Scientific Documents with Automated Text Summarization

Authors: Chieh-Yang Huang, Ting-Yao Hsu, Ryan Rossi, Ani Nenkova, Sungchul Kim, Gromit Yeuk-Yin Chan, Eunyee Koh, Clyde Lee Giles, Ting-Hao 'Kenneth' Huang

Abstract: Good figure captions help paper readers understand complex scientific figures. Unfortunately, even published papers often have poorly written captions. Automatic caption generation could aid paper writers by providing good starting captions that can be refined for better quality. Prior work often treated figure caption generation as a vision-to-language task. In this paper, we show that it can be… ▽ More Good figure captions help paper readers understand complex scientific figures. Unfortunately, even published papers often have poorly written captions. Automatic caption generation could aid paper writers by providing good starting captions that can be refined for better quality. Prior work often treated figure caption generation as a vision-to-language task. In this paper, we show that it can be more effectively tackled as a text summarization task in scientific documents. We fine-tuned PEGASUS, a pre-trained abstractive summarization model, to specifically summarize figure-referencing paragraphs (e.g., "Figure 3 shows...") into figure captions. Experiments on large-scale arXiv figures show that our method outperforms prior vision methods in both automatic and human evaluations. We further conducted an in-depth investigation focused on two key challenges: (i) the common presence of low-quality author-written captions and (ii) the lack of clear standards for good captions. Our code and data are available at: https://github.com/Crowd-AI-Lab/Generating-Figure-Captions-as-a-Text-Summarization-Task. △ Less

Submitted 11 August, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

Comments: Accepted by INLG-2023

arXiv:2302.00634 [pdf, other]

doi 10.1093/mnras/stad2232

Relaxation of the turbulent magnetosheath

Authors: Francesco Pecora, Yan Yang, Alexandros Chasapis, Sergio Servidio, Manuel Cuesta, Sohom Roy, Rohit Chhiber, Riddhi Bandyopadhyay, D. J. Gershman, B. L. Giles, J. L. Burch, William H. Matthaeus

Abstract: In turbulence, nonlinear terms drive energy transfer from large-scale eddies into small scales through the so-called energy cascade. Turbulence often relaxes toward states that minimize energy; typically these states are considered globally. However, turbulence can also relax toward local quasi-equilibrium states, creating patches or cells where the magnitude of nonlinearity is reduced and energy… ▽ More In turbulence, nonlinear terms drive energy transfer from large-scale eddies into small scales through the so-called energy cascade. Turbulence often relaxes toward states that minimize energy; typically these states are considered globally. However, turbulence can also relax toward local quasi-equilibrium states, creating patches or cells where the magnitude of nonlinearity is reduced and energy cascade is impaired. We show, for the first time, compelling observational evidence that this ``cellularization'' of turbulence can occur due to local relaxation in a strongly turbulent natural environment such as the Earth's magnetosheath. △ Less

Submitted 1 February, 2023; originally announced February 2023.

arXiv:2301.12293 [pdf, other]

ACL-Fig: A Dataset for Scientific Figure Classification

Authors: Zeba Karishma, Shaurya Rohatgi, Kavya Shrinivas Puranik, Jian Wu, C. Lee Giles

Abstract: Most existing large-scale academic search engines are built to retrieve text-based information. However, there are no large-scale retrieval services for scientific figures and tables. One challenge for such services is understanding scientific figures' semantics, such as their types and purposes. A key obstacle is the need for datasets containing annotated scientific figures and tables, which can… ▽ More Most existing large-scale academic search engines are built to retrieve text-based information. However, there are no large-scale retrieval services for scientific figures and tables. One challenge for such services is understanding scientific figures' semantics, such as their types and purposes. A key obstacle is the need for datasets containing annotated scientific figures and tables, which can then be used for classification, question-answering, and auto-captioning. Here, we develop a pipeline that extracts figures and tables from the scientific literature and a deep-learning-based framework that classifies scientific figures using visual features. Using this pipeline, we built the first large-scale automatically annotated corpus, ACL-Fig, consisting of 112,052 scientific figures extracted from ~56K research papers in the ACL Anthology. The ACL-Fig-Pilot dataset contains 1,671 manually labeled scientific figures belonging to 19 categories. The dataset is accessible at https://huggingface.co/datasets/citeseerx/ACL-fig under a CC BY-NC license. △ Less

Submitted 28 January, 2023; originally announced January 2023.

Comments: 6 pages, 4 figures, accepted by the AAAI-23 Workshop on Scientific Document Understanding

arXiv:2211.16590 [pdf, other]

Artificial prediction markets present a novel opportunity for human-AI collaboration

Authors: Tatiana Chakravorti, Vaibhav Singh, Sarah Rajtmajer, Michael McLaughlin, Robert Fraleigh, Christopher Griffin, Anthony Kwasnica, David Pennock, C. Lee Giles

Abstract: Despite high-profile successes in the field of Artificial Intelligence, machine-driven technologies still suffer important limitations, particularly for complex tasks where creativity, planning, common sense, intuition, or learning from limited data is required. These limitations motivate effective methods for human-machine collaboration. Our work makes two primary contributions. We thoroughly exp… ▽ More Despite high-profile successes in the field of Artificial Intelligence, machine-driven technologies still suffer important limitations, particularly for complex tasks where creativity, planning, common sense, intuition, or learning from limited data is required. These limitations motivate effective methods for human-machine collaboration. Our work makes two primary contributions. We thoroughly experiment with an artificial prediction market model to understand the effects of market parameters on model performance for benchmark classification tasks. We then demonstrate, through simulation, the impact of exogenous agents in the market, where these exogenous agents represent primitive human behaviors. This work lays the foundation for a novel set of hybrid human-AI machine learning algorithms. △ Less

Submitted 29 November, 2022; originally announced November 2022.

arXiv:2208.12671 [pdf]

doi 10.1029/2021JA029518

Thin current sheet behind the dipolarization front

Authors: Nakamura, R., Baumjohann, W., Nakamura, T. K. M., Panov, E., V., Schmid, D., Varsani, A., S. Apatenkov, V. A. Sergeev, J. Birn, T. Nagai, C. Gabrielse, M. Andre, J. L. Burch, C. Carr, I. S Dandouras, C. P. Escoubet, A, N. Fazakerley , et al. (4 additional authors not shown)

Abstract: We report a unique conjugate observation of fast flows and associated current sheet disturbances in the near-Earth magnetotail by MMS (Magnetospheric Multiscale) and Cluster preceding a positive bay onset of a small substorm at ~14:10 UT, Sep. 8, 2018. MMS and Cluster were located both at X ~-14 RE. A dipolarization front (DF) of a localized fast flow was detected by Cluster and MMS, separated in… ▽ More We report a unique conjugate observation of fast flows and associated current sheet disturbances in the near-Earth magnetotail by MMS (Magnetospheric Multiscale) and Cluster preceding a positive bay onset of a small substorm at ~14:10 UT, Sep. 8, 2018. MMS and Cluster were located both at X ~-14 RE. A dipolarization front (DF) of a localized fast flow was detected by Cluster and MMS, separated in the dawn-dusk direction by ~4 RE, almost simultaneously. Adiabatic electron acceleration signatures revealed from comparison of the energy spectra confirm that both spacecraft encounter the same DF. We analyzed the change in the current sheet structure based on multi-scale multi-point data analysis. The current sheet thickened during the passage of DF, yet, temporally thinned subsequently associated with another flow enhancement centered more on the dawnward side of the initial flow. MMS and Cluster observed intense perpendicular and parallel current in the off-equatorial region mainly during this interval of the current sheet thinning. Maximum field-aligned currents both at MMS and Cluster are directed tailward. Detailed analysis of MMS data showed that the intense field-aligned currents consisted of multiple small-scale intense current layers accompanied by enhanced Hall-currents in the dawn-dusk flow-shear region. We suggest that the current sheet thinning is related to the flow bouncing process and/or to the expansion/activation of reconnection. Based on these mesoscale and small-scale multipoint observations, 3D evolution of the flow and current-sheet disturbances was inferred preceding the development of a substorm current wedge. △ Less

Submitted 26 August, 2022; originally announced August 2022.

Journal ref: Journal of Geophysical Research: Space Physics, 126, e2021JA029518, 2021

arXiv:2207.09029 [pdf, other]

doi 10.1029/2022JA030360

Tens to hundreds of keV electron precipitation driven by kinetic Alfvén waves during an electron injection

Authors: Y. Shen, A. V. Artemyev, X. -J. Zhang, V. Angelopoulos, I. Vasko, D. Turner, E. Tsai, C. Wilkins, J. Weygand, C. T. Russell, R. E. Ergun, B. L. Giles

Abstract: Electron injections are critical processes associated with magnetospheric substorms, which deposit significant electron energy into the ionosphere. Although wave scattering of $<$10 keV electrons during injections has been well studied, the link between magnetotail electron injections and energetic ($\geq$100 keV) electron precipitation remains elusive. Using conjugate observations between the ELF… ▽ More Electron injections are critical processes associated with magnetospheric substorms, which deposit significant electron energy into the ionosphere. Although wave scattering of $<$10 keV electrons during injections has been well studied, the link between magnetotail electron injections and energetic ($\geq$100 keV) electron precipitation remains elusive. Using conjugate observations between the ELFIN and Magnetospheric Multiscale (MMS) missions, we present evidence of tens to hundreds of keV electron precipitation to the ionosphere potentially driven by kinetic Alfvén waves (KAWs) associated with magnetotail electron injections and magnetic field gradients. Test particle simulations adapted to observations show that dipolarization-front magnetic field gradients and associated $\nabla B$ drifts allow Doppler-shifted Landau resonances between the injected electrons and KAWs, producing electron spatial scattering across the front which results in pitch-angle decreases and subsequent precipitation. Test particle results show that such KAW-driven precipitation can account for ELFIN observations below $\sim$300 keV. △ Less

Submitted 18 July, 2022; originally announced July 2022.

Comments: 25 pages, 5 figures, with supporting information, the manuscript has been accepted for publication by JGR space physics

arXiv:2203.13879 [pdf, other]

doi 10.1063/5.0090275

On the origin of "patchy" energy conversion in electron diffusion regions

Authors: Kevin J. Genestreti, Xiaocan Li, Yi-Hsin Liu, James L. Burch, Roy B. Torbert, Stephen A. Fuselier, Takuma Nakamura, Barbara L. Giles, Daniel J. Gershman, Robert E. Ergun, Christopher T. Russell, Robert J. Strangeway

Abstract: During magnetic reconnection, field lines interconnect in electron diffusion regions (EDRs). In some EDRs the reconnection and energy conversion rates are controlled by a steady out-of-plane electric field. In other EDRs the energy conversion rate $\vec{J}\cdot\vec{E}'$ is "patchy", with electron-scale large-amplitude positive and negative peaks. We investigate 22 EDRs observed by NASA's Magnetosp… ▽ More During magnetic reconnection, field lines interconnect in electron diffusion regions (EDRs). In some EDRs the reconnection and energy conversion rates are controlled by a steady out-of-plane electric field. In other EDRs the energy conversion rate $\vec{J}\cdot\vec{E}'$ is "patchy", with electron-scale large-amplitude positive and negative peaks. We investigate 22 EDRs observed by NASA's Magnetospheric Multiscale (MMS) mission in a wide range of conditions to determine the cause of patchy $\vec{J}\cdot\vec{E}'$. The patchiness of the energy conversion is quantified and correlated with seven parameters describing various aspects of the asymptotic inflow regions that affect the structure, stability, and efficiency of reconnection. We find that (1) neither the guide field strength nor the asymmetries in the inflow ion pressure, electron pressure, reconnecting magnetic field strength, and number density are well correlated with the patchiness of the EDR energy conversion, (2) the out-of-plane axes of the 22 EDRs are typically fairly well aligned with the "preferred" axes, which bisect the time-averaged inflow magnetic fields and maximize the reconnection rate, and (3) the time-variability in the upstream magnetic field direction is best correlated with the patchiness of the EDR $\vec{J}\cdot\vec{E}'$. A 3-d fully-kinetic simulation of reconnection with a non-uniform inflow magnetic field is analyzed; the variation in the magnetic field generates secondary X-lines, which develop to maximize the reconnection rate for the time-varying inflow magnetic field. The results suggest that magnetopause reconnection, for which the inflow magnetic field direction is often highly variable, may commonly be patchy in space, at least at the electron scale. △ Less

Submitted 25 March, 2022; originally announced March 2022.

Comments: 31 pages, 6 figures, submitted to Physics of Plasmas

arXiv:2201.11795 [pdf, other]

Neural JPEG: End-to-End Image Compression Leveraging a Standard JPEG Encoder-Decoder

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Lee Giles

Abstract: Recent advances in deep learning have led to superhuman performance across a variety of applications. Recently, these methods have been successfully employed to improve the rate-distortion performance in the task of image compression. However, current methods either use additional post-processing blocks on the decoder end to improve compression or propose an end-to-end compression scheme based on… ▽ More Recent advances in deep learning have led to superhuman performance across a variety of applications. Recently, these methods have been successfully employed to improve the rate-distortion performance in the task of image compression. However, current methods either use additional post-processing blocks on the decoder end to improve compression or propose an end-to-end compression scheme based on heuristics. For the majority of these, the trained deep neural networks (DNNs) are not compatible with standard encoders and would be difficult to deply on personal computers and cellphones. In light of this, we propose a system that learns to improve the encoding performance by enhancing its internal neural representations on both the encoder and decoder ends, an approach we call Neural JPEG. We propose frequency domain pre-editing and post-editing methods to optimize the distribution of the DCT coefficients at both encoder and decoder ends in order to improve the standard compression (JPEG) method. Moreover, we design and integrate a scheme for jointly learning quantization tables within this hybrid neural compression framework.Experiments demonstrate that our approach successfully improves the rate-distortion performance over JPEG across various quality metrics, such as PSNR and MS-SSIM, and generates visually appealing images with better color retention quality. △ Less

Submitted 31 January, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

Comments: Accepted in DCC 2022, 11 pages

arXiv:2201.11782 [pdf, other]

An Empirical Analysis of Recurrent Learning Algorithms In Neural Lossy Image Compression Systems

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, Lee Giles

Abstract: Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark. However, they are slow to train (due to backprop-through-time) and, to the best of our knowledge, have not been systematically evaluated on a large variety of datasets. In this paper, we perform the first large-scale comparison of recent state-of-the-ar… ▽ More Recent advances in deep learning have resulted in image compression algorithms that outperform JPEG and JPEG 2000 on the standard Kodak benchmark. However, they are slow to train (due to backprop-through-time) and, to the best of our knowledge, have not been systematically evaluated on a large variety of datasets. In this paper, we perform the first large-scale comparison of recent state-of-the-art hybrid neural compression algorithms, while exploring the effects of alternative training strategies (when applicable). The hybrid recurrent neural decoder is a former state-of-the-art model (recently overtaken by a Google model) that can be trained using backprop-through-time (BPTT) or with alternative algorithms like sparse attentive backtracking (SAB), unbiased online recurrent optimization (UORO), and real-time recurrent learning (RTRL). We compare these training alternatives along with the Google models (GOOG and E2E) on 6 benchmark datasets. Surprisingly, we found that the model trained with SAB performs better (outperforming even BPTT), resulting in faster convergence and a better peak signal-to-noise ratio. △ Less

Submitted 27 January, 2022; originally announced January 2022.

Comments: Accepted at DCC 2021, 15 pages

arXiv:2201.08495 [pdf, other]

SciBERTSUM: Extractive Summarization for Scientific Documents

Authors: Athar Sefid, C Lee Giles

Abstract: The summarization literature focuses on the summarization of news articles. The news articles in the CNN-DailyMail are relatively short documents with about 30 sentences per document on average. We introduce SciBERTSUM, our summarization framework designed for the summarization of long documents like scientific papers with more than 500 sentences. SciBERTSUM extends BERTSUM to long documents by 1)… ▽ More The summarization literature focuses on the summarization of news articles. The news articles in the CNN-DailyMail are relatively short documents with about 30 sentences per document on average. We introduce SciBERTSUM, our summarization framework designed for the summarization of long documents like scientific papers with more than 500 sentences. SciBERTSUM extends BERTSUM to long documents by 1) adding a section embedding layer to include section information in the sentence vector and 2) applying a sparse attention mechanism where each sentences will attend locally to nearby sentences and only a small number of sentences attend globally to all other sentences. We used slides generated by the authors of scientific papers as reference summaries since they contain the technical details from the paper. The results show the superiority of our model in terms of ROUGE scores. △ Less

Submitted 20 January, 2022; originally announced January 2022.

arXiv:2201.06924 [pdf, other]

A Synthetic Prediction Market for Estimating Confidence in Published Work

Authors: Sarah Rajtmajer, Christopher Griffin, Jian Wu, Robert Fraleigh, Laxmaan Balaji, Anna Squicciarini, Anthony Kwasnica, David Pennock, Michael McLaughlin, Timothy Fritton, Nishanth Nakshatri, Arjun Menon, Sai Ajay Modukuri, Rajal Nivargi, Xin Wei, C. Lee Giles

Abstract: Explainably estimating confidence in published scholarly work offers opportunity for faster and more robust scientific progress. We develop a synthetic prediction market to assess the credibility of published claims in the social and behavioral sciences literature. We demonstrate our system and detail our findings using a collection of known replication projects. We suggest that this work lays the… ▽ More Explainably estimating confidence in published scholarly work offers opportunity for faster and more robust scientific progress. We develop a synthetic prediction market to assess the credibility of published claims in the social and behavioral sciences literature. We demonstrate our system and detail our findings using a collection of known replication projects. We suggest that this work lays the foundation for a research agenda that creatively uses AI for peer review. △ Less

Submitted 23 December, 2021; originally announced January 2022.

arXiv:2201.06091 [pdf, other]

Parallel transmit PUlse design for Saturation Homogeneity (PUSH) for Magnetization Transfer imaging at 7T

Authors: David Leitão, Raphael Tomi-Tricot, Pip Bridgen, Tom Wilkinson, Patrick Liebig, Rene Gumbrecht, Dieter Ritter, Sharon L. Giles, Ana Baburamani, Jan Sedlacik, Joseph V. Hajnal, Shaihan J. Malik

Abstract: Purpose: This work proposes a novel RF pulse design for parallel transmit (pTx) systems to obtain uniform saturation of semisolid magnetization for Magnetization Transfer (MT) contrast in the presence of transmit field ($B_1^+$) inhomogeneities. The semisolid magnetization is usually modeled as being purely longitudinal, with the applied $B_1^+$ field saturating but not rotating its magnetization,… ▽ More Purpose: This work proposes a novel RF pulse design for parallel transmit (pTx) systems to obtain uniform saturation of semisolid magnetization for Magnetization Transfer (MT) contrast in the presence of transmit field ($B_1^+$) inhomogeneities. The semisolid magnetization is usually modeled as being purely longitudinal, with the applied $B_1^+$ field saturating but not rotating its magnetization, thus standard pTx pulse design methods do not apply. Theory and Methods: Pulse design for Saturation Homogeneity (PUSH) optimizes pTx RF pulses by considering uniformity of root-mean squared $B_1^+$, $B_1^{rms}$, which relates to the rate of semisolid saturation. Here we considered designs consisting of a small number of spatially non-selective sub-pulses optimized over either a single 2D plane or 3D. Simulations and in vivo experiments on a 7T Terra system with an 8-TX Nova head coil in 5 subjects were carried out to study the homogenization of $B_1^{rms}$ and of the MT contrast by acquiring MT ratio maps. Results: Simulations and in vivo experiments showed up to 6 and 2 times more uniform $B_1^{rms}$ compared to circular polarized (CP) mode for 2D and 3D optimizations, respectively. This translated into 4 and 1.25 times more uniform MT contrast, consistently for all subjects, where 2 sub-pulses were enough for the implementation and coil used. Conclusion: The proposed PUSH method obtains more uniform and higher MT contrast than CP mode within the same SAR budget. △ Less

Submitted 16 January, 2022; originally announced January 2022.

Comments: 18 pages, 9 figures. Code available at: https://github.com/mriphysics/PUSH

arXiv:2112.00215 [pdf, other]

doi 10.1029/2021SW002933

Impact angle control of local intense d$B$/d$t$ variations during shock-induced substorms

Authors: Denny M. Oliveira, James M. Weygand, Eftyhia Zesta, Chigomezyo M. Ngwira, Michael D. Hartinger, Zhonghua Xu, Barbara L. Giles, Dan J. Gershman, Marcos V. D. Silveira, Vitor M. Souza

Abstract: The impact of interplanetary shocks on the magnetosphere can trigger magnetic substorms that intensify auroral electrojet currents. These currents enhance ground magnetic field perturbations (d$B$/d$t$), which in turn generate geomagnetically induced currents (GICs) that can be detrimental to power transmission infrastructure. We perform a comparative study of d$B$/d$t$ variations in response to t… ▽ More The impact of interplanetary shocks on the magnetosphere can trigger magnetic substorms that intensify auroral electrojet currents. These currents enhance ground magnetic field perturbations (d$B$/d$t$), which in turn generate geomagnetically induced currents (GICs) that can be detrimental to power transmission infrastructure. We perform a comparative study of d$B$/d$t$ variations in response to two similarly strong shocks, but with one being nearly frontal, and the other, highly inclined. Multi-instrument analyses by the Time History of Events and Macroscale Interactions during Substorms (THEMIS) and Los Alamos National Laboratory spacecraft show that nightside substorm-time energetic particle injections are more intense and occur faster in the case of the nearly head-on impact. The same trend is observed in d$B$/d$t$ variations recorded by THEMIS ground magnetometers. THEMIS all-sky imager data show a fast and clear poleward auroral expansion in the first case, which does not clearly occur in the second case. Strong field-aligned currents computed with the spherical elementary current system (SECS) technique occur in both cases, but the current variations resulting from the inclined shock impact are weaker and slower compared to the nearly frontal case. SECS analyses also reveal that geographic areas with d$B$/d$t$ surpassing the thresholds 1.5 and 5 nT/s, usually linked to high-risk GICs, are larger and occur earlier due to the symmetric compression caused by the nearly head-on impact. These results, with profound space weather implications, suggest that shock impact angles affect the geospace driving conditions and the location and intensity of the subsequent d$B$/d$t$ variations during substorm activity. △ Less

Submitted 30 November, 2021; originally announced December 2021.

Comments: 44 pages, 18 figures, 3 tables

Journal ref: Published in Space Weather, 2021

arXiv:2111.06329 [pdf, other]

doi 10.1029/2021GL097547

A Systematic Look at the Temperature Gradient Contribution to the Dayside Magnetopause Current

Authors: Jason M. H. Beedle, David J. Gershman, Vadim M. Uritsky, Tai D. Phan, Barbara L. Giles

Abstract: Magnetopause diamagnetic currents arise from density and temperature driven pressure gradients across the boundary layer. While theoretically recognized, the temperature contributions to the magnetopause current system have not yet been systematically studied. To bridge this gap, we used a database of Magnetospheric Multiscale (MMS) magnetopause crossings to analyze diamagnetic current densities a… ▽ More Magnetopause diamagnetic currents arise from density and temperature driven pressure gradients across the boundary layer. While theoretically recognized, the temperature contributions to the magnetopause current system have not yet been systematically studied. To bridge this gap, we used a database of Magnetospheric Multiscale (MMS) magnetopause crossings to analyze diamagnetic current densities and their contributions across the dayside and flank magnetopause. Our results indicate that the ion temperature gradient component makes up to 37% of the ion diamagnetic current density along the magnetopause and typically opposes the classical Chapman-Ferraro current direction, interfering destructively with the density gradient component, thus lowering the total diamagnetic current density. This effect is most pronounced on the flank magnetopause. The electron diamagnetic current was found to be 5 to 14 times weaker than the ion diamagnetic current on average. △ Less

Submitted 15 February, 2022; v1 submitted 3 October, 2021; originally announced November 2021.

arXiv:2111.03118 [pdf, other]

doi 10.1063/5.0071015

Energy Dissipation in Turbulent Reconnection

Authors: R. Bandyopadhyay, A. Chasapis, W. H. Matthaeus, T. N. Parashar, C. C. Haggerty, M. A. Shay, D. J. Gershman, B. L. Giles, J. L. Burch

Abstract: We study the nature of pressure-strain interaction at reconnection sites, detected by NASA's Magnetospheric Multiscale (MMS) Mission. We employ data from a series of published case studies, including a large-scale reconnection event at the magnetopause, three small-scale reconnection events at the magnetosheath current sheets, and one example of the recently discovered electron-only reconnection.… ▽ More We study the nature of pressure-strain interaction at reconnection sites, detected by NASA's Magnetospheric Multiscale (MMS) Mission. We employ data from a series of published case studies, including a large-scale reconnection event at the magnetopause, three small-scale reconnection events at the magnetosheath current sheets, and one example of the recently discovered electron-only reconnection. In all instances, we find that the pressure-strain shows signature of conversion into (or from) internal energy at the reconnection site. The electron heating rate is larger than the ion heating rate and the compressive heating is dominant over the incompressive heating rate in all cases considered. The magnitude of thermal energy conversion rate is close to the electromagnetic energy conversion rate in the reconnection region. Although in most cases the pressure-strain interaction indicates that the particle internal energy is increasing, in one case the internal energy is decreasing. These observations indicate that the pressure-strain interaction can be used as an independent measure of energy conversion and dynamics in reconnection regions, in particular independent of measures based on the electromagnetic work. Finally, we explore a selected reconnection site in a turbulent Particle-in-Cell (PIC) simulation which further supports the observational results. △ Less

Submitted 4 November, 2021; originally announced November 2021.

Comments: The following article has been accepted by Physics of Plasmas

arXiv:2110.11624 [pdf, other]

SciCap: Generating Captions for Scientific Figures

Authors: Ting-Yao Hsu, C. Lee Giles, Ting-Hao 'Kenneth' Huang

Abstract: Researchers use figures to communicate rich, complex information in scientific papers. The captions of these figures are critical to conveying effective messages. However, low-quality figure captions commonly occur in scientific articles and may decrease understanding. In this paper, we propose an end-to-end neural framework to automatically generate informative, high-quality captions for scientif… ▽ More Researchers use figures to communicate rich, complex information in scientific papers. The captions of these figures are critical to conveying effective messages. However, low-quality figure captions commonly occur in scientific articles and may decrease understanding. In this paper, we propose an end-to-end neural framework to automatically generate informative, high-quality captions for scientific figures. To this end, we introduce SCICAP, a large-scale figure-caption dataset based on computer science arXiv papers published between 2010 and 2020. After pre-processing - including figure-type classification, sub-figure identification, text normalization, and caption text selection - SCICAP contained more than two million figures extracted from over 290,000 papers. We then established baseline models that caption graph plots, the dominant (19.2%) figure type. The experimental results showed both opportunities and steep challenges of generating captions for scientific figures. △ Less

Submitted 25 October, 2021; v1 submitted 22 October, 2021; originally announced October 2021.

Comments: To Appear in EMNLP 2021 Findings. The dataset is available at: https://github.com/tingyaohsu/SciCap

arXiv:2106.03246 [pdf, other]

Extractive Research Slide Generation Using Windowed Labeling Ranking

Authors: Athar Sefid, Jian Wu, Prasenjit Mitra, Lee Giles

Abstract: Presentation slides describing the content of scientific and technical papers are an efficient and effective way to present that work. However, manually generating presentation slides is labor intensive. We propose a method to automatically generate slides for scientific papers based on a corpus of 5000 paper-slide pairs compiled from conference proceedings websites. The sentence labeling module o… ▽ More Presentation slides describing the content of scientific and technical papers are an efficient and effective way to present that work. However, manually generating presentation slides is labor intensive. We propose a method to automatically generate slides for scientific papers based on a corpus of 5000 paper-slide pairs compiled from conference proceedings websites. The sentence labeling module of our method is based on SummaRuNNer, a neural sequence model for extractive summarization. Instead of ranking sentences based on semantic similarities in the whole document, our algorithm measures importance and novelty of sentences by combining semantic and lexical features within a sentence window. Our method outperforms several baseline methods including SummaRuNNer by a significant margin in terms of ROUGE score. △ Less

Submitted 6 June, 2021; originally announced June 2021.

Journal ref: NAACL/Proceedings of the Second Workshop on Scholarly Document Processing 2021

arXiv:2105.14931 [pdf, other]

doi 10.1007/978-3-030-86549-8_32

Document Domain Randomization for Deep Learning Document Layout Extraction

Authors: Meng Ling, Jian Chen, Torsten Möller, Petra Isenberg, Tobias Isenberg, Michael Sedlmair, Robert S. Laramee, Han-Wei Shen, Jian Wu, C. Lee Giles

Abstract: We present document domain randomization (DDR), the first successful transfer of convolutional neural networks (CNNs) trained only on graphically rendered pseudo-paper pages to real-world document segmentation. DDR renders pseudo-document pages by modeling randomized textual and non-textual contents of interest, with user-defined layout and font styles to support joint learning of fine-grained cla… ▽ More We present document domain randomization (DDR), the first successful transfer of convolutional neural networks (CNNs) trained only on graphically rendered pseudo-paper pages to real-world document segmentation. DDR renders pseudo-document pages by modeling randomized textual and non-textual contents of interest, with user-defined layout and font styles to support joint learning of fine-grained classes. We demonstrate competitive results using our DDR approach to extract nine document classes from the benchmark CS-150 and papers published in two domains, namely annual meetings of Association for Computational Linguistics (ACL) and IEEE Visualization (VIS). We compare DDR to conditions of style mismatch, fewer or more noisy samples that are more easily obtained in the real world. We show that high-fidelity semantic information is not necessary to label semantic classes but style mismatch between train and test can lower model accuracy. Using smaller training samples had a slightly detrimental effect. Finally, network models still achieved high test accuracy when correct labels are diluted towards confusing labels; this behavior hold across several classes. △ Less

Submitted 20 May, 2021; originally announced May 2021.

Comments: Main paper to appear in ICDAR 2021 (16th International Conference on Document Analysis and Recognition). This version contains additional materials. The associated test data is hosted on IEEE Data Port: http://doi.org/10.21227/326q-bf39

Journal ref: International Conference on Document Analysis and Recognition (ICDAR), 2021

arXiv:2104.09403 [pdf, other]

OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas

Authors: Shivansh Rao, Vikas Kumar, Daniel Kifer, Lee Giles, Ankur Mali

Abstract: Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, floor boundary, and ceiling boundary. A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout. However, the space-varying distortions in panoramic images are not compatible… ▽ More Given a single RGB panorama, the goal of 3D layout reconstruction is to estimate the room layout by predicting the corners, floor boundary, and ceiling boundary. A common approach has been to use standard convolutional networks to predict the corners and boundaries, followed by post-processing to generate the 3D layout. However, the space-varying distortions in panoramic images are not compatible with the translational equivariance property of standard convolutions, thus degrading performance. Instead, we propose to use spherical convolutions. The resulting network, which we call OmniLayout performs convolutions directly on the sphere surface, sampling according to inverse equirectangular projection and hence invariant to equirectangular distortions. Using a new evaluation metric, we show that our network reduces the error in the heavily distorted regions (near the poles) by approx 25 % when compared to standard convolutional networks. Experimental results show that OmniLayout outperforms the state-of-the-art by approx 4% on two different benchmark datasets (PanoContext and Stanford 2D-3D). Code is available at https://github.com/rshivansh/OmniLayout. △ Less

Submitted 19 April, 2021; originally announced April 2021.

Comments: Accepted at CVPR, OmniCV Workshop. 10 Pages, 9 Figures, 6 Tables

arXiv:2104.04580 [pdf, other]

Predicting the Reproducibility of Social and Behavioral Science Papers Using Supervised Learning Models

Authors: Jian Wu, Rajal Nivargi, Sree Sai Teja Lanka, Arjun Manoj Menon, Sai Ajay Modukuri, Nishanth Nakshatri, Xin Wei, Zhuoer Wang, James Caverlee, Sarah M. Rajtmajer, C. Lee Giles

Abstract: In recent years, significant effort has been invested verifying the reproducibility and robustness of research claims in social and behavioral sciences (SBS), much of which has involved resource-intensive replication projects. In this paper, we investigate prediction of the reproducibility of SBS papers using machine learning methods based on a set of features. We propose a framework that extracts… ▽ More In recent years, significant effort has been invested verifying the reproducibility and robustness of research claims in social and behavioral sciences (SBS), much of which has involved resource-intensive replication projects. In this paper, we investigate prediction of the reproducibility of SBS papers using machine learning methods based on a set of features. We propose a framework that extracts five types of features from scholarly work that can be used to support assessments of reproducibility of published research claims. Bibliometric features, venue features, and author features are collected from public APIs or extracted using open source machine learning libraries with customized parsers. Statistical features, such as p-values, are extracted by recognizing patterns in the body text. Semantic features, such as funding information, are obtained from public APIs or are extracted using natural language processing models. We analyze pairwise correlations between individual features and their importance for predicting a set of human-assessed ground truth labels. In doing so, we identify a subset of 9 top features that play relatively more important roles in predicting the reproducibility of SBS papers in our corpus. Results are verified by comparing performances of 10 supervised predictive classifiers trained on different sets of features. △ Less

Submitted 21 October, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

Comments: 17 pages, 8 figures

arXiv:2104.02899 [pdf, other]

Recognizing and Verifying Mathematical Equations using Multiplicative Differential Neural Units

Authors: Ankur Mali, Alexander Ororbia, Daniel Kifer, C. Lee Giles

Abstract: Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation verification, which requires determining whether trigonometric and linear algebraic statements are valid identities or not, and (2) equation completion, which entails fi… ▽ More Automated mathematical reasoning is a challenging problem that requires an agent to learn algebraic patterns that contain long-range dependencies. Two particular tasks that test this type of reasoning are (1) mathematical equation verification, which requires determining whether trigonometric and linear algebraic statements are valid identities or not, and (2) equation completion, which entails filling in a blank within an expression to make it true. Solving these tasks with deep learning requires that the neural model learn how to manipulate and compose various algebraic symbols, carrying this ability over to previously unseen expressions. Artificial neural networks, including recurrent networks and transformers, struggle to generalize on these kinds of difficult compositional problems, often exhibiting poor extrapolation performance. In contrast, recursive neural networks (recursive-NNs) are, theoretically, capable of achieving better extrapolation due to their tree-like design but are difficult to optimize as the depth of their underlying tree structure increases. To overcome this issue, we extend recursive-NNs to utilize multiplicative, higher-order synaptic connections and, furthermore, to learn to dynamically control and manipulate an external memory. We argue that this key modification gives the neural system the ability to capture powerful transition functions for each possible input. We demonstrate the effectiveness of our proposed higher-order, memory-augmented recursive-NN models on two challenging mathematical equation tasks, showing improved extrapolation, stable performance, and faster convergence. Our models achieve a 1.53% average improvement over current state-of-the-art methods in equation verification and achieve a 2.22% Top-1 average accuracy and 2.96% Top-5 average accuracy for equation completion. △ Less

Submitted 6 April, 2021; originally announced April 2021.

arXiv:2103.01256 [pdf]

doi 10.1038/s41467-021-25477-8

Understanding the onset of hot streaks across artistic, cultural, and scientific careers

Authors: Lu Liu, Nima Dehmamy, Jillian Chown, C. Lee Giles, Dashun Wang

Abstract: Hot streaks dominate the main impact of creative careers. Despite their ubiquitous nature across a wide range of creative domains, it remains unclear if there is any regularity underlying the beginning of hot streaks. Here, we develop computational methods using deep learning and network science and apply them to novel, large-scale datasets tracing the career outputs of artists, film directors, an… ▽ More Hot streaks dominate the main impact of creative careers. Despite their ubiquitous nature across a wide range of creative domains, it remains unclear if there is any regularity underlying the beginning of hot streaks. Here, we develop computational methods using deep learning and network science and apply them to novel, large-scale datasets tracing the career outputs of artists, film directors, and scientists, allowing us to build high-dimensional representations of the artworks, films, and scientific publications they produce. By examining individuals' career trajectories within the underlying creative space, we find that across all three domains, individuals tend to explore diverse styles or topics before their hot streak, but become notably more focused in what they work on after the hot streak begins. Crucially, we find that hot streaks are associated with neither exploration nor exploitation behavior in isolation, but a particular sequence of exploration followed by exploitation, where the transition from exploration to exploitation closely traces the onset of a hot streak. Overall, these results unveil among the first identifiable regularity underlying the onset of hot streaks, which appears universal across diverse creative domains, suggesting that a sequential view of creative strategies that balances experimentation and implementation may be particularly powerful for producing long-lasting contributions, which may have broad implications for identifying and nurturing creative talents. △ Less

Submitted 1 March, 2021; originally announced March 2021.

arXiv:2101.01787 [pdf, other]

Design and Analysis of a Synthetic Prediction Market using Dynamic Convex Sets

Authors: Nishanth Nakshatri, Arjun Menon, C. Lee Giles, Sarah Rajtmajer, Christopher Griffin

Abstract: We present a synthetic prediction market whose agent purchase logic is defined using a sigmoid transformation of a convex semi-algebraic set defined in feature space. Asset prices are determined by a logarithmic scoring market rule. Time varying asset prices affect the structure of the semi-algebraic sets leading to time-varying agent purchase rules. We show that under certain assumptions on the u… ▽ More We present a synthetic prediction market whose agent purchase logic is defined using a sigmoid transformation of a convex semi-algebraic set defined in feature space. Asset prices are determined by a logarithmic scoring market rule. Time varying asset prices affect the structure of the semi-algebraic sets leading to time-varying agent purchase rules. We show that under certain assumptions on the underlying geometry, the resulting synthetic prediction market can be used to arbitrarily closely approximate a binary function defined on a set of input data. We also provide sufficient conditions for market convergence and show that under certain instances markets can exhibit limit cycles in asset spot price. We provide an evolutionary algorithm for training agent parameters to allow a market to model the distribution of a given data set and illustrate the market approximation using two open source data sets. Results are compared to standard machine learning methods. △ Less

Submitted 5 January, 2021; originally announced January 2021.

Comments: 17 pages, 7 figures

arXiv:2012.07565 [pdf, other]

Automating Document Classification with Distant Supervision to Increase the Efficiency of Systematic Reviews

Authors: Xiaoxiao Li, Rabah Al-Zaidy, Amy Zhang, Stefan Baral, Le Bao, C. Lee Giles

Abstract: Objective: Systematic reviews of scholarly documents often provide complete and exhaustive summaries of literature relevant to a research question. However, well-done systematic reviews are expensive, time-demanding, and labor-intensive. Here, we propose an automatic document classification approach to significantly reduce the effort in reviewing documents. Methods: We first describe a manual docu… ▽ More Objective: Systematic reviews of scholarly documents often provide complete and exhaustive summaries of literature relevant to a research question. However, well-done systematic reviews are expensive, time-demanding, and labor-intensive. Here, we propose an automatic document classification approach to significantly reduce the effort in reviewing documents. Methods: We first describe a manual document classification procedure that is used to curate a pertinent training dataset and then propose three classifiers: a keyword-guided method, a cluster analysis-based refined method, and a random forest approach that utilizes a large set of feature tokens. As an example, this approach is used to identify documents studying female sex workers that are assumed to contain content relevant to either HIV or violence. We compare the performance of the three classifiers by cross-validation and conduct a sensitivity analysis on the portion of data utilized in training the model. Results: The random forest approach provides the highest area under the curve (AUC) for both receiver operating characteristic (ROC) and precision/recall (PR). Analyses of precision and recall suggest that random forest could facilitate manually reviewing 20\% of the articles while containing 80\% of the relevant cases. Finally, we found a good classifier could be obtained by using a relatively small training sample size. Conclusions: In sum, the automated procedure of document classification presented here could improve both the precision and efficiency of systematic reviews, as well as facilitating live reviews, where reviews are updated regularly. △ Less

Submitted 9 December, 2020; originally announced December 2020.

arXiv:2012.03397 [pdf, other]

doi 10.1109/BigData50022.2020.9377796

Modeling Updates of Scholarly Webpages Using Archived Data

Authors: Yasith Jayawardana, Alexander C. Nwala, Gavindya Jayawardena, Jian Wu, Sampath Jayarathna, Michael L. Nelson, C. Lee Giles

Abstract: The vastness of the web imposes a prohibitive cost on building large-scale search engines with limited resources. Crawl frontiers thus need to be optimized to improve the coverage and freshness of crawled content. In this paper, we propose an approach for modeling the dynamics of change in the web using archived copies of webpages. To evaluate its utility, we conduct a preliminary study on the sch… ▽ More The vastness of the web imposes a prohibitive cost on building large-scale search engines with limited resources. Crawl frontiers thus need to be optimized to improve the coverage and freshness of crawled content. In this paper, we propose an approach for modeling the dynamics of change in the web using archived copies of webpages. To evaluate its utility, we conduct a preliminary study on the scholarly web using 19,977 seed URLs of authors' homepages obtained from their Google Scholar profiles. We first obtain archived copies of these webpages from the Internet Archive (IA), and estimate when their actual updates occurred. Next, we apply maximum likelihood to estimate their mean update frequency ($λ$) values. Our evaluation shows that $λ$ values derived from a short history of archived data provide a good estimate for the true update frequency in the short-term, and that our method provides better estimations of updates at a fraction of resources compared to the baseline models. Based on this, we demonstrate the utility of archived data to optimize the crawling strategy of web crawlers, and uncover important challenges that inspire future research directions. △ Less

Submitted 6 December, 2020; originally announced December 2020.

Comments: 12 pages, 2 appendix pages, 18 figures, to be published in Proceedings of IEEE Big Data 2020 - 5th Computational Archival Science (CAS) Workshop

arXiv:2012.02641 [pdf, other]

doi 10.3847/1538-4357/abce5a

In situ evidence of ion acceleration between consecutive reconnection jet fronts

Authors: Filomena Catapano, Alessandro Retino, Gaetano Zimbardo, Alexandra Alexandrova, Ian J. Cohen, Drew L. Turner, Olivier Le Contel, Giulia Cozzani, Silvia Perri, Antonella Greco, Hugo Breuillard, Dominique Delcourt, Laurent Mirioni, Yuri Khotyaintsev, Andris Vaivads, Barbara L. Giles, Barry H. Mauk, Stephen A. Fuselier, Roy B. Torbert, Christopher T. Russell, Per A. Lindqvist, Robert E. Ergun, Thomas Moore, James L. Burch

Abstract: Processes driven by unsteady reconnection can efficiently accelerate particles in many astrophysical plasmas. An example are the reconnection jet fronts in an outflow region. We present evidence of suprathermal ion acceleration between two consecutive reconnection jet fronts observed by the Magnetospheric Multiscale mission in the terrestrial magnetotail. An earthward propagating jet is approached… ▽ More Processes driven by unsteady reconnection can efficiently accelerate particles in many astrophysical plasmas. An example are the reconnection jet fronts in an outflow region. We present evidence of suprathermal ion acceleration between two consecutive reconnection jet fronts observed by the Magnetospheric Multiscale mission in the terrestrial magnetotail. An earthward propagating jet is approached by a second faster jet. Between the jets, the thermal ions are mostly perpendicular to magnetic field, are trapped and are gradually accelerated in the parallel direction up to 150 keV. Observations suggest that ions are predominantly accelerated by a Fermi-like mechanism in the contracting magnetic bottle formed between the two jet fronts. The ion acceleration mechanism is presumably efficient in other environments where jet fronts produced by variable rates of reconnection are common and where the interaction of multiple jet fronts can also develop a turbulent environment, e.g. in stellar and solar eruptions. △ Less

Submitted 30 November, 2020; originally announced December 2020.

arXiv:2010.01782 [pdf, other]

doi 10.1093/mnrasl/slaa171

Observation of Inertial-range Energy Cascade within a Reconnection Jet in Earth's Magnetotail

Authors: Riddhi Bandyopadhyay, Alexandros Chasapis, D. J. Gershman, B. L. Giles, C. T. Russell, R. J. Strangeway, O. Le Contel, M. R. Argall, J. L. Burch

Abstract: Earth's magnetotail region provides a unique environment to study plasma turbulence. We investigate the turbulence developed in an exhaust produced by magnetic reconnection at the terrestrial magnetotail region. Magnetic and velocity spectra show broad-band fluctuations corresponding to the inertial range, with Kolmorogov $-5/3$ scaling, indicative of a well developed turbulent cascade. We examine… ▽ More Earth's magnetotail region provides a unique environment to study plasma turbulence. We investigate the turbulence developed in an exhaust produced by magnetic reconnection at the terrestrial magnetotail region. Magnetic and velocity spectra show broad-band fluctuations corresponding to the inertial range, with Kolmorogov $-5/3$ scaling, indicative of a well developed turbulent cascade. We examine the mixed, third-order structure functions, and obtain a linear scaling in the inertial range. This linear scaling of the third-order structure functions implies a scale-invariant cascade of energy through the inertial range. A Politano-Pouquet third-order analysis gives an estimate of the incompressive energy transfer rate of $\sim 10^{7}~\mathrm{J\,kg^{-1}\,s^{-1}}$. This is four orders of magnitude higher than the values typically measured in 1 AU solar wind, suggesting that the turbulence cascade plays an important role as a pathway of energy dissipation during reconnection events in the tail region. △ Less

Submitted 5 October, 2020; originally announced October 2020.

Comments: Accepted for publication in MNRAS

arXiv:2009.03079 [pdf, ps, other]

doi 10.1029/2020JA027854

Estimation of the electron density from spacecraft potential during high frequency electric field fluctuations

Authors: O. W. Roberts, R. Nakamura, K. Torkar, D. B. Graham, D. J. Gershman, J. C. Holmes, A. Varsani, C. P. Escoubet, Z. Vörös, S. Wellenzohn, Y. Khotyaintsev, R. E. Ergun, B. L. Giles

Abstract: Spacecraft potential has often been used to infer electron density with much higher time resolution than is typically possible with plasma instruments. However, recently two studies by Torkar et al. 2017 and Graham et al. 2018 have shown that external electric fields can also have an effect on the spacecraft potential by enhancing photoelectron escape from the surface. Consequently, should the ele… ▽ More Spacecraft potential has often been used to infer electron density with much higher time resolution than is typically possible with plasma instruments. However, recently two studies by Torkar et al. 2017 and Graham et al. 2018 have shown that external electric fields can also have an effect on the spacecraft potential by enhancing photoelectron escape from the surface. Consequently, should the electron density derived from the spacecraft potential be used during an event with a large electric field, the estimation would be contaminated and the user would see the effects of the electric field rather than density perturbations. The goal of this paper is to propose a method to remove the electric field effects to allow the density derived from spacecraft potential to be used even during large amplitude wave events such as Langmuir waves or upper hybrid waves. △ Less

Submitted 7 September, 2020; originally announced September 2020.

Comments: Published in JGR Space Physics

Journal ref: Journal of Geophysical Research: Space Physics, 125, e2020JA027854

arXiv:2008.11290 [pdf, other]

Extractive Summarizer for Scholarly Articles

Authors: Athar Sefid, Clyde Lee Giles, Prasenjit Mitra

Abstract: We introduce an extractive method that will summarize long scientific papers. Our model uses presentation slides provided by the authors of the papers as the gold summary standard to label the sentences. The sentences are ranked based on their novelty and their importance as estimated by deep neural networks. Our window-based extractive labeling of sentences results in the improvement of at least… ▽ More We introduce an extractive method that will summarize long scientific papers. Our model uses presentation slides provided by the authors of the papers as the gold summary standard to label the sentences. The sentences are ranked based on their novelty and their importance as estimated by deep neural networks. Our window-based extractive labeling of sentences results in the improvement of at least 4 ROUGE1-Recall points. △ Less

Submitted 25 August, 2020; originally announced August 2020.

Showing 1–50 of 129 results for author: Giles, L