-
Learning Malware Representation based on Execution Sequences
Authors:
Yi-Ting Huang,
Ting-Yi Chen,
Yeali S. Sun,
Meng Chang Chen
Abstract:
Malware analysis has been extensively investigated as the number and types of malware has increased dramatically. However, most previous studies use end-to-end systems to detect whether a sample is malicious, or to identify its malware family. In this paper, we propose a neural network framework composed of an embedder, an encoder, and a filter to learn malware representations from characteristic…
▽ More
Malware analysis has been extensively investigated as the number and types of malware has increased dramatically. However, most previous studies use end-to-end systems to detect whether a sample is malicious, or to identify its malware family. In this paper, we propose a neural network framework composed of an embedder, an encoder, and a filter to learn malware representations from characteristic execution sequences for malware family classification. The embedder uses BERT and Sent2Vec, state-of-the-art embedding modules, to capture relations within a single API call and among consecutive API calls in an execution trace. The encoder comprises gated recurrent units (GRU) to preserve the ordinal position of API calls and a self-attention mechanism for comparing intra-relations among different positions of API calls. The filter identifies representative API calls to build the malware representation. We conduct broad experiments to determine the influence of individual framework components. The results show that the proposed framework outperforms the baselines, and also demonstrates that considering Sent2Vec to learn complete API call embeddings and GRU to explicitly preserve ordinal information yields more information and thus significant improvements. Also, the proposed approach effectively classifies new malicious execution traces on the basis of similarities with previously collected families.
△ Less
Submitted 4 February, 2021; v1 submitted 16 December, 2019;
originally announced December 2019.
-
Bringing personalized learning into computer-aided question generation
Authors:
Yi-Ting Huang,
Meng Chang Chen,
Yeali S. Sun
Abstract:
This paper proposes a novel and statistical method of ability estimation based on acquisition distribution for a personalized computer aided question generation. This method captures the learning outcomes over time and provides a flexible measurement based on the acquisition distributions instead of precalibration. Compared to the previous studies, the proposed method is robust, especially when an…
▽ More
This paper proposes a novel and statistical method of ability estimation based on acquisition distribution for a personalized computer aided question generation. This method captures the learning outcomes over time and provides a flexible measurement based on the acquisition distributions instead of precalibration. Compared to the previous studies, the proposed method is robust, especially when an ability of a student is unknown. The results from the empirical data show that the estimated abilities match the actual abilities of learners, and the pretest and post-test of the experimental group show significant improvement. These results suggest that this method can serves as the ability estimation for a personalized computer-aided testing environment.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
Development and Evaluation of a Personalized Computer-aided Question Generation for English Learners to Improve Proficiency and Correct Mistakes
Authors:
Yi-Ting Huang,
Meng Chang Chen,
Yeali S. Sun
Abstract:
In the last several years, the field of computer assisted language learning has increasingly focused on computer aided question generation. However, this approach often provides test takers with an exhaustive amount of questions that are not designed for any specific testing purpose. In this work, we present a personalized computer aided question generation that generates multiple choice questions…
▽ More
In the last several years, the field of computer assisted language learning has increasingly focused on computer aided question generation. However, this approach often provides test takers with an exhaustive amount of questions that are not designed for any specific testing purpose. In this work, we present a personalized computer aided question generation that generates multiple choice questions at various difficulty levels and types, including vocabulary, grammar and reading comprehension. In order to improve the weaknesses of test takers, it selects questions depending on an estimated proficiency level and unclear concepts behind incorrect responses. This results show that the students with the personalized automatic quiz generation corrected their mistakes more frequently than ones only with computer aided question generation. Moreover, students demonstrated the most progress between the pretest and post test and correctly answered more difficult questions. Finally, we investigated the personalizing strategy and found that a student could make a significant progress if the proposed system offered the vocabulary questions at the same level of his or her proficiency level, and if the grammar and reading comprehension questions were at a level lower than his or her proficiency level.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers
Authors:
Yi-Ting Huang,
Meng Chang Chen,
Yeali S. Sun
Abstract:
In recent years, the number of people studying English as a second language (ESL) has surpassed the number of native speakers. Recent work have demonstrated the success of providing personalized content based on reading difficulty, such as information retrieval and summarization. However, almost all prior studies of reading difficulty are designed for native speakers, rather than non-native reader…
▽ More
In recent years, the number of people studying English as a second language (ESL) has surpassed the number of native speakers. Recent work have demonstrated the success of providing personalized content based on reading difficulty, such as information retrieval and summarization. However, almost all prior studies of reading difficulty are designed for native speakers, rather than non-native readers. In this study, we investigate various features for ESL readers, by conducting a linear regression to estimate the reading level of English language sources. This estimation is based not only on the complexity of lexical and syntactic features, but also several novel concepts, including the age of word and grammar acquisition from several sources, word sense from WordNet, and the implicit relation between sentences. By employing Bayesian Information Criterion (BIC) to select the optimal model, we find that the combination of the number of words, the age of word acquisition and the height of the parsing tree generate better results than alternative competing models. Thus, our results show that proposed second language reading difficulty estimation outperforms other first language reading difficulty estimations.
△ Less
Submitted 29 August, 2018;
originally announced August 2018.
-
Virtual Machine Introspection Based Malware Behavior Profiling and Family Grouping
Authors:
Shun-Wen Hsiao,
Yeali S. Sun,
Meng Chang Chen
Abstract:
The proliferation of malwares have been attributed to the alternations of a handful of original malware source codes. The malwares alternated from the same origin share some intrinsic behaviors and form a malware family. Expediently, identifying its malware family when a malware is first seen on the Internet can provide useful clues to mitigate the threat. In this paper, a malware profiler (VMP) i…
▽ More
The proliferation of malwares have been attributed to the alternations of a handful of original malware source codes. The malwares alternated from the same origin share some intrinsic behaviors and form a malware family. Expediently, identifying its malware family when a malware is first seen on the Internet can provide useful clues to mitigate the threat. In this paper, a malware profiler (VMP) is proposed to profile the execution behaviors of a malware by leveraging virtual machine introspection (VMI) technique. The VMP inserts plug-ins inside the virtual machine monitor (VMM) to record the invoked API calls with their input parameters and return values as the profile of malware. In this paper, a popular similarity measurement Jaccard distance and a phylogenetic tree construction method are adopted to discover malware families. The studies of malware profiles show the malwares from a malware family are very similar to each others and distinct from other malware families as well as benign software. This paper also examines VMP against existing anti-malware detection engines and some well-known malware grouping methods to compare the goodness in their malware family constructions. A peer voting approach is proposed and the results show VMP is better than almost all of the compared anti-malware engines, and compatible with the fine tuned text-mining approach and high order N-gram approaches. We also establish a malware profiling website based on VMP for malware research.
△ Less
Submitted 4 May, 2017;
originally announced May 2017.
-
Giant spin-phonon-electronic coupling in a 5d oxide
Authors:
S. Calder,
J. H. Lee,
M. B. Stone,
M. D. Lumsden,
J. C. Lang,
M. Feygenson,
Y. G. Shi,
Y. S. Sun,
Y. Tsujimoto,
K. Yamaura,
A. D. Christianson
Abstract:
Enhanced coupling of material properties offers new fundamental insights and routes to multifunctional devices. In this context 5d oxides provide new paradigms of cooperative interactions driving novel emergent behavior. This is exemplified in 5d osmates that host a metal-insulator transition (MIT) driven by magnetic order. Here we consider the most robust case, the 5d perovskite NaOsO3, and revea…
▽ More
Enhanced coupling of material properties offers new fundamental insights and routes to multifunctional devices. In this context 5d oxides provide new paradigms of cooperative interactions driving novel emergent behavior. This is exemplified in 5d osmates that host a metal-insulator transition (MIT) driven by magnetic order. Here we consider the most robust case, the 5d perovskite NaOsO3, and reveal a giant coupling between spin and phonon through a frequency shift of Δω=40 cm-1, the largest measured in any material. We identify the dominant octahedral breathing mode and show isosymmetry with spin ordering which induces dynamic charge disproportionation that sheds new light on the MIT. The occurrence of the dramatic spin-phonon-electronic coupling in NaOsO3 is due to a property common to all 5d materials: the large spatial extent of the 5d ion. This allows magnetism to couple to phonons on an unprecedented scale and consequently offers multiple new routes to enhanced coupled phenomena.
△ Less
Submitted 9 January, 2015;
originally announced January 2015.
-
Magnetically driven metal-insulator transition in NaOsO3
Authors:
S. Calder,
V. O. Garlea,
D. F. McMorrow,
M. D. Lumsden,
M. B. Stone,
J. C. Lang,
J. -W. Kim,
J. A. Schlueter,
Y. G. Shi,
K. Yamaura,
Y. S. Sun,
Y. Tsujimoto,
A. D. Christianson
Abstract:
The metal-insulator transition (MIT) is one of the most dramatic manifestations of electron correlations in materials. Various mechanisms producing MITs have been extensively considered, including the Mott (electron localization via Coulomb repulsion), Anderson (localization via disorder) and Peierls (localization via distortion of a periodic 1D lattice). One additional route to a MIT proposed by…
▽ More
The metal-insulator transition (MIT) is one of the most dramatic manifestations of electron correlations in materials. Various mechanisms producing MITs have been extensively considered, including the Mott (electron localization via Coulomb repulsion), Anderson (localization via disorder) and Peierls (localization via distortion of a periodic 1D lattice). One additional route to a MIT proposed by Slater, in which long-range magnetic order in a three dimensional system drives the MIT, has received relatively little attention. Using neutron and X-ray scattering we show that the MIT in NaOsO3 is coincident with the onset of long-range commensurate three dimensional magnetic order. Whilst candidate materials have been suggested, our experimental methodology allows the first definitive demonstration of the long predicted Slater MIT. We discuss our results in the light of recent reports of a Mott spin-orbit insulating state in other 5d oxides.
△ Less
Submitted 7 February, 2012;
originally announced February 2012.