Search | arXiv e-print repository

doi 10.1109/IPCCC59868.2024.10850192

Enhanced Outsourced and Secure Inference for Tall Sparse Decision Trees

Authors: Andrew Quijano, Spyros T. Halkidis, Kevin Gallagher, Kemal Akkaya, Nikolaos Samaras

Abstract: A decision tree is an easy-to-understand tool that has been widely used for classification tasks. On the one hand, due to privacy concerns, there has been an urgent need to create privacy-preserving classifiers that conceal the user's input from the classifier. On the other hand, with the rise of cloud computing, data owners are keen to reduce risk by outsourcing their model, but want security gua… ▽ More A decision tree is an easy-to-understand tool that has been widely used for classification tasks. On the one hand, due to privacy concerns, there has been an urgent need to create privacy-preserving classifiers that conceal the user's input from the classifier. On the other hand, with the rise of cloud computing, data owners are keen to reduce risk by outsourcing their model, but want security guarantees that third parties cannot steal their decision tree model. To address these issues, Joye and Salehi introduced a theoretical protocol that efficiently evaluates decision trees while maintaining privacy by leveraging their comparison protocol that is resistant to timing attacks. However, their approach was not only inefficient but also prone to side-channel attacks. Therefore, in this paper, we propose a new decision tree inference protocol in which the model is shared and evaluated among multiple entities. We partition our decision tree model by each level to be stored in a new entity we refer to as a "level-site." Utilizing this approach, we were able to gain improved average run time for classifier evaluation for a non-complete tree, while also having strong mitigations against side-channel attacks. △ Less

Submitted 4 May, 2025; originally announced May 2025.

Journal ref: IEEE International Performance Computing and Communications Conference (2024) 1-6

arXiv:2107.09948 [pdf, other]

A Statistical Model of Word Rank Evolution

Authors: Alex John Quijano, Rick Dale, Suzanne Sindi

Abstract: The availability of large linguistic data sets enables data-driven approaches to study linguistic change. The Google Books corpus unigram frequency data set is used to investigate the word rank dynamics in eight languages. We observed the rank changes of the unigrams from 1900 to 2008 and compared it to a Wright-Fisher inspired model that we developed for our analysis. The model simulates a neutra… ▽ More The availability of large linguistic data sets enables data-driven approaches to study linguistic change. The Google Books corpus unigram frequency data set is used to investigate the word rank dynamics in eight languages. We observed the rank changes of the unigrams from 1900 to 2008 and compared it to a Wright-Fisher inspired model that we developed for our analysis. The model simulates a neutral evolutionary process with the restriction of having no disappearing and added words. This work explains the mathematical framework of the model - written as a Markov Chain with multinomial transition probabilities - to show how frequencies of words change in time. From our observations in the data and our model, word rank stability shows two types of characteristics: (1) the increase/decrease in ranks are monotonic, or (2) the rank stays the same. Based on our model, high-ranked words tend to be more stable while low-ranked words tend to be more volatile. Some words change in ranks in two ways: (a) by an accumulation of small increasing/decreasing rank changes in time and (b) by shocks of increase/decrease in ranks. Most words in all of the languages we have looked at are rank stable, but not as stable as a neutral model would predict. The stopwords and Swadesh words are observed to be rank stable across eight languages indicating linguistic conformity in established languages. These signatures suggest unigram frequencies in all languages have changed in a manner inconsistent with a purely neutral evolutionary process. △ Less

Submitted 14 February, 2022; v1 submitted 21 July, 2021; originally announced July 2021.

Comments: This manuscript - with 31 pages (main), 10 figures (main), 24 pages (supplementary), and 19 figures (supplementary) - is a manuscript for a journal research article

arXiv:2107.02559 [pdf, other]

doi 10.1109/TAP.2022.3142310

Radiation and Scattering of EM Waves in Large Plasmas Around Objects in Hypersonic Flight

Authors: A. Scarabosio, J. L. Araque Quijano, J. Tobon, M. Righero, G. Giordanengo, D. DAmbrosio, L. Walpot, G. Vecchi

Abstract: Hypersonic flight regime is conventionally defined for Mach larger than 5; in these conditions, the flying object becomes enveloped in a plasma. This plasma is densest in thin surface layers, but in typical situations of interest it impacts electromagnetic wave propagation in an electrically large volume. We address this problem with a hybrid approach. We employ Equivalence Theorem to separate the… ▽ More Hypersonic flight regime is conventionally defined for Mach larger than 5; in these conditions, the flying object becomes enveloped in a plasma. This plasma is densest in thin surface layers, but in typical situations of interest it impacts electromagnetic wave propagation in an electrically large volume. We address this problem with a hybrid approach. We employ Equivalence Theorem to separate the inhomogeneous plasma region from the surrounding free space via an equivalent (Huygens) surface, and the Eikonal approximation to Maxwell equations in the large inhomogeneous region for obtaining equivalent currents on the separating surface. Then, we obtain the scattered field via (exact) free space radiation of these surface equivalent currents. The method is extensively tested against reference results and then applied to a real-life re-entry vehicle with full 3D plasma computed via Computational Fluid Dynamic (CFD) simulations. We address both scattering (RCS) from the entire vehicle and radiation from the on-board antennas. From our results, significant radio link path losses can be associated with plasma spatial variations (gradients) and collisional losses, to an extent that matches well the usually perceived blackout in crossing layers in cutoff. Furthermore, we find good agreement with existing literature concerning significant alterations of the radar response (RCS) due to the plasma envelope. △ Less

Submitted 6 July, 2021; originally announced July 2021.

arXiv:2101.06326 [pdf, other]

Grid Search Hyperparameter Benchmarking of BERT, ALBERT, and LongFormer on DuoRC

Authors: Alex John Quijano, Sam Nguyen, Juanita Ordonez

Abstract: The purpose of this project is to evaluate three language models named BERT, ALBERT, and LongFormer on the Question Answering dataset called DuoRC. The language model task has two inputs, a question, and a context. The context is a paragraph or an entire document while the output is the answer based on the context. The goal is to perform grid search hyperparameter fine-tuning using DuoRC. Pretrain… ▽ More The purpose of this project is to evaluate three language models named BERT, ALBERT, and LongFormer on the Question Answering dataset called DuoRC. The language model task has two inputs, a question, and a context. The context is a paragraph or an entire document while the output is the answer based on the context. The goal is to perform grid search hyperparameter fine-tuning using DuoRC. Pretrained weights of the models are taken from the Huggingface library. Different sets of hyperparameters are used to fine-tune the models using two versions of DuoRC which are the SelfRC and the ParaphraseRC. The results show that the ALBERT (pretrained using the SQuAD1 dataset) has an F1 score of 76.4 and an accuracy score of 68.52 after fine-tuning on the SelfRC dataset. The Longformer model (pretrained using the SQuAD and SelfRC datasets) has an F1 score of 52.58 and an accuracy score of 46.60 after fine-tuning on the ParaphraseRC dataset. The current results outperformed the results from the previous model by DuoRC. △ Less

Submitted 29 March, 2021; v1 submitted 15 January, 2021; originally announced January 2021.

Comments: 9 pages, 2 figures, 2 tables

Report number: LLNL-TR-817729

arXiv:2009.12413 [pdf, other]

doi 10.1109/TCBB.2020.3040910

Maximum Covering Subtrees for Phylogenetic Networks

Authors: Nathan Davidov, Amanda Hernandez, Justin Jian, Patrick McKenna, K. A. Medlin, Roadra Mojumder, Megan Owen, Andrew Quijano, Amanda Rodriguez, Katherine St. John, Katherine Thai, Meliza Uraga

Abstract: Tree-based phylogenetic networks, which may be roughly defined as leaf-labeled networks built by adding arcs only between the original tree edges, have elegant properties for modeling evolutionary histories. We answer an open question of Francis, Semple, and Steel about the complexity of determining how far a phylogenetic network is from being tree-based, including non-binary phylogenetic networks… ▽ More Tree-based phylogenetic networks, which may be roughly defined as leaf-labeled networks built by adding arcs only between the original tree edges, have elegant properties for modeling evolutionary histories. We answer an open question of Francis, Semple, and Steel about the complexity of determining how far a phylogenetic network is from being tree-based, including non-binary phylogenetic networks. We show that finding a phylogenetic tree covering the maximum number of nodes in a phylogenetic network can be be computed in polynomial time via an encoding into a minimum-cost maximum flow problem. △ Less

Submitted 24 November, 2020; v1 submitted 25 September, 2020; originally announced September 2020.

arXiv:2008.11612 [pdf, other]

doi 10.1109/MASSW.2019.00017

Server-side Fingerprint-Based Indoor Localization Using Encrypted Sorting

Authors: Andrew Quijano, Kemal Akkaya

Abstract: GPS signals, the main origin of navigation, are not functional in indoor environments. Therefore, Wi-Fi access points have started to be increasingly used for localization and tracking inside the buildings by relying on a fingerprint-based approach. However, with these types of approaches, several concerns regarding the privacy of the users have arisen. Malicious individuals can determine a client… ▽ More GPS signals, the main origin of navigation, are not functional in indoor environments. Therefore, Wi-Fi access points have started to be increasingly used for localization and tracking inside the buildings by relying on a fingerprint-based approach. However, with these types of approaches, several concerns regarding the privacy of the users have arisen. Malicious individuals can determine a client's daily habits and activities by simply analyzing their wireless signals. While there are already efforts to incorporate privacy into the existing fingerprint-based approaches, they are limited to the characteristics of the homomorphic cryptographic schemes they employed. In this paper, we propose to enhance the performance of these approaches by exploiting another homomorphic algorithm, namely DGK, with its unique encrypted sorting capability and thus pushing most of the computations to the server side. We developed an Android app and tested our system within a Columbia University dormitory. Compared to existing systems, the results indicated that more power savings can be achieved at the client side and DGK can be a viable option with more powerful server computation capabilities. △ Less

Submitted 26 August, 2020; originally announced August 2020.

Comments: This paper was presented in the IEEE MASS REU workshop, 2019, Monterrey, CA

Journal ref: 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW), pp. 53-57

Showing 1–6 of 6 results for author: Quijano, A