-
GME: GPU-based Microarchitectural Extensions to Accelerate Homomorphic Encryption
Authors:
Kaustubh Shivdikar,
Yuhui Bao,
Rashmi Agrawal,
Michael Shen,
Gilbert Jonatan,
Evelio Mora,
Alexander Ingare,
Neal Livesay,
José L. Abellán,
John Kim,
Ajay Joshi,
David Kaeli
Abstract:
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without decrypting it. FHE has garnered significant attention over the past decade as it supports secure outsourcing of data processing to remote cloud services. Despite its promise of strong data privacy and security guarantees, FHE introduces a slowdown of up to five orders of magnitude as compared to the same computatio…
▽ More
Fully Homomorphic Encryption (FHE) enables the processing of encrypted data without decrypting it. FHE has garnered significant attention over the past decade as it supports secure outsourcing of data processing to remote cloud services. Despite its promise of strong data privacy and security guarantees, FHE introduces a slowdown of up to five orders of magnitude as compared to the same computation using plaintext data. This overhead is presently a major barrier to the commercial adoption of FHE.
In this work, we leverage GPUs to accelerate FHE, capitalizing on a well-established GPU ecosystem available in the cloud. We propose GME, which combines three key microarchitectural extensions along with a compile-time optimization to the current AMD CDNA GPU architecture. First, GME integrates a lightweight on-chip compute unit (CU)-side hierarchical interconnect to retain ciphertext in cache across FHE kernels, thus eliminating redundant memory transactions. Second, to tackle compute bottlenecks, GME introduces special MOD-units that provide native custom hardware support for modular reduction operations, one of the most commonly executed sets of operations in FHE. Third, by integrating the MOD-unit with our novel pipelined $64$-bit integer arithmetic cores (WMAC-units), GME further accelerates FHE workloads by $19\%$. Finally, we propose a Locality-Aware Block Scheduler (LABS) that exploits the temporal locality available in FHE primitive blocks. Incorporating these microarchitectural features and compiler optimizations, we create a synergistic approach achieving average speedups of $796\times$, $14.2\times$, and $2.3\times$ over Intel Xeon CPU, NVIDIA V100 GPU, and Xilinx FPGA implementations, respectively.
△ Less
Submitted 19 September, 2023;
originally announced September 2023.
-
Accelerating Polynomial Multiplication for Homomorphic Encryption on GPUs
Authors:
Kaustubh Shivdikar,
Gilbert Jonatan,
Evelio Mora,
Neal Livesay,
Rashmi Agrawal,
Ajay Joshi,
Jose Abellan,
John Kim,
David Kaeli
Abstract:
Homomorphic Encryption (HE) enables users to securely outsource both the storage and computation of sensitive data to untrusted servers. Not only does HE offer an attractive solution for security in cloud systems, but lattice-based HE systems are also believed to be resistant to attacks by quantum computers. However, current HE implementations suffer from prohibitively high latency. For lattice-ba…
▽ More
Homomorphic Encryption (HE) enables users to securely outsource both the storage and computation of sensitive data to untrusted servers. Not only does HE offer an attractive solution for security in cloud systems, but lattice-based HE systems are also believed to be resistant to attacks by quantum computers. However, current HE implementations suffer from prohibitively high latency. For lattice-based HE to become viable for real-world systems, it is necessary for the key bottlenecks - particularly polynomial multiplication - to be highly efficient.
In this paper, we present a characterization of GPU-based implementations of polynomial multiplication. We begin with a survey of modular reduction techniques and analyze several variants of the widely-used Barrett modular reduction algorithm. We then propose a modular reduction variant optimized for 64-bit integer words on the GPU, obtaining a 1.8x speedup over the existing comparable implementations. Next, we explore the following GPU-specific improvements for polynomial multiplication targeted at optimizing latency and throughput: 1) We present a 2D mixed-radix, multi-block implementation of NTT that results in a 1.85x average speedup over the previous state-of-the-art. 2) We explore shared memory optimizations aimed at reducing redundant memory accesses, further improving speedups by 1.2x. 3) Finally, we fuse the Hadamard product with neighboring stages of the NTT, reducing the twiddle factor memory footprint by 50%. By combining our NTT optimizations, we achieve an overall speedup of 123.13x and 2.37x over the previous state-of-the-art CPU and GPU implementations of NTT kernels, respectively.
△ Less
Submitted 2 September, 2022;
originally announced September 2022.
-
Learning Program Representations for Food Images and Cooking Recipes
Authors:
Dim P. Papadopoulos,
Enrique Mora,
Nadiia Chepurko,
Kuan Wei Huang,
Ferda Ofli,
Antonio Torralba
Abstract:
In this paper, we are interested in modeling a how-to instructional procedure, such as a cooking recipe, with a meaningful and rich high-level representation. Specifically, we propose to represent cooking recipes and food images as cooking programs. Programs provide a structured representation of the task, capturing cooking semantics and sequential relationships of actions in the form of a graph.…
▽ More
In this paper, we are interested in modeling a how-to instructional procedure, such as a cooking recipe, with a meaningful and rich high-level representation. Specifically, we propose to represent cooking recipes and food images as cooking programs. Programs provide a structured representation of the task, capturing cooking semantics and sequential relationships of actions in the form of a graph. This allows them to be easily manipulated by users and executed by agents. To this end, we build a model that is trained to learn a joint embedding between recipes and food images via self-supervision and jointly generate a program from this embedding as a sequence. To validate our idea, we crowdsource programs for cooking recipes and show that: (a) projecting the image-recipe embeddings into programs leads to better cross-modal retrieval results; (b) generating programs from images leads to better recognition results compared to predicting raw cooking instructions; and (c) we can generate food images by manipulating programs via optimizing the latent code of a GAN. Code, data, and models are available online.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Fractal dimension analysis for automatic morphological galaxy classification
Authors:
Jorge de la Calleja,
Elsa M. de la Calleja,
Hugo Jair Escalante
Abstract:
In this report we present experimental results using \emph{Haussdorf-Besicovich} fractal dimension for performing morphological galaxy classification. The fractal dimension is a topological, structural and spatial property that give us information about the space were an object lives. We have calculated the fractal dimension value of the main types of galaxies: ellipticals, spirals and irregulars;…
▽ More
In this report we present experimental results using \emph{Haussdorf-Besicovich} fractal dimension for performing morphological galaxy classification. The fractal dimension is a topological, structural and spatial property that give us information about the space were an object lives. We have calculated the fractal dimension value of the main types of galaxies: ellipticals, spirals and irregulars; and we use it as a feature for classifying them. Also, we have performed an image analysis process in order to standardize the galaxy images, and we have used principal component analysis to obtain the main attributes in the images. Galaxy classification was performed using machine learning algorithms: C4.5, k-nearest neighbors, random forest and support vector machines. Preliminary experimental results using 10-fold cross-validation show that fractal dimension helps to improve classification, with over 88 per cent accuracy for elliptical galaxies, 100 per cent accuracy for spiral galaxies and over 40 per cent for irregular galaxies.
△ Less
Submitted 22 June, 2017;
originally announced June 2017.
-
Bragg-Williams approximation for the dynamics of prey-predator biological associations
Authors:
E. M. De la Calleja,
J. L. Carrillo,
I. Santamaría-Holek
Abstract:
The dynamics of an association of interactive biological species is studied theoretically. We explore a mean field approximation to describe the temporal evolution of an ecological system with the basic prey-predator interspecies relation, as well as an approximation to introduce time correlations in the dynamics. We start by discussing the solution of the Volterra-Lotka model in a mean field appr…
▽ More
The dynamics of an association of interactive biological species is studied theoretically. We explore a mean field approximation to describe the temporal evolution of an ecological system with the basic prey-predator interspecies relation, as well as an approximation to introduce time correlations in the dynamics. We start by discussing the solution of the Volterra-Lotka model in a mean field approximation based in an analogy with the Weiss solution to the Ising model for ferromagnetic materials. In order to explore the effects of long-range time correlations, we describe the time evolution of the system within a kind of Bragg-Williams approximation. This approach allows us to evaluate a characteristic life-time of the ecosystem. This quantity could be very useful to discuss the time evolution of the system under a wide diversity of environmental conditions of the ecosystem which is not usually considered. We discuss the general trends of the temporal evolution of the association with some data from real ecosystems.
△ Less
Submitted 28 April, 2016; v1 submitted 11 March, 2016;
originally announced March 2016.
-
Order-Fractal transition in abstract paintings
Authors:
E. M. De la Calleja,
F. Cervantes,
J. De la Calleja
Abstract:
We report the degree of order of twenty-two Jackson Pollock's paintings using \emph{Hausdorff-Besicovitch fractal dimension}. Through the maximum value of each multi-fractal spectrum, the artworks are classify by the year in which they were painted. It has been reported that Pollock's paintings are fractal and it increased on his latest works. However our results show that fractal dimension of the…
▽ More
We report the degree of order of twenty-two Jackson Pollock's paintings using \emph{Hausdorff-Besicovitch fractal dimension}. Through the maximum value of each multi-fractal spectrum, the artworks are classify by the year in which they were painted. It has been reported that Pollock's paintings are fractal and it increased on his latest works. However our results show that fractal dimension of the paintings are on a range of fractal dimension with values close to two. We identify this behavior as a fractal-order transition. Based on the study of disorder-order transition in physical systems, we interpreted the fractal-order transition through its dark paint strokes in Pollocks' paintings, as structured lines following a power law measured by fractal dimension. We obtain self-similarity in some specific Pollock's paintings, that reveal an important dependence on the scale of observation. We also characterize by its fractal spectrum, the called \emph{Teri's Find}. We obtained similar spectrums between \emph{Teri's Find} and \emph{Number 5} from Pollock, suggesting that fractal dimension cannot be completely rejected as a quantitative parameter to authenticate this kind of artworks.
△ Less
Submitted 12 April, 2016; v1 submitted 22 October, 2015;
originally announced October 2015.