-
Hacking, The Lazy Way: LLM Augmented Pentesting
Authors:
Dhruva Goyal,
Sitaraman Subramanian,
Aditya Peela,
Nisha P. Shetty
Abstract:
In our research, we introduce a new concept called "LLM Augmented Pentesting" demonstrated with a tool named "Pentest Copilot," that revolutionizes the field of ethical hacking by integrating Large Language Models (LLMs) into penetration testing workflows, leveraging the advanced GPT-4-turbo model. Our approach focuses on overcoming the traditional resistance to automation in penetration testing b…
▽ More
In our research, we introduce a new concept called "LLM Augmented Pentesting" demonstrated with a tool named "Pentest Copilot," that revolutionizes the field of ethical hacking by integrating Large Language Models (LLMs) into penetration testing workflows, leveraging the advanced GPT-4-turbo model. Our approach focuses on overcoming the traditional resistance to automation in penetration testing by employing LLMs to automate specific sub-tasks while ensuring a comprehensive understanding of the overall testing process.
Pentest Copilot showcases remarkable proficiency in tasks such as utilizing testing tools, interpreting outputs, and suggesting follow-up actions, efficiently bridging the gap between automated systems and human expertise. By integrating a "chain of thought" mechanism, Pentest Copilot optimizes token usage and enhances decision-making processes, leading to more accurate and context-aware outputs. Additionally, our implementation of Retrieval-Augmented Generation (RAG) minimizes hallucinations and ensures the tool remains aligned with the latest cybersecurity techniques and knowledge. We also highlight a unique infrastructure system that supports in-browser penetration testing, providing a robust platform for cybersecurity professionals. Our findings demonstrate that LLM Augmented Pentesting can not only significantly enhance task completion rates in penetration testing but also effectively addresses real-world challenges, marking a substantial advancement in the cybersecurity domain.
△ Less
Submitted 19 May, 2025; v1 submitted 14 September, 2024;
originally announced September 2024.
-
GANs Settle Scores!
Authors:
Siddarth Asokan,
Nishanth Shetty,
Aadithya Srikanth,
Chandra Sekhar Seelamantula
Abstract:
Generative adversarial networks (GANs) comprise a generator, trained to learn the underlying distribution of the desired data, and a discriminator, trained to distinguish real samples from those output by the generator. A majority of GAN literature focuses on understanding the optimality of the discriminator through integral probability metric (IPM) or divergence based analysis. In this paper, we…
▽ More
Generative adversarial networks (GANs) comprise a generator, trained to learn the underlying distribution of the desired data, and a discriminator, trained to distinguish real samples from those output by the generator. A majority of GAN literature focuses on understanding the optimality of the discriminator through integral probability metric (IPM) or divergence based analysis. In this paper, we propose a unified approach to analyzing the generator optimization through variational approach. In $f$-divergence-minimizing GANs, we show that the optimal generator is the one that matches the score of its output distribution with that of the data distribution, while in IPM GANs, we show that this optimal generator matches score-like functions, involving the flow-field of the kernel associated with a chosen IPM constraint space. Further, the IPM-GAN optimization can be seen as one of smoothed score-matching, where the scores of the data and the generator distributions are convolved with the kernel associated with the constraint. The proposed approach serves to unify score-based training and existing GAN flavors, leveraging results from normalizing flows, while also providing explanations for empirical phenomena such as the stability of non-saturating GAN losses. Based on these results, we propose novel alternatives to $f$-GAN and IPM-GAN training based on score and flow matching, and discriminator-guided Langevin sampling.
△ Less
Submitted 2 June, 2023;
originally announced June 2023.
-
Kernel Bi-Linear Modeling for Reconstructing Data on Manifolds: The Dynamic-MRI Case
Authors:
Gaurav N. Shetty,
Konstantinos Slavakis,
Ukash Nakarmi,
Gesualdo Scutari,
Leslie Ying
Abstract:
This paper establishes a kernel-based framework for reconstructing data on manifolds, tailored to fit the dynamic-(d)MRI-data recovery problem. The proposed methodology exploits simple tangent-space geometries of manifolds in reproducing kernel Hilbert spaces and follows classical kernel-approximation arguments to form the data-recovery task as a bi-linear inverse problem. Departing from mainstrea…
▽ More
This paper establishes a kernel-based framework for reconstructing data on manifolds, tailored to fit the dynamic-(d)MRI-data recovery problem. The proposed methodology exploits simple tangent-space geometries of manifolds in reproducing kernel Hilbert spaces and follows classical kernel-approximation arguments to form the data-recovery task as a bi-linear inverse problem. Departing from mainstream approaches, the proposed methodology uses no training data, employs no graph Laplacian matrix to penalize the optimization task, uses no costly (kernel) pre-imaging step to map feature points back to the input space, and utilizes complex-valued kernel functions to account for k-space data. The framework is validated on synthetically generated dMRI data, where comparisons against state-of-the-art schemes highlight the rich potential of the proposed approach in data-recovery problems.
△ Less
Submitted 26 February, 2020;
originally announced February 2020.
-
Augmenting Variational Autoencoders with Sparse Labels: A Unified Framework for Unsupervised, Semi-(un)supervised, and Supervised Learning
Authors:
Felix Berkhahn,
Richard Keys,
Wajih Ouertani,
Nikhil Shetty,
Dominik Geißler
Abstract:
We present a new flavor of Variational Autoencoder (VAE) that interpolates seamlessly between unsupervised, semi-supervised and fully supervised learning domains. We show that unlabeled datapoints not only boost unsupervised tasks, but also the classification performance. Vice versa, every label not only improves classification, but also unsupervised tasks. The proposed architecture is simple: A c…
▽ More
We present a new flavor of Variational Autoencoder (VAE) that interpolates seamlessly between unsupervised, semi-supervised and fully supervised learning domains. We show that unlabeled datapoints not only boost unsupervised tasks, but also the classification performance. Vice versa, every label not only improves classification, but also unsupervised tasks. The proposed architecture is simple: A classification layer is connected to the topmost encoder layer, and then combined with the resampled latent layer for the decoder. The usual evidence lower bound (ELBO) loss is supplemented with a supervised loss target on this classification layer that is only applied for labeled datapoints. This simplicity allows for extending any existing VAE model to our proposed semi-supervised framework with minimal effort. In the context of classification, we found that this approach even outperforms a direct supervised setup.
△ Less
Submitted 14 November, 2019; v1 submitted 8 August, 2019;
originally announced August 2019.
-
Bi-Linear Modeling of Data Manifolds for Dynamic-MRI Recovery
Authors:
Gaurav N. Shetty,
Konstantinos Slavakis,
Abhishek Bose,
Ukash Nakarmi,
Gesualdo Scutari,
Leslie Ying
Abstract:
This paper puts forth a novel bi-linear modeling framework for data recovery via manifold-learning and sparse-approximation arguments and considers its application to dynamic magnetic-resonance imaging (dMRI). Each temporal-domain MR image is viewed as a point that lies onto or close to a smooth manifold, and landmark points are identified to describe the point cloud concisely. To facilitate compu…
▽ More
This paper puts forth a novel bi-linear modeling framework for data recovery via manifold-learning and sparse-approximation arguments and considers its application to dynamic magnetic-resonance imaging (dMRI). Each temporal-domain MR image is viewed as a point that lies onto or close to a smooth manifold, and landmark points are identified to describe the point cloud concisely. To facilitate computations, a dimensionality reduction module generates low-dimensional/compressed renditions of the landmark points. Recovery of the high-fidelity MRI data is realized by solving a non-convex minimization task for the linear decompression operator and those affine combinations of landmark points which locally approximate the latent manifold geometry. An algorithm with guaranteed convergence to stationary solutions of the non-convex minimization task is also provided. The aforementioned framework exploits the underlying spatio-temporal patterns and geometry of the acquired data without any prior training on external data or information. Extensive numerical results on simulated as well as real cardiac-cine and perfusion MRI data illustrate noteworthy improvements of the advocated machine-learning framework over state-of-the-art reconstruction techniques.
△ Less
Submitted 11 June, 2019; v1 submitted 26 December, 2018;
originally announced December 2018.
-
On the Shoulders of Giants: The Growing Impact of Older Articles
Authors:
Alex Verstak,
Anurag Acharya,
Helder Suzuki,
Sean Henderson,
Mikhail Iakhiaev,
Cliff Chiung Yu Lin,
Namit Shetty
Abstract:
In this paper, we examine the evolution of the impact of older scholarly articles. We attempt to answer four questions. First, how often are older articles cited and how has this changed over time. Second, how does the impact of older articles vary across different research fields. Third, is the change in the impact of older articles accelerating or slowing down. Fourth, are these trends different…
▽ More
In this paper, we examine the evolution of the impact of older scholarly articles. We attempt to answer four questions. First, how often are older articles cited and how has this changed over time. Second, how does the impact of older articles vary across different research fields. Third, is the change in the impact of older articles accelerating or slowing down. Fourth, are these trends different for much older articles.
To answer these questions, we studied citations from articles published in 1990-2013. We computed the fraction of citations to older articles from articles published each year as the measure of impact. We considered articles that were published at least 10 years before the citing article as older articles. We computed these numbers for 261 subject categories and 9 broad areas of research. Finally, we repeated the computation for two other definitions of older articles, 15 years and older and 20 years and older.
There are three conclusions from our study. First, the impact of older articles has grown substantially over 1990-2013. In 2013, 36% of citations were to articles that are at least 10 years old; this fraction has grown 28% since 1990. The fraction of older citations increased over 1990-2013 for 7 out of 9 broad areas and 231 out of 261 subject categories. Second, the increase over the second half (2002-2013) was double the increase in the first half (1990-2001).
Third, the trend of a growing impact of older articles also holds for even older articles. In 2013, 21% of citations were to articles >= 15 years old with an increase of 30% since 1990 and 13% of citations were to articles >= 20 years old with an increase of 36%.
Now that finding and reading relevant older articles is about as easy as finding and reading recently published articles, significant advances aren't getting lost on the shelves and are influencing work worldwide for years after.
△ Less
Submitted 2 November, 2014;
originally announced November 2014.
-
Rise of the Rest: The Growing Impact of Non-Elite Journals
Authors:
Anurag Acharya,
Alex Verstak,
Helder Suzuki,
Sean Henderson,
Mikhail Iakhiaev,
Cliff Chiung Yu Lin,
Namit Shetty
Abstract:
In this paper, we examine the evolution of the impact of non-elite journals. We attempt to answer two questions. First, what fraction of the top-cited articles are published in non-elite journals and how has this changed over time. Second, what fraction of the total citations are to non-elite journals and how has this changed over time.
We studied citations to articles published in 1995-2013. We…
▽ More
In this paper, we examine the evolution of the impact of non-elite journals. We attempt to answer two questions. First, what fraction of the top-cited articles are published in non-elite journals and how has this changed over time. Second, what fraction of the total citations are to non-elite journals and how has this changed over time.
We studied citations to articles published in 1995-2013. We computed the 10 most-cited journals and the 1000 most-cited articles each year for all 261 subject categories in Scholar Metrics. We marked the 10 most-cited journals in a category as the elite journals for the category and the rest as non-elite.
There are two conclusions from our study. First, the fraction of top-cited articles published in non-elite journals increased steadily over 1995-2013. While the elite journals still publish a substantial fraction of high-impact articles, many more authors of well-regarded papers in diverse research fields are choosing other venues.
The number of top-1000 papers published in non-elite journals for the representative subject category went from 149 in 1995 to 245 in 2013, a growth of 64%. Looking at broad research areas, 4 out of 9 areas saw at least one-third of the top-cited articles published in non-elite journals in 2013. For 6 out of 9 areas, the fraction of top-cited papers published in non-elite journals for the representative subject category grew by 45% or more.
Second, now that finding and reading relevant articles in non-elite journals is about as easy as finding and reading articles in elite journals, researchers are increasingly building on and citing work published everywhere. Considering citations to all articles, the percentage of citations to articles in non-elite journals went from 27% in 1995 to 47% in 2013. Six out of nine broad areas had at least 50% of citations going to articles published in non-elite journals in 2013.
△ Less
Submitted 8 October, 2014;
originally announced October 2014.
-
Agent Based Intelligent Alert System for Smart-Phones
Authors:
Sandeep Venkatesh,
Shreyas Balakuntala,
Rajarajeswari S,
Nytika N Shetty,
Namratha Shetty,
Neha Sudhakar
Abstract:
The paper deals with the design of an agent which modifies and enhances the various alert systems in the smartphones. The actions of the agent includes sorting the notifications abiding to human thinking, helping the user to have a safe conversation, assisting in tracking back the reach-ability status of the caller when needed, conveying the user about the notifications in times of situations like…
▽ More
The paper deals with the design of an agent which modifies and enhances the various alert systems in the smartphones. The actions of the agent includes sorting the notifications abiding to human thinking, helping the user to have a safe conversation, assisting in tracking back the reach-ability status of the caller when needed, conveying the user about the notifications in times of situations like drained battery and smartly alerting the user in situations like sleeping. The agent uses the information gathered from a survey, to modify the existing methods of alerts and produce alerts which abide by the human cognitive responses.
△ Less
Submitted 25 May, 2013;
originally announced May 2013.