Search | arXiv e-print repository

Quark: Real-time, High-resolution, and General Neural View Synthesis

Authors: John Flynn, Michael Broxton, Lukas Murmann, Lucy Chai, Matthew DuVall, Clément Godard, Kathryn Heal, Srinivas Kaza, Stephen Lombardi, Xuan Luo, Supreeth Achar, Kira Prabhu, Tiancheng Sun, Lynn Tsai, Ryan Overbeck

Abstract: We present a novel neural algorithm for performing high-quality, high-resolution, real-time novel view synthesis. From a sparse set of input RGB images or videos streams, our network both reconstructs the 3D scene and renders novel views at 1080p resolution at 30fps on an NVIDIA A100. Our feed-forward network generalizes across a wide variety of datasets and scenes and produces state-of-the-art qu… ▽ More We present a novel neural algorithm for performing high-quality, high-resolution, real-time novel view synthesis. From a sparse set of input RGB images or videos streams, our network both reconstructs the 3D scene and renders novel views at 1080p resolution at 30fps on an NVIDIA A100. Our feed-forward network generalizes across a wide variety of datasets and scenes and produces state-of-the-art quality for a real-time method. Our quality approaches, and in some cases surpasses, the quality of some of the top offline methods. In order to achieve these results we use a novel combination of several key concepts, and tie them together into a cohesive and effective algorithm. We build on previous works that represent the scene using semi-transparent layers and use an iterative learned render-and-refine approach to improve those layers. Instead of flat layers, our method reconstructs layered depth maps (LDMs) that efficiently represent scenes with complex depth and occlusions. The iterative update steps are embedded in a multi-scale, UNet-style architecture to perform as much compute as possible at reduced resolution. Within each update step, to better aggregate the information from multiple input views, we use a specialized Transformer-based network component. This allows the majority of the per-input image processing to be performed in the input image space, as opposed to layer space, further increasing efficiency. Finally, due to the real-time nature of our reconstruction and rendering, we dynamically create and discard the internal 3D geometry for each frame, generating the LDM for each view. Taken together, this produces a novel and effective algorithm for view synthesis. Through extensive evaluation, we demonstrate that we achieve state-of-the-art quality at real-time rates. Project page: https://quark-3d.github.io/ △ Less

Submitted 25 November, 2024; originally announced November 2024.

Comments: SIGGRAPH Asia 2024 camera ready version; project page https://quark-3d.github.io/

arXiv:2410.15605 [pdf, other]

Deep Active Learning with Manifold-preserving Trajectory Sampling

Authors: Yingrui Ji, Vijaya Sindhoori Kaza, Nishanth Artham, Tianyang Wang

Abstract: Active learning (AL) is for optimizing the selection of unlabeled data for annotation (labeling), aiming to enhance model performance while minimizing labeling effort. The key question in AL is which unlabeled data should be selected for annotation. Existing deep AL methods arguably suffer from bias incurred by clabeled data, which takes a much lower percentage than unlabeled data in AL context. W… ▽ More Active learning (AL) is for optimizing the selection of unlabeled data for annotation (labeling), aiming to enhance model performance while minimizing labeling effort. The key question in AL is which unlabeled data should be selected for annotation. Existing deep AL methods arguably suffer from bias incurred by clabeled data, which takes a much lower percentage than unlabeled data in AL context. We observe that such an issue is severe in different types of data, such as vision and non-vision data. To address this issue, we propose a novel method, namely Manifold-Preserving Trajectory Sampling (MPTS), aiming to enforce the feature space learned from labeled data to represent a more accurate manifold. By doing so, we expect to effectively correct the bias incurred by labeled data, which can cause a biased selection of unlabeled data. Despite its focus on manifold, the proposed method can be conveniently implemented by performing distribution mapping with MMD (Maximum Mean Discrepancies). Extensive experiments on various vision and non-vision benchmark datasets demonstrate the superiority of our method. Our source code can be found here. △ Less

Submitted 20 October, 2024; originally announced October 2024.

arXiv:2303.13508 [pdf, other]

DreamBooth3D: Subject-Driven Text-to-3D Generation

Authors: Amit Raj, Srinivas Kaza, Ben Poole, Michael Niemeyer, Nataniel Ruiz, Ben Mildenhall, Shiran Zada, Kfir Aberman, Michael Rubinstein, Jonathan Barron, Yuanzhen Li, Varun Jampani

Abstract: We present DreamBooth3D, an approach to personalize text-to-3D generative models from as few as 3-6 casually captured images of a subject. Our approach combines recent advances in personalizing text-to-image models (DreamBooth) with text-to-3D generation (DreamFusion). We find that naively combining these methods fails to yield satisfactory subject-specific 3D assets due to personalized text-to-im… ▽ More We present DreamBooth3D, an approach to personalize text-to-3D generative models from as few as 3-6 casually captured images of a subject. Our approach combines recent advances in personalizing text-to-image models (DreamBooth) with text-to-3D generation (DreamFusion). We find that naively combining these methods fails to yield satisfactory subject-specific 3D assets due to personalized text-to-image models overfitting to the input viewpoints of the subject. We overcome this through a 3-stage optimization strategy where we jointly leverage the 3D consistency of neural radiance fields together with the personalization capability of text-to-image models. Our method can produce high-quality, subject-specific 3D assets with text-driven modifications such as novel poses, colors and attributes that are not seen in any of the input images of the subject. △ Less

Submitted 27 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: Project page at https://dreambooth3d.github.io/ Video Summary at https://youtu.be/kKVDrbfvOoA

arXiv:2107.03248 [pdf, other]

DER Forecast using Privacy Preserving Federated Learning

Authors: Venkatesh Venkataramanan, Sridevi Kaza, Anuradha M. Annaswamy

Abstract: With increasing penetration of Distributed Energy Resources (DERs) in grid edge including renewable generation, flexible loads, and storage, accurate prediction of distributed generation and consumption at the consumer level becomes important. However, DER prediction based on the transmission of customer level data, either repeatedly or in large amounts, is not feasible due to privacy concerns. In… ▽ More With increasing penetration of Distributed Energy Resources (DERs) in grid edge including renewable generation, flexible loads, and storage, accurate prediction of distributed generation and consumption at the consumer level becomes important. However, DER prediction based on the transmission of customer level data, either repeatedly or in large amounts, is not feasible due to privacy concerns. In this paper, a distributed machine learning approach, Federated Learning, is proposed to carry out DER forecasting using a network of IoT nodes, each of which transmits a model of the consumption and generation patterns without revealing consumer data. We consider a simulation study which includes 1000 DERs, and show that our method leads to an accurate prediction of preserve consumer privacy, while still leading to an accurate forecast. We also evaluate grid-specific performance metrics such as load swings and load curtailment and show that our FL algorithm leads to satisfactory performance. Simulations are also performed on the Pecan street dataset to demonstrate the validity of the proposed approach on real data. △ Less

Submitted 7 July, 2021; originally announced July 2021.

arXiv:1309.7266 [pdf]

Evaluating Link-Based Techniques for Detecting Fake Pharmacy Websites

Authors: Ahmed Abbasi, Siddharth Kaza, F. Mariam Zahedi

Abstract: Fake online pharmacies have become increasingly pervasive, constituting over 90% of online pharmacy websites. There is a need for fake website detection techniques capable of identifying fake online pharmacy websites with a high degree of accuracy. In this study, we compared several well-known link-based detection techniques on a large-scale test bed with the hyperlink graph encompassing over 80 m… ▽ More Fake online pharmacies have become increasingly pervasive, constituting over 90% of online pharmacy websites. There is a need for fake website detection techniques capable of identifying fake online pharmacy websites with a high degree of accuracy. In this study, we compared several well-known link-based detection techniques on a large-scale test bed with the hyperlink graph encompassing over 80 million links between 15.5 million web pages, including 1.2 million known legitimate and fake pharmacy pages. We found that the QoC and QoL class propagation algorithms achieved an accuracy of over 90% on our dataset. The results revealed that algorithms that incorporate dual class propagation as well as inlink and outlink information, on page-level or site-level graphs, are better suited for detecting fake pharmacy websites. In addition, site-level analysis yielded significantly better results than page-level analysis for most algorithms evaluated. △ Less

Submitted 27 September, 2013; originally announced September 2013.

Comments: Abbasi, A., Kaza, S., and Zahedi, F. M. "Evaluating Link-Based Techniques for Detecting Fake Pharmacy Websites," In Proceedings of the 19th Annual Workshop on Information Technologies and Systems, Phoenix, Arizona, December 14-15, 2009

Showing 1–5 of 5 results for author: Kaza, S