-
Textured Gaussians for Enhanced 3D Scene Appearance Modeling
Authors:
Brian Chao,
Hung-Yu Tseng,
Lorenzo Porzi,
Chen Gao,
Tuotuo Li,
Qinbo Li,
Ayush Saraf,
Jia-Bin Huang,
Johannes Kopf,
Gordon Wetzstein,
Changil Kim
Abstract:
3D Gaussian Splatting (3DGS) has recently emerged as a state-of-the-art 3D reconstruction and rendering technique due to its high-quality results and fast training and rendering time. However, pixels covered by the same Gaussian are always shaded in the same color up to a Gaussian falloff scaling factor. Furthermore, the finest geometric detail any individual Gaussian can represent is a simple ell…
▽ More
3D Gaussian Splatting (3DGS) has recently emerged as a state-of-the-art 3D reconstruction and rendering technique due to its high-quality results and fast training and rendering time. However, pixels covered by the same Gaussian are always shaded in the same color up to a Gaussian falloff scaling factor. Furthermore, the finest geometric detail any individual Gaussian can represent is a simple ellipsoid. These properties of 3DGS greatly limit the expressivity of individual Gaussian primitives. To address these issues, we draw inspiration from texture and alpha mapping in traditional graphics and integrate it with 3DGS. Specifically, we propose a new generalized Gaussian appearance representation that augments each Gaussian with alpha~(A), RGB, or RGBA texture maps to model spatially varying color and opacity across the extent of each Gaussian. As such, each Gaussian can represent a richer set of texture patterns and geometric structures, instead of just a single color and ellipsoid as in naive Gaussian Splatting. Surprisingly, we found that the expressivity of Gaussians can be greatly improved by using alpha-only texture maps, and further augmenting Gaussians with RGB texture maps achieves the highest expressivity. We validate our method on a wide variety of standard benchmark datasets and our own custom captures at both the object and scene levels. We demonstrate image quality improvements over existing methods while using a similar or lower number of Gaussians.
△ Less
Submitted 28 May, 2025; v1 submitted 27 November, 2024;
originally announced November 2024.
-
Fair Interest Rates Are Impossible for Lending Pools: Results from Options Pricing
Authors:
Joe Halpern,
Rafael Pass,
Aditya Saraf
Abstract:
Cryptocurrency lending pools are services that allow lenders to pool together assets in one cryptocurrency and loan it out to borrowers who provide collateral worth more (than the loan) in a separate cryptocurrency. Borrowers can repay their loans to reclaim their collateral unless their loan was liquidated, which happens when the value of the collateral dips significantly. Interest rates for thes…
▽ More
Cryptocurrency lending pools are services that allow lenders to pool together assets in one cryptocurrency and loan it out to borrowers who provide collateral worth more (than the loan) in a separate cryptocurrency. Borrowers can repay their loans to reclaim their collateral unless their loan was liquidated, which happens when the value of the collateral dips significantly. Interest rates for these pools are currently set via supply and demand heuristics, which have several downsides, including inefficiency, inflexibility, and being vulnerable to manipulation. Here, we reduce lending pools to options, and then use ideas from options pricing to search for fair interest rates for lending pools. In a simplified model where the loans have a fixed duration and can only be repaid at the end of the term, we obtain analytical pricing results. We then consider a more realistic model, where loans can be repaid dynamically and without expiry. Our main theoretical contribution is to show that fair interest rates do not exist in this setting. We then show that impossibility results generalize even to models of lending pools which have no obvious reduction to options. To address these negative results, we introduce a model of lending pools with fixed fees, and model the ability of borrowers to top-up their loans to reduce the risk of liquidation. As a proof of concept, we use simulations to show how our model's predicted interest rates compare to interest rates in practice.
△ Less
Submitted 29 October, 2024; v1 submitted 14 October, 2024;
originally announced October 2024.
-
Doppelgänger's Watch: A Split Objective Approach to Large Language Models
Authors:
Shervin Ghasemlou,
Ashish Katiyar,
Aparajita Saraf,
Seungwhan Moon,
Mangesh Pujari,
Pinar Donmez,
Babak Damavandi,
Anuj Kumar
Abstract:
In this paper, we investigate the problem of "generation supervision" in large language models, and present a novel bicameral architecture to separate supervision signals from their core capability, helpfulness. Doppelgänger, a new module parallel to the underlying language model, supervises the generation of each token, and learns to concurrently predict the supervision score(s) of the sequences…
▽ More
In this paper, we investigate the problem of "generation supervision" in large language models, and present a novel bicameral architecture to separate supervision signals from their core capability, helpfulness. Doppelgänger, a new module parallel to the underlying language model, supervises the generation of each token, and learns to concurrently predict the supervision score(s) of the sequences up to and including each token. In this work, we present the theoretical findings, and leave the report on experimental results to a forthcoming publication.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
The Llama 3 Herd of Models
Authors:
Aaron Grattafiori,
Abhimanyu Dubey,
Abhinav Jauhri,
Abhinav Pandey,
Abhishek Kadian,
Ahmad Al-Dahle,
Aiesha Letman,
Akhil Mathur,
Alan Schelten,
Alex Vaughan,
Amy Yang,
Angela Fan,
Anirudh Goyal,
Anthony Hartshorn,
Aobo Yang,
Archi Mitra,
Archie Sravankumar,
Artem Korenev,
Arthur Hinsvark,
Arun Rao,
Aston Zhang,
Aurelien Rodriguez,
Austen Gregerson,
Ava Spataru,
Baptiste Roziere
, et al. (536 additional authors not shown)
Abstract:
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical…
▽ More
Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical evaluation of Llama 3. We find that Llama 3 delivers comparable quality to leading language models such as GPT-4 on a plethora of tasks. We publicly release Llama 3, including pre-trained and post-trained versions of the 405B parameter language model and our Llama Guard 3 model for input and output safety. The paper also presents the results of experiments in which we integrate image, video, and speech capabilities into Llama 3 via a compositional approach. We observe this approach performs competitively with the state-of-the-art on image, video, and speech recognition tasks. The resulting models are not yet being broadly released as they are still under development.
△ Less
Submitted 23 November, 2024; v1 submitted 31 July, 2024;
originally announced July 2024.
-
Stability evaluation of approximate Riemann solvers using the direct Lyapunov method
Authors:
Aishwarjya Gogoi,
Jadav Chandra Mandal,
Amitabh Saraf
Abstract:
The paper presents a new approach of stability evaluation of the approximate Riemann solvers based on the direct Lyapunov method. The present methodology offers a detailed understanding of the origins of numerical shock instability in the approximate Riemann solvers. The pressure perturbation feeding the density and transverse momentum perturbations is identified as the cause of the numerical shoc…
▽ More
The paper presents a new approach of stability evaluation of the approximate Riemann solvers based on the direct Lyapunov method. The present methodology offers a detailed understanding of the origins of numerical shock instability in the approximate Riemann solvers. The pressure perturbation feeding the density and transverse momentum perturbations is identified as the cause of the numerical shock instabilities in the complete approximate Riemann solvers, while the magnitude of the numerical shock instabilities are found to be proportional to the magnitude of the pressure perturbations. A shock-stable HLLEM scheme is proposed based on the insights obtained from this analysis about the origins of numerical shock instability in the approximate Riemann solvers. A set of numerical test cases are solved to show that the proposed scheme is free from numerical shock instability problems of the original HLLEM scheme at high Mach numbers.
△ Less
Submitted 26 March, 2024;
originally announced March 2024.
-
Chunking Tasks for Present-Biased Agents
Authors:
Joe Halpern,
Aditya Saraf
Abstract:
Everyone puts things off sometimes. How can we combat this tendency to procrastinate? A well-known technique used by instructors is to break up a large project into more manageable chunks. But how should this be done best? Here we study the process of chunking using the graph-theoretic model of present bias introduced by Kleinberg and Oren (2014). We first analyze how to optimally chunk single edg…
▽ More
Everyone puts things off sometimes. How can we combat this tendency to procrastinate? A well-known technique used by instructors is to break up a large project into more manageable chunks. But how should this be done best? Here we study the process of chunking using the graph-theoretic model of present bias introduced by Kleinberg and Oren (2014). We first analyze how to optimally chunk single edges within a task graph, given a limited number of chunks. We show that for edges on the shortest path, the optimal chunking makes initial chunks easy and later chunks progressively harder. For edges not on the shortest path, optimal chunking is significantly more complex, but we provide an efficient algorithm that chunks the edge optimally. We then use our optimal edge-chunking algorithm to optimally chunk task graphs. We show that with a linear number of chunks on each edge, the biased agent's cost can be exponentially lowered, to within a constant factor of the true cheapest path. Finally, we extend our model to the case where a task designer must chunk a graph for multiple types of agents simultaneously. The problem grows significantly more complex with even two types of agents, but we provide optimal graph chunking algorithms for two types. Our work highlights the efficacy of chunking as a means to combat present bias.
△ Less
Submitted 22 September, 2023;
originally announced September 2023.
-
OmnimatteRF: Robust Omnimatte with 3D Background Modeling
Authors:
Geng Lin,
Chen Gao,
Jia-Bin Huang,
Changil Kim,
Yipeng Wang,
Matthias Zwicker,
Ayush Saraf
Abstract:
Video matting has broad applications, from adding interesting effects to casually captured movies to assisting video production professionals. Matting with associated effects such as shadows and reflections has also attracted increasing research activity, and methods like Omnimatte have been proposed to separate dynamic foreground objects of interest into their own layers. However, prior works rep…
▽ More
Video matting has broad applications, from adding interesting effects to casually captured movies to assisting video production professionals. Matting with associated effects such as shadows and reflections has also attracted increasing research activity, and methods like Omnimatte have been proposed to separate dynamic foreground objects of interest into their own layers. However, prior works represent video backgrounds as 2D image layers, limiting their capacity to express more complicated scenes, thus hindering application to real-world videos. In this paper, we propose a novel video matting method, OmnimatteRF, that combines dynamic 2D foreground layers and a 3D background model. The 2D layers preserve the details of the subjects, while the 3D background robustly reconstructs scenes in real-world videos. Extensive experiments demonstrate that our method reconstructs scenes with better quality on various videos.
△ Less
Submitted 14 September, 2023;
originally announced September 2023.
-
Transforming Breast Cancer Diagnosis: Towards Real-Time Ultrasound to Mammogram Conversion for Cost-Effective Diagnosis
Authors:
Sahar Almahfouz Nasser,
Ashutosh Sharma,
Anmol Saraf,
Amruta Mahendra Parulekar,
Purvi Haria,
Amit Sethi
Abstract:
Ultrasound (US) imaging is better suited for intraoperative settings because it is real-time and more portable than other imaging techniques, such as mammography. However, US images are characterized by lower spatial resolution noise-like artifacts. This research aims to address these limitations by providing surgeons with mammogram-like image quality in real-time from noisy US images. Unlike prev…
▽ More
Ultrasound (US) imaging is better suited for intraoperative settings because it is real-time and more portable than other imaging techniques, such as mammography. However, US images are characterized by lower spatial resolution noise-like artifacts. This research aims to address these limitations by providing surgeons with mammogram-like image quality in real-time from noisy US images. Unlike previous approaches for improving US image quality that aim to reduce artifacts by treating them as (speckle noise), we recognize their value as informative wave interference pattern (WIP). To achieve this, we utilize the Stride software to numerically solve the forward model, generating ultrasound images from mammograms images by solving wave-equations. Additionally, we leverage the power of domain adaptation to enhance the realism of the simulated ultrasound images. Then, we utilize generative adversarial networks (GANs) to tackle the inverse problem of generating mammogram-quality images from ultrasound images. The resultant images have considerably more discernible details than the original US images.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Robust Dynamic Radiance Fields
Authors:
Yu-Lun Liu,
Chen Gao,
Andreas Meuleman,
Hung-Yu Tseng,
Ayush Saraf,
Changil Kim,
Yung-Yu Chuang,
Johannes Kopf,
Jia-Bin Huang
Abstract:
Dynamic radiance field reconstruction methods aim to model the time-varying structure and appearance of a dynamic scene. Existing methods, however, assume that accurate camera poses can be reliably estimated by Structure from Motion (SfM) algorithms. These methods, thus, are unreliable as SfM algorithms often fail or produce erroneous poses on challenging videos with highly dynamic objects, poorly…
▽ More
Dynamic radiance field reconstruction methods aim to model the time-varying structure and appearance of a dynamic scene. Existing methods, however, assume that accurate camera poses can be reliably estimated by Structure from Motion (SfM) algorithms. These methods, thus, are unreliable as SfM algorithms often fail or produce erroneous poses on challenging videos with highly dynamic objects, poorly textured surfaces, and rotating camera motion. We address this robustness issue by jointly estimating the static and dynamic radiance fields along with the camera parameters (poses and focal length). We demonstrate the robustness of our approach via extensive quantitative and qualitative experiments. Our results show favorable performance over the state-of-the-art dynamic view synthesis methods.
△ Less
Submitted 21 March, 2023; v1 submitted 5 January, 2023;
originally announced January 2023.
-
Prediction of topological phases in metastable ferromagnetic MPX$_3$ monolayers
Authors:
Natalya Sheremetyeva,
Ilyoun Na,
Anay Saraf,
Sinéad M. Griffin,
Geoffroy Hautier
Abstract:
Density functional theory calculations are carried out to study the electronic and topological properties of $M$P$X_3$ ($M$ = Mn, Fe, Co, Ni, and $X$ = S, Se) monolayers in the ferromagnetic (FM) metastable magnetic state. We find that FM MnPSe$_3$ monolayers host topological semimetal signatures that are gapped out when spin-orbit coupling (SOC) is included. These findings are supported by explic…
▽ More
Density functional theory calculations are carried out to study the electronic and topological properties of $M$P$X_3$ ($M$ = Mn, Fe, Co, Ni, and $X$ = S, Se) monolayers in the ferromagnetic (FM) metastable magnetic state. We find that FM MnPSe$_3$ monolayers host topological semimetal signatures that are gapped out when spin-orbit coupling (SOC) is included. These findings are supported by explicit calculations of the Berry curvature and the Chern number. The choice of the Hubbard-$U$ parameter to describe the $d$-electrons is thoroughly discussed, as well as the influence of using a hybrid-functional approach. The presence of band inversions and the associated topological features are found to be formalism-dependent. Nevertheless, routes to achieve the topological phase via the application of external biaxial strain are demonstrated. Within the hybrid-functional picture, topological band structures are recovered under a pressure of 15% (17 GPa). The present work provides a potential avenue for uncovering new topological phases in metastable ferromagnetic phases.
△ Less
Submitted 7 December, 2022; v1 submitted 5 December, 2022;
originally announced December 2022.
-
IMU2CLIP: Multimodal Contrastive Learning for IMU Motion Sensors from Egocentric Videos and Text
Authors:
Seungwhan Moon,
Andrea Madotto,
Zhaojiang Lin,
Alireza Dirafzoon,
Aparajita Saraf,
Amy Bearman,
Babak Damavandi
Abstract:
We present IMU2CLIP, a novel pre-training approach to align Inertial Measurement Unit (IMU) motion sensor recordings with video and text, by projecting them into the joint representation space of Contrastive Language-Image Pre-training (CLIP). The proposed approach allows IMU2CLIP to translate human motions (as measured by IMU sensors) into their corresponding textual descriptions and videos -- wh…
▽ More
We present IMU2CLIP, a novel pre-training approach to align Inertial Measurement Unit (IMU) motion sensor recordings with video and text, by projecting them into the joint representation space of Contrastive Language-Image Pre-training (CLIP). The proposed approach allows IMU2CLIP to translate human motions (as measured by IMU sensors) into their corresponding textual descriptions and videos -- while preserving the transitivity across these modalities.
We explore several new IMU-based applications that IMU2CLIP enables, such as motion-based media retrieval and natural language reasoning tasks with motion data. In addition, we show that IMU2CLIP can significantly improve the downstream performance when fine-tuned for each application (e.g. activity recognition), demonstrating the universal usage of IMU2CLIP as a new pre-trained resource. Our code will be made publicly available.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
AMICO: Amodal Instance Composition
Authors:
Peiye Zhuang,
Jia-bin Huang,
Ayush Saraf,
Xuejian Rong,
Changil Kim,
Denis Demandolx
Abstract:
Image composition aims to blend multiple objects to form a harmonized image. Existing approaches often assume precisely segmented and intact objects. Such assumptions, however, are hard to satisfy in unconstrained scenarios. We present Amodal Instance Composition for compositing imperfect -- potentially incomplete and/or coarsely segmented -- objects onto a target image. We first develop object sh…
▽ More
Image composition aims to blend multiple objects to form a harmonized image. Existing approaches often assume precisely segmented and intact objects. Such assumptions, however, are hard to satisfy in unconstrained scenarios. We present Amodal Instance Composition for compositing imperfect -- potentially incomplete and/or coarsely segmented -- objects onto a target image. We first develop object shape prediction and content completion modules to synthesize the amodal contents. We then propose a neural composition model to blend the objects seamlessly. Our primary technical novelty lies in using separate foreground/background representations and blending mask prediction to alleviate segmentation errors. Our results show state-of-the-art performance on public COCOA and KINS benchmarks and attain favorable visual results across diverse scenes. We demonstrate various image composition applications such as object insertion and de-occlusion.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Optimal Local Bayesian Differential Privacy over Markov Chains
Authors:
Darshan Chakrabarti,
Jie Gao,
Aditya Saraf,
Grant Schoenebeck,
Fang-Yi Yu
Abstract:
In the literature of data privacy, differential privacy is the most popular model. An algorithm is differentially private if its outputs with and without any individual's data are indistinguishable. In this paper, we focus on data generated from a Markov chain and argue that Bayesian differential privacy (BDP) offers more meaningful guarantees in this context. Our main theoretical contribution is…
▽ More
In the literature of data privacy, differential privacy is the most popular model. An algorithm is differentially private if its outputs with and without any individual's data are indistinguishable. In this paper, we focus on data generated from a Markov chain and argue that Bayesian differential privacy (BDP) offers more meaningful guarantees in this context. Our main theoretical contribution is providing a mechanism for achieving BDP when data is drawn from a binary Markov chain. We improve on the state-of-the-art BDP mechanism and show that our mechanism provides the optimal noise-privacy tradeoffs for any local mechanism up to negligible factors. We also briefly discuss a non-local mechanism which adds correlated noise. Lastly, we perform experiments on synthetic data that detail when DP is insufficient, and experiments on real data to show that our privacy guarantees are robust to underlying distributions that are not simple Markov chains.
△ Less
Submitted 22 June, 2022;
originally announced June 2022.
-
Dynamic View Synthesis from Dynamic Monocular Video
Authors:
Chen Gao,
Ayush Saraf,
Johannes Kopf,
Jia-Bin Huang
Abstract:
We present an algorithm for generating novel views at arbitrary viewpoints and any input time step given a monocular video of a dynamic scene. Our work builds upon recent advances in neural implicit representation and uses continuous and differentiable functions for modeling the time-varying structure and the appearance of the scene. We jointly train a time-invariant static NeRF and a time-varying…
▽ More
We present an algorithm for generating novel views at arbitrary viewpoints and any input time step given a monocular video of a dynamic scene. Our work builds upon recent advances in neural implicit representation and uses continuous and differentiable functions for modeling the time-varying structure and the appearance of the scene. We jointly train a time-invariant static NeRF and a time-varying dynamic NeRF, and learn how to blend the results in an unsupervised manner. However, learning this implicit function from a single video is highly ill-posed (with infinitely many solutions that match the input video). To resolve the ambiguity, we introduce regularization losses to encourage a more physically plausible solution. We show extensive quantitative and qualitative results of dynamic view synthesis from casually captured videos.
△ Less
Submitted 13 May, 2021;
originally announced May 2021.
-
Competition Alleviates Present Bias in Task Completion
Authors:
Aditya Saraf,
Anna R. Karlin,
Jamie Morgenstern
Abstract:
We build upon recent work [Kleinberg and Oren, 2014, Kleinberg et al., 2016, 2017] that considers present biased agents, who place more weight on costs they must incur now than costs they will incur in the future. They consider a graph theoretic model where agents must complete a task and show that present biased agents can take exponentially more expensive paths than optimal. We propose a theoret…
▽ More
We build upon recent work [Kleinberg and Oren, 2014, Kleinberg et al., 2016, 2017] that considers present biased agents, who place more weight on costs they must incur now than costs they will incur in the future. They consider a graph theoretic model where agents must complete a task and show that present biased agents can take exponentially more expensive paths than optimal. We propose a theoretical model that adds competition into the mix -- two agents compete to finish a task first. We show that, in a wide range of settings, a small amount of competition can alleviate the harms of present bias. This can help explain why biased agents may not perform so poorly in naturally competitive settings, and can guide task designers on how to protect present biased agents from harm. Our work thus paints a more positive picture than much of the existing literature on present bias.
△ Less
Submitted 13 January, 2022; v1 submitted 28 September, 2020;
originally announced September 2020.
-
Robust Bayesianism and Likelihoodism
Authors:
Conor Mayo-Wilson,
Aditya Saraf
Abstract:
We defend a new theory of statistical evidence, which we call Robust Bayesianism (RB). We prove that, under widely accepted assumptions, RB entails the law of likelihood [Royall, 1997], the likelihood principle [Berger and Wolpert, 1988], and a variety of other widely-accepted "statistical principles", e.g., the sufficiency principle [Birnbaum, 1962, 1972] and stopping-rule principle [Berger and W…
▽ More
We defend a new theory of statistical evidence, which we call Robust Bayesianism (RB). We prove that, under widely accepted assumptions, RB entails the law of likelihood [Royall, 1997], the likelihood principle [Berger and Wolpert, 1988], and a variety of other widely-accepted "statistical principles", e.g., the sufficiency principle [Birnbaum, 1962, 1972] and stopping-rule principle [Berger and Wolpert, 1988]. The main technical contribution of this paper is to extend some of those results to a qualitative framework in which experimenters are justified only in making comparative, non-numerical judgments of the form "A given B is more likely than C given D."
△ Less
Submitted 15 October, 2022; v1 submitted 8 September, 2020;
originally announced September 2020.
-
Flow-edge Guided Video Completion
Authors:
Chen Gao,
Ayush Saraf,
Jia-Bin Huang,
Johannes Kopf
Abstract:
We present a new flow-based video completion algorithm. Previous flow completion methods are often unable to retain the sharpness of motion boundaries. Our method first extracts and completes motion edges, and then uses them to guide piecewise-smooth flow completion with sharp edges. Existing methods propagate colors among local flow connections between adjacent frames. However, not all missing re…
▽ More
We present a new flow-based video completion algorithm. Previous flow completion methods are often unable to retain the sharpness of motion boundaries. Our method first extracts and completes motion edges, and then uses them to guide piecewise-smooth flow completion with sharp edges. Existing methods propagate colors among local flow connections between adjacent frames. However, not all missing regions in a video can be reached in this way because the motion boundaries form impenetrable barriers. Our method alleviates this problem by introducing non-local flow connections to temporally distant frames, enabling propagating video content over motion boundaries. We validate our approach on the DAVIS dataset. Both visual and quantitative results show that our method compares favorably against the state-of-the-art algorithms.
△ Less
Submitted 3 September, 2020;
originally announced September 2020.
-
One Shot 3D Photography
Authors:
Johannes Kopf,
Kevin Matzen,
Suhib Alsisan,
Ocean Quigley,
Francis Ge,
Yangming Chong,
Josh Patterson,
Jan-Michael Frahm,
Shu Wu,
Matthew Yu,
Peizhao Zhang,
Zijian He,
Peter Vajda,
Ayush Saraf,
Michael Cohen
Abstract:
3D photography is a new medium that allows viewers to more fully experience a captured moment. In this work, we refer to a 3D photo as one that displays parallax induced by moving the viewpoint (as opposed to a stereo pair with a fixed viewpoint). 3D photos are static in time, like traditional photos, but are displayed with interactive parallax on mobile or desktop screens, as well as on Virtual R…
▽ More
3D photography is a new medium that allows viewers to more fully experience a captured moment. In this work, we refer to a 3D photo as one that displays parallax induced by moving the viewpoint (as opposed to a stereo pair with a fixed viewpoint). 3D photos are static in time, like traditional photos, but are displayed with interactive parallax on mobile or desktop screens, as well as on Virtual Reality devices, where viewing it also includes stereo. We present an end-to-end system for creating and viewing 3D photos, and the algorithmic and design choices therein. Our 3D photos are captured in a single shot and processed directly on a mobile device. The method starts by estimating depth from the 2D input image using a new monocular depth estimation network that is optimized for mobile devices. It performs competitively to the state-of-the-art, but has lower latency and peak memory consumption and uses an order of magnitude fewer parameters. The resulting depth is lifted to a layered depth image, and new geometry is synthesized in parallax regions. We synthesize color texture and structures in the parallax regions as well, using an inpainting network, also optimized for mobile devices, on the LDI directly. Finally, we convert the result into a mesh-based representation that can be efficiently transmitted and rendered even on low-end devices and over poor network connections. Altogether, the processing takes just a few seconds on a mobile device, and the result can be instantly viewed and shared. We perform extensive quantitative evaluation to validate our system and compare its new components against the current state-of-the-art.
△ Less
Submitted 1 September, 2020; v1 submitted 27 August, 2020;
originally announced August 2020.
-
Blockchain Meets Database: Design and Implementation of a Blockchain Relational Database
Authors:
Senthil Nathan,
Chander Govindarajan,
Adarsh Saraf,
Manish Sethi,
Praveen Jayachandran
Abstract:
In this paper, we design and implement the first-ever decentralized replicated relational database with blockchain properties that we term blockchain relational database. We highlight several similarities between features provided by blockchain platforms and a replicated relational database, although they are conceptually different, primarily in their trust model. Motivated by this, we leverage th…
▽ More
In this paper, we design and implement the first-ever decentralized replicated relational database with blockchain properties that we term blockchain relational database. We highlight several similarities between features provided by blockchain platforms and a replicated relational database, although they are conceptually different, primarily in their trust model. Motivated by this, we leverage the rich features, decades of research and optimization, and available tooling in relational databases to build a blockchain relational database. We consider a permissioned blockchain model of known, but mutually distrustful organizations each operating their own database instance that are replicas of one another. The replicas execute transactions independently and engage in decentralized consensus to determine the commit order for transactions. We design two approaches, the first where the commit order for transactions is agreed upon prior to executing them, and the second where transactions are executed without prior knowledge of the commit order while the ordering happens in parallel. We leverage serializable snapshot isolation (SSI) to guarantee that the replicas across nodes remain consistent and respect the ordering determined by consensus, and devise a new variant of SSI based on block height for the latter approach. We implement our system on PostgreSQL and present detailed performance experiments analyzing both approaches.
△ Less
Submitted 31 May, 2019; v1 submitted 5 March, 2019;
originally announced March 2019.