-
Cost-Effective, Low Latency Vector Search with Azure Cosmos DB
Authors:
Nitish Upreti,
Krishnan Sundaram,
Hari Sudan Sundar,
Samer Boshra,
Balachandar Perumalswamy,
Shivam Atri,
Martin Chisholm,
Revti Raman Singh,
Greg Yang,
Subramanyam Pattipaka,
Tamara Hass,
Nitesh Dudhey,
James Codella,
Mark Hildebrand,
Magdalen Manohar,
Jack Moffitt,
Haiyang Xu,
Naren Datha,
Suryansh Gupta,
Ravishankar Krishnaswamy,
Prashant Gupta,
Abhishek Sahu,
Ritika Mor,
Santosh Kulkarni,
Hemeswari Varada
, et al. (11 additional authors not shown)
Abstract:
Vector indexing enables semantic search over diverse corpora and has become an important interface to databases for both users and AI agents. Efficient vector search requires deep optimizations in database systems. This has motivated a new class of specialized vector databases that optimize for vector search quality and cost. Instead, we argue that a scalable, high-performance, and cost-efficient…
▽ More
Vector indexing enables semantic search over diverse corpora and has become an important interface to databases for both users and AI agents. Efficient vector search requires deep optimizations in database systems. This has motivated a new class of specialized vector databases that optimize for vector search quality and cost. Instead, we argue that a scalable, high-performance, and cost-efficient vector search system can be built inside a cloud-native operational database like Azure Cosmos DB while leveraging the benefits of a distributed database such as high availability, durability, and scale. We do this by deeply integrating DiskANN, a state-of-the-art vector indexing library, inside Azure Cosmos DB NoSQL. This system uses a single vector index per partition stored in existing index trees, and kept in sync with underlying data. It supports < 20ms query latency over an index spanning 10 million of vectors, has stable recall over updates, and offers nearly 15x and 41x lower query cost compared to Zilliz and Pinecone serverless enterprise products. It also scales out to billions of vectors via automatic partitioning. This convergent design presents a point in favor of integrating vector indices into operational databases in the context of recent debates on specialized vector databases, and offers a template for vector indexing in other databases.
△ Less
Submitted 9 May, 2025;
originally announced May 2025.
-
Forest Covers and Bounded Forest Covers
Authors:
Daya Ram Gaur,
Barun Gorain,
Shaswati Patra,
Rishi Ranjan Singh
Abstract:
We study approximation algorithms for the forest cover and bounded forest cover problems. A probabilistic $2+ε$ approximation algorithm for the forest cover problem is given using the method of dual fitting. A deterministic algorithm with a 2-approximation ratio that rounds the optimal solution to a linear program is given next. The 2-approximation for the forest cover is then used to give a 6-app…
▽ More
We study approximation algorithms for the forest cover and bounded forest cover problems. A probabilistic $2+ε$ approximation algorithm for the forest cover problem is given using the method of dual fitting. A deterministic algorithm with a 2-approximation ratio that rounds the optimal solution to a linear program is given next. The 2-approximation for the forest cover is then used to give a 6-approximation for the bounded forest cover problem. The use of the probabilistic method to develop the $2+ε$ approximation algorithm may be of independent interest.
△ Less
Submitted 25 November, 2024;
originally announced November 2024.
-
Centrality Measures: A Tool to Identify Key Actors in Social Networks
Authors:
Rishi Ranjan Singh
Abstract:
Experts from several disciplines have been widely using centrality measures for analyzing large as well as complex networks. These measures rank nodes/edges in networks by quantifying a notion of the importance of nodes/edges. Ranking aids in identifying important and crucial actors in networks. In this chapter, we summarize some of the centrality measures that are extensively applied for mining s…
▽ More
Experts from several disciplines have been widely using centrality measures for analyzing large as well as complex networks. These measures rank nodes/edges in networks by quantifying a notion of the importance of nodes/edges. Ranking aids in identifying important and crucial actors in networks. In this chapter, we summarize some of the centrality measures that are extensively applied for mining social network data. We also discuss various directions of research related to these measures.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
EBBIOT: A Low-complexity Tracking Algorithm for Surveillance in IoVT Using Stationary Neuromorphic Vision Sensors
Authors:
Jyotibdha Acharya,
Andres Ussa Caycedo,
Vandana Reddy Padala,
Rishi Raj Sidhu Singh,
Garrick Orchard,
Bharath Ramesh,
Arindam Basu
Abstract:
In this paper, we present EBBIOT-a novel paradigm for object tracking using stationary neuromorphic vision sensors in low-power sensor nodes for the Internet of Video Things (IoVT). Different from fully event based tracking or fully frame based approaches, we propose a mixed approach where we create event-based binary images (EBBI) that can use memory efficient noise filtering algorithms. We explo…
▽ More
In this paper, we present EBBIOT-a novel paradigm for object tracking using stationary neuromorphic vision sensors in low-power sensor nodes for the Internet of Video Things (IoVT). Different from fully event based tracking or fully frame based approaches, we propose a mixed approach where we create event-based binary images (EBBI) that can use memory efficient noise filtering algorithms. We exploit the motion triggering aspect of neuromorphic sensors to generate region proposals based on event density counts with >1000X less memory and computes compared to frame based approaches. We also propose a simple overlap based tracker (OT) with prediction based handling of occlusion. Our overall approach requires 7X less memory and 3X less computations than conventional noise filtering and event based mean shift (EBMS) tracking. Finally, we show that our approach results in significantly higher precision and recall compared to EBMS approach as well as Kalman Filter tracker when evaluated over 1.1 hours of traffic recordings at two different locations.
△ Less
Submitted 4 October, 2019;
originally announced October 2019.
-
An Efficient Heuristic for Betweenness-Ordering
Authors:
Rishi Ranjan Singh,
Shubham Chaudhary,
Manas Agarwal
Abstract:
Centrality measures, erstwhile popular amongst the sociologists and psychologists, have seen broad and increasing applications across several disciplines of late. Amongst a plethora of application specific definitions available in the literature to rank the vertices, closeness centrality, betweenness centrality and eigenvector centrality (page-rank) have been the most important and widely applied…
▽ More
Centrality measures, erstwhile popular amongst the sociologists and psychologists, have seen broad and increasing applications across several disciplines of late. Amongst a plethora of application specific definitions available in the literature to rank the vertices, closeness centrality, betweenness centrality and eigenvector centrality (page-rank) have been the most important and widely applied ones. Networks where information, signal or commodities are flowing on the edges, surrounds us. Betweenness centrality comes as a handy tool to analyze such systems, but betweenness computation is a daunting task in large size networks. In this paper, we propose an efficient heuristic to determine the betweenness-ordering of $k$ vertices (where $k$ is very less than the total number of vertices) without computing their exact betweenness indices. The algorithm is based on a non-uniform node sampling model which is developed based on the analysis of Erdos-Renyi graphs. We apply our approach to find the betweenness-ordering of vertices in several synthetic and real-world graphs. The proposed heuristic results very efficient ordering even when runs for a linear time in the terms of the number of edges. We compare our method with the available techniques in the literature and show that our method produces more efficient ordering than the currently known methods.
△ Less
Submitted 22 March, 2017; v1 submitted 23 September, 2014;
originally announced September 2014.
-
Navigability on Networks: A Graph Theoretic Perspective
Authors:
Rishi Ranjan Singh,
Shreyas Balakuntala,
Sudarshan Iyengar
Abstract:
Human navigation has been of interest to psychologists and cognitive scientists since the past few decades. It was in the recent past that a study of human navigational strategies was initiated with a network analytic approach, instigated mainly by Milgrams small world experiment. We brief the work in this direction and provide answers to the algorithmic questions raised by the previous study. It…
▽ More
Human navigation has been of interest to psychologists and cognitive scientists since the past few decades. It was in the recent past that a study of human navigational strategies was initiated with a network analytic approach, instigated mainly by Milgrams small world experiment. We brief the work in this direction and provide answers to the algorithmic questions raised by the previous study. It is noted that humans have a tendency to navigate using centers of the network - such paths are called the center-strategic-paths. We show that the problem of finding a center-strategic-path is an easy one. We provide a polynomial time algorithm to find a center-strategic-path between a given pair of nodes. We apply our finding in empirically checking the navigability on synthetic networks and analyze few special types of graphs.
△ Less
Submitted 7 May, 2013; v1 submitted 15 April, 2013;
originally announced April 2013.
-
Approaches for user profile Investigation in Orkut Social Network
Authors:
Rajni Ranjan Singh,
Deepak Singh Tomar
Abstract:
Internet becomes a large and rich repository of information about us as individually. Any thing form user profile information to friends links the user subscribes to are reflection of social interactions as user has in real worlds. Social networking has created new ways to communicate and share information. Social networking websites are being used regularly by millions of people, and it now see…
▽ More
Internet becomes a large and rich repository of information about us as individually. Any thing form user profile information to friends links the user subscribes to are reflection of social interactions as user has in real worlds. Social networking has created new ways to communicate and share information. Social networking websites are being used regularly by millions of people, and it now seems that social networking will be an enduring part of everyday life. Social networks such as Orkut, Bebo, MySpace, Flickr, Facebook, Friendster and LinkedIn, have attracted millions of internet user who are involved in bogging, participatory book reviewing, personal networking and photo sharing. Social network services are increasingly being used in legal and criminal investigations. Information posted on sites such as Orkut and Facebook has been used by police, probation, and university officials to prosecute users of said sites. In some situations, content posted on web social network has been used in court. In the proposed work degree of closeness is identified by link weight approaches and information matrices are generated and matched on the basis of similarity in user profile information. The proposed technique is useful to investigate a user profile and calculate closeness or interaction between users.
△ Less
Submitted 5 December, 2009;
originally announced December 2009.