-
Function-Correcting $b$-symbol Codes for Locally $(λ, ρ,b)$-Functions
Authors:
Gyanendra K. Verma,
Anamika Singh,
Abhay Kumar Singh
Abstract:
The family of functions plays a central role in the design and effectiveness of function-correcting codes. By focusing on a well-defined family of functions, function-correcting codes can be constructed with minimal length while still ensuring full error detection or correction within that family. In this work, we explore the concept of locally $(λ,ρ)$-functions for $b$-symbol read channels and in…
▽ More
The family of functions plays a central role in the design and effectiveness of function-correcting codes. By focusing on a well-defined family of functions, function-correcting codes can be constructed with minimal length while still ensuring full error detection or correction within that family. In this work, we explore the concept of locally $(λ,ρ)$-functions for $b$-symbol read channels and investigate the redundancy of the corresponding function-correcting $b$-symbol codes(FCBSC) by introducing the notions of locally $(λ,ρ,b)$-functions. First, we discuss the possible values of $λ$ and $ρ$ for which any function can be considered as locally $(λ,ρ)$-function in $b$-symbol metric. The findings improve some known results in the Hamming metric and present several new results in the $b$-symbol metric. Then we investigate the redundancy of $(f,t)$-FCBSC for locally $(λ,ρ,b)$-functions. We establish a recurrence relation between the optimal redundancy of $(f,t)$ -function-correcting codes for the $(b+1)$-read and $b$-read channels. We establish an upper bound on the redundancy of $(f,t)$-function-correcting $b$-symbol codes for general locally ($λ,ρ$, $b$)-functions by linking it to the minimum achievable length of $b$-symbol error-correcting codes and traditional Hamming-metric codes, given a fixed number of codewords and a specified minimum distance. We derive some explicit upper bounds on the redundancy of $(f,t)$-function-correcting $b$-symbol codes for $ρ=2t$. Moreover, for the case where $b=1$, we show that a locally ($3,2t,1$)-function achieves the optimal redundancy of $3t$. Additionally, we explicitly investigate locality and redundancy for the $b$-symbol weight distribution function for $b\geq1$.
△ Less
Submitted 23 May, 2025; v1 submitted 14 May, 2025;
originally announced May 2025.
-
Power Flow Approximations for Multiphase Distribution Networks using Gaussian Processes
Authors:
Daniel Glover,
Parikshit Pareek,
Deepjyoti Deka,
Anamika Dubey
Abstract:
Learning-based approaches are increasingly leveraged to manage and coordinate the operation of grid-edge resources in active power distribution networks. Among these, model-based techniques stand out for their superior data efficiency and robustness compared to model-free methods. However, effective model learning requires a learning-based approximator for the underlying power flow model. This stu…
▽ More
Learning-based approaches are increasingly leveraged to manage and coordinate the operation of grid-edge resources in active power distribution networks. Among these, model-based techniques stand out for their superior data efficiency and robustness compared to model-free methods. However, effective model learning requires a learning-based approximator for the underlying power flow model. This study extends existing work by introducing a data-driven power flow method based on Gaussian Processes (GPs) to approximate the multiphase power flow model, by mapping net load injections to nodal voltages. Simulation results using the IEEE 123-bus and 8500-node distribution test feeders demonstrate that the trained GP model can reliably predict the nonlinear power flow solutions with minimal training data. We also conduct a comparative analysis of the training efficiency and testing performance of the proposed GP-based power flow approximator against a deep neural network-based approximator, highlighting the advantages of our data-efficient approach. Results over realistic operating conditions show that despite an 85% reduction in the training sample size (corresponding to a 92.8% improvement in training time), GP models produce a 99.9% relative reduction in mean absolute error compared to the baselines of deep neural networks.
△ Less
Submitted 29 April, 2025;
originally announced April 2025.
-
Energy-Based Reward Models for Robust Language Model Alignment
Authors:
Anamika Lochab,
Ruqi Zhang
Abstract:
Reward models (RMs) are essential for aligning Large Language Models (LLMs) with human preferences. However, they often struggle with capturing complex human preferences and generalizing to unseen data. To address these challenges, we introduce Energy-Based Reward Model (EBRM), a lightweight post-hoc refinement framework that enhances RM robustness and generalization. EBRM models the reward distri…
▽ More
Reward models (RMs) are essential for aligning Large Language Models (LLMs) with human preferences. However, they often struggle with capturing complex human preferences and generalizing to unseen data. To address these challenges, we introduce Energy-Based Reward Model (EBRM), a lightweight post-hoc refinement framework that enhances RM robustness and generalization. EBRM models the reward distribution explicitly, capturing uncertainty in human preferences and mitigating the impact of noisy or misaligned annotations. It achieves this through conflict-aware data filtering, label-noise-aware contrastive training, and hybrid initialization. Notably, EBRM enhances RMs without retraining, making it computationally efficient and adaptable across different models and tasks. Empirical evaluations on RM benchmarks demonstrate significant improvements in both robustness and generalization, achieving up to a 5.97% improvement in safety-critical alignment tasks compared to standard RMs. Furthermore, reinforcement learning experiments confirm that our refined rewards enhance alignment quality, effectively delaying reward hacking. These results demonstrate our approach as a scalable and effective enhancement for existing RMs and alignment pipelines. The code is available at EBRM.
△ Less
Submitted 17 April, 2025;
originally announced April 2025.
-
Function-Correcting Codes for b-Symbol Read Channels
Authors:
Anamika Singh,
Abhay Kumar Singh,
Eitan Yaakobi
Abstract:
Function-correcting codes are an innovative class of codes that are designed to protect a function evaluation of the data against errors or corruptions. Due to its usefulness in machine learning applications and archival data storage, where preserving the integrity of computation is crucial, Lenz et al. recently introduced function-correcting codes for binary symmetric channels to safeguard functi…
▽ More
Function-correcting codes are an innovative class of codes that are designed to protect a function evaluation of the data against errors or corruptions. Due to its usefulness in machine learning applications and archival data storage, where preserving the integrity of computation is crucial, Lenz et al. recently introduced function-correcting codes for binary symmetric channels to safeguard function evaluation against errors. Xia et al. expanded this concept to symbol-pair read channels over binary fields. The current paper further advances the theory by developing function-correcting codes for b-symbol read channels over finite fields. We introduce the idea of irregular b-symbol distance codes and establish bounds on their performance over finite fields. This concept helps in understanding the behavior of function-correcting codes in more complex settings. We also present a graphical approach of the problem of constructing function-correcting b-symbol codes. Furthermore, we apply these general concepts to specific classes of functions and compare the redundancy of function-correcting b-symbol codes with classical b-symbol codes. Our findings demonstrate that function-correcting b-symbol codes achieve lower redundancy while maintaining reliability.
△ Less
Submitted 17 March, 2025;
originally announced March 2025.
-
Adaptive Random Fourier Features Training Stabilized By Resampling With Applications in Image Regression
Authors:
Aku Kammonen,
Anamika Pandey,
Erik von Schwerin,
Raúl Tempone
Abstract:
This paper presents an enhanced adaptive random Fourier features (ARFF) training algorithm for shallow neural networks, building upon the work introduced in "Adaptive Random Fourier Features with Metropolis Sampling", Kammonen et al., \emph{Foundations of Data Science}, 2(3):309--332, 2020. This improved method uses a particle filter-type resampling technique to stabilize the training process and…
▽ More
This paper presents an enhanced adaptive random Fourier features (ARFF) training algorithm for shallow neural networks, building upon the work introduced in "Adaptive Random Fourier Features with Metropolis Sampling", Kammonen et al., \emph{Foundations of Data Science}, 2(3):309--332, 2020. This improved method uses a particle filter-type resampling technique to stabilize the training process and reduce the sensitivity to parameter choices. The Metropolis test can also be omitted when resampling is used, reducing the number of hyperparameters by one and reducing the computational cost per iteration compared to the ARFF method. We present comprehensive numerical experiments demonstrating the efficacy of the proposed algorithm in function regression tasks as a stand-alone method and as a pretraining step before gradient-based optimization, using the Adam optimizer. Furthermore, we apply the proposed algorithm to a simple image regression problem, illustrating its utility in sampling frequencies for the random Fourier features (RFF) layer of coordinate-based multilayer perceptrons. In this context, we use the proposed algorithm to sample the parameters of the RFF layer in an automated manner.
△ Less
Submitted 29 April, 2025; v1 submitted 8 October, 2024;
originally announced October 2024.
-
Cascade Reward Sampling for Efficient Decoding-Time Alignment
Authors:
Bolian Li,
Yifan Wang,
Anamika Lochab,
Ananth Grama,
Ruqi Zhang
Abstract:
Aligning large language models (LLMs) with human preferences is essential for their applications. Recently, decoding-time alignment has emerged as an effective plug-and-play technique that avoids fine-tuning model parameters. This approach retains the general utility of pretrained LLMs but often suffers from significant inefficiencies during decoding, primarily due to wasted token generation and e…
▽ More
Aligning large language models (LLMs) with human preferences is essential for their applications. Recently, decoding-time alignment has emerged as an effective plug-and-play technique that avoids fine-tuning model parameters. This approach retains the general utility of pretrained LLMs but often suffers from significant inefficiencies during decoding, primarily due to wasted token generation and excessive reward evaluations. To address these challenges, we introduce Cascade Reward Sampling (CARDS) to resolve both efficiency bottlenecks in decoding-time alignment. Specifically, we develop a segment-level rejection sampling algorithm that minimizes redundant computations of both LLMs and reward models (RMs). Central to CARDS is an uncertainty-based segmentation mechanism, which ensures the accuracy of RMs evaluations on incomplete segments. Furthermore, we provide a detailed analysis of reward scores on segments to elucidate the improved alignment performance. Experimental results demonstrate that CARDS significantly improves decoding efficiency, alignment quality, and general utility compared to existing decoding-time alignment methods, achieving approximately a 70% reduction in decoding time and over 90% win-ties in utility and safety benchmarks.
△ Less
Submitted 31 March, 2025; v1 submitted 24 June, 2024;
originally announced June 2024.
-
AutoLCZ: Towards Automatized Local Climate Zone Mapping from Rule-Based Remote Sensing
Authors:
Chenying Liu,
Hunsoo Song,
Anamika Shreevastava,
Conrad M Albrecht
Abstract:
Local climate zones (LCZs) established a standard classification system to categorize the landscape universe for improved urban climate studies. Existing LCZ mapping is guided by human interaction with geographic information systems (GIS) or modelled from remote sensing (RS) data. GIS-based methods do not scale to large areas. However, RS-based methods leverage machine learning techniques to autom…
▽ More
Local climate zones (LCZs) established a standard classification system to categorize the landscape universe for improved urban climate studies. Existing LCZ mapping is guided by human interaction with geographic information systems (GIS) or modelled from remote sensing (RS) data. GIS-based methods do not scale to large areas. However, RS-based methods leverage machine learning techniques to automatize LCZ classification from RS. Yet, RS-based methods require huge amounts of manual labels for training.
We propose a novel LCZ mapping framework, termed AutoLCZ, to extract the LCZ classification features from high-resolution RS modalities. We study the definition of numerical rules designed to mimic the LCZ definitions. Those rules model geometric and surface cover properties from LiDAR data. Correspondingly, we enable LCZ classification from RS data in a GIS-based scheme. The proposed AutoLCZ method has potential to reduce the human labor to acquire accurate metadata. At the same time, AutoLCZ sheds light on the physical interpretability of RS-based methods. In a proof-of-concept for New York City (NYC) we leverage airborne LiDAR surveys to model 4 LCZ features to distinguish 10 LCZ types. The results indicate the potential of AutoLCZ as promising avenue for large-scale LCZ mapping from RS data.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
"Hey..! This medicine made me sick": Sentiment Analysis of User-Generated Drug Reviews using Machine Learning Techniques
Authors:
Abhiram B. Nair,
Abhinand K.,
Anamika U.,
Denil Tom Jaison,
Ajitha V.,
V. S. Anoop
Abstract:
Sentiment analysis has become increasingly important in healthcare, especially in the biomedical and pharmaceutical fields. The data generated by the general public on the effectiveness, side effects, and adverse drug reactions are goldmines for different agencies and medicine producers to understand the concerns and reactions of people. Despite the challenge of obtaining datasets on drug-related…
▽ More
Sentiment analysis has become increasingly important in healthcare, especially in the biomedical and pharmaceutical fields. The data generated by the general public on the effectiveness, side effects, and adverse drug reactions are goldmines for different agencies and medicine producers to understand the concerns and reactions of people. Despite the challenge of obtaining datasets on drug-related problems, sentiment analysis on this topic would be a significant boon to the field. This project proposes a drug review classification system that classifies user reviews on a particular drug into different classes, such as positive, negative, and neutral. This approach uses a dataset that is collected from publicly available sources containing drug reviews, such as drugs.com. The collected data is manually labeled and verified manually to ensure that the labels are correct. Three pre-trained language models, such as BERT, SciBERT, and BioBERT, are used to obtain embeddings, which were later used as features to different machine learning classifiers such as decision trees, support vector machines, random forests, and also deep learning algorithms such as recurrent neural networks. The performance of these classifiers is quantified using precision, recall, and f1-score, and the results show that the proposed approaches are useful in analyzing the sentiments of people on different drugs.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Deployment of Advanced and Intelligent Logistics Vehicles with Enhanced Tracking and Security Features
Authors:
Iqtiar Md Siddique,
Selim Molla,
MD Rakib Hasan,
Anamika Ahmed Siddique
Abstract:
This study focuses on the implementation of modern and intelligent logistics vehicles equipped with advanced tracking and security features. In response to the evolving landscape of logistics management, the proposed system integrates cutting edge technologies to enhance efficiency and ensure the security of the entire logistics process. The core component of this implementation is the incorporati…
▽ More
This study focuses on the implementation of modern and intelligent logistics vehicles equipped with advanced tracking and security features. In response to the evolving landscape of logistics management, the proposed system integrates cutting edge technologies to enhance efficiency and ensure the security of the entire logistics process. The core component of this implementation is the incorporation of state-of-the art tracking mechanisms, enabling real-time monitoring of vehicle locations and movements. Furthermore, the system addresses the paramount concern of security by introducing advanced security measures. Through the utilization of sophisticated tracking technologies and security protocols, the proposed logistics vehicles aim to safeguard both customer and provider data. The implementation includes the integration of QR code concepts, creating a binary image system that conceals sensitive information and ensures access only to authorized users. In addition to tracking and security, the study delves into the realm of information mining, employing techniques such as classification, clustering, and recommendation to extract meaningful patterns from vast datasets. Collaborative filtering techniques are incorporated to enhance customer experience by recommending services based on user preferences and historical data. This abstract encapsulates the comprehensive approach of deploying modern logistics vehicles, emphasizing their intelligence through advanced tracking, robust security measures, and data-driven insights. The proposed system aims to revolutionize logistics management, providing a seamless and secure experience for both customers and service providers in the dynamic logistics landscape.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Comparing Spectral Bias and Robustness For Two-Layer Neural Networks: SGD vs Adaptive Random Fourier Features
Authors:
Aku Kammonen,
Lisi Liang,
Anamika Pandey,
Raúl Tempone
Abstract:
We present experimental results highlighting two key differences resulting from the choice of training algorithm for two-layer neural networks. The spectral bias of neural networks is well known, while the spectral bias dependence on the choice of training algorithm is less studied. Our experiments demonstrate that an adaptive random Fourier features algorithm (ARFF) can yield a spectral bias clos…
▽ More
We present experimental results highlighting two key differences resulting from the choice of training algorithm for two-layer neural networks. The spectral bias of neural networks is well known, while the spectral bias dependence on the choice of training algorithm is less studied. Our experiments demonstrate that an adaptive random Fourier features algorithm (ARFF) can yield a spectral bias closer to zero compared to the stochastic gradient descent optimizer (SGD). Additionally, we train two identically structured classifiers, employing SGD and ARFF, to the same accuracy levels and empirically assess their robustness against adversarial noise attacks.
△ Less
Submitted 31 January, 2024;
originally announced February 2024.
-
Deconstructing written rules and hierarchy in peer produced software communities
Authors:
Mahasweta Chakraborti,
Beril Bulat,
Qiankun Zhong,
Anamika Sen,
Seth Frey
Abstract:
We employ recent advances in computational institutional analysis and NLP to investigate the systems of authority that are reflected in the written policy documents of the ASF. Our study to decipher the effective similarities or departures of the ASF model from conventional software companies reveals evidence of both flat and bureaucratic governance in a peer production set up, suggesting a compli…
▽ More
We employ recent advances in computational institutional analysis and NLP to investigate the systems of authority that are reflected in the written policy documents of the ASF. Our study to decipher the effective similarities or departures of the ASF model from conventional software companies reveals evidence of both flat and bureaucratic governance in a peer production set up, suggesting a complicated relationship between business-based theories of administrative hierarchy and foundational principles of the OSS movement.
△ Less
Submitted 16 June, 2022;
originally announced June 2022.
-
Reinforcement Learning for Battery Energy Storage Dispatch augmented with Model-based Optimizer
Authors:
Gayathri Krishnamoorthy,
Anamika Dubey
Abstract:
Reinforcement learning has been found useful in solving optimal power flow (OPF) problems in electric power distribution systems. However, the use of largely model-free reinforcement learning algorithms that completely ignore the physics-based modeling of the power grid compromises the optimizer performance and poses scalability challenges. This paper proposes a novel approach to synergistically c…
▽ More
Reinforcement learning has been found useful in solving optimal power flow (OPF) problems in electric power distribution systems. However, the use of largely model-free reinforcement learning algorithms that completely ignore the physics-based modeling of the power grid compromises the optimizer performance and poses scalability challenges. This paper proposes a novel approach to synergistically combine the physics-based models with learning-based algorithms using imitation learning to solve distribution-level OPF problems. Specifically, we propose imitation learning based improvements in deep reinforcement learning (DRL) methods to solve the OPF problem for a specific case of battery storage dispatch in the power distribution systems. The proposed imitation learning algorithm uses the approximate optimal solutions obtained from a linearized model-based OPF solver to provide a good initial policy for the DRL algorithms while improving the training efficiency. The effectiveness of the proposed approach is demonstrated using IEEE 34-bus and 123-bus distribution feeders with numerous distribution-level battery storage systems.
△ Less
Submitted 2 September, 2021;
originally announced September 2021.
-
Signature Verification using Geometrical Features and Artificial Neural Network Classifier
Authors:
Anamika Jain,
Satish Kumar Singh,
Krishna Pratap Singh
Abstract:
Signature verification has been one of the major researched areas in the field of computer vision. Many financial and legal organizations use signature verification as access control and authentication. Signature images are not rich in texture; however, they have much vital geometrical information. Through this work, we have proposed a signature verification methodology that is simple yet effectiv…
▽ More
Signature verification has been one of the major researched areas in the field of computer vision. Many financial and legal organizations use signature verification as access control and authentication. Signature images are not rich in texture; however, they have much vital geometrical information. Through this work, we have proposed a signature verification methodology that is simple yet effective. The technique presented in this paper harnesses the geometrical features of a signature image like center, isolated points, connected components, etc., and with the power of Artificial Neural Network (ANN) classifier, classifies the signature image based on their geometrical features. Publicly available dataset MCYT, BHSig260 (contains the image of two regional languages Bengali and Hindi) has been used in this paper to test the effectiveness of the proposed method. We have received a lower Equal Error Rate (EER) on MCYT 100 dataset and higher accuracy on the BHSig260 dataset.
△ Less
Submitted 4 August, 2021;
originally announced August 2021.
-
Analyzing Offline Social Engagements: An Empirical Study of Meetup Events Related to Software Development
Authors:
Abhishek Sharma,
Gede Artha Azriadi Prana,
Anamika Sawhney,
Nachiappan Nagappan,
David Lo
Abstract:
Software developers use a variety of social media channels and tools in order to keep themselves up to date, collaborate with other developers, and find projects to contribute to. Meetup is one of such social media used by software developers to organize community gatherings. Liu et al. characterized Meetup as an event-based social network (EBSN) which contains valuable offline social interactions…
▽ More
Software developers use a variety of social media channels and tools in order to keep themselves up to date, collaborate with other developers, and find projects to contribute to. Meetup is one of such social media used by software developers to organize community gatherings. Liu et al. characterized Meetup as an event-based social network (EBSN) which contains valuable offline social interactions in addition to online interactions. Recently, Storey et al. found out that Meetup was one of the social channels used by developers. We in this work investigate in detail the dynamics of Meetup groups and events related to software development, which has not been done in any of the previous works.
First, we identified 6,317 Meetup groups related to software development and extracted 185,758 events organized by them. Then we took a statistically significant sample of 452 events on which we performed open coding, based on which we were able to develop 9 categories of events (8 main categories + Others). Next, we did a popularity analysis of the categories of events and found that Talks by Domain Experts, Hands-on Sessions, and Open Discussions are the most popular categories of events organized by Meetup groups related to software development. Our findings show that more popular categories are those where developers can learn and gain knowledge. On doing a diversity analysis of Meetup groups we found 19.82% of the members on an average are female, which is a larger proportion as compared to numbers reported in previous studies on other social media. From a broader software development community point of view information from this new forum can be valuable to identify and understand emerging topics and associations among them which can be helpful to identify future trends as well as current best practices.
△ Less
Submitted 16 December, 2019;
originally announced December 2019.
-
Investigating Ortega Hypothesis in Q&A portals: An Analysis of StackOverflow
Authors:
Anamika Chhabra,
S. R. S. Iyengar
Abstract:
Ortega Hypothesis considers masses, i.e., a large number of average people who are not specially qualified as being instrumental in any system's progress. This hypothesis has been reasonably examined in the scientific domain where it has been supported by a few works while refuted by many others, resulting in no clear consensus. While the hypothesis has only been explored in the scientific domain…
▽ More
Ortega Hypothesis considers masses, i.e., a large number of average people who are not specially qualified as being instrumental in any system's progress. This hypothesis has been reasonably examined in the scientific domain where it has been supported by a few works while refuted by many others, resulting in no clear consensus. While the hypothesis has only been explored in the scientific domain so far, it has hardly been examined in other fields. Given the large-scale collaboration facilitated by the modern Q&A portals where a crowd with a diverse skill-set contributes, an investigation of this hypothesis becomes necessary for informed policy-making. In this work, we investigate the research question inspired by Ortega Hypothesis in StackOverflow where we examine the contribution made by masses and check whether the system may continue to function well even in their absence. The results point towards the importance of masses in Q&A portals for the little but useful contribution that they provide. The insights obtained from the study may help in devising informed incentivization policies enabling better utilization of the potential of the users.
△ Less
Submitted 6 November, 2019;
originally announced November 2019.
-
Social Network Based Substance Abuse Prevention via Network Modification (A Preliminary Study)
Authors:
Aida Rahmattalabi,
Anamika Barman Adhikari,
Phebe Vayanos,
Milind Tambe,
Eric Rice,
Robin Baker
Abstract:
Substance use and abuse is a significant public health problem in the United States. Group-based intervention programs offer a promising means of preventing and reducing substance abuse. While effective, unfortunately, inappropriate intervention groups can result in an increase in deviant behaviors among participants, a process known as deviancy training. This paper investigates the problem of opt…
▽ More
Substance use and abuse is a significant public health problem in the United States. Group-based intervention programs offer a promising means of preventing and reducing substance abuse. While effective, unfortunately, inappropriate intervention groups can result in an increase in deviant behaviors among participants, a process known as deviancy training. This paper investigates the problem of optimizing the social influence related to the deviant behavior via careful construction of the intervention groups. We propose a Mixed Integer Optimization formulation that decides on the intervention groups, captures the impact of the groups on the structure of the social network, and models the impact of these changes on behavior propagation. In addition, we propose a scalable hybrid meta-heuristic algorithm that combines Mixed Integer Programming and Large Neighborhood Search to find near-optimal network partitions. Our algorithm is packaged in the form of GUIDE, an AI-based decision aid that recommends intervention groups. Being the first quantitative decision aid of this kind, GUIDE is able to assist practitioners, in particular social workers, in three key areas: (a) GUIDE proposes near-optimal solutions that are shown, via extensive simulations, to significantly improve over the traditional qualitative practices for forming intervention groups; (b) GUIDE is able to identify circumstances when an intervention will lead to deviancy training, thus saving time, money, and effort; (c) GUIDE can evaluate current strategies of group formation and discard strategies that will lead to deviancy training. In developing GUIDE, we are primarily interested in substance use interventions among homeless youth as a high risk and vulnerable population. GUIDE is developed in collaboration with Urban Peak, a homeless-youth serving organization in Denver, CO, and is under preparation for deployment.
△ Less
Submitted 31 January, 2019;
originally announced February 2019.
-
Toward SATVAM: An IoT Network for Air Quality Monitoring
Authors:
Rashmi Ballamajalu,
Srijith Nair,
Shayal Chhabra,
Sumit K Monga,
Anand SVR,
Malati Hegde,
Yogesh Simmhan,
Anamika Sharma,
Chandan M Choudhary,
Ronak Sutaria,
Rajesh Zele,
Sachchida N. Tripathi
Abstract:
Air pollution is ranked as the second most serious risk for public health in India after malnutrition. The lack of spatially and temporally distributed air quality information prevents a scientific study on its impact on human health and on the national economy. In this paper, we present our initial efforts toward SATVAM, Streaming Analytics over Temporal Variables for Air quality Monitoring, that…
▽ More
Air pollution is ranked as the second most serious risk for public health in India after malnutrition. The lack of spatially and temporally distributed air quality information prevents a scientific study on its impact on human health and on the national economy. In this paper, we present our initial efforts toward SATVAM, Streaming Analytics over Temporal Variables for Air quality Monitoring, that aims to address this gap. We introduce the multi-disciplinary, multi-institutional project and some of the key IoT technologies used. These cut across hardware integration of gas sensors with a wireless mote packaging, design of the wireless sensor network using 6LoWPAN and RPL, and integration with a cloud backend for data acquisition and analysis. The outcome of our initial deployment will inform an improved design that will enable affordable and manageable monitoring at the city scale. This should lead to data-driven policies for urban air quality management.
△ Less
Submitted 19 November, 2018;
originally announced November 2018.
-
Capturing Knowledge Triggering in Collaborative Settings
Authors:
Anamika Chhabra,
S. R. Sudarshan Iyengar
Abstract:
In collaborative knowledge building settings, the existing knowledge in the system is perceived to set stage for the manifestation of more knowledge, termed as the phenomenon of triggering. Although the literature points to a few theories supporting the existence of this phenomenon, these have never been validated in real collaborative environments, thus questioning their general prevalence. In th…
▽ More
In collaborative knowledge building settings, the existing knowledge in the system is perceived to set stage for the manifestation of more knowledge, termed as the phenomenon of triggering. Although the literature points to a few theories supporting the existence of this phenomenon, these have never been validated in real collaborative environments, thus questioning their general prevalence. In this work, we provide a mechanized way to observe the presence of triggering in knowledge building environments. We implement the method on the most-edited articles of Wikipedia and show how the existing factoids lead to the inclusion of more factoids in these articles. The proposed technique may further be used in other collaborative knowledge building settings as well. The insights obtained from the study will help the portal designers to build portals enabling optimal triggering.
△ Less
Submitted 2 September, 2018;
originally announced September 2018.
-
The Saga of KPR: Theoretical and Experimental developments
Authors:
Kiran Sharma,
Anamika,
Anindya S. Chakrabarti,
Anirban Chakraborti,
Sujoy Chakravarty
Abstract:
In this article, we present a brief narration of the origin and the overview of the recent developments done on the Kolkata Paise Restaurant (KPR) problem, which can serve as a prototype for a broader class of resource allocation problems in the presence of a large number of competing agents, typically studied using coordination and anti-coordination games. We discuss the KPR and its several exten…
▽ More
In this article, we present a brief narration of the origin and the overview of the recent developments done on the Kolkata Paise Restaurant (KPR) problem, which can serve as a prototype for a broader class of resource allocation problems in the presence of a large number of competing agents, typically studied using coordination and anti-coordination games. We discuss the KPR and its several extensions, as well as its applications in many economic and social phenomena. We end the article with some discussions on our ongoing experimental analysis of the same problem. We demonstrate that this provides an interesting picture of how people analyze complex situations, and design their strategies or react to them.
△ Less
Submitted 18 December, 2017;
originally announced December 2017.
-
How Does Knowledge Come By?
Authors:
Anamika Chhabra,
S. R. S. Iyengar
Abstract:
Although the amount of knowledge that the humans possess has been gradually increasing, we still do not know the procedure and conditions that lead to the creation of new knowledge. An understanding of the modus operandi for the creation of knowledge may help in accelerating the existing pace of building knowledge. Our state of ignorance regarding various aspects of the process of knowledge buildi…
▽ More
Although the amount of knowledge that the humans possess has been gradually increasing, we still do not know the procedure and conditions that lead to the creation of new knowledge. An understanding of the modus operandi for the creation of knowledge may help in accelerating the existing pace of building knowledge. Our state of ignorance regarding various aspects of the process of knowledge building is highlighted by the existing literature in the domain. The reason behind it has been our inability to acquire the underlying data of this complex process. However, current time shows great promise of improvements in the knowledge building domain due to the availability of several online knowledge building portals. In this report, we emphasise that these portals act as prototypes for universal knowledge building process. The analysis of big data availed from these portals may equip the knowledge building researchers with the much needed meta-knowledge.
△ Less
Submitted 19 May, 2017;
originally announced May 2017.
-
Ideal Composition of a Group for Maximal Knowledge Building in Crowdsourced Environments
Authors:
Anamika Chhabra,
S. R. S. Iyengar,
Jaspal Singh Saini
Abstract:
Crowdsourcing has revolutionized the process of knowledge building on the web. Wikipedia and StackOverflow are witness to this uprising development. However, the dynamics behind the process of crowdsourcing in the domain of knowledge building is an area relatively unexplored. It has been observed that an ecosystem exists in the collaborative knowledge building environments (KBE), which puts users…
▽ More
Crowdsourcing has revolutionized the process of knowledge building on the web. Wikipedia and StackOverflow are witness to this uprising development. However, the dynamics behind the process of crowdsourcing in the domain of knowledge building is an area relatively unexplored. It has been observed that an ecosystem exists in the collaborative knowledge building environments (KBE), which puts users of a KBE into various categories based on their expertise. Classical cognitive theories indicate triggering among the knowledge units to be one of the most important reasons behind accelerated knowledge building in collaborative KBEs. We use the concept of ecosystem and the triggering phenomenon to highlight the necessity for the right mix of users in a KBE. We provide a hill climbing based algorithm which gives the ideal mixture of users in a KBE, given the amount of triggering that takes place among the users of various categories. The study will help the portal designers to accordingly build suitable crowdsourced environments.
△ Less
Submitted 11 May, 2016; v1 submitted 28 October, 2015;
originally announced October 2015.
-
A Framework for Textbook Enhancement and Learning using Crowdsourced Annotations
Authors:
Anamika Chhabra,
S. R. S. Iyengar,
Poonam Saini,
Rajesh Shreedhar Bhat
Abstract:
Despite a significant improvement in the educational aids in terms of effective teaching-learning process, most of the educational content available to the students is less than optimal in the context of being up-to-date, exhaustive and easy-to-understand. There is a need to iteratively improve the educational material based on the feedback collected from the students' learning experience. This ca…
▽ More
Despite a significant improvement in the educational aids in terms of effective teaching-learning process, most of the educational content available to the students is less than optimal in the context of being up-to-date, exhaustive and easy-to-understand. There is a need to iteratively improve the educational material based on the feedback collected from the students' learning experience. This can be achieved by observing the students' interactions with the content, and then having the authors modify it based on this feedback. Hence, we aim to facilitate and promote communication between the communities of authors, instructors and students in order to gradually improve the educational material. Such a system will also help in students' learning process by encouraging student-to-student teaching. Underpinning these objectives, we provide the framework of a platform named Crowdsourced Annotation System (CAS) where the people from these communities can collaborate and benefit from each other. We use the concept of in-context annotations, through which, the students can add their comments about the given text while learning it. An experiment was conducted on 60 students who try to learn an article of a textbook by annotating it for four days. According to the result of the experiment, most of the students were highly satisfied with the use of CAS. They stated that the system is extremely useful for learning and they would like to use it for learning other concepts in future.
△ Less
Submitted 11 August, 2015; v1 submitted 20 March, 2015;
originally announced March 2015.
-
Ecosystem: A Characteristic Of Crowdsourced Environments
Authors:
Anamika Chhabra,
S. R. S. Iyengar,
Poonam Saini,
Rajesh Shreedhar Bhat,
Vijay Kumar
Abstract:
The phenomenal success of certain crowdsourced online platforms, such as Wikipedia, is accredited to their ability to tap the crowd's potential to collaboratively build knowledge. While it is well known that the crowd's collective wisdom surpasses the cumulative individual expertise, little is understood on the dynamics of knowledge building in a crowdsourced environment. A proper understanding of…
▽ More
The phenomenal success of certain crowdsourced online platforms, such as Wikipedia, is accredited to their ability to tap the crowd's potential to collaboratively build knowledge. While it is well known that the crowd's collective wisdom surpasses the cumulative individual expertise, little is understood on the dynamics of knowledge building in a crowdsourced environment. A proper understanding of the dynamics of knowledge building in a crowdsourced environment would enable one in the better designing of such environments to solicit knowledge from the crowd. Our experiment on crowdsourced systems based on annotations shows that an important reason for the rapid knowledge building in such environments is due to variance in expertise. First, we used as our test bed, a customized Crowdsourced Annotation System (CAS) which provides a group of users the facility to annotate a given document while trying to understand it. Our results showed the presence of different genres of proficiency amongst the users of an annotation system. We observed that the ecosystem in crowdsourced annotation system comprised of mainly four categories of contributors, namely: Probers, Solvers, Articulators and Explorers. We inferred from our experiment that the knowledge garnering mainly happens due to the synergetic interaction across these categories. Further, we conducted an analysis on the dataset of Wikipedia and Stack Overflow and noticed the ecosystem presence in these portals as well. From this study, we claim that the ecosystem is a universal characteristic of all crowdsourced portals.
△ Less
Submitted 27 August, 2015; v1 submitted 24 February, 2015;
originally announced February 2015.
-
A Framework for Picture Extraction on Search Engine Improved and Meaningful Result
Authors:
Anamika Sharma
Abstract:
Searching is an important tool of information gathering, if information is in the form of picture than it play a major role to take quick action and easy to memorize. This is a human tendency to retain more picture than text. The complexity and the occurrence of variety of query can give variation in result and provide the humans to learn something new or get confused. This paper presents a develo…
▽ More
Searching is an important tool of information gathering, if information is in the form of picture than it play a major role to take quick action and easy to memorize. This is a human tendency to retain more picture than text. The complexity and the occurrence of variety of query can give variation in result and provide the humans to learn something new or get confused. This paper presents a development of a framework that will focus on recourse identification for the user so that they can get faster access with accurate & concise results on time and analysis of the change that is evident as the scenario changes from text to picture retrieval. This paper also provides a glimpse how to get accurate picture information in advance and extended technologies searching framework. The new challenges and design techniques of picture retrieval systems are also suggested in this paper.
△ Less
Submitted 8 December, 2011;
originally announced December 2011.