-
Handloom Design Generation Using Generative Networks
Authors:
Rajat Kanti Bhattacharjee,
Meghali Nandi,
Amrit Jha,
Gunajit Kalita,
Ferdous Ahmed Barbhuiya
Abstract:
This paper proposes deep learning techniques of generating designs for clothing, focused on handloom fabric and discusses the associated challenges along with its application. The capability of generative neural network models in understanding artistic designs and synthesizing those is not yet explored well. In this work, multiple methods are employed incorporating the current state of the art gen…
▽ More
This paper proposes deep learning techniques of generating designs for clothing, focused on handloom fabric and discusses the associated challenges along with its application. The capability of generative neural network models in understanding artistic designs and synthesizing those is not yet explored well. In this work, multiple methods are employed incorporating the current state of the art generative models and style transfer algorithms to study and observe their performance for the task. The results are then evaluated through user score. This work also provides a new dataset NeuralLoom for the task of the design generation.
△ Less
Submitted 20 May, 2025;
originally announced May 2025.
-
Through the Telco Lens: A Countrywide Empirical Study of Cellular Handovers
Authors:
Michail Kalntis,
José Suárez-Varela,
Jesús Omaña Iglesias,
Anup Kiran Bhattacharjee,
George Iosifidis,
Fernando A. Kuipers,
Andra Lutu
Abstract:
Cellular networks rely on handovers (HOs) as a fundamental element to enable seamless connectivity for mobile users. A comprehensive analysis of HOs can be achieved through data from Mobile Network Operators (MNOs); however, the vast majority of studies employ data from measurement campaigns within confined areas and with limited end-user devices, thereby providing only a partial view of HOs. This…
▽ More
Cellular networks rely on handovers (HOs) as a fundamental element to enable seamless connectivity for mobile users. A comprehensive analysis of HOs can be achieved through data from Mobile Network Operators (MNOs); however, the vast majority of studies employ data from measurement campaigns within confined areas and with limited end-user devices, thereby providing only a partial view of HOs. This paper presents the first countrywide analysis of HO performance, from the perspective of a top-tier MNO in a European country. We collect traffic from approximately 40M users for 4 weeks and study the impact of the radio access technologies (RATs), device types, and manufacturers on HOs across the country. We characterize the geo-temporal dynamics of horizontal (intra-RAT) and vertical (inter-RATs) HOs, at the district level and at millisecond granularity, and leverage open datasets from the country's official census office to associate our findings with the population. We further delve into the frequency, duration, and causes of HO failures, and model them using statistical tools. Our study offers unique insights into mobility management, highlighting the heterogeneity of the network and devices, and their effect on HOs.
△ Less
Submitted 29 November, 2024;
originally announced November 2024.
-
Hierarchical Clustering using Reversible Binary Cellular Automata for High-Dimensional Data
Authors:
Baby C. J.,
Kamalika Bhattacharjee
Abstract:
This work proposes a hierarchical clustering algorithm for high-dimensional datasets using the cyclic space of reversible finite cellular automata. In cellular automaton (CA) based clustering, if two objects belong to the same cycle, they are closely related and considered as part of the same cluster. However, if a high-dimensional dataset is clustered using the cycles of one CA, closely related o…
▽ More
This work proposes a hierarchical clustering algorithm for high-dimensional datasets using the cyclic space of reversible finite cellular automata. In cellular automaton (CA) based clustering, if two objects belong to the same cycle, they are closely related and considered as part of the same cluster. However, if a high-dimensional dataset is clustered using the cycles of one CA, closely related objects may belong to different cycles. This paper identifies the relationship between objects in two different cycles based on the median of all elements in each cycle so that they can be grouped in the next stage. Further, to minimize the number of intermediate clusters which in turn reduces the computational cost, a rule selection strategy is taken to find the best rules based on information propagation and cycle structure. After encoding the dataset using frequency-based encoding such that the consecutive data elements maintain a minimum hamming distance in encoded form, our proposed clustering algorithm iterates over three stages to finally cluster the data elements into the desired number of clusters given by user. This algorithm can be applied to various fields, including healthcare, sports, chemical research, agriculture, etc. When verified over standard benchmark datasets with various performance metrics, our algorithm is at par with the existing algorithms with quadratic time complexity.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Who should I trust? A Visual Analytics Approach for Comparing Net Load Forecasting Models
Authors:
Kaustav Bhattacharjee,
Soumya Kundu,
Indrasis Chakraborty,
Aritra Dasgupta
Abstract:
Net load forecasting is crucial for energy planning and facilitating informed decision-making regarding trade and load distributions. However, evaluating forecasting models' performance against benchmark models remains challenging, thereby impeding experts' trust in the model's performance. In this context, there is a demand for technological interventions that allow scientists to compare models a…
▽ More
Net load forecasting is crucial for energy planning and facilitating informed decision-making regarding trade and load distributions. However, evaluating forecasting models' performance against benchmark models remains challenging, thereby impeding experts' trust in the model's performance. In this context, there is a demand for technological interventions that allow scientists to compare models across various timeframes and solar penetration levels. This paper introduces a visual analytics-based application designed to compare the performance of deep-learning-based net load forecasting models with other models for probabilistic net load forecasting. This application employs carefully selected visual analytic interventions, enabling users to discern differences in model performance across different solar penetration levels, dataset resolutions, and hours of the day over multiple months. We also present observations made using our application through a case study, demonstrating the effectiveness of visualizations in aiding scientists in making informed decisions and enhancing trust in net load forecasting models.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
Gödel Number based Clustering Algorithm with Decimal First Degree Cellular Automata
Authors:
Vicky Vikrant,
Narodia Parth P,
Kamalika Bhattacharjee
Abstract:
In this paper, a decimal first degree cellular automata (FDCA) based clustering algorithm is proposed where clusters are created based on reachability. Cyclic spaces are created and configurations which are in the same cycle are treated as the same cluster. Here, real-life data objects are encoded into decimal strings using Gödel number based encoding. The benefits of the scheme is, it reduces the…
▽ More
In this paper, a decimal first degree cellular automata (FDCA) based clustering algorithm is proposed where clusters are created based on reachability. Cyclic spaces are created and configurations which are in the same cycle are treated as the same cluster. Here, real-life data objects are encoded into decimal strings using Gödel number based encoding. The benefits of the scheme is, it reduces the encoded string length while maintaining the features properties. Candidate CA rules are identified based on some theoretical criteria such as self-replication and information flow. An iterative algorithm is developed to generate the desired number of clusters over three stages. The results of the clustering are evaluated based on benchmark clustering metrics such as Silhouette score, Davis Bouldin, Calinski Harabasz and Dunn Index. In comparison with the existing state-of-the-art clustering algorithms, our proposed algorithm gives better performance.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Forte: An Interactive Visual Analytic Tool for Trust-Augmented Net Load Forecasting
Authors:
Kaustav Bhattacharjee,
Soumya Kundu,
Indrasis Chakraborty,
Aritra Dasgupta
Abstract:
Accurate net load forecasting is vital for energy planning, aiding decisions on trade and load distribution. However, assessing the performance of forecasting models across diverse input variables, like temperature and humidity, remains challenging, particularly for eliciting a high degree of trust in the model outcomes. In this context, there is a growing need for data-driven technological interv…
▽ More
Accurate net load forecasting is vital for energy planning, aiding decisions on trade and load distribution. However, assessing the performance of forecasting models across diverse input variables, like temperature and humidity, remains challenging, particularly for eliciting a high degree of trust in the model outcomes. In this context, there is a growing need for data-driven technological interventions to aid scientists in comprehending how models react to both noisy and clean input variables, thus shedding light on complex behaviors and fostering confidence in the outcomes. In this paper, we present Forte, a visual analytics-based application to explore deep probabilistic net load forecasting models across various input variables and understand the error rates for different scenarios. With carefully designed visual interventions, this web-based interface empowers scientists to derive insights about model performance by simulating diverse scenarios, facilitating an informed decision-making process. We discuss observations made using Forte and demonstrate the effectiveness of visualization techniques to provide valuable insights into the correlation between weather inputs and net load forecasts, ultimately advancing grid capabilities by improving trust in forecasting models.
△ Less
Submitted 10 November, 2023;
originally announced November 2023.
-
Document-Level Supervision for Multi-Aspect Sentiment Analysis Without Fine-grained Labels
Authors:
Kasturi Bhattacharjee,
Rashmi Gangadharaiah
Abstract:
Aspect-based sentiment analysis (ABSA) is a widely studied topic, most often trained through supervision from human annotations of opinionated texts. These fine-grained annotations include identifying aspects towards which a user expresses their sentiment, and their associated polarities (aspect-based sentiments). Such fine-grained annotations can be expensive and often infeasible to obtain in rea…
▽ More
Aspect-based sentiment analysis (ABSA) is a widely studied topic, most often trained through supervision from human annotations of opinionated texts. These fine-grained annotations include identifying aspects towards which a user expresses their sentiment, and their associated polarities (aspect-based sentiments). Such fine-grained annotations can be expensive and often infeasible to obtain in real-world settings. There is, however, an abundance of scenarios where user-generated text contains an overall sentiment, such as a rating of 1-5 in user reviews or user-generated feedback, which may be leveraged for this task. In this paper, we propose a VAE-based topic modeling approach that performs ABSA using document-level supervision and without requiring fine-grained labels for either aspects or sentiments. Our approach allows for the detection of multiple aspects in a document, thereby allowing for the possibility of reasoning about how sentiment expressed through multiple aspects comes together to form an observable overall document-level sentiment. We demonstrate results on two benchmark datasets from two different domains, significantly outperforming a state-of-the-art baseline.
△ Less
Submitted 10 October, 2023;
originally announced October 2023.
-
TRIVEA: Transparent Ranking Interpretation using Visual Explanation of Black-Box Algorithmic Rankers
Authors:
Jun Yuan,
Kaustav Bhattacharjee,
Akm Zahirul Islam,
Aritra Dasgupta
Abstract:
Ranking schemes drive many real-world decisions, like, where to study, whom to hire, what to buy, etc. Many of these decisions often come with high consequences. For example, a university can be deemed less prestigious if not featured in a top-k list, and consumers might not even explore products that do not get recommended to buyers. At the heart of most of these decisions are opaque ranking sche…
▽ More
Ranking schemes drive many real-world decisions, like, where to study, whom to hire, what to buy, etc. Many of these decisions often come with high consequences. For example, a university can be deemed less prestigious if not featured in a top-k list, and consumers might not even explore products that do not get recommended to buyers. At the heart of most of these decisions are opaque ranking schemes, which dictate the ordering of data entities, but their internal logic is inaccessible or proprietary. Drawing inferences about the ranking differences is like a guessing game to the stakeholders, like, the rankees (i.e., the entities who are ranked, like product companies) and the decision-makers (i.e., who use the rankings, like buyers). In this paper, we aim to enable transparency in ranking interpretation by using algorithmic rankers that learn from available data and by enabling human reasoning about the learned ranking differences using explainable AI (XAI) methods. To realize this aim, we leverage the exploration-explanation paradigm of human-data interaction to let human stakeholders explore subsets and groupings of complex multi-attribute ranking data using visual explanations of model fit and attribute influence on rankings. We realize this explanation paradigm for transparent ranking interpretation in TRIVEA, a visual analytic system that is fueled by: i) visualizations of model fit derived from algorithmic rankers that learn the associations between attributes and rankings from available data and ii) visual explanations derived from XAI methods that help abstract important patterns, like, the relative influence of attributes in different ranking ranges. Using TRIVEA, end users not trained in data science have the agency to transparently reason about the global and local behavior of the rankings without the need to open black-box ranking models and develop confidence in the resulting attribute-based inferences. We demonstrate the efficacy of TRIVEA using multiple usage scenarios and subjective feedback from researchers with diverse domain expertise.
Keywords: Visual Analytics, Learning-to-Rank, Explainable ML, Ranking
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Power to the Data Defenders: Human-Centered Disclosure Risk Calibration of Open Data
Authors:
Kaustav Bhattacharjee,
Aritra Dasgupta
Abstract:
The open data ecosystem is susceptible to vulnerabilities due to disclosure risks. Though the datasets are anonymized during release, the prevalence of the release-and-forget model makes the data defenders blind to privacy issues arising after the dataset release. One such issue can be the disclosure risks in the presence of newly released datasets which may compromise the privacy of the data subj…
▽ More
The open data ecosystem is susceptible to vulnerabilities due to disclosure risks. Though the datasets are anonymized during release, the prevalence of the release-and-forget model makes the data defenders blind to privacy issues arising after the dataset release. One such issue can be the disclosure risks in the presence of newly released datasets which may compromise the privacy of the data subjects of the anonymous open datasets. In this paper, we first examine some of these pitfalls through the examples we observed during a red teaming exercise and then envision other possible vulnerabilities in this context. We also discuss proactive risk monitoring, including developing a collection of highly susceptible open datasets and a visual analytic workflow that empowers data defenders towards undertaking dynamic risk calibration strategies.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
PRIVEE: A Visual Analytic Workflow for Proactive Privacy Risk Inspection of Open Data
Authors:
Kaustav Bhattacharjee,
Akm Islam,
Jaideep Vaidya,
Aritra Dasgupta
Abstract:
Open data sets that contain personal information are susceptible to adversarial attacks even when anonymized. By performing low-cost joins on multiple datasets with shared attributes, malicious users of open data portals might get access to information that violates individuals' privacy. However, open data sets are primarily published using a release-and-forget model, whereby data owners and custo…
▽ More
Open data sets that contain personal information are susceptible to adversarial attacks even when anonymized. By performing low-cost joins on multiple datasets with shared attributes, malicious users of open data portals might get access to information that violates individuals' privacy. However, open data sets are primarily published using a release-and-forget model, whereby data owners and custodians have little to no cognizance of these privacy risks. We address this critical gap by developing a visual analytic solution that enables data defenders to gain awareness about the disclosure risks in local, joinable data neighborhoods. The solution is derived through a design study with data privacy researchers, where we initially play the role of a red team and engage in an ethical data hacking exercise based on privacy attack scenarios. We use this problem and domain characterization to develop a set of visual analytic interventions as a defense mechanism and realize them in PRIVEE, a visual risk inspection workflow that acts as a proactive monitor for data defenders. PRIVEE uses a combination of risk scores and associated interactive visualizations to let data defenders explore vulnerable joins and interpret risks at multiple levels of data granularity. We demonstrate how PRIVEE can help emulate the attack strategies and diagnose disclosure risks through two case studies with data privacy experts.
△ Less
Submitted 12 August, 2022;
originally announced August 2022.
-
Affinity Classification Problem by Stochastic Cellular Automata
Authors:
Kamalika Bhattacharjee,
Subrata Paul,
Sukanta Das
Abstract:
This work introduces a new problem, named as, affinity classification problem which is a generalization of the density classification problem. To solve this problem, we introduce temporally stochastic cellular automata where two rules are stochastically applied in each step on all cells of the automata. Our model is defined on 2-dimensional grid having affection capability. We show that this model…
▽ More
This work introduces a new problem, named as, affinity classification problem which is a generalization of the density classification problem. To solve this problem, we introduce temporally stochastic cellular automata where two rules are stochastically applied in each step on all cells of the automata. Our model is defined on 2-dimensional grid having affection capability. We show that this model can be used in several applications like modeling self-healing systems.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction
Authors:
Elsbeth Turcan,
Shuai Wang,
Rishita Anubhai,
Kasturi Bhattacharjee,
Yaser Al-Onaizan,
Smaranda Muresan
Abstract:
Detecting what emotions are expressed in text is a well-studied problem in natural language processing. However, research on finer grained emotion analysis such as what causes an emotion is still in its infancy. We present solutions that tackle both emotion recognition and emotion cause detection in a joint fashion. Considering that common-sense knowledge plays an important role in understanding i…
▽ More
Detecting what emotions are expressed in text is a well-studied problem in natural language processing. However, research on finer grained emotion analysis such as what causes an emotion is still in its infancy. We present solutions that tackle both emotion recognition and emotion cause detection in a joint fashion. Considering that common-sense knowledge plays an important role in understanding implicitly expressed emotions and the reasons for those emotions, we propose novel methods that combine common-sense knowledge via adapted knowledge models with multi-task learning to perform joint emotion classification and emotion cause tagging. We show performance improvement on both tasks when including common-sense reasoning and a multitask framework. We provide a thorough analysis to gain insights into model performance.
△ Less
Submitted 17 June, 2021;
originally announced June 2021.
-
HIT: A Hierarchically Fused Deep Attention Network for Robust Code-mixed Language Representation
Authors:
Ayan Sengupta,
Sourabh Kumar Bhattacharjee,
Tanmoy Chakraborty,
Md Shad Akhtar
Abstract:
Understanding linguistics and morphology of resource-scarce code-mixed texts remains a key challenge in text processing. Although word embedding comes in handy to support downstream tasks for low-resource languages, there are plenty of scopes in improving the quality of language representation particularly for code-mixed languages. In this paper, we propose HIT, a robust representation learning me…
▽ More
Understanding linguistics and morphology of resource-scarce code-mixed texts remains a key challenge in text processing. Although word embedding comes in handy to support downstream tasks for low-resource languages, there are plenty of scopes in improving the quality of language representation particularly for code-mixed languages. In this paper, we propose HIT, a robust representation learning method for code-mixed texts. HIT is a hierarchical transformer-based framework that captures the semantic relationship among words and hierarchically learns the sentence-level semantics using a fused attention mechanism. HIT incorporates two attention modules, a multi-headed self-attention and an outer product attention module, and computes their weighted sum to obtain the attention weights. Our evaluation of HIT on one European (Spanish) and five Indic (Hindi, Bengali, Tamil, Telugu, and Malayalam) languages across four NLP tasks on eleven datasets suggests significant performance improvement against various state-of-the-art systems. We further show the adaptability of learned representation across tasks in a transfer learning setup (with and without fine-tuning).
△ Less
Submitted 30 May, 2021;
originally announced May 2021.
-
To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging
Authors:
Kasturi Bhattacharjee,
Miguel Ballesteros,
Rishita Anubhai,
Smaranda Muresan,
Jie Ma,
Faisal Ladhak,
Yaser Al-Onaizan
Abstract:
Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success. However, training these models can be costly both from an economic and environmental standpoint. In this work, we investigate how t…
▽ More
Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success. However, training these models can be costly both from an economic and environmental standpoint. In this work, we investigate how to effectively use unlabeled data: by exploring the task-specific semi-supervised approach, Cross-View Training (CVT) and comparing it with task-agnostic BERT in multiple settings that include domain and task relevant English data. CVT uses a much lighter model architecture and we show that it achieves similar performance to BERT on a set of sequence tagging tasks, with lesser financial and environmental impact.
△ Less
Submitted 27 October, 2020;
originally announced October 2020.
-
Cellular Automata: Reversibility, Semi-reversibility and Randomness
Authors:
Kamalika Bhattacharjee
Abstract:
In this dissertation, we study two of the global properties of 1-dimensional cellular automata (CAs) under periodic boundary condition, namely, reversibility and randomness. To address reversibility of finite CAs, we develop a mathematical tool, named reachability tree, which can efficiently characterize those CAs. A decision algorithm is proposed using minimized reachability tree which takes a CA…
▽ More
In this dissertation, we study two of the global properties of 1-dimensional cellular automata (CAs) under periodic boundary condition, namely, reversibility and randomness. To address reversibility of finite CAs, we develop a mathematical tool, named reachability tree, which can efficiently characterize those CAs. A decision algorithm is proposed using minimized reachability tree which takes a CA rule and size n as input and verifies whether the CA is reversible for that n. To decide reversibility of a finite CA, we need to know both the rule and the CA size. However, for infinite CAs, reversibility is decided based on the local rule only. Therefore, apparently, these two cases seem to be divergent. This dissertation targets to construct a bridge between these two cases. To do so, reversibility of CAs is redefined and the notion of semi-reversible CAs is introduced. Hence, we propose a new classification of finite CAs -(1) reversible CAs, (2) semi-reversible CAs and (3) strictly irreversible CAs. Finally, relation between reversibility of finite and infinite CAs is established. This dissertation also explores CAs as source of randomness and build pseudo-random number generators (PRNGs) based on CAs. We identify a list of properties for a CA to be a good source of randomness. Two heuristic algorithms are proposed to synthesize candidate (decimal) CAs which have great potentiality as PRNGs. Two schemes tare developed o use these CAs as window-based PRNGs - (1) as decimal number generators and as (2) binary number generators. We empirically observe that in comparison to the best PRNG SFMT19937-64, average performance of our proposed PRNGs are slightly better. Hence, our decimal CAs based PRNGs are one of the best PRNGs today.
△ Less
Submitted 8 November, 2019;
originally announced November 2019.
-
Neural Word Decomposition Models for Abusive Language Detection
Authors:
Sravan Babu Bodapati,
Spandana Gella,
Kasturi Bhattacharjee,
Yaser Al-Onaizan
Abstract:
User generated text on social media often suffers from a lot of undesired characteristics including hatespeech, abusive language, insults etc. that are targeted to attack or abuse a specific group of people. Often such text is written differently compared to traditional text such as news involving either explicit mention of abusive words, obfuscated words and typological errors or implicit abuse i…
▽ More
User generated text on social media often suffers from a lot of undesired characteristics including hatespeech, abusive language, insults etc. that are targeted to attack or abuse a specific group of people. Often such text is written differently compared to traditional text such as news involving either explicit mention of abusive words, obfuscated words and typological errors or implicit abuse i.e., indicating or targeting negative stereotypes. Thus, processing this text poses several robustness challenges when we apply natural language processing techniques developed for traditional text. For example, using word or token based models to process such text can treat two spelling variants of a word as two different words. Following recent work, we analyze how character, subword and byte pair encoding (BPE) models can be aid some of the challenges posed by user generated text. In our work, we analyze the effectiveness of each of the above techniques, compare and contrast various word decomposition techniques when used in combination with others. We experiment with finetuning large pretrained language models, and demonstrate their robustness to domain shift by studying Wikipedia attack, toxicity and Twitter hatespeech datasets
△ Less
Submitted 2 October, 2019;
originally announced October 2019.
-
On Finite $1$-Dimensional Cellular Automata: Reversibility and Semi-reversibility
Authors:
Kamalika Bhattacharjee,
Sukanta Das
Abstract:
Reversibility of a one-dimensional finite cellular automaton (CA) is dependent on lattice size. A finite CA can be reversible for a set of lattice sizes. On the other hand, reversibility of an infinite CA, which is decided by exploring the rule only, is different in its kind from that of finite CA. Can we, however, link the reversibility of finite CA to that of infinite CA? In order to address thi…
▽ More
Reversibility of a one-dimensional finite cellular automaton (CA) is dependent on lattice size. A finite CA can be reversible for a set of lattice sizes. On the other hand, reversibility of an infinite CA, which is decided by exploring the rule only, is different in its kind from that of finite CA. Can we, however, link the reversibility of finite CA to that of infinite CA? In order to address this issue, we introduce a new notion, named semi-reversibility. We classify the CAs into three types with respect to reversibility property -- reversible, semi-reversible and strictly irreversible. A tool, reachability tree, has been used to decide the reversibility class of any CA. Finally, relation among the existing cases of reversibility is established.
△ Less
Submitted 14 March, 2019;
originally announced March 2019.
-
Graph based Question Answering System
Authors:
Piyush Mital,
Saurabh Agarwal,
Bhargavi Neti,
Yashodhara Haribhakta,
Vibhavari Kamble,
Krishnanjan Bhattacharjee,
Debashri Das,
Swati Mehta,
Ajai Kumar
Abstract:
In today's digital age in the dawning era of big data analytics it is not the information but the linking of information through entities and actions which defines the discourse. Any textual data either available on the Internet off off-line (like newspaper data, Wikipedia dump, etc) is basically connect information which cannot be treated isolated for its wholesome semantics. There is a need for…
▽ More
In today's digital age in the dawning era of big data analytics it is not the information but the linking of information through entities and actions which defines the discourse. Any textual data either available on the Internet off off-line (like newspaper data, Wikipedia dump, etc) is basically connect information which cannot be treated isolated for its wholesome semantics. There is a need for an automated retrieval process with proper information extraction to structure the data for relevant and fast text analytics. The first big challenge is the conversion of unstructured textual data to structured data. Unlike other databases, graph databases handle relationships and connections elegantly. Our project aims at developing a graph-based information extraction and retrieval system.
△ Less
Submitted 5 December, 2018;
originally announced December 2018.
-
A Search for Good Pseudo-random Number Generators : Survey and Empirical Studies
Authors:
Kamalika Bhattacharjee,
Krishnendu Maity,
Sukanta Das
Abstract:
In today's world, several applications demand numbers which appear random but are generated by a background algorithm; that is, pseudo-random numbers. Since late $19^{th}$ century, researchers have been working on pseudo-random number generators (PRNGs). Several PRNGs continue to develop, each one demanding to be better than the previous ones. In this scenario, this paper targets to verify the cla…
▽ More
In today's world, several applications demand numbers which appear random but are generated by a background algorithm; that is, pseudo-random numbers. Since late $19^{th}$ century, researchers have been working on pseudo-random number generators (PRNGs). Several PRNGs continue to develop, each one demanding to be better than the previous ones. In this scenario, this paper targets to verify the claim of so-called good generators and rank the existing generators based on strong empirical tests in same platforms. To do this, the genre of PRNGs developed so far has been explored and classified into three groups -- linear congruential generator based, linear feedback shift register based and cellular automata based. From each group, well-known generators have been chosen for empirical testing. Two types of empirical testing has been done on each PRNG -- blind statistical tests with Diehard battery of tests, TestU01 library and NIST statistical test-suite and graphical tests (lattice test and space-time diagram test). Finally, the selected $29$ PRNGs are divided into $24$ groups and are ranked according to their overall performance in all empirical tests.
△ Less
Submitted 3 November, 2018;
originally announced November 2018.
-
A Survey of Cellular Automata: Types, Dynamics, Non-uniformity and Applications
Authors:
Kamalika Bhattacharjee,
Nazma Naskar,
Souvik Roy,
Sukanta Das
Abstract:
Cellular automata (CAs) are dynamical systems which exhibit complex global behavior from simple local interaction and computation. Since the inception of cellular automaton (CA) by von Neumann in 1950s, it has attracted the attention of several researchers over various backgrounds and fields for modelling different physical, natural as well as real-life phenomena. Classically, CAs are uniform. How…
▽ More
Cellular automata (CAs) are dynamical systems which exhibit complex global behavior from simple local interaction and computation. Since the inception of cellular automaton (CA) by von Neumann in 1950s, it has attracted the attention of several researchers over various backgrounds and fields for modelling different physical, natural as well as real-life phenomena. Classically, CAs are uniform. However, non-uniformity has also been introduced in update pattern, lattice structure, neighborhood dependency and local rule. In this survey, we tour to the various types of CAs introduced till date, the different characterization tools, the global behaviors of CAs, like universality, reversibility, dynamics etc. Special attention is given to non-uniformity in CAs and especially to non-uniform elementary CAs, which have been very useful in solving several real-life problems.
△ Less
Submitted 8 May, 2018; v1 submitted 8 July, 2016;
originally announced July 2016.
-
Stochastic kinetics reveal imperative role of anisotropic interfacial tension to determine morphology and evolution of nucleated droplets in nematogenic films
Authors:
Amit Kumar Bhattacharjee
Abstract:
For isotropic fluids, classical nucleation theory predicts the nucleation rate, barrier height and critical droplet size by accounting for the competition between bulk energy and interfacial tension. The nucleation process in liquid crystals is less understood. We numerically investigate nucleation in monolayered nematogenic films using a mesoscopic framework, in particular, we study the mor- phol…
▽ More
For isotropic fluids, classical nucleation theory predicts the nucleation rate, barrier height and critical droplet size by accounting for the competition between bulk energy and interfacial tension. The nucleation process in liquid crystals is less understood. We numerically investigate nucleation in monolayered nematogenic films using a mesoscopic framework, in particular, we study the mor- phology and kinetic pathway in spontaneous formation and growth of droplets of the stable phase in the metastable background. The parameter $κ$ that quantifies the anisotropic elastic energy plays a central role in determining the geometric structure of the droplets. Noncircular nematic droplets with homogeneous director orientation are nucleated in a background of supercooled isotropic phase for small $κ$. For large $κ$, noncircular droplets with integer topological charge, accompanied by a biaxial ring at the outer surface, are nucleated. The isotropic droplet shape in a superheated nematic background is found to depend on $κ$ in a similar way. Identical growth laws are found in the two cases, although an unusual two-stage mechanism is observed in the nucleation of isotropic droplets. Temporal distributions of successive events indicate the relevance of long-ranged elasticity-mediated interactions within the isotropic domains. Implications for a theoretical description of nucleation in anisotropic fluids are discussed.
△ Less
Submitted 30 July, 2017; v1 submitted 17 March, 2016;
originally announced March 2016.
-
Tag Me Maybe: Perceptions of Public Targeted Sharing on Facebook
Authors:
Saiph Savage,
Andres Monroy-Hernandez,
Kasturi Bhattacharjee,
Tobias Hollerer
Abstract:
Social network sites allow users to publicly tag people in their posts. These tagged posts allow users to share to both the general public and a targeted audience, dynamically assembled via notifications that alert the people mentioned. We investigate people's perceptions of this mixed sharing mode through a qualitative study with 120 participants. We found that individuals like this sharing modal…
▽ More
Social network sites allow users to publicly tag people in their posts. These tagged posts allow users to share to both the general public and a targeted audience, dynamically assembled via notifications that alert the people mentioned. We investigate people's perceptions of this mixed sharing mode through a qualitative study with 120 participants. We found that individuals like this sharing modality as they believe it strengthens their relationships. Individuals also report using tags to have more control of Facebook's ranking algorithm, and to expose one another to novel information and people. This work helps us understand people's complex relationships with the algorithms that mediate their interactions with each another. We conclude by discussing the design implications of these findings.
△ Less
Submitted 3 September, 2015;
originally announced September 2015.
-
Reversibility of d-State Finite Cellular Automata
Authors:
Kamalika Bhattacharjee,
Sukanta Das
Abstract:
This paper investigates reversibility properties of 1-dimensional 3-neighborhood d-state finite cellular automata (CAs) of length n under periodic boundary condition. A tool named reachability tree has been developed from de Bruijn graph which represents all possible reachable configurations of an n-cell CA. This tool has been used to test reversibility of CAs. We have identified a large set of re…
▽ More
This paper investigates reversibility properties of 1-dimensional 3-neighborhood d-state finite cellular automata (CAs) of length n under periodic boundary condition. A tool named reachability tree has been developed from de Bruijn graph which represents all possible reachable configurations of an n-cell CA. This tool has been used to test reversibility of CAs. We have identified a large set of reversible CAs using this tool by following some greedy strategies.
△ Less
Submitted 8 May, 2018; v1 submitted 4 February, 2015;
originally announced February 2015.