Search | arXiv e-print repository

Spectral Clustering for Crowdsourcing with Inherently Distinct Task Types

Authors: Saptarshi Mandal, Seo Taek Kong, Dimitrios Katselis, R. Srikant

Abstract: The Dawid-Skene model is the most widely assumed model in the analysis of crowdsourcing algorithms that estimate ground-truth labels from noisy worker responses. In this work, we are motivated by crowdsourcing applications where workers have distinct skill sets and their accuracy additionally depends on a task's type. While weighted majority vote (WMV) with a single weight vector for each worker a… ▽ More The Dawid-Skene model is the most widely assumed model in the analysis of crowdsourcing algorithms that estimate ground-truth labels from noisy worker responses. In this work, we are motivated by crowdsourcing applications where workers have distinct skill sets and their accuracy additionally depends on a task's type. While weighted majority vote (WMV) with a single weight vector for each worker achieves the optimal label estimation error in the Dawid-Skene model, we show that different weights for different types are necessary for a multi-type model. Focusing on the case where there are two types of tasks, we propose a spectral method to partition tasks into two groups that cluster tasks by type. Our analysis reveals that task types can be perfectly recovered if the number of workers $n$ scales logarithmically with the number of tasks $d$. Any algorithm designed for the Dawid-Skene model can then be applied independently to each type to infer the labels. Numerical experiments show how clustering tasks by type before estimating ground-truth labels enhances the performance of crowdsourcing algorithms in practical applications. △ Less

Submitted 9 August, 2024; v1 submitted 14 February, 2023; originally announced February 2023.

arXiv:2207.09054 [pdf, other]

doi 10.1109/TAP.2019.2938704

Towards a Low-SWaP 1024-beam Digital Array: A 32-beam Sub-system at 5.8 GHz

Authors: Arjuna Madanayake, Viduneth Ariyarathna, Suresh Madishetty, Sravan Pulipati, R. J. Cintra, Diego Coelho, Raíza Oliveira, Fábio M. Bayer, Leonid Belostotski, Soumyajit Mandal, Theodore S. Rappaport

Abstract: Millimeter wave communications require multibeam beamforming in order to utilize wireless channels that suffer from obstructions, path loss, and multi-path effects. Digital multibeam beamforming has maximum degrees of freedom compared to analog phased arrays. However, circuit complexity and power consumption are important constraints for digital multibeam systems. A low-complexity digital computin… ▽ More Millimeter wave communications require multibeam beamforming in order to utilize wireless channels that suffer from obstructions, path loss, and multi-path effects. Digital multibeam beamforming has maximum degrees of freedom compared to analog phased arrays. However, circuit complexity and power consumption are important constraints for digital multibeam systems. A low-complexity digital computing architecture is proposed for a multiplication-free 32-point linear transform that approximates multiple simultaneous RF beams similar to a discrete Fourier transform (DFT). Arithmetic complexity due to multiplication is reduced from the FFT complexity of $\mathcal{O}(N\: \log N)$ for DFT realizations, down to zero, thus yielding a 46% and 55% reduction in chip area and dynamic power consumption, respectively, for the $N=32$ case considered. The paper describes the proposed 32-point DFT approximation targeting a 1024-beams using a 2D array, and shows the multiplierless approximation and its mapping to a 32-beam sub-system consisting of 5.8 GHz antennas that can be used for generating 1024 digital beams without multiplications. Real-time beam computation is achieved using a Xilinx FPGA at 120 MHz bandwidth per beam. Theoretical beam performance is compared with measured RF patterns from both a fixed-point FFT as well as the proposed multiplier-free algorithm and are in good agreement. △ Less

Submitted 29 May, 2024; v1 submitted 18 July, 2022; originally announced July 2022.

Comments: 22 pages, 8 figures, 3 tables. Fixed typo in Table 1

Journal ref: IEEE Transactions on Antennas and Propagation, v. 68, n. 2, Feb. 2020

arXiv:2207.05866 [pdf, ps, other]

doi 10.1109/ACCESS.2020.2994550

Fast Radix-32 Approximate DFTs for 1024-Beam Digital RF Beamforming

Authors: A. Madanayake, R. J. Cintra, N. Akram, V. Ariyarathna, S. Mandal, V. A. Coutinho, F. M. Bayer, D. Coelho, T. S. Rappaport

Abstract: The discrete Fourier transform (DFT) is widely employed for multi-beam digital beamforming. The DFT can be efficiently implemented through the use of fast Fourier transform (FFT) algorithms, thus reducing chip area, power consumption, processing time, and consumption of other hardware resources. This paper proposes three new hybrid DFT 1024-point DFT approximations and their respective fast algori… ▽ More The discrete Fourier transform (DFT) is widely employed for multi-beam digital beamforming. The DFT can be efficiently implemented through the use of fast Fourier transform (FFT) algorithms, thus reducing chip area, power consumption, processing time, and consumption of other hardware resources. This paper proposes three new hybrid DFT 1024-point DFT approximations and their respective fast algorithms. These approximate DFT (ADFT) algorithms have significantly reduced circuit complexity and power consumption compared to traditional FFT approaches while trading off a subtle loss in computational precision which is acceptable for digital beamforming applications in RF antenna implementations. ADFT algorithms have not been introduced for beamforming beyond $N = 32$, but this paper anticipates the need for massively large adaptive arrays for future 5G and 6G systems. Digital CMOS circuit designs for the ADFTs show the resulting improvements in both circuit complexity and power consumption metrics. Simulation results show similar or lower critical path delay with up to 48.5% lower chip area compared to a standard Cooley-Tukey FFT. The time-area and dynamic power metrics are reduced up to 66.0%. The 1024-point ADFT beamformers produce signal-to-noise ratio (SNR) gains between 29.2--30.1 dB, which is a loss of $\le$ 0.9 dB SNR gain compared to exact 1024-point DFT beamformers (worst case) realizable at using an FFT. △ Less

Submitted 12 July, 2022; originally announced July 2022.

Comments: 21 pages, 8 figures, 5 tables. The factorization shown in Section 2 is fixed in this version

Journal ref: IEEE Access, vol. 8, 2020

arXiv:2203.13210 [pdf, other]

doi 10.1177/09622802221106720

A comparison of two frameworks for multi-state modelling, applied to outcomes after hospital admissions with COVID-19

Authors: Christopher Jackson, Brian Tom, Peter Kirwan, Sema Mandal, Shaun Seaman, Kevin Kunzmann, Anne Presanis, Daniela De Angelis

Abstract: We compare two multi-state modelling frameworks that can be used to represent dates of events following hospital admission for people infected during an epidemic. The methods are applied to data from people admitted to hospital with COVID-19, to estimate the probability of admission to ICU, the probability of death in hospital for patients before and after ICU admission, the lengths of stay in hos… ▽ More We compare two multi-state modelling frameworks that can be used to represent dates of events following hospital admission for people infected during an epidemic. The methods are applied to data from people admitted to hospital with COVID-19, to estimate the probability of admission to ICU, the probability of death in hospital for patients before and after ICU admission, the lengths of stay in hospital, and how all these vary with age and gender. One modelling framework is based on defining transition-specific hazard functions for competing risks. A less commonly used framework defines partially-latent subpopulations who will experience each subsequent event, and uses a mixture model to estimate the probability that an individual will experience each event, and the distribution of the time to the event given that it occurs. We compare the advantages and disadvantages of these two frameworks, in the context of the COVID-19 example. The issues include the interpretation of the model parameters, the computational efficiency of estimating the quantities of interest, implementation in software and assessing goodness of fit. In the example, we find that some groups appear to be at very low risk of some events, in particular ICU admission, and these are best represented by using "cure-rate" models to define transition-specific hazards. We provide general-purpose software to implement all the models we describe in the "flexsurv" R package, which allows arbitrarily-flexible distributions to be used to represent the cause-specific hazards or times to events. △ Less

Submitted 24 March, 2022; originally announced March 2022.

Journal ref: Statistical Methods in Medical Research (2022)

arXiv:2112.10661 [pdf, other]

doi 10.1038/s41467-022-32458-y

Trends in COVID-19 hospital outcomes in England before and after vaccine introduction, a cohort study

Authors: Peter Kirwan, Andre Charlett, Paul Birrell, Suzanne Elgohari, Russell Hope, Sema Mandal, Daniela De Angelis, Anne Presanis

Abstract: Widespread vaccination campaigns have changed the landscape for COVID-19, vastly altering symptoms and reducing morbidity and mortality. We estimate trends in mortality by month of admission and vaccination status among those hospitalised with COVID-19 in England between March 2020 to September 2021, controlling for demographic factors and hospital load. Among 259,727 hospitalised COVID-19 cases… ▽ More Widespread vaccination campaigns have changed the landscape for COVID-19, vastly altering symptoms and reducing morbidity and mortality. We estimate trends in mortality by month of admission and vaccination status among those hospitalised with COVID-19 in England between March 2020 to September 2021, controlling for demographic factors and hospital load. Among 259,727 hospitalised COVID-19 cases, 51,948 (20.0%) experienced mortality in hospital. Hospitalised fatality risk ranged from 40.3% (95% confidence interval 39.4-41.3%) in March 2020 to 8.1% (7.2-9.0%) in June 2021. Older individuals and those with multiple co-morbidities were more likely to die or else experienced longer stays prior to discharge. Compared to unvaccinated people, the hazard of hospitalised mortality was 0.71 (0.67-0.77) with a first vaccine dose, and 0.56 (0.52-0.61) with a second vaccine dose. Compared to hospital load at 0-20% of the busiest week, the hazard of hospitalised mortality during periods of peak load (90-100%), was 1.23 (1.12-1.34). The prognosis for people hospitalised with COVID-19 in England has varied substantially throughout the pandemic and according to case-mix, vaccination, and hospital load. Our estimates provide an indication for demands on hospital resources, and the relationship between hospital burden and outcomes. △ Less

Submitted 3 August, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

arXiv:2103.04867 [pdf]

Trends in risks of severe events and lengths of stay for COVID-19 hospitalisations in England over the pre-vaccination era: results from the Public Health England SARI-Watch surveillance scheme

Authors: Peter D. Kirwan, Suzanne Elgohari, Christopher H. Jackson, Brian D. M. Tom, Sema Mandal, Daniela De Angelis, Anne M. Presanis

Abstract: Background: Trends in hospitalised case-fatality risk (HFR), risk of intensive care unit (ICU) admission and lengths of stay for patients hospitalised for COVID-19 in England over the pre-vaccination era are unknown. Methods: Data on hospital and ICU admissions with COVID-19 at 31 NHS trusts in England were collected by Public Health England's Severe Acute Respiratory Infections surveillance sys… ▽ More Background: Trends in hospitalised case-fatality risk (HFR), risk of intensive care unit (ICU) admission and lengths of stay for patients hospitalised for COVID-19 in England over the pre-vaccination era are unknown. Methods: Data on hospital and ICU admissions with COVID-19 at 31 NHS trusts in England were collected by Public Health England's Severe Acute Respiratory Infections surveillance system and linked to death information. We applied parametric multi-state mixture models, accounting for censored outcomes and regressing risks and times between events on month of admission, geography, and baseline characteristics. Findings: 20,785 adults were admitted with COVID-19 in 2020. Between March and June/July/August estimated HFR reduced from 31.9% (95% confidence interval 30.3-33.5%) to 10.9% (9.4-12.7%), then rose steadily from 21.6% (18.4-25.5%) in September to 25.7% (23.0-29.2%) in December, with steeper increases among older patients, those with multi-morbidity and outside London/South of England. ICU admission risk reduced from 13.9% (12.8-15.2%) in March to 6.2% (5.3-7.1%) in May, rising to a high of 14.2% (11.1-17.2%) in September. Median length of stay in non-critical care increased during 2020, from 6.6 to 12.3 days for those dying, and from 6.1 to 9.3 days for those discharged. Interpretation: Initial improvements in patient outcomes, corresponding to developments in clinical practice, were not sustained throughout 2020, with HFR in December approaching the levels seen at the start of the pandemic, whilst median hospital stays have lengthened. The role of increased transmission, new variants, case-mix and hospital pressures in increasing COVID-19 severity requires urgent further investigation. △ Less

Submitted 22 March, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

Comments: 45 pages, 12 figures

arXiv:2010.07204 [pdf, other]

Incorporating survival data into case-control studies with incident and prevalent cases

Authors: Soutrik Mandal, Jing Qin, Ruth M. Pfeiffer

Abstract: Typically, case-control studies to estimate odds-ratios associating risk factors with disease incidence from logistic regression only include cases with newly diagnosed disease. Recently proposed methods allow incorporating information on prevalent cases, individuals who survived from disease diagnosis to sampling, into cross-sectionally sampled case-control studies under parametric assumptions fo… ▽ More Typically, case-control studies to estimate odds-ratios associating risk factors with disease incidence from logistic regression only include cases with newly diagnosed disease. Recently proposed methods allow incorporating information on prevalent cases, individuals who survived from disease diagnosis to sampling, into cross-sectionally sampled case-control studies under parametric assumptions for the survival time after diagnosis. Here we propose and study methods to additionally use prospectively observed survival times from prevalent and incident cases to adjust logistic models for the time between disease diagnosis and sampling, the backward time, for prevalent cases. This adjustment yields unbiased odds-ratio estimates from case-control studies that include prevalent cases. We propose a computationally simple two-step generalized method-of-moments estimation procedure. First, we estimate the survival distribution based on a semi-parametric Cox model using an expectation-maximization algorithm that yields fully efficient estimates and accommodates left truncation for the prevalent cases and right censoring. Then, we use the estimated survival distribution in an extension of the logistic model to three groups (controls, incident and prevalent cases), to accommodate the survival bias in prevalent cases. In simulations, when the amount of censoring was modest, odds-ratios from the two-step procedure were equally efficient as those estimated by jointly optimizing the logistic and survival data likelihoods under parametric assumptions. Even with 90% censoring they were as efficient as estimates obtained using only cross-sectionally available information under parametric assumptions. This indicates that utilizing prospective survival data from the cases lessens model dependency and improves precision of association estimates for case-control studies with prevalent cases. △ Less

Submitted 16 October, 2020; v1 submitted 14 October, 2020; originally announced October 2020.

arXiv:1908.08783 [pdf, other]

Learning Fitness Functions for Machine Programming

Authors: Shantanu Mandal, Todd A. Anderson, Javier S. Turek, Justin Gottschlich, Shengtian Zhou, Abdullah Muzahid

Abstract: The problem of automatic software generation is known as Machine Programming. In this work, we propose a framework based on genetic algorithms to solve this problem. Although genetic algorithms have been used successfully for many problems, one criticism is that hand-crafting its fitness function, the test that aims to effectively guide its evolution, can be notably challenging. Our framework pres… ▽ More The problem of automatic software generation is known as Machine Programming. In this work, we propose a framework based on genetic algorithms to solve this problem. Although genetic algorithms have been used successfully for many problems, one criticism is that hand-crafting its fitness function, the test that aims to effectively guide its evolution, can be notably challenging. Our framework presents a novel approach to learn the fitness function using neural networks to predict values of ideal fitness functions. We also augment the evolutionary process with a minimally intrusive search heuristic. This heuristic improves the framework's ability to discover correct programs from ones that are approximately correct and does so with negligible computational overhead. We compare our approach with several state-of-the-art program synthesis methods and demonstrate that it finds more correct programs with fewer candidate program generations. △ Less

Submitted 23 January, 2021; v1 submitted 22 August, 2019; originally announced August 2019.

Journal ref: Proceedings of Machine Learning and Systems (MLSys), 3 (2021), 139-155

arXiv:1812.11118 [pdf, other]

doi 10.1073/pnas.1903070116

Reconciling modern machine learning practice and the bias-variance trade-off

Authors: Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal

Abstract: Breakthroughs in machine learning are rapidly changing science and society, yet our fundamental understanding of this technology has lagged far behind. Indeed, one of the central tenets of the field, the bias-variance trade-off, appears to be at odds with the observed behavior of methods used in the modern machine learning practice. The bias-variance trade-off implies that a model should balance u… ▽ More Breakthroughs in machine learning are rapidly changing science and society, yet our fundamental understanding of this technology has lagged far behind. Indeed, one of the central tenets of the field, the bias-variance trade-off, appears to be at odds with the observed behavior of methods used in the modern machine learning practice. The bias-variance trade-off implies that a model should balance under-fitting and over-fitting: rich enough to express underlying structure in data, simple enough to avoid fitting spurious patterns. However, in the modern practice, very rich models such as neural networks are trained to exactly fit (i.e., interpolate) the data. Classically, such models would be considered over-fit, and yet they often obtain high accuracy on test data. This apparent contradiction has raised questions about the mathematical foundations of machine learning and their relevance to practitioners. In this paper, we reconcile the classical understanding and the modern practice within a unified performance curve. This "double descent" curve subsumes the textbook U-shaped bias-variance trade-off curve by showing how increasing model capacity beyond the point of interpolation results in improved performance. We provide evidence for the existence and ubiquity of double descent for a wide spectrum of models and datasets, and we posit a mechanism for its emergence. This connection between the performance and the structure of machine learning models delineates the limits of classical analyses, and has implications for both the theory and practice of machine learning. △ Less

Submitted 10 September, 2019; v1 submitted 28 December, 2018; originally announced December 2018.

arXiv:1810.12770 [pdf, ps, other]

Explicit Feedbacks Meet with Implicit Feedbacks : A Combined Approach for Recommendation System

Authors: Supriyo Mandal, Abyayananda Maiti

Abstract: Recommender systems recommend items more accurately by analyzing users' potential interest on different brands' items. In conjunction with users' rating similarity, the presence of users' implicit feedbacks like clicking items, viewing items specifications, watching videos etc. have been proved to be helpful for learning users' embedding, that helps better rating prediction of users. Most existing… ▽ More Recommender systems recommend items more accurately by analyzing users' potential interest on different brands' items. In conjunction with users' rating similarity, the presence of users' implicit feedbacks like clicking items, viewing items specifications, watching videos etc. have been proved to be helpful for learning users' embedding, that helps better rating prediction of users. Most existing recommender systems focus on modeling of ratings and implicit feedbacks ignoring users' explicit feedbacks. Explicit feedbacks can be used to validate the reliability of the particular users and can be used to learn about the users' characteristic. Users' characteristic mean what type of reviewers they are. In this paper, we explore three different models for recommendation with more accuracy focusing on users' explicit feedbacks and implicit feedbacks. First one is RHC-PMF that predicts users' rating more accurately based on user's three explicit feedbacks (rating, helpfulness score and centrality) and second one is RV-PMF, where user's implicit feedback (view relationship) is considered. Last one is RHCV-PMF, where both type of feedbacks are considered. In this model users' explicit feedbacks' similarity indicate the similarity of their reliability and characteristic and implicit feedback's similarity indicates their preference similarity. Extensive experiments on real world dataset, i.e. Amazon.com online review dataset shows that our models perform better compare to base-line models in term of users' rating prediction. RHCV-PMF model also performs better rating prediction compare to baseline models for cold start users and cold start items. △ Less

Submitted 29 October, 2018; originally announced October 2018.

Comments: 12 pages. Accepted in Complex Networks, 2018

arXiv:1802.01396 [pdf, other]

To understand deep learning we need to understand kernel learning

Authors: Mikhail Belkin, Siyuan Ma, Soumik Mandal

Abstract: Generalization performance of classifiers in deep learning has recently become a subject of intense study. Deep models, typically over-parametrized, tend to fit the training data exactly. Despite this "overfitting", they perform well on test data, a phenomenon not yet fully understood. The first point of our paper is that strong performance of overfitted classifiers is not a unique feature of de… ▽ More Generalization performance of classifiers in deep learning has recently become a subject of intense study. Deep models, typically over-parametrized, tend to fit the training data exactly. Despite this "overfitting", they perform well on test data, a phenomenon not yet fully understood. The first point of our paper is that strong performance of overfitted classifiers is not a unique feature of deep learning. Using six real-world and two synthetic datasets, we establish experimentally that kernel machines trained to have zero classification or near zero regression error perform very well on test data, even when the labels are corrupted with a high level of noise. We proceed to give a lower bound on the norm of zero loss solutions for smooth kernels, showing that they increase nearly exponentially with data size. We point out that this is difficult to reconcile with the existing generalization bounds. Moreover, none of the bounds produce non-trivial results for interpolating solutions. Second, we show experimentally that (non-smooth) Laplacian kernels easily fit random labels, a finding that parallels results for ReLU neural networks. In contrast, fitting noisy data requires many more epochs for smooth Gaussian kernels. Similar performance of overfitted Laplacian and Gaussian classifiers on test, suggests that generalization is tied to the properties of the kernel function rather than the optimization process. Certain key phenomena of deep learning are manifested similarly in kernel methods in the modern "overfitted" regime. The combination of the experimental and theoretical results presented in this paper indicates a need for new theoretical ideas for understanding properties of classical kernel methods. We argue that progress on understanding deep learning will be difficult until more tractable "shallow" kernel methods are better understood. △ Less

Submitted 14 June, 2018; v1 submitted 5 February, 2018; originally announced February 2018.

Showing 1–11 of 11 results for author: Mandal, S