-
Classification of integers based on residue classes via modern deep learning algorithms
Authors:
Da Wu,
Jingye Yang,
Mian Umair Ahsan,
Kai Wang
Abstract:
Judging whether an integer can be divided by prime numbers such as 2 or 3 may appear trivial to human beings, but can be less straightforward for computers. Here, we tested multiple deep learning architectures and feature engineering approaches on classifying integers based on their residues when divided by small prime numbers. We found that the ability of classification critically depends on the…
▽ More
Judging whether an integer can be divided by prime numbers such as 2 or 3 may appear trivial to human beings, but can be less straightforward for computers. Here, we tested multiple deep learning architectures and feature engineering approaches on classifying integers based on their residues when divided by small prime numbers. We found that the ability of classification critically depends on the feature space. We also evaluated Automated Machine Learning (AutoML) platforms from Amazon, Google and Microsoft, and found that they failed on this task without appropriately engineered features. Furthermore, we introduced a method that utilizes linear regression on Fourier series basis vectors, and demonstrated its effectiveness. Finally, we evaluated Large Language Models (LLMs) such as GPT-4, GPT-J, LLaMA and Falcon, and demonstrated their failures. In conclusion, feature engineering remains an important task to improve performance and increase interpretability of machine-learning models, even in the era of AutoML and LLMs.
△ Less
Submitted 8 September, 2023; v1 submitted 3 April, 2023;
originally announced April 2023.
-
A Comprehensive Review of Modern Object Segmentation Approaches
Authors:
Yuanbo Wang,
Unaiza Ahsan,
Hanyan Li,
Matthew Hagen
Abstract:
Image segmentation is the task of associating pixels in an image with their respective object class labels. It has a wide range of applications in many industries including healthcare, transportation, robotics, fashion, home improvement, and tourism. Many deep learning-based approaches have been developed for image-level object recognition and pixel-level scene understanding-with the latter requir…
▽ More
Image segmentation is the task of associating pixels in an image with their respective object class labels. It has a wide range of applications in many industries including healthcare, transportation, robotics, fashion, home improvement, and tourism. Many deep learning-based approaches have been developed for image-level object recognition and pixel-level scene understanding-with the latter requiring a much denser annotation of scenes with a large set of objects. Extensions of image segmentation tasks include 3D and video segmentation, where units of voxels, point clouds, and video frames are classified into different objects. We use "Object Segmentation" to refer to the union of these segmentation tasks. In this monograph, we investigate both traditional and modern object segmentation approaches, comparing their strengths, weaknesses, and utilities. We examine in detail the wide range of deep learning-based segmentation techniques developed in recent years, provide a review of the widely used datasets and evaluation metrics, and discuss potential future research directions.
△ Less
Submitted 13 January, 2023;
originally announced January 2023.
-
Deep Learning-based Online Alternative Product Recommendations at Scale
Authors:
Mingming Guo,
Nian Yan,
Xiquan Cui,
San He Wu,
Unaiza Ahsan,
Rebecca West,
Khalifeh Al Jadda
Abstract:
Alternative recommender systems are critical for ecommerce companies. They guide customers to explore a massive product catalog and assist customers to find the right products among an overwhelming number of options. However, it is a non-trivial task to recommend alternative products that fit customer needs. In this paper, we use both textual product information (e.g. product titles and descriptio…
▽ More
Alternative recommender systems are critical for ecommerce companies. They guide customers to explore a massive product catalog and assist customers to find the right products among an overwhelming number of options. However, it is a non-trivial task to recommend alternative products that fit customer needs. In this paper, we use both textual product information (e.g. product titles and descriptions) and customer behavior data to recommend alternative products. Our results show that the coverage of alternative products is significantly improved in offline evaluations as well as recall and precision. The final A/B test shows that our algorithm increases the conversion rate by 12 percent in a statistically significant way. In order to better capture the semantic meaning of product information, we build a Siamese Network with Bidirectional LSTM to learn product embeddings. In order to learn a similarity space that better matches the preference of real customers, we use co-compared data from historical customer behavior as labels to train the network. In addition, we use NMSLIB to accelerate the computationally expensive kNN computation for millions of products so that the alternative recommendation is able to scale across the entire catalog of a major ecommerce site.
△ Less
Submitted 15 April, 2021;
originally announced April 2021.
-
Interpretable Methods for Identifying Product Variants
Authors:
Rebecca West,
Khalifeh Al Jadda,
Unaiza Ahsan,
Huiming Qu,
Xiquan Cui
Abstract:
For e-commerce companies with large product selections, the organization and grouping of products in meaningful ways is important for creating great customer shopping experiences and cultivating an authoritative brand image. One important way of grouping products is to identify a family of product variants, where the variants are mostly the same with slight and yet distinct differences (e.g. color…
▽ More
For e-commerce companies with large product selections, the organization and grouping of products in meaningful ways is important for creating great customer shopping experiences and cultivating an authoritative brand image. One important way of grouping products is to identify a family of product variants, where the variants are mostly the same with slight and yet distinct differences (e.g. color or pack size). In this paper, we introduce a novel approach to identifying product variants. It combines both constrained clustering and tailored NLP techniques (e.g. extraction of product family name from unstructured product title and identification of products with similar model numbers) to achieve superior performance compared with an existing baseline using a vanilla classification approach. In addition, we design the algorithm to meet certain business criteria, including meeting high accuracy requirements on a wide range of categories (e.g. appliances, decor, tools, and building materials, etc.) as well as prioritizing the interpretability of the model to make it accessible and understandable to all business partners.
△ Less
Submitted 12 April, 2021;
originally announced April 2021.
-
Video Jigsaw: Unsupervised Learning of Spatiotemporal Context for Video Action Recognition
Authors:
Unaiza Ahsan,
Rishi Madhok,
Irfan Essa
Abstract:
We propose a self-supervised learning method to jointly reason about spatial and temporal context for video recognition. Recent self-supervised approaches have used spatial context [9, 34] as well as temporal coherency [32] but a combination of the two requires extensive preprocessing such as tracking objects through millions of video frames [59] or computing optical flow to determine frame region…
▽ More
We propose a self-supervised learning method to jointly reason about spatial and temporal context for video recognition. Recent self-supervised approaches have used spatial context [9, 34] as well as temporal coherency [32] but a combination of the two requires extensive preprocessing such as tracking objects through millions of video frames [59] or computing optical flow to determine frame regions with high motion [30]. We propose to combine spatial and temporal context in one self-supervised framework without any heavy preprocessing. We divide multiple video frames into grids of patches and train a network to solve jigsaw puzzles on these patches from multiple frames. So the network is trained to correctly identify the position of a patch within a video frame as well as the position of a patch over time. We also propose a novel permutation strategy that outperforms random permutations while significantly reducing computational and memory constraints. We use our trained network for transfer learning tasks such as video activity recognition and demonstrate the strength of our approach on two benchmark video action recognition datasets without using a single frame from these datasets for unsupervised pretraining of our proposed video jigsaw network.
△ Less
Submitted 22 August, 2018;
originally announced August 2018.
-
DiscrimNet: Semi-Supervised Action Recognition from Videos using Generative Adversarial Networks
Authors:
Unaiza Ahsan,
Chen Sun,
Irfan Essa
Abstract:
We propose an action recognition framework using Gen- erative Adversarial Networks. Our model involves train- ing a deep convolutional generative adversarial network (DCGAN) using a large video activity dataset without la- bel information. Then we use the trained discriminator from the GAN model as an unsupervised pre-training step and fine-tune the trained discriminator model on a labeled dataset…
▽ More
We propose an action recognition framework using Gen- erative Adversarial Networks. Our model involves train- ing a deep convolutional generative adversarial network (DCGAN) using a large video activity dataset without la- bel information. Then we use the trained discriminator from the GAN model as an unsupervised pre-training step and fine-tune the trained discriminator model on a labeled dataset to recognize human activities. We determine good network architectural and hyperparameter settings for us- ing the discriminator from DCGAN as a trained model to learn useful representations for action recognition. Our semi-supervised framework using only appearance infor- mation achieves superior or comparable performance to the current state-of-the-art semi-supervised action recog- nition methods on two challenging video activity datasets: UCF101 and HMDB51.
△ Less
Submitted 22 January, 2018;
originally announced January 2018.
-
Complex Event Recognition from Images with Few Training Examples
Authors:
Unaiza Ahsan,
Chen Sun,
James Hays,
Irfan Essa
Abstract:
We propose to leverage concept-level representations for complex event recognition in photographs given limited training examples. We introduce a novel framework to discover event concept attributes from the web and use that to extract semantic features from images and classify them into social event categories with few training examples. Discovered concepts include a variety of objects, scenes, a…
▽ More
We propose to leverage concept-level representations for complex event recognition in photographs given limited training examples. We introduce a novel framework to discover event concept attributes from the web and use that to extract semantic features from images and classify them into social event categories with few training examples. Discovered concepts include a variety of objects, scenes, actions and event sub-types, leading to a discriminative and compact representation for event images. Web images are obtained for each discovered event concept and we use (pretrained) CNN features to train concept classifiers. Extensive experiments on challenging event datasets demonstrate that our proposed method outperforms several baselines using deep CNN features directly in classifying images into events with limited training examples. We also demonstrate that our method achieves the best overall accuracy on a dataset with unseen event categories using a single training example.
△ Less
Submitted 17 January, 2017;
originally announced January 2017.
-
Refugee Resettlement Housing Scout
Authors:
Unaiza Ahsan,
Oleksandra Sopova,
Wes Stayton,
Bistra Dilkina
Abstract:
According to the United States High Commission for Refugees (UNHCr), there are 65.3 million forcibly displaced people in the world today, 21.5 million of them being refugees. This has led to an unprecedented refugee crisis which has led countries to accept refugee families and to resettle them. Diverse agencies are helping refugees coming to US to resettle and start their new life in the country.…
▽ More
According to the United States High Commission for Refugees (UNHCr), there are 65.3 million forcibly displaced people in the world today, 21.5 million of them being refugees. This has led to an unprecedented refugee crisis which has led countries to accept refugee families and to resettle them. Diverse agencies are helping refugees coming to US to resettle and start their new life in the country. One of the first and most challenging steps of this process is to find affordable housing that also meets a suite of additional constraints and priorities. These include being within a mile of public transportation and near schools, faith centers and international grocery stores. We detail an interactive data-driven web-based tool, which incorporates in one consolidated platform most of the needed information. The tool searches, filters and demonstrates a list of possible housing locations, and allows for the dynamic prioritization based on user-specified importance weights on the diverse criteria. The platform was created in a partnership with New American Pathways, a nonprofit that supports refugee resettlement in the metro Atlanta, but exemplifies a methodology that can help many other organizations with similar goals.
△ Less
Submitted 28 September, 2016;
originally announced September 2016.