A Systematic Review on Long-Tailed Learning
Authors:
Chongsheng Zhang,
George Almpanidis,
Gaojuan Fan,
Binquan Deng,
Yanbo Zhang,
Ji Liu,
Aouaidjia Kamel,
Paolo Soda,
João Gama
Abstract:
Long-tailed data is a special type of multi-class imbalanced data with a very large amount of minority/tail classes that have a very significant combined influence. Long-tailed learning aims to build high-performance models on datasets with long-tailed distributions, which can identify all the classes with high accuracy, in particular the minority/tail classes. It is a cutting-edge research direct…
▽ More
Long-tailed data is a special type of multi-class imbalanced data with a very large amount of minority/tail classes that have a very significant combined influence. Long-tailed learning aims to build high-performance models on datasets with long-tailed distributions, which can identify all the classes with high accuracy, in particular the minority/tail classes. It is a cutting-edge research direction that has attracted a remarkable amount of research effort in the past few years. In this paper, we present a comprehensive survey of latest advances in long-tailed visual learning. We first propose a new taxonomy for long-tailed learning, which consists of eight different dimensions, including data balancing, neural architecture, feature enrichment, logits adjustment, loss function, bells and whistles, network optimization, and post hoc processing techniques. Based on our proposed taxonomy, we present a systematic review of long-tailed learning methods, discussing their commonalities and alignable differences. We also analyze the differences between imbalance learning and long-tailed learning approaches. Finally, we discuss prospects and future directions in this field.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
QnAMaker: Data to Bot in 2 Minutes
Authors:
Parag Agrawal,
Tulasi Menon,
Aya Kamel,
Michel Naim,
Chaikesh Chouragade,
Gurvinder Singh,
Rohan Kulkarni,
Anshuman Suri,
Sahithi Katakam,
Vineet Pratik,
Prakul Bansal,
Simerpreet Kaur,
Neha Rajput,
Anand Duggal,
Achraf Chalabi,
Prashant Choudhari,
Reddy Satti,
Niranjan Nayak
Abstract:
Having a bot for seamless conversations is a much-desired feature that products and services today seek for their websites and mobile apps. These bots help reduce traffic received by human support significantly by handling frequent and directly answerable known questions. Many such services have huge reference documents such as FAQ pages, which makes it hard for users to browse through this data.…
▽ More
Having a bot for seamless conversations is a much-desired feature that products and services today seek for their websites and mobile apps. These bots help reduce traffic received by human support significantly by handling frequent and directly answerable known questions. Many such services have huge reference documents such as FAQ pages, which makes it hard for users to browse through this data. A conversation layer over such raw data can lower traffic to human support by a great margin. We demonstrate QnAMaker, a service that creates a conversational layer over semi-structured data such as FAQ pages, product manuals, and support documents. QnAMaker is the popular choice for Extraction and Question-Answering as a service and is used by over 15,000 bots in production. It is also used by search interfaces and not just bots.
△ Less
Submitted 18 March, 2020;
originally announced March 2020.