-
ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding
Authors:
Xingyu Fu,
Minqian Liu,
Zhengyuan Yang,
John Corring,
Yijuan Lu,
Jianwei Yang,
Dan Roth,
Dinei Florencio,
Cha Zhang
Abstract:
Structured image understanding, such as interpreting tables and charts, requires strategically refocusing across various structures and texts within an image, forming a reasoning sequence to arrive at the final answer. However, current multimodal large language models (LLMs) lack this multihop selective attention capability. In this work, we introduce ReFocus, a simple yet effective framework that…
▽ More
Structured image understanding, such as interpreting tables and charts, requires strategically refocusing across various structures and texts within an image, forming a reasoning sequence to arrive at the final answer. However, current multimodal large language models (LLMs) lack this multihop selective attention capability. In this work, we introduce ReFocus, a simple yet effective framework that equips multimodal LLMs with the ability to generate "visual thoughts" by performing visual editing on the input image through code, shifting and refining their visual focuses. Specifically, ReFocus enables multimodal LLMs to generate Python codes to call tools and modify the input image, sequentially drawing boxes, highlighting sections, and masking out areas, thereby enhancing the visual reasoning process. We experiment upon a wide range of structured image understanding tasks involving tables and charts. ReFocus largely improves performance on all tasks over GPT-4o without visual editing, yielding an average gain of 11.0% on table tasks and 6.8% on chart tasks. We present an in-depth analysis of the effects of different visual edits, and reasons why ReFocus can improve the performance without introducing additional information. Further, we collect a 14k training set using ReFocus, and prove that such visual chain-of-thought with intermediate information offers a better supervision than standard VQA data, reaching a 8.0% average gain over the same model trained with QA pairs and 2.6% over CoT.
△ Less
Submitted 9 January, 2025;
originally announced January 2025.
-
From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Authors:
Li Sun,
Florian Luisier,
Kayhan Batmanghelich,
Dinei Florencio,
Cha Zhang
Abstract:
Current state-of-the-art models for natural language understanding require a preprocessing step to convert raw text into discrete tokens. This process known as tokenization relies on a pre-built vocabulary of words or sub-word morphemes. This fixed vocabulary limits the model's robustness to spelling errors and its capacity to adapt to new domains. In this work, we introduce a novel open-vocabular…
▽ More
Current state-of-the-art models for natural language understanding require a preprocessing step to convert raw text into discrete tokens. This process known as tokenization relies on a pre-built vocabulary of words or sub-word morphemes. This fixed vocabulary limits the model's robustness to spelling errors and its capacity to adapt to new domains. In this work, we introduce a novel open-vocabulary language model that adopts a hierarchical two-level approach: one at the word level and another at the sequence level. Concretely, we design an intra-word module that uses a shallow Transformer architecture to learn word representations from their characters, and a deep inter-word Transformer module that contextualizes each word representation by attending to the entire word sequence. Our model thus directly operates on character sequences with explicit awareness of word boundaries, but without biased sub-word or word-level vocabulary. Experiments on various downstream tasks show that our method outperforms strong baselines. We also demonstrate that our hierarchical model is robust to textual corruption and domain shift.
△ Less
Submitted 29 May, 2023; v1 submitted 23 May, 2023;
originally announced May 2023.
-
Diffusion-based Document Layout Generation
Authors:
Liu He,
Yijuan Lu,
John Corring,
Dinei Florencio,
Cha Zhang
Abstract:
We develop a diffusion-based approach for various document layout sequence generation. Layout sequences specify the contents of a document design in an explicit format. Our novel diffusion-based approach works in the sequence domain rather than the image domain in order to permit more complex and realistic layouts. We also introduce a new metric, Document Earth Mover's Distance (Doc-EMD). By consi…
▽ More
We develop a diffusion-based approach for various document layout sequence generation. Layout sequences specify the contents of a document design in an explicit format. Our novel diffusion-based approach works in the sequence domain rather than the image domain in order to permit more complex and realistic layouts. We also introduce a new metric, Document Earth Mover's Distance (Doc-EMD). By considering similarity between heterogeneous categories document designs, we handle the shortcomings of prior document metrics that only evaluate the same category of layouts. Our empirical analysis shows that our diffusion-based approach is comparable to or outperforming other previous methods for layout generation across various document datasets. Moreover, our metric is capable of differentiating documents better than previous metrics for specific cases.
△ Less
Submitted 19 March, 2023;
originally announced March 2023.
-
Understanding Long Documents with Different Position-Aware Attentions
Authors:
Hai Pham,
Guoxin Wang,
Yijuan Lu,
Dinei Florencio,
Cha Zhang
Abstract:
Despite several successes in document understanding, the practical task for long document understanding is largely under-explored due to several challenges in computation and how to efficiently absorb long multimodal input. Most current transformer-based approaches only deal with short documents and employ solely textual information for attention due to its prohibitive computation and memory limit…
▽ More
Despite several successes in document understanding, the practical task for long document understanding is largely under-explored due to several challenges in computation and how to efficiently absorb long multimodal input. Most current transformer-based approaches only deal with short documents and employ solely textual information for attention due to its prohibitive computation and memory limit. To address those issues in long document understanding, we explore different approaches in handling 1D and new 2D position-aware attention with essentially shortened context. Experimental results show that our proposed models have advantages for this task based on various evaluation metrics. Furthermore, our model makes changes only to the attention and thus can be easily adapted to any transformer-based architecture.
△ Less
Submitted 17 August, 2022;
originally announced August 2022.
-
Improving Structured Text Recognition with Regular Expression Biasing
Authors:
Baoguang Shi,
Wenfeng Cheng,
Yijuan Lu,
Cha Zhang,
Dinei Florencio
Abstract:
We study the problem of recognizing structured text, i.e. text that follows certain formats, and propose to improve the recognition accuracy of structured text by specifying regular expressions (regexes) for biasing. A biased recognizer recognizes text that matches the specified regexes with significantly improved accuracy, at the cost of a generally small degradation on other text. The biasing is…
▽ More
We study the problem of recognizing structured text, i.e. text that follows certain formats, and propose to improve the recognition accuracy of structured text by specifying regular expressions (regexes) for biasing. A biased recognizer recognizes text that matches the specified regexes with significantly improved accuracy, at the cost of a generally small degradation on other text. The biasing is realized by modeling regexes as a Weighted Finite-State Transducer (WFST) and injecting it into the decoder via dynamic replacement. A single hyperparameter controls the biasing strength. The method is useful for recognizing text lines with known formats or containing words from a domain vocabulary. Examples include driver license numbers, drug names in prescriptions, etc. We demonstrate the efficacy of regex biasing on datasets of printed and handwritten structured text and measures its side effects.
△ Less
Submitted 10 November, 2021;
originally announced November 2021.
-
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Authors:
Minghao Li,
Tengchao Lv,
Jingye Chen,
Lei Cui,
Yijuan Lu,
Dinei Florencio,
Cha Zhang,
Zhoujun Li,
Furu Wei
Abstract:
Text recognition is a long-standing research problem for document digitalization. Existing approaches are usually built based on CNN for image understanding and RNN for char-level text generation. In addition, another language model is usually needed to improve the overall accuracy as a post-processing step. In this paper, we propose an end-to-end text recognition approach with pre-trained image T…
▽ More
Text recognition is a long-standing research problem for document digitalization. Existing approaches are usually built based on CNN for image understanding and RNN for char-level text generation. In addition, another language model is usually needed to improve the overall accuracy as a post-processing step. In this paper, we propose an end-to-end text recognition approach with pre-trained image Transformer and text Transformer models, namely TrOCR, which leverages the Transformer architecture for both image understanding and wordpiece-level text generation. The TrOCR model is simple but effective, and can be pre-trained with large-scale synthetic data and fine-tuned with human-labeled datasets. Experiments show that the TrOCR model outperforms the current state-of-the-art models on the printed, handwritten and scene text recognition tasks. The TrOCR models and code are publicly available at \url{https://aka.ms/trocr}.
△ Less
Submitted 6 September, 2022; v1 submitted 21 September, 2021;
originally announced September 2021.
-
LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding
Authors:
Yiheng Xu,
Tengchao Lv,
Lei Cui,
Guoxin Wang,
Yijuan Lu,
Dinei Florencio,
Cha Zhang,
Furu Wei
Abstract:
Multimodal pre-training with text, layout, and image has achieved SOTA performance for visually-rich document understanding tasks recently, which demonstrates the great potential for joint learning across different modalities. In this paper, we present LayoutXLM, a multimodal pre-trained model for multilingual document understanding, which aims to bridge the language barriers for visually-rich doc…
▽ More
Multimodal pre-training with text, layout, and image has achieved SOTA performance for visually-rich document understanding tasks recently, which demonstrates the great potential for joint learning across different modalities. In this paper, we present LayoutXLM, a multimodal pre-trained model for multilingual document understanding, which aims to bridge the language barriers for visually-rich document understanding. To accurately evaluate LayoutXLM, we also introduce a multilingual form understanding benchmark dataset named XFUND, which includes form understanding samples in 7 languages (Chinese, Japanese, Spanish, French, Italian, German, Portuguese), and key-value pairs are manually labeled for each language. Experiment results show that the LayoutXLM model has significantly outperformed the existing SOTA cross-lingual pre-trained models on the XFUND dataset. The pre-trained LayoutXLM model and the XFUND dataset are publicly available at https://aka.ms/layoutxlm.
△ Less
Submitted 9 September, 2021; v1 submitted 18 April, 2021;
originally announced April 2021.
-
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Authors:
Yang Xu,
Yiheng Xu,
Tengchao Lv,
Lei Cui,
Furu Wei,
Guoxin Wang,
Yijuan Lu,
Dinei Florencio,
Cha Zhang,
Wanxiang Che,
Min Zhang,
Lidong Zhou
Abstract:
Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents. We propose LayoutLMv2 architecture with new pre-training tasks to model the interaction among text, layout, and image in a single multi-modal framework. Specifically, with a…
▽ More
Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents. We propose LayoutLMv2 architecture with new pre-training tasks to model the interaction among text, layout, and image in a single multi-modal framework. Specifically, with a two-stream multi-modal Transformer encoder, LayoutLMv2 uses not only the existing masked visual-language modeling task but also the new text-image alignment and text-image matching tasks, which make it better capture the cross-modality interaction in the pre-training stage. Meanwhile, it also integrates a spatial-aware self-attention mechanism into the Transformer architecture so that the model can fully understand the relative positional relationship among different text blocks. Experiment results show that LayoutLMv2 outperforms LayoutLM by a large margin and achieves new state-of-the-art results on a wide variety of downstream visually-rich document understanding tasks, including FUNSD (0.7895 $\to$ 0.8420), CORD (0.9493 $\to$ 0.9601), SROIE (0.9524 $\to$ 0.9781), Kleister-NDA (0.8340 $\to$ 0.8520), RVL-CDIP (0.9443 $\to$ 0.9564), and DocVQA (0.7295 $\to$ 0.8672). We made our model and code publicly available at \url{https://aka.ms/layoutlmv2}.
△ Less
Submitted 9 January, 2022; v1 submitted 29 December, 2020;
originally announced December 2020.
-
TAP: Text-Aware Pre-training for Text-VQA and Text-Caption
Authors:
Zhengyuan Yang,
Yijuan Lu,
Jianfeng Wang,
Xi Yin,
Dinei Florencio,
Lijuan Wang,
Cha Zhang,
Lei Zhang,
Jiebo Luo
Abstract:
In this paper, we propose Text-Aware Pre-training (TAP) for Text-VQA and Text-Caption tasks. These two tasks aim at reading and understanding scene text in images for question answering and image caption generation, respectively. In contrast to the conventional vision-language pre-training that fails to capture scene text and its relationship with the visual and text modalities, TAP explicitly inc…
▽ More
In this paper, we propose Text-Aware Pre-training (TAP) for Text-VQA and Text-Caption tasks. These two tasks aim at reading and understanding scene text in images for question answering and image caption generation, respectively. In contrast to the conventional vision-language pre-training that fails to capture scene text and its relationship with the visual and text modalities, TAP explicitly incorporates scene text (generated from OCR engines) in pre-training. With three pre-training tasks, including masked language modeling (MLM), image-text (contrastive) matching (ITM), and relative (spatial) position prediction (RPP), TAP effectively helps the model learn a better aligned representation among the three modalities: text word, visual object, and scene text. Due to this aligned representation learning, even pre-trained on the same downstream task dataset, TAP already boosts the absolute accuracy on the TextVQA dataset by +5.4%, compared with a non-TAP baseline. To further improve the performance, we build a large-scale dataset based on the Conceptual Caption dataset, named OCR-CC, which contains 1.4 million scene text-related image-text pairs. Pre-trained on this OCR-CC dataset, our approach outperforms the state of the art by large margins on multiple tasks, i.e., +8.3% accuracy on TextVQA, +8.6% accuracy on ST-VQA, and +10.2 CIDEr score on TextCaps.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
RePr: Improved Training of Convolutional Filters
Authors:
Aaditya Prakash,
James Storer,
Dinei Florencio,
Cha Zhang
Abstract:
A well-trained Convolutional Neural Network can easily be pruned without significant loss of performance. This is because of unnecessary overlap in the features captured by the network's filters. Innovations in network architecture such as skip/dense connections and Inception units have mitigated this problem to some extent, but these improvements come with increased computation and memory require…
▽ More
A well-trained Convolutional Neural Network can easily be pruned without significant loss of performance. This is because of unnecessary overlap in the features captured by the network's filters. Innovations in network architecture such as skip/dense connections and Inception units have mitigated this problem to some extent, but these improvements come with increased computation and memory requirements at run-time. We attempt to address this problem from another angle - not by changing the network structure but by altering the training method. We show that by temporarily pruning and then restoring a subset of the model's filters, and repeating this process cyclically, overlap in the learned features is reduced, producing improved generalization. We show that the existing model-pruning criteria are not optimal for selecting filters to prune in this context and introduce inter-filter orthogonality as the ranking criteria to determine under-expressive filters. Our method is applicable both to vanilla convolutional networks and more complex modern architectures, and improves the performance across a variety of tasks, especially when applied to smaller networks.
△ Less
Submitted 25 February, 2019; v1 submitted 18 November, 2018;
originally announced November 2018.
-
A Fusion Framework for Camouflaged Moving Foreground Detection in the Wavelet Domain
Authors:
Shuai Li,
Dinei Florencio,
Wanqing Li,
Yaqin Zhao,
Chris Cook
Abstract:
Detecting camouflaged moving foreground objects has been known to be difficult due to the similarity between the foreground objects and the background. Conventional methods cannot distinguish the foreground from background due to the small differences between them and thus suffer from under-detection of the camouflaged foreground objects. In this paper, we present a fusion framework to address thi…
▽ More
Detecting camouflaged moving foreground objects has been known to be difficult due to the similarity between the foreground objects and the background. Conventional methods cannot distinguish the foreground from background due to the small differences between them and thus suffer from under-detection of the camouflaged foreground objects. In this paper, we present a fusion framework to address this problem in the wavelet domain. We first show that the small differences in the image domain can be highlighted in certain wavelet bands. Then the likelihood of each wavelet coefficient being foreground is estimated by formulating foreground and background models for each wavelet band. The proposed framework effectively aggregates the likelihoods from different wavelet bands based on the characteristics of the wavelet transform. Experimental results demonstrated that the proposed method significantly outperformed existing methods in detecting camouflaged foreground objects. Specifically, the average F-measure for the proposed algorithm was 0.87, compared to 0.71 to 0.8 for the other state-of-the-art methods.
△ Less
Submitted 16 April, 2018;
originally announced April 2018.
-
Deep Learning Based Speech Beamforming
Authors:
Kaizhi Qian,
Yang Zhang,
Shiyu Chang,
Xuesong Yang,
Dinei Florencio,
Mark Hasegawa-Johnson
Abstract:
Multi-channel speech enhancement with ad-hoc sensors has been a challenging task. Speech model guided beamforming algorithms are able to recover natural sounding speech, but the speech models tend to be oversimplified or the inference would otherwise be too complicated. On the other hand, deep learning based enhancement approaches are able to learn complicated speech distributions and perform effi…
▽ More
Multi-channel speech enhancement with ad-hoc sensors has been a challenging task. Speech model guided beamforming algorithms are able to recover natural sounding speech, but the speech models tend to be oversimplified or the inference would otherwise be too complicated. On the other hand, deep learning based enhancement approaches are able to learn complicated speech distributions and perform efficient inference, but they are unable to deal with variable number of input channels. Also, deep learning approaches introduce a lot of errors, particularly in the presence of unseen noise types and settings. We have therefore proposed an enhancement framework called DEEPBEAM, which combines the two complementary classes of algorithms. DEEPBEAM introduces a beamforming filter to produce natural sounding speech, but the filter coefficients are determined with the help of a monaural speech enhancement neural network. Experiments on synthetic and real-world data show that DEEPBEAM is able to produce clean, dry and natural sounding speech, and is robust against unseen noise.
△ Less
Submitted 14 February, 2018;
originally announced February 2018.
-
Foreground Detection in Camouflaged Scenes
Authors:
Shuai Li,
Dinei Florencio,
Yaqin Zhao,
Chris Cook,
Wanqing Li
Abstract:
Foreground detection has been widely studied for decades due to its importance in many practical applications. Most of the existing methods assume foreground and background show visually distinct characteristics and thus the foreground can be detected once a good background model is obtained. However, there are many situations where this is not the case. Of particular interest in video surveillanc…
▽ More
Foreground detection has been widely studied for decades due to its importance in many practical applications. Most of the existing methods assume foreground and background show visually distinct characteristics and thus the foreground can be detected once a good background model is obtained. However, there are many situations where this is not the case. Of particular interest in video surveillance is the camouflage case. For example, an active attacker camouflages by intentionally wearing clothes that are visually similar to the background. In such cases, even given a decent background model, it is not trivial to detect foreground objects. This paper proposes a texture guided weighted voting (TGWV) method which can efficiently detect foreground objects in camouflaged scenes. The proposed method employs the stationary wavelet transform to decompose the image into frequency bands. We show that the small and hardly noticeable differences between foreground and background in the image domain can be effectively captured in certain wavelet frequency bands. To make the final foreground decision, a weighted voting scheme is developed based on intensity and texture of all the wavelet bands with weights carefully designed. Experimental results demonstrate that the proposed method achieves superior performance compared to the current state-of-the-art results.
△ Less
Submitted 11 July, 2017;
originally announced July 2017.
-
Joint Denoising / Compression of Image Contours via Shape Prior and Context Tree
Authors:
Amin Zheng,
Gene Cheung,
Dinei Florencio
Abstract:
With the advent of depth sensing technologies, the extraction of object contours in images---a common and important pre-processing step for later higher-level computer vision tasks like object detection and human action recognition---has become easier. However, acquisition noise in captured depth images means that detected contours suffer from unavoidable errors. In this paper, we propose to joint…
▽ More
With the advent of depth sensing technologies, the extraction of object contours in images---a common and important pre-processing step for later higher-level computer vision tasks like object detection and human action recognition---has become easier. However, acquisition noise in captured depth images means that detected contours suffer from unavoidable errors. In this paper, we propose to jointly denoise and compress detected contours in an image for bandwidth-constrained transmission to a client, who can then carry out aforementioned application-specific tasks using the decoded contours as input. We first prove theoretically that in general a joint denoising / compression approach can outperform a separate two-stage approach that first denoises then encodes contours lossily. Adopting a joint approach, we first propose a burst error model that models typical errors encountered in an observed string y of directional edges. We then formulate a rate-constrained maximum a posteriori (MAP) problem that trades off the posterior probability p(x'|y) of an estimated string x' given y with its code rate R(x'). We design a dynamic programming (DP) algorithm that solves the posed problem optimally, and propose a compact context representation called total suffix tree (TST) that can reduce complexity of the algorithm dramatically. Experimental results show that our joint denoising / compression scheme outperformed a competing separate scheme in rate-distortion performance noticeably.
△ Less
Submitted 30 April, 2017;
originally announced May 2017.
-
Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks
Authors:
Anurag Kumar,
Dinei Florencio
Abstract:
In this paper we consider the problem of speech enhancement in real-world like conditions where multiple noises can simultaneously corrupt speech. Most of the current literature on speech enhancement focus primarily on presence of single noise in corrupted speech which is far from real-world environments. Specifically, we deal with improving speech quality in office environment where multiple stat…
▽ More
In this paper we consider the problem of speech enhancement in real-world like conditions where multiple noises can simultaneously corrupt speech. Most of the current literature on speech enhancement focus primarily on presence of single noise in corrupted speech which is far from real-world environments. Specifically, we deal with improving speech quality in office environment where multiple stationary as well as non-stationary noises can be simultaneously present in speech. We propose several strategies based on Deep Neural Networks (DNN) for speech enhancement in these scenarios. We also investigate a DNN training strategy based on psychoacoustic models from speech coding for enhancement of noisy speech
△ Less
Submitted 9 May, 2016;
originally announced May 2016.
-
Context Tree based Image Contour Coding using A Geometric Prior
Authors:
Amin Zheng,
Gene Cheung,
Dinei Florencio
Abstract:
If object contours in images are coded efficiently as side information, then they can facilitate advanced image / video coding techniques, such as graph Fourier transform coding or motion prediction of arbitrarily shaped pixel blocks. In this paper, we study the problem of lossless and lossy compression of detected contours in images. Specifically, we first convert a detected object contour compos…
▽ More
If object contours in images are coded efficiently as side information, then they can facilitate advanced image / video coding techniques, such as graph Fourier transform coding or motion prediction of arbitrarily shaped pixel blocks. In this paper, we study the problem of lossless and lossy compression of detected contours in images. Specifically, we first convert a detected object contour composed of contiguous between-pixel edges to a sequence of directional symbols drawn from a small alphabet. To encode the symbol sequence using arithmetic coding, we compute an optimal variable-length context tree (VCT) $\mathcal{T}$ via a maximum a posterior (MAP) formulation to estimate symbols' conditional probabilities. MAP prevents us from overfitting given a small training set $\mathcal{X}$ of past symbol sequences by identifying a VCT $\mathcal{T}$ that achieves a high likelihood $P(\mathcal{X}|\mathcal{T})$ of observing $\mathcal{X}$ given $\mathcal{T}$, and a large geometric prior $P(\mathcal{T})$ stating that image contours are more often straight than curvy. For the lossy case, we design efficient dynamic programming (DP) algorithms that optimally trade off coding rate of an approximate contour $\hat{\mathbf{x}}$ given a VCT $\mathcal{T}$ with two notions of distortion of $\hat{\mathbf{x}}$ with respect to the original contour $\mathbf{x}$. To reduce the size of the DP tables, a total suffix tree is derived from a given VCT $\mathcal{T}$ for compact table entry indexing, reducing complexity. Experimental results show that for lossless contour coding, our proposed algorithm outperforms state-of-the-art context-based schemes consistently for both small and large training datasets. For lossy contour coding, our algorithms outperform comparable schemes in the literature in rate-distortion performance.
△ Less
Submitted 27 April, 2016;
originally announced April 2016.
-
Precision Enhancement of 3D Surfaces from Multiple Compressed Depth Maps
Authors:
Pengfei Wan,
Gene Cheung,
Philip A. Chou,
Dinei Florencio,
Cha Zhang,
Oscar C. Au
Abstract:
In texture-plus-depth representation of a 3D scene, depth maps from different camera viewpoints are typically lossily compressed via the classical transform coding / coefficient quantization paradigm. In this paper we propose to reduce distortion of the decoded depth maps due to quantization. The key observation is that depth maps from different viewpoints constitute multiple descriptions (MD) of…
▽ More
In texture-plus-depth representation of a 3D scene, depth maps from different camera viewpoints are typically lossily compressed via the classical transform coding / coefficient quantization paradigm. In this paper we propose to reduce distortion of the decoded depth maps due to quantization. The key observation is that depth maps from different viewpoints constitute multiple descriptions (MD) of the same 3D scene. Considering the MD jointly, we perform a POCS-like iterative procedure to project a reconstructed signal from one depth map to the other and back, so that the converged depth maps have higher precision than the original quantized versions.
△ Less
Submitted 24 February, 2014;
originally announced May 2014.