Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction

Rector-Brooks, Jarrid; Hasan, Mohsin; Peng, Zhangzhi; Quinn, Zachary; Liu, Chenghao; Mittal, Sarthak; Dziri, Nouha; Bronstein, Michael; Bengio, Yoshua; Chatterjee, Pranam; Tong, Alexander; Bose, Avishek Joey

Computer Science > Machine Learning

arXiv:2410.08134 (cs)

[Submitted on 10 Oct 2024]

Title:Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction

Authors:Jarrid Rector-Brooks, Mohsin Hasan, Zhangzhi Peng, Zachary Quinn, Chenghao Liu, Sarthak Mittal, Nouha Dziri, Michael Bronstein, Yoshua Bengio, Pranam Chatterjee, Alexander Tong, Avishek Joey Bose

View PDF HTML (experimental)

Abstract:Generative modeling of discrete data underlies important applications spanning text-based agents like ChatGPT to the design of the very building blocks of life in protein sequences. However, application domains need to exert control over the generated data by steering the generative process - typically via RLHF - to satisfy a specified property, reward, or affinity metric. In this paper, we study the problem of steering Masked Diffusion Models (MDMs), a recent class of discrete diffusion models that offer a compelling alternative to traditional autoregressive models. We introduce Discrete Denoising Posterior Prediction (DDPP), a novel framework that casts the task of steering pre-trained MDMs as a problem of probabilistic inference by learning to sample from a target Bayesian posterior. Our DDPP framework leads to a family of three novel objectives that are all simulation-free, and thus scalable while applying to general non-differentiable reward functions. Empirically, we instantiate DDPP by steering MDMs to perform class-conditional pixel-level image modeling, RLHF-based alignment of MDMs using text-based rewards, and finetuning protein language models to generate more diverse secondary structures and shorter proteins. We substantiate our designs via wet-lab validation, where we observe transient expression of reward-optimized protein sequences.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2410.08134 [cs.LG]
	(or arXiv:2410.08134v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.08134

Submission history

From: Jarrid Rector-Brooks [view email]
[v1] Thu, 10 Oct 2024 17:18:30 UTC (21,562 KB)

Computer Science > Machine Learning

Title:Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators