Skip to main content

Showing 1–3 of 3 results for author: Brade, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2504.05106  [pdf, other

    cs.HC cs.AI cs.LG

    SpeakEasy: Enhancing Text-to-Speech Interactions for Expressive Content Creation

    Authors: Stephen Brade, Sam Anderson, Rithesh Kumar, Zeyu Jin, Anh Truong

    Abstract: Novice content creators often invest significant time recording expressive speech for social media videos. While recent advancements in text-to-speech (TTS) technology can generate highly realistic speech in various languages and accents, many struggle with unintuitive or overly granular TTS interfaces. We propose simplifying TTS generation by allowing users to specify high-level context alongside… ▽ More

    Submitted 7 April, 2025; originally announced April 2025.

  2. arXiv:2312.04690  [pdf, other

    cs.HC cs.AI cs.SD eess.AS

    SynthScribe: Deep Multimodal Tools for Synthesizer Sound Retrieval and Exploration

    Authors: Stephen Brade, Bryan Wang, Mauricio Sousa, Gregory Lee Newsome, Sageev Oore, Tovi Grossman

    Abstract: Synthesizers are powerful tools that allow musicians to create dynamic and original sounds. Existing commercial interfaces for synthesizers typically require musicians to interact with complex low-level parameters or to manage large libraries of premade sounds. To address these challenges, we implement SynthScribe -- a fullstack system that uses multimodal deep learning to let users express their… ▽ More

    Submitted 20 February, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  3. arXiv:2304.09337  [pdf, other

    cs.HC cs.AI cs.MM

    Promptify: Text-to-Image Generation through Interactive Prompt Exploration with Large Language Models

    Authors: Stephen Brade, Bryan Wang, Mauricio Sousa, Sageev Oore, Tovi Grossman

    Abstract: Text-to-image generative models have demonstrated remarkable capabilities in generating high-quality images based on textual prompts. However, crafting prompts that accurately capture the user's creative intent remains challenging. It often involves laborious trial-and-error procedures to ensure that the model interprets the prompts in alignment with the user's intention. To address the challenges… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.