Skip to main content

Showing 1–11 of 11 results for author: Yan, J N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2505.19441  [pdf, ps, other

    cs.HC cs.AI cs.CY cs.LG

    Fairness Practices in Industry: A Case Study in Machine Learning Teams Building Recommender Systems

    Authors: Jing Nathan Yan, Junxiong Wang, Jeffrey M. Rzeszotarski, Allison Koenecke

    Abstract: The rapid proliferation of recommender systems necessitates robust fairness practices to address inherent biases. Assessing fairness, though, is challenging due to constantly evolving metrics and best practices. This paper analyzes how industry practitioners perceive and incorporate these changing fairness standards in their workflows. Through semi-structured interviews with 11 practitioners from… ▽ More

    Submitted 25 May, 2025; originally announced May 2025.

  2. arXiv:2409.12181  [pdf, other

    cs.CL cs.LG

    A Controlled Study on Long Context Extension and Generalization in LLMs

    Authors: Yi Lu, Jing Nathan Yan, Songlin Yang, Justin T. Chiu, Siyu Ren, Fei Yuan, Wenting Zhao, Zhiyong Wu, Alexander M. Rush

    Abstract: Broad textual understanding and in-context learning require language models that utilize full document contexts. Due to the implementation challenges associated with directly training long-context models, many methods have been proposed for extending models to handle long contexts. However, owing to differences in data and model classes, it has been challenging to compare these approaches, leading… ▽ More

    Submitted 23 September, 2024; v1 submitted 18 September, 2024; originally announced September 2024.

  3. arXiv:2405.18132  [pdf, other

    cs.CV

    EG4D: Explicit Generation of 4D Object without Score Distillation

    Authors: Qi Sun, Zhiyang Guo, Ziyu Wan, Jing Nathan Yan, Shengming Yin, Wengang Zhou, Jing Liao, Houqiang Li

    Abstract: In recent years, the increasing demand for dynamic 3D assets in design and gaming applications has given rise to powerful generative pipelines capable of synthesizing high-quality 4D objects. Previous methods generally rely on score distillation sampling (SDS) algorithm to infer the unseen views and motion of 4D objects, thus leading to unsatisfactory results with defects like over-saturation and… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  4. arXiv:2401.13660  [pdf, other

    cs.CL cs.LG

    MambaByte: Token-free Selective State Space Model

    Authors: Junxiong Wang, Tushaar Gangavarapu, Jing Nathan Yan, Alexander M. Rush

    Abstract: Token-free language models learn directly from raw bytes and remove the inductive bias of subword tokenization. Operating on bytes, however, results in significantly longer sequences. In this setting, standard autoregressive Transformers scale poorly as the effective memory required grows with sequence length. The recent development of the Mamba state space model (SSM) offers an appealing alternat… ▽ More

    Submitted 9 August, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: Published at COLM 2024

  5. arXiv:2311.18257  [pdf, other

    cs.CV cs.LG

    Diffusion Models Without Attention

    Authors: Jing Nathan Yan, Jiatao Gu, Alexander M. Rush

    Abstract: In recent advancements in high-fidelity image generation, Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a key player. However, their application at high resolutions presents significant computational challenges. Current methods, such as patchifying, expedite processes in UNet and Transformer architectures but at the expense of representational capacity. Addressing this, we intro… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

  6. arXiv:2311.08390  [pdf, other

    cs.CL

    Predicting Text Preference Via Structured Comparative Reasoning

    Authors: Jing Nathan Yan, Tianqi Liu, Justin T Chiu, Jiaming Shen, Zhen Qin, Yue Yu, Yao Zhao, Charu Lakshmanan, Yair Kurzion, Alexander M. Rush, Jialu Liu, Michael Bendersky

    Abstract: Comparative reasoning plays a crucial role in text preference prediction; however, large language models (LLMs) often demonstrate inconsistencies in their reasoning. While approaches like Chain-of-Thought improve accuracy in many other settings, they struggle to consistently distinguish the similarities and differences of complex texts. We introduce SC, a prompting approach that predicts text pref… ▽ More

    Submitted 1 July, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

  7. arXiv:2311.07099  [pdf, other

    cs.CL cs.AI

    Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning

    Authors: Yue Yu, Jiaming Shen, Tianqi Liu, Zhen Qin, Jing Nathan Yan, Jialu Liu, Chao Zhang, Michael Bendersky

    Abstract: Large language models (LLMs) have shown remarkable capabilities in various natural language understanding tasks. With only a few demonstration examples, these LLMs can quickly adapt to target tasks without expensive gradient updates. Common strategies to boost such 'in-context' learning ability are to ensemble multiple model decoded results and require the model to generate an explanation along wi… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  8. arXiv:2306.01069  [pdf, other

    cs.CL cs.AI cs.IR

    TimelineQA: A Benchmark for Question Answering over Timelines

    Authors: Wang-Chiew Tan, Jane Dwivedi-Yu, Yuliang Li, Lambert Mathias, Marzieh Saeidi, Jing Nathan Yan, Alon Y. Halevy

    Abstract: Lifelogs are descriptions of experiences that a person had during their life. Lifelogs are created by fusing data from the multitude of digital services, such as online photos, maps, shopping and content streaming services. Question answering over lifelogs can offer personal assistants a critical resource when they try to provide advice in context. However, obtaining answers to questions over life… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

  9. arXiv:2212.10544  [pdf, other

    cs.CL cs.LG

    Pretraining Without Attention

    Authors: Junxiong Wang, Jing Nathan Yan, Albert Gu, Alexander M. Rush

    Abstract: Transformers have been essential to pretraining success in NLP. While other architectures have been used, downstream accuracy is either significantly worse, or requires attention layers to match standard benchmarks such as GLUE. This work explores pretraining without attention by using recent advances in sequence routing based on state-space models (SSMs). Our proposed model, Bidirectional Gated S… ▽ More

    Submitted 8 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  10. DataPrep.EDA: Task-Centric Exploratory Data Analysis for Statistical Modeling in Python

    Authors: Jinglin Peng, Weiyuan Wu, Brandon Lockhart, Song Bian, Jing Nathan Yan, Linghao Xu, Zhixuan Chi, Jeffrey Rzeszotarski, Jiannan Wang

    Abstract: Exploratory Data Analysis (EDA) is a crucial step in any data science project. However, existing Python libraries fall short in supporting data scientists to complete common EDA tasks for statistical modeling. Their API design is either too low level, which is optimized for plotting rather than EDA, or too high level, which is hard to specify more fine-grained EDA tasks. In response, we propose Da… ▽ More

    Submitted 10 April, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

  11. arXiv:1902.09711  [pdf, ps, other

    cs.DB

    Detecting Data Errors with Statistical Constraints

    Authors: Jing Nathan Yan, Oliver Schulte, Jiannan Wang, Reynold Cheng

    Abstract: A powerful approach to detecting erroneous data is to check which potentially dirty data records are incompatible with a user's domain knowledge. Previous approaches allow the user to specify domain knowledge in the form of logical constraints (e.g., functional dependency and denial constraints). We extend the constraint-based approach by introducing a novel class of statistical constraints (SCs).… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.