Skip to main content

Showing 1–4 of 4 results for author: Dlabal, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2410.16512  [pdf, other

    cs.CV

    TIPS: Text-Image Pretraining with Spatial awareness

    Authors: Kevis-Kokitsi Maninis, Kaifeng Chen, Soham Ghosh, Arjun Karpur, Koert Chen, Ye Xia, Bingyi Cao, Daniel Salz, Guangxing Han, Jan Dlabal, Dan Gnanapragasam, Mojtaba Seyedhosseini, Howard Zhou, Andre Araujo

    Abstract: While image-text representation learning has become very popular in recent years, existing models tend to lack spatial awareness and have limited direct applicability for dense understanding tasks. For this reason, self-supervised image-only pretraining is still the go-to method for many dense vision applications (e.g. depth estimation, semantic segmentation), despite the lack of explicit supervis… ▽ More

    Submitted 7 March, 2025; v1 submitted 21 October, 2024; originally announced October 2024.

    Comments: ICLR2025 camera-ready + appendix

  2. arXiv:2403.02626  [pdf, other

    cs.CV cs.LG

    Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use

    Authors: Imad Eddine Toubal, Aditya Avinash, Neil Gordon Alldrin, Jan Dlabal, Wenlei Zhou, Enming Luo, Otilia Stretcu, Hao Xiong, Chun-Ta Lu, Howard Zhou, Ranjay Krishna, Ariel Fuxman, Tom Duerig

    Abstract: From content moderation to wildlife conservation, the number of applications that require models to recognize nuanced or subjective visual concepts is growing. Traditionally, developing classifiers for such concepts requires substantial manual effort measured in hours, days, or even months to identify and annotate data needed for training. Even with recently proposed Agile Modeling techniques, whi… ▽ More

    Submitted 19 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

  3. arXiv:2105.12849  [pdf, ps, other

    cs.LG

    CARLS: Cross-platform Asynchronous Representation Learning System

    Authors: Chun-Ta Lu, Yun Zeng, Da-Cheng Juan, Yicheng Fan, Zhe Li, Jan Dlabal, Yi-Ting Chen, Arjun Gopalan, Allan Heydon, Chun-Sung Ferng, Reah Miyara, Ariel Fuxman, Futang Peng, Zhen Li, Tom Duerig, Andrew Tomkins

    Abstract: In this work, we propose CARLS, a novel framework for augmenting the capacity of existing deep learning frameworks by enabling multiple components -- model trainers, knowledge makers and knowledge banks -- to concertedly work together in an asynchronous fashion across hardware platforms. The proposed CARLS is particularly suitable for learning paradigms where model training benefits from additiona… ▽ More

    Submitted 26 May, 2021; originally announced May 2021.

  4. arXiv:2005.07289  [pdf, other

    cs.CV cs.LG

    Taskology: Utilizing Task Relations at Scale

    Authors: Yao Lu, Sören Pirk, Jan Dlabal, Anthony Brohan, Ankita Pasad, Zhao Chen, Vincent Casser, Anelia Angelova, Ariel Gordon

    Abstract: Many computer vision tasks address the problem of scene understanding and are naturally interrelated e.g. object classification, detection, scene segmentation, depth estimation, etc. We show that we can leverage the inherent relationships among collections of tasks, as they are trained jointly, supervising each other through their known relationships via consistency losses. Furthermore, explicitly… ▽ More

    Submitted 17 March, 2021; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: IEEE Conference on Computer Vision and Pattern Recognition, 2021