Search | arXiv e-print repository

arXiv:2401.00128 [pdf]

Quantifying intra-tumoral genetic heterogeneity of glioblastoma toward precision medicine using MRI and a data-inclusive machine learning algorithm

Authors: Lujia Wang, Hairong Wang, Fulvio D'Angelo, Lee Curtin, Christopher P. Sereduk, Gustavo De Leon, Kyle W. Singleton, Javier Urcuyo, Andrea Hawkins-Daarud, Pamela R. Jackson, Chandan Krishna, Richard S. Zimmerman, Devi P. Patra, Bernard R. Bendok, Kris A. Smith, Peter Nakaji, Kliment Donev, Leslie C. Baxter, Maciej M. Mrugała, Michele Ceccarelli, Antonio Iavarone, Kristin R. Swanson, Nhan L. Tran, Leland S. Hu, Jing Li

Abstract: Glioblastoma (GBM) is one of the most aggressive and lethal human cancers. Intra-tumoral genetic heterogeneity poses a significant challenge for treatment. Biopsy is invasive, which motivates the development of non-invasive, MRI-based machine learning (ML) models to quantify intra-tumoral genetic heterogeneity for each patient. This capability holds great promise for enabling better therapeutic se… ▽ More Glioblastoma (GBM) is one of the most aggressive and lethal human cancers. Intra-tumoral genetic heterogeneity poses a significant challenge for treatment. Biopsy is invasive, which motivates the development of non-invasive, MRI-based machine learning (ML) models to quantify intra-tumoral genetic heterogeneity for each patient. This capability holds great promise for enabling better therapeutic selection to improve patient outcomes. We proposed a novel Weakly Supervised Ordinal Support Vector Machine (WSO-SVM) to predict regional genetic alteration status within each GBM tumor using MRI. WSO-SVM was applied to a unique dataset of 318 image-localized biopsies with spatially matched multiparametric MRI from 74 GBM patients. The model was trained to predict the regional genetic alteration of three GBM driver genes (EGFR, PDGFRA, and PTEN) based on features extracted from the corresponding region of five MRI contrast images. For comparison, a variety of existing ML algorithms were also applied. The classification accuracy of each gene was compared between the different algorithms. The SHapley Additive exPlanations (SHAP) method was further applied to compute contribution scores of different contrast images. Finally, the trained WSO-SVM was used to generate prediction maps within the tumoral area of each patient to help visualize the intra-tumoral genetic heterogeneity. This study demonstrated the feasibility of using MRI and WSO-SVM to enable non-invasive prediction of intra-tumoral regional genetic alteration for each GBM patient, which can inform future adaptive therapies for individualized oncology. △ Less

Submitted 29 December, 2023; originally announced January 2024.

Comments: 36 pages, 8 figures, 3 tables

arXiv:2312.06721 [pdf, other]

Understanding Physical Dynamics with Counterfactual World Modeling

Authors: Rahul Venkatesh, Honglin Chen, Kevin Feigelis, Daniel M. Bear, Khaled Jedoui, Klemen Kotar, Felix Binder, Wanhee Lee, Sherry Liu, Kevin A. Smith, Judith E. Fan, Daniel L. K. Yamins

Abstract: The ability to understand physical dynamics is critical for agents to act in the world. Here, we use Counterfactual World Modeling (CWM) to extract vision structures for dynamics understanding. CWM uses a temporally-factored masking policy for masked prediction of video data without annotations. This policy enables highly effective "counterfactual prompting" of the predictor, allowing a spectrum o… ▽ More The ability to understand physical dynamics is critical for agents to act in the world. Here, we use Counterfactual World Modeling (CWM) to extract vision structures for dynamics understanding. CWM uses a temporally-factored masking policy for masked prediction of video data without annotations. This policy enables highly effective "counterfactual prompting" of the predictor, allowing a spectrum of visual structures to be extracted from a single pre-trained predictor without finetuning on annotated datasets. We demonstrate that these structures are useful for physical dynamics understanding, allowing CWM to achieve the state-of-the-art performance on the Physion benchmark. △ Less

Submitted 22 July, 2024; v1 submitted 10 December, 2023; originally announced December 2023.

Comments: ECCV 2024. Project page at: https://neuroailab.github.io/cwm-physics/

arXiv:2306.15668 [pdf, other]

Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties

Authors: Hsiao-Yu Tung, Mingyu Ding, Zhenfang Chen, Daniel Bear, Chuang Gan, Joshua B. Tenenbaum, Daniel LK Yamins, Judith E Fan, Kevin A. Smith

Abstract: General physical scene understanding requires more than simply localizing and recognizing objects -- it requires knowledge that objects can have different latent properties (e.g., mass or elasticity), and that those properties affect the outcome of physical events. While there has been great progress in physical and video prediction models in recent years, benchmarks to test their performance typi… ▽ More General physical scene understanding requires more than simply localizing and recognizing objects -- it requires knowledge that objects can have different latent properties (e.g., mass or elasticity), and that those properties affect the outcome of physical events. While there has been great progress in physical and video prediction models in recent years, benchmarks to test their performance typically do not require an understanding that objects have individual physical properties, or at best test only those properties that are directly observable (e.g., size or color). This work proposes a novel dataset and benchmark, termed Physion++, that rigorously evaluates visual physical prediction in artificial systems under circumstances where those predictions rely on accurate estimates of the latent physical properties of objects in the scene. Specifically, we test scenarios where accurate prediction relies on estimates of properties such as mass, friction, elasticity, and deformability, and where the values of those properties can only be inferred by observing how objects move and interact with other objects or fluids. We evaluate the performance of a number of state-of-the-art prediction models that span a variety of levels of learning vs. built-in knowledge, and compare that performance to a set of human predictions. We find that models that have been trained using standard regimes and datasets do not spontaneously learn to make inferences about latent properties, but also that models that encode objectness and physical states tend to make better predictions. However, there is still a huge gap between all models and human performance, and all models' predictions correlate poorly with those made by humans, suggesting that no state-of-the-art model is learning to make physical predictions in a human-like way. Project page: https://dingmyu.github.io/physion_v2/ △ Less

Submitted 1 November, 2023; v1 submitted 27 June, 2023; originally announced June 2023.

Comments: Accepted by NeurIPS 2023 Datasets and Benchmarks Track

arXiv:2212.09993 [pdf, other]

Are Deep Neural Networks SMARTer than Second Graders?

Authors: Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Kevin A. Smith, Joshua B. Tenenbaum

Abstract: Recent times have witnessed an increasing number of applications of deep neural networks towards solving tasks that require superior cognitive abilities, e.g., playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills? To answer this question, we propose SMART: a Simple Multimodal Algor… ▽ More Recent times have witnessed an increasing number of applications of deep neural networks towards solving tasks that require superior cognitive abilities, e.g., playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills? To answer this question, we propose SMART: a Simple Multimodal Algorithmic Reasoning Task and the associated SMART-101 dataset, for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed specifically for children in the 6--8 age group. Our dataset consists of 101 unique puzzles; each puzzle comprises a picture and a question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning, among others. To scale our dataset towards training deep neural networks, we programmatically generate entirely new instances for each puzzle, while retaining their solution algorithm. To benchmark performances on SMART-101, we propose a vision and language meta-learning model using varied state-of-the-art backbones. Our experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization. We also evaluate the recent ChatGPT and other large language models on a subset of SMART-101 and find that while these models show convincing reasoning abilities, the answers are often incorrect. △ Less

Submitted 11 September, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

Comments: Extended version of CVPR 2023 paper. For the SMART-101 dataset, see http://smartdataset.github.io/smart101

arXiv:2210.12521 [pdf, other]

H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions

Authors: Kei Ota, Hsiao-Yu Tung, Kevin A. Smith, Anoop Cherian, Tim K. Marks, Alan Sullivan, Asako Kanezaki, Joshua B. Tenenbaum

Abstract: The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn't work. We enable these capabilities in autonomous agents by proposing "Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR), a probabil… ▽ More The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn't work. We enable these capabilities in autonomous agents by proposing "Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR), a probabilistic generative framework that simultaneously generates a distribution of hypotheses about how objects articulate given input observations, captures certainty over hypotheses over time, and infer plausible actions for exploration and goal-conditioned manipulation. We compare our model with existing work in manipulating objects after a handful of exploration actions, on the PartNet-Mobility dataset. We further propose a novel PuzzleBoxes benchmark that contains locked boxes that require multiple steps to solve. We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework, despite using zero training data. We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models. △ Less

Submitted 22 October, 2022; originally announced October 2022.

arXiv:2102.12321 [pdf, other]

AGENT: A Benchmark for Core Psychological Reasoning

Authors: Tianmin Shu, Abhishek Bhandwaldar, Chuang Gan, Kevin A. Smith, Shari Liu, Dan Gutfreund, Elizabeth Spelke, Joshua B. Tenenbaum, Tomer D. Ullman

Abstract: For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life. Intuitive psychology, the ability to reason about hidden mental variables that drive observable actions, comes naturally to people: even pre-verbal infants can tell agents from objects, expecting agents to act efficiently to achieve goals given constraint… ▽ More For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life. Intuitive psychology, the ability to reason about hidden mental variables that drive observable actions, comes naturally to people: even pre-verbal infants can tell agents from objects, expecting agents to act efficiently to achieve goals given constraints. Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning. Inspired by cognitive development studies on intuitive psychology, we present a benchmark consisting of a large dataset of procedurally generated 3D animations, AGENT (Action, Goal, Efficiency, coNstraint, uTility), structured around four scenarios (goal preferences, action efficiency, unobserved constraints, and cost-reward trade-offs) that probe key concepts of core intuitive psychology. We validate AGENT with human-ratings, propose an evaluation protocol emphasizing generalization, and compare two strong baselines built on Bayesian inverse planning and a Theory of Mind neural network. Our results suggest that to pass the designed tests of core intuitive psychology at human levels, a model must acquire or have built-in representations of how agents plan, combining utility computations and core knowledge of objects and physics. △ Less

Submitted 25 July, 2021; v1 submitted 24 February, 2021; originally announced February 2021.

Comments: ICML 2021, 12 pages, 7 figures

arXiv:2011.07193 [pdf, other]

Data-Efficient Learning for Complex and Real-Time Physical Problem Solving using Augmented Simulation

Authors: Kei Ota, Devesh K. Jha, Diego Romeres, Jeroen van Baar, Kevin A. Smith, Takayuki Semitsu, Tomoaki Oiki, Alan Sullivan, Daniel Nikovski, Joshua B. Tenenbaum

Abstract: Humans quickly solve tasks in novel systems with complex dynamics, without requiring much interaction. While deep reinforcement learning algorithms have achieved tremendous success in many complex tasks, these algorithms need a large number of samples to learn meaningful policies. In this paper, we present a task for navigating a marble to the center of a circular maze. While this system is very i… ▽ More Humans quickly solve tasks in novel systems with complex dynamics, without requiring much interaction. While deep reinforcement learning algorithms have achieved tremendous success in many complex tasks, these algorithms need a large number of samples to learn meaningful policies. In this paper, we present a task for navigating a marble to the center of a circular maze. While this system is very intuitive and easy for humans to solve, it can be very difficult and inefficient for standard reinforcement learning algorithms to learn meaningful policies. We present a model that learns to move a marble in the complex environment within minutes of interacting with the real system. Learning consists of initializing a physics engine with parameters estimated using data from the real system. The error in the physics engine is then corrected using Gaussian process regression, which is used to model the residual between real observations and physics engine simulations. The physics engine augmented with the residual model is then used to control the marble in the maze environment using a model-predictive feedback over a receding horizon. To the best of our knowledge, this is the first time that a hybrid model consisting of a full physics engine along with a statistical function approximator has been used to control a complex physical system in real-time using nonlinear model-predictive control (NMPC). △ Less

Submitted 15 February, 2021; v1 submitted 13 November, 2020; originally announced November 2020.

Comments: Under submission

arXiv:1911.11180 [pdf, other]

doi 10.1038/s41467-019-13066-9

Infrared nano-spectroscopy of ferroelastic domain walls in hybrid improper ferroelectric Ca$_3$Ti$_2$O$_7$

Authors: K. A. Smith, E. A. Nowadnick, S. Fan, O. Khatib, S. J. Lim, B. Gao, N. C. Harms, S. N. Neal, J. K. Kirkland, M. C. Martin, C. J. Won, M. B. Raschke, S. -W. Cheong, C. J. Fennie, G. L. Carr, H. A. Bechtel, J. L. Musfeldt

Abstract: Ferroic materials are well known to exhibit heterogeneity in the form of domain walls. Understanding the properties of these boundaries is crucial for controlling functionality with external stimuli and for realizing their potential for ultra-low power memory and logic devices as well as novel computing architectures. In this work, we employ synchrotron-based near-field infrared nano-spectroscopy… ▽ More Ferroic materials are well known to exhibit heterogeneity in the form of domain walls. Understanding the properties of these boundaries is crucial for controlling functionality with external stimuli and for realizing their potential for ultra-low power memory and logic devices as well as novel computing architectures. In this work, we employ synchrotron-based near-field infrared nano-spectroscopy to reveal the vibrational properties of ferroelastic (90$^\circ$ ferroelectric) domain walls in the hybrid improper ferroelectric Ca$_3$Ti$_2$O$_7$. By locally mapping the Ti-O stretching and Ti-O-Ti bending modes, we reveal how structural order parameters rotate across a wall. Thus, we link observed near-field amplitude changes to underlying structural modulations and test ferroelectric switching models against real space measurements of local structure. This initiative opens the door to broadband infrared nano-imaging of heterogeneity in ferroics. △ Less

Submitted 25 November, 2019; originally announced November 2019.

Journal ref: Nature Communications 10, 5235 (2019)

arXiv:1907.09620 [pdf, other]

doi 10.1073/pnas.1912341117

Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning

Authors: Kelsey R. Allen, Kevin A. Smith, Joshua B. Tenenbaum

Abstract: Many animals, and an increasing number of artificial agents, display sophisticated capabilities to perceive and manipulate objects. But human beings remain distinctive in their capacity for flexible, creative tool use -- using objects in new ways to act on the world, achieve a goal, or solve a problem. To study this type of general physical problem solving, we introduce the Virtual Tools game. In… ▽ More Many animals, and an increasing number of artificial agents, display sophisticated capabilities to perceive and manipulate objects. But human beings remain distinctive in their capacity for flexible, creative tool use -- using objects in new ways to act on the world, achieve a goal, or solve a problem. To study this type of general physical problem solving, we introduce the Virtual Tools game. In this game, people solve a large range of challenging physical puzzles in just a handful of attempts. We propose that the flexibility of human physical problem solving rests on an ability to imagine the effects of hypothesized actions, while the efficiency of human search arises from rich action priors which are updated via observations of the world. We instantiate these components in the "Sample, Simulate, Update" (SSUP) model and show that it captures human performance across 30 levels of the Virtual Tools game. More broadly, this model provides a mechanism for explaining how people condense general physical knowledge into actionable, task-specific plans to achieve flexible and efficient physical problem-solving. △ Less

Submitted 29 June, 2020; v1 submitted 22 July, 2019; originally announced July 2019.

Comments: This manuscript is in press at PNAS. It is an extended version of a paper "Rapid Trial-and-Error Learning in Physical Problem Solving" accepted for oral presentation at the 41st Annual Meeting of the Cognitive Science Society (2019). It represents ongoing work on the part of the authors

arXiv:1705.09701 [pdf, other]

SMORE: A Cold Data Object Store for SMR Drives (Extended Version)

Authors: Peter Macko, Xiongzi Ge, John Haskins Jr., James Kelley, David Slik, Keith A. Smith, Maxim G. Smith

Abstract: Shingled magnetic recording (SMR) increases the capacity of magnetic hard drives, but it requires that each zone of a disk be written sequentially and erased in bulk. This makes SMR a good fit for workloads dominated by large data objects with limited churn. To explore this possibility, we have developed SMORE, an object storage system designed to reliably and efficiently store large, seldom-chang… ▽ More Shingled magnetic recording (SMR) increases the capacity of magnetic hard drives, but it requires that each zone of a disk be written sequentially and erased in bulk. This makes SMR a good fit for workloads dominated by large data objects with limited churn. To explore this possibility, we have developed SMORE, an object storage system designed to reliably and efficiently store large, seldom-changing data objects on an array of host-managed or host-aware SMR disks. SMORE uses a log-structured approach to accommodate the constraint that all writes to an SMR drive must be sequential within large shingled zones. It stripes data across zones on separate disks, using erasure coding to protect against drive failure. A separate garbage collection thread reclaims space by migrating live data out of the emptiest zones so that they can be trimmed and reused. An index stored on flash and backed up to the SMR drives maps object identifiers to on-disk locations. SMORE interleaves log records with object data within SMR zones to enable index recovery after a system crash (or failure of the flash device) without any additional logging mechanism. SMORE achieves full disk bandwidth when ingesting data---with a variety of object sizes---and when reading large objects. Read performance declines for smaller object sizes where inter- object seek time dominates. With a worst-case pattern of random deletions, SMORE has a write amplification (not counting RAID parity) of less than 2.0 at 80% occupancy. By taking an index snapshot every two hours, SMORE recovers from crashes in less than a minute. More frequent snapshots allow faster recovery. △ Less

Submitted 26 May, 2017; originally announced May 2017.

Comments: 13 pages, 8 figures, full version of 6 page paper published at MSST 2017

arXiv:0802.2950 [pdf]

Looking Beyond Content: Skill development for engineers

Authors: Edward F. Redish, Karl A. Smith

Abstract: Current concerns over reforming engineering education have focused attention on helping students develop skills and an adaptive expertise. Phenomenological guidelines for instruction along these lines can be understood as arising out of an emerging theory of thinking and learning built on results in the neural, cognitive, and behavioral sciences. We outline this framework and consider some of it… ▽ More Current concerns over reforming engineering education have focused attention on helping students develop skills and an adaptive expertise. Phenomenological guidelines for instruction along these lines can be understood as arising out of an emerging theory of thinking and learning built on results in the neural, cognitive, and behavioral sciences. We outline this framework and consider some of its implications for one example: developing a more detailed understanding of the specific skill of using mathematics in modeling physical situations. This approach provides theoretical underpinnings for some best-practice instructional methods designed to help students develop this skill and providesguidance for further research in the area. △ Less

Submitted 20 February, 2008; originally announced February 2008.

Comments: 20 pages

Journal ref: Journal of Engineering Education 97, 295-307 (July 2008)

Showing 1–11 of 11 results for author: Smith, K A