-
Quantifying intra-tumoral genetic heterogeneity of glioblastoma toward precision medicine using MRI and a data-inclusive machine learning algorithm
Authors:
Lujia Wang,
Hairong Wang,
Fulvio D'Angelo,
Lee Curtin,
Christopher P. Sereduk,
Gustavo De Leon,
Kyle W. Singleton,
Javier Urcuyo,
Andrea Hawkins-Daarud,
Pamela R. Jackson,
Chandan Krishna,
Richard S. Zimmerman,
Devi P. Patra,
Bernard R. Bendok,
Kris A. Smith,
Peter Nakaji,
Kliment Donev,
Leslie C. Baxter,
Maciej M. MrugaĊa,
Michele Ceccarelli,
Antonio Iavarone,
Kristin R. Swanson,
Nhan L. Tran,
Leland S. Hu,
Jing Li
Abstract:
Glioblastoma (GBM) is one of the most aggressive and lethal human cancers. Intra-tumoral genetic heterogeneity poses a significant challenge for treatment. Biopsy is invasive, which motivates the development of non-invasive, MRI-based machine learning (ML) models to quantify intra-tumoral genetic heterogeneity for each patient. This capability holds great promise for enabling better therapeutic se…
▽ More
Glioblastoma (GBM) is one of the most aggressive and lethal human cancers. Intra-tumoral genetic heterogeneity poses a significant challenge for treatment. Biopsy is invasive, which motivates the development of non-invasive, MRI-based machine learning (ML) models to quantify intra-tumoral genetic heterogeneity for each patient. This capability holds great promise for enabling better therapeutic selection to improve patient outcomes. We proposed a novel Weakly Supervised Ordinal Support Vector Machine (WSO-SVM) to predict regional genetic alteration status within each GBM tumor using MRI. WSO-SVM was applied to a unique dataset of 318 image-localized biopsies with spatially matched multiparametric MRI from 74 GBM patients. The model was trained to predict the regional genetic alteration of three GBM driver genes (EGFR, PDGFRA, and PTEN) based on features extracted from the corresponding region of five MRI contrast images. For comparison, a variety of existing ML algorithms were also applied. The classification accuracy of each gene was compared between the different algorithms. The SHapley Additive exPlanations (SHAP) method was further applied to compute contribution scores of different contrast images. Finally, the trained WSO-SVM was used to generate prediction maps within the tumoral area of each patient to help visualize the intra-tumoral genetic heterogeneity. This study demonstrated the feasibility of using MRI and WSO-SVM to enable non-invasive prediction of intra-tumoral regional genetic alteration for each GBM patient, which can inform future adaptive therapies for individualized oncology.
△ Less
Submitted 29 December, 2023;
originally announced January 2024.
-
Understanding Physical Dynamics with Counterfactual World Modeling
Authors:
Rahul Venkatesh,
Honglin Chen,
Kevin Feigelis,
Daniel M. Bear,
Khaled Jedoui,
Klemen Kotar,
Felix Binder,
Wanhee Lee,
Sherry Liu,
Kevin A. Smith,
Judith E. Fan,
Daniel L. K. Yamins
Abstract:
The ability to understand physical dynamics is critical for agents to act in the world. Here, we use Counterfactual World Modeling (CWM) to extract vision structures for dynamics understanding. CWM uses a temporally-factored masking policy for masked prediction of video data without annotations. This policy enables highly effective "counterfactual prompting" of the predictor, allowing a spectrum o…
▽ More
The ability to understand physical dynamics is critical for agents to act in the world. Here, we use Counterfactual World Modeling (CWM) to extract vision structures for dynamics understanding. CWM uses a temporally-factored masking policy for masked prediction of video data without annotations. This policy enables highly effective "counterfactual prompting" of the predictor, allowing a spectrum of visual structures to be extracted from a single pre-trained predictor without finetuning on annotated datasets. We demonstrate that these structures are useful for physical dynamics understanding, allowing CWM to achieve the state-of-the-art performance on the Physion benchmark.
△ Less
Submitted 22 July, 2024; v1 submitted 10 December, 2023;
originally announced December 2023.
-
Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties
Authors:
Hsiao-Yu Tung,
Mingyu Ding,
Zhenfang Chen,
Daniel Bear,
Chuang Gan,
Joshua B. Tenenbaum,
Daniel LK Yamins,
Judith E Fan,
Kevin A. Smith
Abstract:
General physical scene understanding requires more than simply localizing and recognizing objects -- it requires knowledge that objects can have different latent properties (e.g., mass or elasticity), and that those properties affect the outcome of physical events. While there has been great progress in physical and video prediction models in recent years, benchmarks to test their performance typi…
▽ More
General physical scene understanding requires more than simply localizing and recognizing objects -- it requires knowledge that objects can have different latent properties (e.g., mass or elasticity), and that those properties affect the outcome of physical events. While there has been great progress in physical and video prediction models in recent years, benchmarks to test their performance typically do not require an understanding that objects have individual physical properties, or at best test only those properties that are directly observable (e.g., size or color). This work proposes a novel dataset and benchmark, termed Physion++, that rigorously evaluates visual physical prediction in artificial systems under circumstances where those predictions rely on accurate estimates of the latent physical properties of objects in the scene. Specifically, we test scenarios where accurate prediction relies on estimates of properties such as mass, friction, elasticity, and deformability, and where the values of those properties can only be inferred by observing how objects move and interact with other objects or fluids. We evaluate the performance of a number of state-of-the-art prediction models that span a variety of levels of learning vs. built-in knowledge, and compare that performance to a set of human predictions. We find that models that have been trained using standard regimes and datasets do not spontaneously learn to make inferences about latent properties, but also that models that encode objectness and physical states tend to make better predictions. However, there is still a huge gap between all models and human performance, and all models' predictions correlate poorly with those made by humans, suggesting that no state-of-the-art model is learning to make physical predictions in a human-like way. Project page: https://dingmyu.github.io/physion_v2/
△ Less
Submitted 1 November, 2023; v1 submitted 27 June, 2023;
originally announced June 2023.
-
Are Deep Neural Networks SMARTer than Second Graders?
Authors:
Anoop Cherian,
Kuan-Chuan Peng,
Suhas Lohit,
Kevin A. Smith,
Joshua B. Tenenbaum
Abstract:
Recent times have witnessed an increasing number of applications of deep neural networks towards solving tasks that require superior cognitive abilities, e.g., playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills? To answer this question, we propose SMART: a Simple Multimodal Algor…
▽ More
Recent times have witnessed an increasing number of applications of deep neural networks towards solving tasks that require superior cognitive abilities, e.g., playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills? To answer this question, we propose SMART: a Simple Multimodal Algorithmic Reasoning Task and the associated SMART-101 dataset, for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed specifically for children in the 6--8 age group. Our dataset consists of 101 unique puzzles; each puzzle comprises a picture and a question, and their solution needs a mix of several elementary skills, including arithmetic, algebra, and spatial reasoning, among others. To scale our dataset towards training deep neural networks, we programmatically generate entirely new instances for each puzzle, while retaining their solution algorithm. To benchmark performances on SMART-101, we propose a vision and language meta-learning model using varied state-of-the-art backbones. Our experiments reveal that while powerful deep models offer reasonable performances on puzzles in a supervised setting, they are not better than random accuracy when analyzed for generalization. We also evaluate the recent ChatGPT and other large language models on a subset of SMART-101 and find that while these models show convincing reasoning abilities, the answers are often incorrect.
△ Less
Submitted 11 September, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions
Authors:
Kei Ota,
Hsiao-Yu Tung,
Kevin A. Smith,
Anoop Cherian,
Tim K. Marks,
Alan Sullivan,
Asako Kanezaki,
Joshua B. Tenenbaum
Abstract:
The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn't work. We enable these capabilities in autonomous agents by proposing "Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR), a probabil…
▽ More
The world is filled with articulated objects that are difficult to determine how to use from vision alone, e.g., a door might open inwards or outwards. Humans handle these objects with strategic trial-and-error: first pushing a door then pulling if that doesn't work. We enable these capabilities in autonomous agents by proposing "Hypothesize, Simulate, Act, Update, and Repeat" (H-SAUR), a probabilistic generative framework that simultaneously generates a distribution of hypotheses about how objects articulate given input observations, captures certainty over hypotheses over time, and infer plausible actions for exploration and goal-conditioned manipulation. We compare our model with existing work in manipulating objects after a handful of exploration actions, on the PartNet-Mobility dataset. We further propose a novel PuzzleBoxes benchmark that contains locked boxes that require multiple steps to solve. We show that the proposed model significantly outperforms the current state-of-the-art articulated object manipulation framework, despite using zero training data. We further improve the test-time efficiency of H-SAUR by integrating a learned prior from learning-based vision models.
△ Less
Submitted 22 October, 2022;
originally announced October 2022.
-
AGENT: A Benchmark for Core Psychological Reasoning
Authors:
Tianmin Shu,
Abhishek Bhandwaldar,
Chuang Gan,
Kevin A. Smith,
Shari Liu,
Dan Gutfreund,
Elizabeth Spelke,
Joshua B. Tenenbaum,
Tomer D. Ullman
Abstract:
For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life. Intuitive psychology, the ability to reason about hidden mental variables that drive observable actions, comes naturally to people: even pre-verbal infants can tell agents from objects, expecting agents to act efficiently to achieve goals given constraint…
▽ More
For machine agents to successfully interact with humans in real-world settings, they will need to develop an understanding of human mental life. Intuitive psychology, the ability to reason about hidden mental variables that drive observable actions, comes naturally to people: even pre-verbal infants can tell agents from objects, expecting agents to act efficiently to achieve goals given constraints. Despite recent interest in machine agents that reason about other agents, it is not clear if such agents learn or hold the core psychology principles that drive human reasoning. Inspired by cognitive development studies on intuitive psychology, we present a benchmark consisting of a large dataset of procedurally generated 3D animations, AGENT (Action, Goal, Efficiency, coNstraint, uTility), structured around four scenarios (goal preferences, action efficiency, unobserved constraints, and cost-reward trade-offs) that probe key concepts of core intuitive psychology. We validate AGENT with human-ratings, propose an evaluation protocol emphasizing generalization, and compare two strong baselines built on Bayesian inverse planning and a Theory of Mind neural network. Our results suggest that to pass the designed tests of core intuitive psychology at human levels, a model must acquire or have built-in representations of how agents plan, combining utility computations and core knowledge of objects and physics.
△ Less
Submitted 25 July, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Data-Efficient Learning for Complex and Real-Time Physical Problem Solving using Augmented Simulation
Authors:
Kei Ota,
Devesh K. Jha,
Diego Romeres,
Jeroen van Baar,
Kevin A. Smith,
Takayuki Semitsu,
Tomoaki Oiki,
Alan Sullivan,
Daniel Nikovski,
Joshua B. Tenenbaum
Abstract:
Humans quickly solve tasks in novel systems with complex dynamics, without requiring much interaction. While deep reinforcement learning algorithms have achieved tremendous success in many complex tasks, these algorithms need a large number of samples to learn meaningful policies. In this paper, we present a task for navigating a marble to the center of a circular maze. While this system is very i…
▽ More
Humans quickly solve tasks in novel systems with complex dynamics, without requiring much interaction. While deep reinforcement learning algorithms have achieved tremendous success in many complex tasks, these algorithms need a large number of samples to learn meaningful policies. In this paper, we present a task for navigating a marble to the center of a circular maze. While this system is very intuitive and easy for humans to solve, it can be very difficult and inefficient for standard reinforcement learning algorithms to learn meaningful policies. We present a model that learns to move a marble in the complex environment within minutes of interacting with the real system. Learning consists of initializing a physics engine with parameters estimated using data from the real system. The error in the physics engine is then corrected using Gaussian process regression, which is used to model the residual between real observations and physics engine simulations. The physics engine augmented with the residual model is then used to control the marble in the maze environment using a model-predictive feedback over a receding horizon. To the best of our knowledge, this is the first time that a hybrid model consisting of a full physics engine along with a statistical function approximator has been used to control a complex physical system in real-time using nonlinear model-predictive control (NMPC).
△ Less
Submitted 15 February, 2021; v1 submitted 13 November, 2020;
originally announced November 2020.
-
Infrared nano-spectroscopy of ferroelastic domain walls in hybrid improper ferroelectric Ca$_3$Ti$_2$O$_7$
Authors:
K. A. Smith,
E. A. Nowadnick,
S. Fan,
O. Khatib,
S. J. Lim,
B. Gao,
N. C. Harms,
S. N. Neal,
J. K. Kirkland,
M. C. Martin,
C. J. Won,
M. B. Raschke,
S. -W. Cheong,
C. J. Fennie,
G. L. Carr,
H. A. Bechtel,
J. L. Musfeldt
Abstract:
Ferroic materials are well known to exhibit heterogeneity in the form of domain walls. Understanding the properties of these boundaries is crucial for controlling functionality with external stimuli and for realizing their potential for ultra-low power memory and logic devices as well as novel computing architectures. In this work, we employ synchrotron-based near-field infrared nano-spectroscopy…
▽ More
Ferroic materials are well known to exhibit heterogeneity in the form of domain walls. Understanding the properties of these boundaries is crucial for controlling functionality with external stimuli and for realizing their potential for ultra-low power memory and logic devices as well as novel computing architectures. In this work, we employ synchrotron-based near-field infrared nano-spectroscopy to reveal the vibrational properties of ferroelastic (90$^\circ$ ferroelectric) domain walls in the hybrid improper ferroelectric Ca$_3$Ti$_2$O$_7$. By locally mapping the Ti-O stretching and Ti-O-Ti bending modes, we reveal how structural order parameters rotate across a wall. Thus, we link observed near-field amplitude changes to underlying structural modulations and test ferroelectric switching models against real space measurements of local structure. This initiative opens the door to broadband infrared nano-imaging of heterogeneity in ferroics.
△ Less
Submitted 25 November, 2019;
originally announced November 2019.
-
Rapid trial-and-error learning with simulation supports flexible tool use and physical reasoning
Authors:
Kelsey R. Allen,
Kevin A. Smith,
Joshua B. Tenenbaum
Abstract:
Many animals, and an increasing number of artificial agents, display sophisticated capabilities to perceive and manipulate objects. But human beings remain distinctive in their capacity for flexible, creative tool use -- using objects in new ways to act on the world, achieve a goal, or solve a problem. To study this type of general physical problem solving, we introduce the Virtual Tools game. In…
▽ More
Many animals, and an increasing number of artificial agents, display sophisticated capabilities to perceive and manipulate objects. But human beings remain distinctive in their capacity for flexible, creative tool use -- using objects in new ways to act on the world, achieve a goal, or solve a problem. To study this type of general physical problem solving, we introduce the Virtual Tools game. In this game, people solve a large range of challenging physical puzzles in just a handful of attempts. We propose that the flexibility of human physical problem solving rests on an ability to imagine the effects of hypothesized actions, while the efficiency of human search arises from rich action priors which are updated via observations of the world. We instantiate these components in the "Sample, Simulate, Update" (SSUP) model and show that it captures human performance across 30 levels of the Virtual Tools game. More broadly, this model provides a mechanism for explaining how people condense general physical knowledge into actionable, task-specific plans to achieve flexible and efficient physical problem-solving.
△ Less
Submitted 29 June, 2020; v1 submitted 22 July, 2019;
originally announced July 2019.
-
SMORE: A Cold Data Object Store for SMR Drives (Extended Version)
Authors:
Peter Macko,
Xiongzi Ge,
John Haskins Jr.,
James Kelley,
David Slik,
Keith A. Smith,
Maxim G. Smith
Abstract:
Shingled magnetic recording (SMR) increases the capacity of magnetic hard drives, but it requires that each zone of a disk be written sequentially and erased in bulk. This makes SMR a good fit for workloads dominated by large data objects with limited churn. To explore this possibility, we have developed SMORE, an object storage system designed to reliably and efficiently store large, seldom-chang…
▽ More
Shingled magnetic recording (SMR) increases the capacity of magnetic hard drives, but it requires that each zone of a disk be written sequentially and erased in bulk. This makes SMR a good fit for workloads dominated by large data objects with limited churn. To explore this possibility, we have developed SMORE, an object storage system designed to reliably and efficiently store large, seldom-changing data objects on an array of host-managed or host-aware SMR disks.
SMORE uses a log-structured approach to accommodate the constraint that all writes to an SMR drive must be sequential within large shingled zones. It stripes data across zones on separate disks, using erasure coding to protect against drive failure. A separate garbage collection thread reclaims space by migrating live data out of the emptiest zones so that they can be trimmed and reused. An index stored on flash and backed up to the SMR drives maps object identifiers to on-disk locations. SMORE interleaves log records with object data within SMR zones to enable index recovery after a system crash (or failure of the flash device) without any additional logging mechanism.
SMORE achieves full disk bandwidth when ingesting data---with a variety of object sizes---and when reading large objects. Read performance declines for smaller object sizes where inter- object seek time dominates. With a worst-case pattern of random deletions, SMORE has a write amplification (not counting RAID parity) of less than 2.0 at 80% occupancy. By taking an index snapshot every two hours, SMORE recovers from crashes in less than a minute. More frequent snapshots allow faster recovery.
△ Less
Submitted 26 May, 2017;
originally announced May 2017.
-
Looking Beyond Content: Skill development for engineers
Authors:
Edward F. Redish,
Karl A. Smith
Abstract:
Current concerns over reforming engineering education have focused attention on helping students develop skills and an adaptive expertise. Phenomenological guidelines for instruction along these lines can be understood as arising out of an emerging theory of thinking and learning built on results in the neural, cognitive, and behavioral sciences. We outline this framework and consider some of it…
▽ More
Current concerns over reforming engineering education have focused attention on helping students develop skills and an adaptive expertise. Phenomenological guidelines for instruction along these lines can be understood as arising out of an emerging theory of thinking and learning built on results in the neural, cognitive, and behavioral sciences. We outline this framework and consider some of its implications for one example: developing a more detailed understanding of the specific skill of using mathematics in modeling physical situations. This approach provides theoretical underpinnings for some best-practice instructional methods designed to help students develop this skill and providesguidance for further research in the area.
△ Less
Submitted 20 February, 2008;
originally announced February 2008.