-
Autonomous loading of ore piles with Load-Haul-Dump machines using Deep Reinforcement Learning
Authors:
Rodrigo Salas,
Francisco Leiva,
Javier Ruiz-del-Solar
Abstract:
This work presents a deep reinforcement learning-based approach to train controllers for the autonomous loading of ore piles with a Load-Haul-Dump (LHD) machine. These controllers must perform a complete loading maneuver, filling the LHD's bucket with material while avoiding wheel drift, dumping material, or getting stuck in the pile. The training process is conducted entirely in simulation, using…
▽ More
This work presents a deep reinforcement learning-based approach to train controllers for the autonomous loading of ore piles with a Load-Haul-Dump (LHD) machine. These controllers must perform a complete loading maneuver, filling the LHD's bucket with material while avoiding wheel drift, dumping material, or getting stuck in the pile. The training process is conducted entirely in simulation, using a simple environment that leverages the Fundamental Equation of Earth-Moving Mechanics so as to achieve a low computational cost. Two different types of policies are trained: one with a hybrid action space and another with a continuous action space. The RL-based policies are evaluated both in simulation and in the real world using a scaled LHD and a scaled muck pile, and their performance is compared to that of a heuristics-based controller and human teleoperation. Additional real-world experiments are performed to assess the robustness of the RL-based policies to measurement errors in the characterization of the piles. Overall, the RL-based controllers show good performance in the real world, achieving fill factors between 71-94%, and less wheel drift than the other baselines during the loading maneuvers. A video showing the training environment and the learned behavior in simulation, as well as some of the performed experiments in the real world, can be found in https://youtu.be/jOpA1rkwhDY.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
Lessons Learned: The Evolution of an Undergraduate Robotics Course in Computer Science
Authors:
R. Pito Salas
Abstract:
Seven years ago (2016), we began integrating Robotics into our Computer Science curriculum. This paper explores the mission, initial goals and objectives, specific choices we made along the way, and why and outcomes. Of course, we were not the first to do so. Our contribution in this paper is to describe a seven-year experience in the hope that others going down this road will benefit, perhaps avo…
▽ More
Seven years ago (2016), we began integrating Robotics into our Computer Science curriculum. This paper explores the mission, initial goals and objectives, specific choices we made along the way, and why and outcomes. Of course, we were not the first to do so. Our contribution in this paper is to describe a seven-year experience in the hope that others going down this road will benefit, perhaps avoiding some missteps and dead-ends. We offer our answers to many questions that anyone undertaking bootstrapping a new robotics program may have to deal with. At the end of the paper, we discuss a set of lessons learned, including striking the right balance between depth and breadth in syllabus design and material organization, the significance of utilizing physical robots and criteria for selecting a suitable robotics platform, insights into the scope and design of a robotics lab, the necessity of standardizing hardware and software configurations, along with implementation methods, and strategies for preparing students for the steep learning curve.
△ Less
Submitted 27 April, 2024;
originally announced April 2024.
-
Tutel: Adaptive Mixture-of-Experts at Scale
Authors:
Changho Hwang,
Wei Cui,
Yifan Xiong,
Ziyue Yang,
Ze Liu,
Han Hu,
Zilong Wang,
Rafael Salas,
Jithin Jose,
Prabhat Ram,
Joe Chau,
Peng Cheng,
Fan Yang,
Mao Yang,
Yongqiang Xiong
Abstract:
Sparsely-gated mixture-of-experts (MoE) has been widely adopted to scale deep learning models to trillion-plus parameters with fixed computational cost. The algorithmic performance of MoE relies on its token routing mechanism that forwards each input token to the right sub-models or experts. While token routing dynamically determines the amount of expert workload at runtime, existing systems suffe…
▽ More
Sparsely-gated mixture-of-experts (MoE) has been widely adopted to scale deep learning models to trillion-plus parameters with fixed computational cost. The algorithmic performance of MoE relies on its token routing mechanism that forwards each input token to the right sub-models or experts. While token routing dynamically determines the amount of expert workload at runtime, existing systems suffer inefficient computation due to their static execution, namely static parallelism and pipelining, which does not adapt to the dynamic workload. We present Flex, a highly scalable stack design and implementation for MoE with dynamically adaptive parallelism and pipelining. Flex designs an identical layout for distributing MoE model parameters and input data, which can be leveraged by all possible parallelism or pipelining methods without any mathematical inequivalence or tensor migration overhead. This enables adaptive parallelism/pipelining optimization at zero cost during runtime. Based on this key design, Flex also implements various MoE acceleration techniques. Aggregating all techniques, Flex finally delivers huge speedup at any scale -- 4.96x and 5.75x speedup of a single MoE layer over 16 and 2,048 A100 GPUs, respectively, over the previous state-of-the-art. Our evaluation shows that Flex efficiently and effectively runs a real-world MoE-based model named SwinV2-MoE, built upon Swin Transformer V2, a state-of-the-art computer vision architecture. On efficiency, Flex accelerates SwinV2-MoE, achieving up to 1.55x and 2.11x speedup in training and inference over Fairseq, respectively. On effectiveness, the SwinV2-MoE model achieves superior accuracy in both pre-training and down-stream computer vision tasks such as COCO object detection than the counterpart dense model, indicating the readiness of Flex for end-to-end real-world model training and inference.
△ Less
Submitted 5 June, 2023; v1 submitted 7 June, 2022;
originally announced June 2022.
-
Rotation invariant CNN using scattering transform for image classification
Authors:
Rosemberg Rodriguez Salas,
Eva Dokladalova,
Petr Dokládal
Abstract:
Deep convolutional neural networks accuracy is heavily impacted by rotations of the input data. In this paper, we propose a convolutional predictor that is invariant to rotations in the input. This architecture is capable of predicting the angular orientation without angle-annotated data. Furthermore, the predictor maps continuously the random rotation of the input to a circular space of the predi…
▽ More
Deep convolutional neural networks accuracy is heavily impacted by rotations of the input data. In this paper, we propose a convolutional predictor that is invariant to rotations in the input. This architecture is capable of predicting the angular orientation without angle-annotated data. Furthermore, the predictor maps continuously the random rotation of the input to a circular space of the prediction. For this purpose, we use the roto-translation properties existing in the Scattering Transform Networks with a series of 3D Convolutions. We validate the results by training with upright and randomly rotated samples. This allows further applications of this work on fields like automatic re-orientation of randomly oriented datasets.
△ Less
Submitted 21 May, 2021;
originally announced May 2021.
-
Teaching Continuity in Robotics Labs in the Age of Covid and Beyond
Authors:
R. Pito Salas
Abstract:
This paper argues that training of future Roboticists and Robotics Engineers in Computer Science departments, requires the extensive direct work with real robots, and that this educational mission will be negatively impacted when access to robotics learning laboratories is curtailed. This is exactly the problem that Robotics Labs encountered in early 2020, at the start of the Covid pandemic. The p…
▽ More
This paper argues that training of future Roboticists and Robotics Engineers in Computer Science departments, requires the extensive direct work with real robots, and that this educational mission will be negatively impacted when access to robotics learning laboratories is curtailed. This is exactly the problem that Robotics Labs encountered in early 2020, at the start of the Covid pandemic. The paper then turns to the description of a remote/virtual robotics teaching laboratory and examines in detail what that would mean, what the benefits would be, and how it may be used. Part of this vision was implemented at our institution during 2020 and has been in constant use since then. The specific architecture and implementation, as far as it has been built, is described. The exciting insight in the conclusion is that the work that was encouraged and triggered by a pandemic seems to have very positive longer-term benefits of increasing access to robotics education, increasing the ability of any one institution to scale their robotics education greatly, and potentially do this while reducing costs.
△ Less
Submitted 18 May, 2021;
originally announced May 2021.
-
Situated Multimodal Control of a Mobile Robot: Navigation through a Virtual Environment
Authors:
Katherine Krajovic,
Nikhil Krishnaswamy,
Nathaniel J. Dimick,
R. Pito Salas,
James Pustejovsky
Abstract:
We present a new interface for controlling a navigation robot in novel environments using coordinated gesture and language. We use a TurtleBot3 robot with a LIDAR and a camera, an embodied simulation of what the robot has encountered while exploring, and a cross-platform bridge facilitating generic communication. A human partner can deliver instructions to the robot using spoken English and gestur…
▽ More
We present a new interface for controlling a navigation robot in novel environments using coordinated gesture and language. We use a TurtleBot3 robot with a LIDAR and a camera, an embodied simulation of what the robot has encountered while exploring, and a cross-platform bridge facilitating generic communication. A human partner can deliver instructions to the robot using spoken English and gestures relative to the simulated environment, to guide the robot through navigation tasks.
△ Less
Submitted 13 July, 2020;
originally announced July 2020.
-
Reusable Learning Objects: An Agile Approach
Authors:
R. Pito Salas
Abstract:
This paper discusses Reusable Learning Objects (RLOs) and to what extent they have lived up to the promise, particularly of reusability. Reusable Learning Objects have actually been discussed in the literature for the last 20 years and yet true large scale sharing of learning and teaching materials remains relatively rare and challenging. This paper argues that part of the reason is that the granu…
▽ More
This paper discusses Reusable Learning Objects (RLOs) and to what extent they have lived up to the promise, particularly of reusability. Reusable Learning Objects have actually been discussed in the literature for the last 20 years and yet true large scale sharing of learning and teaching materials remains relatively rare and challenging. This paper argues that part of the reason is that the granularity of the learning objects that are in use today is not conducive to true reuse. Certainly whole PowerPoint slide decks and word documents are kept in individual files and folders. It is not an ideal situation. As a result, educators, teachers, course designers, are constantly reinventing the wheel, or searching for where that one excellent assignment, explanation, definition was last seen so it can be copied forward. This paper argues that to achieve effective reuse of Learning Objects, the following are required: smaller, more granular (micro) learning objects; means to combine them into larger presentation products; and modern revision and version control. The paper proposes applying approaches originating in the software engineering community, such as agile methodology, version control and management, markup languages, and agile publishing, which together form the Agile Approach of the title of the paper. With that foundation laid, the paper examines CourseGen, an open source software platform designed for creating, sharing, reusing and publishing reusable course content. CourseGen uses a modified markdown format augmented by CourseGen specific directives, such as $link to and $include topic. The CourseGen compiler converts a collection of CourseGen files into the final format such as a web site or a PowerPoint. CourseGen was designed, used and refined over the last three years in several Computer Science Courses at Brandeis University.
△ Less
Submitted 9 July, 2020;
originally announced July 2020.
-
The impact of binaural white noise with oscillations of 100 to 750hz in the short-term visual working memory and the reactivity of alpha and beta cerebral waves
Authors:
Cesar Rommel Salas
Abstract:
According to some researchers, noise is typically conceived as a detrimental factor in cognitive performance affecting perception, decision making, and motor function. However, in recent studies it is associated with white noise with concentration and calm, therefore, this research seeks to establish the impact of binaural white noise on the performance of short-term visual and working memory, the…
▽ More
According to some researchers, noise is typically conceived as a detrimental factor in cognitive performance affecting perception, decision making, and motor function. However, in recent studies it is associated with white noise with concentration and calm, therefore, this research seeks to establish the impact of binaural white noise on the performance of short-term visual and working memory, the alpha - beta brain activity, attention - meditation, through the use of two auditory stimuli with frequency ranges of (100 to 450hz) and (100 to 750hz). This study was conducted in the city of Montes Claros, the Republic of Brazil, where seven participants were evaluated (n = 7) with an average age of 36.71, and two age groups (GP1) 21 to 30 and (GP2) 41 50, with university studies. Within the experimental process, the short-term visual memory tests were performed using the cognitive assessment battery CAB of CogniFit, and the recording of brain activities through the use of monopolar electroencephalogram and the eSense algorithms. With the results obtained and through the use of statistical tests, we can infer that the binaural white noise with oscillations of 100 to 750 Hz contributed to the performance of visual work memory in the short term
△ Less
Submitted 5 May, 2018;
originally announced May 2018.
-
GANs for Biological Image Synthesis
Authors:
Anton Osokin,
Anatole Chessel,
Rafael E. Carazo Salas,
Federico Vaggi
Abstract:
In this paper, we propose a novel application of Generative Adversarial Networks (GAN) to the synthesis of cells imaged by fluorescence microscopy. Compared to natural images, cells tend to have a simpler and more geometric global structure that facilitates image generation. However, the correlation between the spatial pattern of different fluorescent proteins reflects important biological functio…
▽ More
In this paper, we propose a novel application of Generative Adversarial Networks (GAN) to the synthesis of cells imaged by fluorescence microscopy. Compared to natural images, cells tend to have a simpler and more geometric global structure that facilitates image generation. However, the correlation between the spatial pattern of different fluorescent proteins reflects important biological functions, and synthesized images have to capture these relationships to be relevant for biological applications. We adapt GANs to the task at hand and propose new models with casual dependencies between image channels that can generate multi-channel images, which would be impossible to obtain experimentally. We evaluate our approach using two independent techniques and compare it against sensible baselines. Finally, we demonstrate that by interpolating across the latent space we can mimic the known changes in protein localization that occur through time during the cell cycle, allowing us to predict temporal evolution from static images.
△ Less
Submitted 12 September, 2017; v1 submitted 15 August, 2017;
originally announced August 2017.
-
Modelo de Aprendizaje Biocibernetico BLM
Authors:
Rommel Salas
Abstract:
Education in the digital period in which we live, is reaching challenges never before seen, preceded by phenomena that involve not only traditional social units, but also new virtual communities; Innovating is difficult, it is a challenge, however, we must think of new teaching methods that impact the current generation of students, who arrive with new needs and expectations. The construction of k…
▽ More
Education in the digital period in which we live, is reaching challenges never before seen, preceded by phenomena that involve not only traditional social units, but also new virtual communities; Innovating is difficult, it is a challenge, however, we must think of new teaching methods that impact the current generation of students, who arrive with new needs and expectations. The construction of knowledge from the subject and the virtual world that surrounds it, establishes the basis for the development of a new model of teaching, where the classroom is the particular representation of a new physical-cybernetic ecosystem composed of the three large dimensions. Which are part of this new techno-social convergence (human - information - machine); Allowing an interrelation between the student, information, machine and the teacher; Using Biocybernetic methods of influence, control and replication, by means of the massive impact vector (i); In addition, the development of new strategies assisted by cybernetics and the updating of academic content according to the new teaching environment. Hence the importance of this study, which leads us to the need for a new model of transforming academic instruction, which is not based on a conglomerate of technological tools, but establishes a new educational and transformative model, based on "Collaborative Thinking" and the ubiquity of information, thus establishing the relationship between the subject and object of study, thus allowing us to establish the new Biocybernetic educational paradigm in the digital period.
△ Less
Submitted 27 June, 2017;
originally announced June 2017.
-
Antropologia de la Informatica Social: Teoria de la Convergencia Tecno-Social
Authors:
Rommel Salas
Abstract:
The traditional humanism of the twentieth century, inspired by the culture of the book, systematically distanced itself from the new society of digital information; the Internet and tools of information processing revolutionized the world, society during this period developed certain adaptive characteristics based on coexistence (Human - Machine), this transformation sets based on the impact of th…
▽ More
The traditional humanism of the twentieth century, inspired by the culture of the book, systematically distanced itself from the new society of digital information; the Internet and tools of information processing revolutionized the world, society during this period developed certain adaptive characteristics based on coexistence (Human - Machine), this transformation sets based on the impact of three technology segments: devices, applications and infrastructure of social communication, which are involved in various physical, behavioural and cognitive changes of the human being; and the emergence of new models of influence and social control through the new ubiquitous communication; however in this new process of conviviality new models like the "collaborative thinking" and "InfoSharing" develop; managing social information under three Human ontological dimensions (h) - Information (i) - Machine (m), which is the basis of a new physical-cyber ecosystem, where they coexist and develop new social units called "virtual communities ". This new communication infrastructure and social management of information given discovered areas of vulnerability "social perspective of risk", impacting all social units through massive impact vector (i); The virtual environment "H + i + M"; and its components, as well as the life cycle management of social information allows us to understand the path of integration "Techno - Social" and setting a new contribution to cybernetics, within the convergence of technology with society and the new challenges of coexistence, aimed at a new holistic and not pragmatic vision, as the human component (h) in the virtual environment is the precursor of the future and needs to be studied not as an application, but as the hub of a new society.
△ Less
Submitted 27 June, 2017;
originally announced June 2017.