-
Model Evaluation of a Transformable CubeSat for Nonholonomic Attitude Reorientation Using a Drop Tower
Authors:
Yuki Kubo,
Tsubasa Ando,
Hirona Kawahara,
Shu Miyata,
Naoya Uchiyama,
Kazutoshi Ito,
Yoshiki Sugawara
Abstract:
This paper presents a design for a drop tower test to evaluate a numerical model for a structurally reconfigurable spacecraft with actuatable joints, referred to as a transformable spacecraft. A mock-up robot for a 3U-sized transformable spacecraft is designed to fit in a limited time and space of the microgravity environment available in the drop tower. The robot performs agile reorientation, ref…
▽ More
This paper presents a design for a drop tower test to evaluate a numerical model for a structurally reconfigurable spacecraft with actuatable joints, referred to as a transformable spacecraft. A mock-up robot for a 3U-sized transformable spacecraft is designed to fit in a limited time and space of the microgravity environment available in the drop tower. The robot performs agile reorientation, referred to as nonholonomic attitude control, by actuating joints in a particular manner. To adapt to the very short duration of microgravity in the drop tower test, a successive joint actuation maneuver is optimized to maximize the amount of attitude reorientation within the time constraint. The robot records the angular velocity history of all four bodies, and the data is analyzed to evaluate the accuracy of the numerical model. We confirm that the constructed numerical model sufficiently replicates the robot's motion and show that the post-experiment model corrections further improve the accuracy of the numerical simulations. Finally, the difference between this drop tower test and the actual orbit demonstration is discussed to show the prospect.
△ Less
Submitted 23 January, 2025;
originally announced January 2025.
-
Proposal of protocols for speech materials acquisition and presentation assisted by tools based on structured test signals
Authors:
Hideki Kawahara,
Ken-Ichi Sakakibara,
Mitsunori Mizumachi,
Kohei Yatabe
Abstract:
We propose protocols for acquiring speech materials, making them reusable for future investigations, and presenting them for subjective experiments. We also provide means to evaluate existing speech materials' compatibility with target applications. We built these protocols and tools based on structured test signals and analysis methods, including a new family of the Time-Stretched Pulse (TSP). Ov…
▽ More
We propose protocols for acquiring speech materials, making them reusable for future investigations, and presenting them for subjective experiments. We also provide means to evaluate existing speech materials' compatibility with target applications. We built these protocols and tools based on structured test signals and analysis methods, including a new family of the Time-Stretched Pulse (TSP). Over a billion times more powerful computational (including software development) resources than a half-century ago enabled these protocols and tools to be accessible to under-resourced environments.
△ Less
Submitted 30 September, 2024;
originally announced September 2024.
-
Interactive tools for making temporally variable, multiple-attributes, and multiple-instances morphing accessible: Flexible manipulation of divergent speech instances for explorational research and education
Authors:
Hideki Kawahara,
Masanori Morise
Abstract:
We generalized a voice morphing algorithm capable of handling temporally variable, multiple-attributes, and multiple instances. The generalized morphing provides a new strategy for investigating speech diversity. However, excessive complexity and the difficulty of preparation have prevented researchers and students from enjoying its benefits. To address this issue, we introduced a set of interacti…
▽ More
We generalized a voice morphing algorithm capable of handling temporally variable, multiple-attributes, and multiple instances. The generalized morphing provides a new strategy for investigating speech diversity. However, excessive complexity and the difficulty of preparation have prevented researchers and students from enjoying its benefits. To address this issue, we introduced a set of interactive tools to make preparation and tests less cumbersome. These tools are integrated into our previously reported interactive tools as extensions. The introduction of the extended tools in lessons in graduate education was successful. Finally, we outline further extensions to explore excessively complex morphing parameter settings.
△ Less
Submitted 20 April, 2024;
originally announced April 2024.
-
Simultaneous Measurement of Multiple Acoustic Attributes Using Structured Periodic Test Signals Including Music and Other Sound Materials
Authors:
Hideki Kawahara,
Kohei Yatabe,
Ken-Ichi Sakakibara,
Mitsunori Mizumachi,
Tatsuya Kitamura
Abstract:
We introduce a general framework for measuring acoustic properties such as liner time-invariant (LTI) response, signal-dependent time-invariant (SDTI) component, and random and time-varying (RTV) component simultaneously using structured periodic test signals. The framework also enables music pieces and other sound materials as test signals by "safeguarding" them by adding slight deterministic "no…
▽ More
We introduce a general framework for measuring acoustic properties such as liner time-invariant (LTI) response, signal-dependent time-invariant (SDTI) component, and random and time-varying (RTV) component simultaneously using structured periodic test signals. The framework also enables music pieces and other sound materials as test signals by "safeguarding" them by adding slight deterministic "noise." Measurement using swept-sin, MLS (Maxim Length Sequence), and their variants are special cases of the proposed framework. We implemented interactive and real-time measuring tools based on this framework and made them open-source. Furthermore, we applied this framework to assess pitch extractors objectively.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
Measuring pitch extractors' response to frequency-modulated multi-component signals
Authors:
Hideki Kawahara,
Kohei Yatabe,
Ken-Ichi Sakakibara,
Tatsuya Kitamura,
Hideki Banno,
Masanori Morise
Abstract:
This article focuses on the research tool for investigating the fundamental frequencies of voiced sounds. We introduce an objective and informative measurement method of pitch extractors' response to frequency-modulated tones. The method uses a new test signal for acoustic system analysis. The test signal enables simultaneous measurement of the extractors' responses. They are the modulation freque…
▽ More
This article focuses on the research tool for investigating the fundamental frequencies of voiced sounds. We introduce an objective and informative measurement method of pitch extractors' response to frequency-modulated tones. The method uses a new test signal for acoustic system analysis. The test signal enables simultaneous measurement of the extractors' responses. They are the modulation frequency response and the total distortion, including intermodulation distortions. We applied this method to various pitch extractors and placed them on several performance maps. We used the proposed method to fine-tune one of the extractors to make it the best fit tool for scientific research of voice fundamental frequencies.
△ Less
Submitted 2 April, 2022;
originally announced April 2022.
-
An objective test tool for pitch extractors' response attributes
Authors:
Hideki Kawahara,
Kohei Yatabe,
Ken-Ichi Sakakibara,
Tatsuya Kitamura,
Hideki Banno,
Masanori Morise
Abstract:
We propose an objective measurement method for pitch extractors' responses to frequency-modulated signals. It enables us to evaluate different pitch extractors with unified criteria. The method uses extended time-stretched pulses combined by binary orthogonal sequences. It provides simultaneous measurement results consisting of the linear and the non-linear time-invariant responses and random and…
▽ More
We propose an objective measurement method for pitch extractors' responses to frequency-modulated signals. It enables us to evaluate different pitch extractors with unified criteria. The method uses extended time-stretched pulses combined by binary orthogonal sequences. It provides simultaneous measurement results consisting of the linear and the non-linear time-invariant responses and random and time-varying responses. We tested representative pitch extractors using fundamental frequencies spanning 80~Hz to 800~Hz with 1/48 octave steps and produced more than 2000 modulation frequency response plots. We found that making scientific visualization by animating these plots enables us to understand different pitch extractors' behavior at once. Such efficient and effortless inspection is impossible by inspecting all individual plots. The proposed measurement method with visualization leads to further improvement of the performance of one of the extractors mentioned above. In other words, our procedure turns the specific pitch extractor into the best reliable measuring equipment that is crucial for scientific research. We open-sourced MATLAB codes of the proposed objective measurement method and visualization procedure.
△ Less
Submitted 24 June, 2022; v1 submitted 2 April, 2022;
originally announced April 2022.
-
Safeguarding test signals for acoustic measurement using arbitrary sounds
Authors:
Hideki Kawahara,
Kohei Yatabe
Abstract:
We propose a simple method to measure acoustic responses using any sounds by converting them suitable for measurement. This method enables us to use music pieces for measuring acoustic conditions. It is advantageous to measure such conditions without annoying test sounds to listeners. In addition, applying the underlying idea of simultaneous measurement of multiple paths provides practically valua…
▽ More
We propose a simple method to measure acoustic responses using any sounds by converting them suitable for measurement. This method enables us to use music pieces for measuring acoustic conditions. It is advantageous to measure such conditions without annoying test sounds to listeners. In addition, applying the underlying idea of simultaneous measurement of multiple paths provides practically valuable features. For example, it is possible to measure deviations (temporally stable, random, and time-varying) and the impulse response while reproducing slightly modified contents under target conditions. The key idea of the proposed method is to add relatively small deterministic signals that sound like noise to the original sounds. We call the converted sounds safeguarded test signals.
△ Less
Submitted 21 December, 2021;
originally announced December 2021.
-
Objective measurement of pitch extractors' responses to frequency modulated sounds and two reference pitch extraction methods for analyzing voice pitch responses to auditory stimulation
Authors:
Hideki Kawahara,
Kohei Yatabe,
Ken-Ichi Sakakibara,
Tatsuya Kitamura,
Hideki Banno,
Masanori Morise
Abstract:
We propose an objective measurement method for pitch extractors' responses to frequency-modulated signals. The method simultaneously measures the linear and the non-linear time-invariant responses and random and time-varying responses. It uses extended time-stretched pulses combined by binary orthogonal sequences. Our recent finding of involuntary voice pitch response to auditory stimulation while…
▽ More
We propose an objective measurement method for pitch extractors' responses to frequency-modulated signals. The method simultaneously measures the linear and the non-linear time-invariant responses and random and time-varying responses. It uses extended time-stretched pulses combined by binary orthogonal sequences. Our recent finding of involuntary voice pitch response to auditory stimulation while voicing motivated this proposal. The involuntary voice pitch response provides means to investigate voice chain subsystems individually and objectively. This response analysis requires reliable and precise pitch extraction. We found that existing pitch extractors failed to correctly analyze signals used for auditory stimulation by using the proposed method. Therefore, we propose two reference pitch extractors based on the instantaneous frequency analysis and multi-resolution power spectrum analysis. The proposed extractors correctly analyze the test signals. We open-sourced MATLAB codes to measure pitch extractors and codes for conducting the voice pitch response experiment on our GitHub repository.
△ Less
Submitted 27 June, 2022; v1 submitted 5 November, 2021;
originally announced November 2021.
-
Implementation of interactive tools for investigating fundamental frequency response of voiced sounds to auditory stimulation
Authors:
Hideki Kawahara,
Toshie Matsui Kohei,
Yatabe Ken-Ichi Sakakibara Minoru Tsuzaki Masanori Morise Toshio Irino
Abstract:
We introduced a measurement procedure for the involuntary response of voice fundamental-frequency to frequency modulated auditory stimulation. This involuntary response plays an essential role in voice fundamental frequency control while less investigated due to technical difficulties. This article introduces an interactive and real-time tool for investigating this response and supporting tools ad…
▽ More
We introduced a measurement procedure for the involuntary response of voice fundamental-frequency to frequency modulated auditory stimulation. This involuntary response plays an essential role in voice fundamental frequency control while less investigated due to technical difficulties. This article introduces an interactive and real-time tool for investigating this response and supporting tools adopting our new measurement method. The method enables simultaneous measurement of multiple system properties based on a novel set of extended time-stretched pulses combined with orthogonalization. We made MATLAB implementation of these tools available as an open-source repository. This article also provides the detailed measurement procedure using the interactive tool followed by offline measurement tools for conducting subjective experiments and statistical analyses. It also provides technical descriptions of constituent signal processing subsystems as appendices. This application serves as an example for adopting our method to biological system analysis.
△ Less
Submitted 23 September, 2021;
originally announced September 2021.
-
Mixture of orthogonal sequences made from extended time-stretched pulses enables measurement of involuntary voice fundamental frequency response to pitch perturbation
Authors:
Hideki Kawahara,
Toshie Matsui,
Kohei Yatabe,
Ken-Ichi Sakakibara,
Minoru Tsuzaki,
Masanori Morise,
Toshio Irino
Abstract:
Auditory feedback plays an essential role in the regulation of the fundamental frequency of voiced sounds. The fundamental frequency also responds to auditory stimulation other than the speaker's voice. We propose to use this response of the fundamental frequency of sustained vowels to frequency-modulated test signals for investigating involuntary control of voice pitch. This involuntary response…
▽ More
Auditory feedback plays an essential role in the regulation of the fundamental frequency of voiced sounds. The fundamental frequency also responds to auditory stimulation other than the speaker's voice. We propose to use this response of the fundamental frequency of sustained vowels to frequency-modulated test signals for investigating involuntary control of voice pitch. This involuntary response is difficult to identify and isolate by the conventional paradigm, which uses step-shaped pitch perturbation. We recently developed a versatile measurement method using a mixture of orthogonal sequences made from a set of extended time-stretched pulses (TSP). In this article, we extended our approach and designed a set of test signals using the mixture to modulate the fundamental frequency of artificial signals. For testing the response, the experimenter presents the modulated signal aurally while the subject is voicing sustained vowels. We developed a tool for conducting this test quickly and interactively. We make the tool available as an open-source and also provide executable GUI-based applications. Preliminary tests revealed that the proposed method consistently provides compensatory responses with about 100 ms latency, representing involuntary control. Finally, we discuss future applications of the proposed method for objective and non-invasive auditory response measurements.
△ Less
Submitted 3 April, 2021;
originally announced April 2021.
-
Cascaded all-pass filters with randomized center frequencies and phase polarity for acoustic and speech measurement and data augmentation
Authors:
Hideki Kawahara,
Kohei Yatabe
Abstract:
We introduce a new member of TSP (Time Stretched Pulse) for acoustic and speech measurement infrastructure, based on a simple all-pass filter and systematic randomization. This new infrastructure fundamentally upgrades our previous measurement procedure, which enables simultaneous measurement of multiple attributes, including non-linear ones without requiring extra filtering nor post-processing. O…
▽ More
We introduce a new member of TSP (Time Stretched Pulse) for acoustic and speech measurement infrastructure, based on a simple all-pass filter and systematic randomization. This new infrastructure fundamentally upgrades our previous measurement procedure, which enables simultaneous measurement of multiple attributes, including non-linear ones without requiring extra filtering nor post-processing. Our new proposal establishes a theoretically solid, flexible, and extensible foundation in acoustic measurement. Moreover, it is general enough to provide versatile research tools for other fields, such as biological signal analysis. We illustrate using acoustic measurements and data augmentation as representative examples among various prospective applications. We open-sourced MATLAB implementation. It consists of an interactive and real-time acoustic tool, MATLAB functions, and supporting materials.
△ Less
Submitted 12 February, 2021; v1 submitted 25 October, 2020;
originally announced October 2020.
-
Simultaneous measurement of time-invariant linear and nonlinear, and random and extra responses using frequency domain variant of velvet noise
Authors:
Hideki Kawahara,
Ken-Ichi Sakakibara,
Mitsunori Mizumachi,
Masanori Morise,
Hideki Banno
Abstract:
We introduce a new acoustic measurement method that can measure the linear time-invariant response, the nonlinear time-invariant response, and random and time-varying responses simultaneously. The method uses a set of orthogonal sequences made from a set of unit FVNs (Frequency domain variant of Velvet Noise), a new member of the TSP (Time Stretched Pulse). FVN has a unique feature that other TSP…
▽ More
We introduce a new acoustic measurement method that can measure the linear time-invariant response, the nonlinear time-invariant response, and random and time-varying responses simultaneously. The method uses a set of orthogonal sequences made from a set of unit FVNs (Frequency domain variant of Velvet Noise), a new member of the TSP (Time Stretched Pulse). FVN has a unique feature that other TSP members do not. It is a high degree of design freedom that makes the proposed method possible without introducing extra equipment. We introduce two useful cases using two and four orthogonal sequences and illustrates their use using simulations and acoustic measurement examples. We developed an interactive and realtime acoustic analysis tool based on the proposed method. We made it available in an open-source repository. The proposed response analysis method is general and applies to other fields, such as auditory-feedback research and assessment of sound recording and coding.
△ Less
Submitted 9 August, 2020; v1 submitted 5 August, 2020;
originally announced August 2020.
-
Frequency domain variant of Velvet noise and its application to acoustic measurements
Authors:
Hideki Kawahara,
Ken-Ichi Sakakibara,
Mitsunori Mizumachi,
Hideki Banno,
Masanori Morise,
Toshio Irino
Abstract:
We propose a new family of test signals for acoustic measurements such as impulse response, nonlinearity, and the effects of background noise. The proposed family complements difficulties in existing families, the Swept-Sine (SS), pseudo-random noise such as the maximum length sequence (MLS). The proposed family uses the frequency domain variant of the Velvet noise (FVN) as its building block. An…
▽ More
We propose a new family of test signals for acoustic measurements such as impulse response, nonlinearity, and the effects of background noise. The proposed family complements difficulties in existing families, the Swept-Sine (SS), pseudo-random noise such as the maximum length sequence (MLS). The proposed family uses the frequency domain variant of the Velvet noise (FVN) as its building block. An FVN is an impulse response of an all-pass filter and yields the unit impulse when convolved with the time-reversed version of itself. In this respect, FVN is a member of the time-stretched pulse (TSP) in the broadest sense. The high degree of freedom in designing an FVN opens a vast range of applications in acoustic measurement. We introduce the following applications and their specific procedures, among other possibilities. They are as follows. a) Spectrum shaping adaptive to background noise. b) Simultaneous measurement of impulse responses of multiple acoustic paths. d) Simultaneous measurement of linear and nonlinear components of an acoustic path. e) Automatic procedure for time axis alignment of the source and the receiver when they are using independent clocks in acoustic impulse response measurement. We implemented a reference measurement tool equipped with all these procedures. The MATLAB source code and related materials are open-sourced and placed in a GitHub repository.
△ Less
Submitted 10 September, 2019;
originally announced September 2019.
-
Real-time and interactive tools for vocal training based on an analytic signal with a cosine series envelope
Authors:
Hideki Kawahara,
Ken-Ichi Sakakibara,
Eri Haneishi,
Kaori Hagiwara
Abstract:
We introduce real-time and interactive tools for assisting vocal training. In this presentation, we demonstrate mainly a tool based on real-time visualizer of fundamental frequency candidates to provide information-rich feedback to learners. The visualizer uses an efficient algorithm using analytic signals for deriving phase-based attributes. We start using these tools in vocal training for assist…
▽ More
We introduce real-time and interactive tools for assisting vocal training. In this presentation, we demonstrate mainly a tool based on real-time visualizer of fundamental frequency candidates to provide information-rich feedback to learners. The visualizer uses an efficient algorithm using analytic signals for deriving phase-based attributes. We start using these tools in vocal training for assisting learners to acquire the awareness of appropriate vocalization. The first author made the MATLAB implementation of the tools open-source. The code and associated video materials are accessible in the first author's GitHub repository.
△ Less
Submitted 9 September, 2019;
originally announced September 2019.
-
Frequency domain variants of velvet noise and their application to speech processing and synthesis: with appendices
Authors:
Hideki Kawahara,
Ken-Ichi Sakakibara,
Masanori Morise,
Hideki Banno,
Tomoki Toda,
Toshio Irino
Abstract:
We propose a new excitation source signal for VOCODERs and an all-pass impulse response for post-processing of synthetic sounds and pre-processing of natural sounds for data-augmentation. The proposed signals are variants of velvet noise, which is a sparse discrete signal consisting of a few non-zero (1 or -1) elements and sounds smoother than Gaussian white noise. One of the proposed variants, FV…
▽ More
We propose a new excitation source signal for VOCODERs and an all-pass impulse response for post-processing of synthetic sounds and pre-processing of natural sounds for data-augmentation. The proposed signals are variants of velvet noise, which is a sparse discrete signal consisting of a few non-zero (1 or -1) elements and sounds smoother than Gaussian white noise. One of the proposed variants, FVN (Frequency domain Velvet Noise) applies the procedure to generate a velvet noise on the cyclic frequency domain of DFT (Discrete Fourier Transform). Then, by smoothing the generated signal to design the phase of an all-pass filter followed by inverse Fourier transform yields the proposed FVN. Temporally variable frequency weighted mixing of FVN generated by frozen and shuffled random number provides a unified excitation signal which can span from random noise to a repetitive pulse train. The other variant, which is an all-pass impulse response, significantly reduces "buzzy" impression of VOCODER output by filtering. Finally, we will discuss applications of the proposed signal for watermarking and psychoacoustic research.
△ Less
Submitted 18 June, 2018;
originally announced June 2018.
-
A modulation property of time-frequency derivatives of filtered phase and its application to aperiodicity and fo estimation
Authors:
Hideki Kawahara,
Ken-Ichi Sakakibara,
Masanori Morise,
Hideki Banno,
Tomoki Toda
Abstract:
We introduce a simple and linear SNR (strictly speaking, periodic to random power ratio) estimator (0dB to 80dB without additional calibration/linearization) for providing reliable descriptions of aperiodicity in speech corpus. The main idea of this method is to estimate the background random noise level without directly extracting the background noise. The proposed method is applicable to a wide…
▽ More
We introduce a simple and linear SNR (strictly speaking, periodic to random power ratio) estimator (0dB to 80dB without additional calibration/linearization) for providing reliable descriptions of aperiodicity in speech corpus. The main idea of this method is to estimate the background random noise level without directly extracting the background noise. The proposed method is applicable to a wide variety of time windowing functions with very low sidelobe levels. The estimate combines the frequency derivative and the time-frequency derivative of the mapping from filter center frequency to the output instantaneous frequency. This procedure can replace the periodicity detection and aperiodicity estimation subsystems of recently introduced open source vocoder, YANG vocoder. Source code of MATLAB implementation of this method will also be open sourced.
△ Less
Submitted 9 June, 2017;
originally announced June 2017.
-
A new cosine series antialiasing function and its application to aliasing-free glottal source models for speech and singing synthesis
Authors:
Hideki Kawahara,
Ken-Ichi Sakakibara,
Hideki Banno,
Masanori Morise,
Tomoki Toda,
Toshio Irino
Abstract:
We formulated and implemented a procedure to generate aliasing-free excitation source signals. It uses a new antialiasing filter in the continuous time domain followed by an IIR digital filter for response equalization. We introduced a cosine-series-based general design procedure for the new antialiasing function. We applied this new procedure to implement the antialiased Fujisaki-Ljungqvist model…
▽ More
We formulated and implemented a procedure to generate aliasing-free excitation source signals. It uses a new antialiasing filter in the continuous time domain followed by an IIR digital filter for response equalization. We introduced a cosine-series-based general design procedure for the new antialiasing function. We applied this new procedure to implement the antialiased Fujisaki-Ljungqvist model. We also applied it to revise our previous implementation of the antialiased Fant-Liljencrants model. A combination of these signals and a lattice implementation of the time varying vocal tract model provides a reliable and flexible basis to test fo extractors and source aperiodicity analysis methods. MATLAB implementations of these antialiased excitation source models are available as part of our open source tools for speech science.
△ Less
Submitted 8 June, 2017; v1 submitted 22 February, 2017;
originally announced February 2017.
-
Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis
Authors:
Hideki Kawahara,
Yannis Agiomyrgiannakis,
Heiga Zen
Abstract:
This paper introduces a general and flexible framework for F0 and aperiodicity (additive non periodic component) analysis, specifically intended for high-quality speech synthesis and modification applications. The proposed framework consists of three subsystems: instantaneous frequency estimator and initial aperiodicity detector, F0 trajectory tracker, and F0 refinement and aperiodicity extractor.…
▽ More
This paper introduces a general and flexible framework for F0 and aperiodicity (additive non periodic component) analysis, specifically intended for high-quality speech synthesis and modification applications. The proposed framework consists of three subsystems: instantaneous frequency estimator and initial aperiodicity detector, F0 trajectory tracker, and F0 refinement and aperiodicity extractor. A preliminary implementation of the proposed framework substantially outperformed (by a factor of 10 in terms of RMS F0 estimation error) existing F0 extractors in tracking ability of temporally varying F0 trajectories. The front end aperiodicity detector consists of a complex-valued wavelet analysis filter with a highly selective temporal and spectral envelope. This front end aperiodicity detector uses a new measure that quantifies the deviation from periodicity. The measure is less sensitive to slow FM and AM and closely correlates with the signal to noise ratio.
△ Less
Submitted 22 July, 2016; v1 submitted 25 May, 2016;
originally announced May 2016.