Chia sẻ: Vo Danh | Ngày: | Loại File: PDF | Số trang:300

lượt xem


Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

One of the central issues in robotics and animal motor control is the problem of trajectory generation and modulation. Since in many cases trajectories have to be modified on-line when goals are changed, obstacles are encountered, or when external perturbations occur, the notions of trajectory generation and trajectory modulation are tightly coupled.

Chủ đề:


  2.                 The Future of Humanoid Robots – Research and Applications Edited by Riadh Zaier Published by InTech Janeza Trdine 9, 51000 Rijeka, Croatia Copyright © 2011 InTech All chapters are Open Access distributed under the Creative Commons Attribution 3.0 license, which allows users to download, copy and build upon published articles even for commercial purposes, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications. After this work has been published by InTech, authors have the right to republish it, in whole or part, in any publication of which they are the author, and to make other personal use of the work. Any republication, referencing or personal use of the work must explicitly identify the original source. As for readers, this license allows users to download, copy and build upon published chapters even for commercial purposes, as long as the author and publisher are properly credited, which ensures maximum dissemination and a wider impact of our publications. Notice Statements and opinions expressed in the chapters are these of the individual contributors and not necessarily those of the editors or publisher. No responsibility is accepted for the accuracy of information contained in the published chapters. The publisher assumes no responsibility for any damage or injury to persons or property arising out of the use of any materials, instructions, methods or ideas contained in the book. Publishing Process Manager Vedran Greblo Technical Editor Teodora Smiljanic Cover Designer InTech Design Team First published January 2012 Printed in Croatia A free online edition of this book is available at Additional hard copies can be obtained from The Future of Humanoid Robots – Research and Applications, Edited by Riadh Zaier p. cm. ISBN 978-953-307-951-6
  3. free online editions of InTech Books and Journals can be found at
  5.   Contents   Preface IX Part 1 Periodic Tasks and Locomotion Control 1 Chapter 1 Performing Periodic Tasks: On-Line Learning, Adaptation and Synchronization with External Signals 3 Andrej Gams, Tadej Petrič, Aleš Ude and Leon Žlajpah Chapter 2 Autonomous Motion Adaptation Against Structure Changes Without Model Identification 29 Yuki Funabora, Yoshikazu Yano, Shinji Doki and Shigeru Okuma Chapter 3 Design of Oscillatory Neural Network for Locomotion Control of Humanoid Robots 41 Riadh Zaier Part 2 Grasping and Multi-Fingered Robot Hand 61 Chapter 4 Grasp Planning for a Humanoid Hand 63 Tokuo Tsuji, Kensuke Harada, Kenji Kaneko, Fumio Kanehiro and Kenichi Maruyama Chapter 5 Design of 5 D.O.F Robot Hand with an Artificial Skin for an Android Robot 81 Dongwoon Choi, Dong-Wook Lee, Woonghee Shon and Ho-Gil Lee Chapter 6 Development of Multi-Fingered Universal Robot Hand with Torque Limiter Mechanism 97 Wataru Fukui, Futoshi Kobayashi and Fumio Kojima Part 3 Interactive Applications of Humanoid Robots 109 Chapter 7 Exoskeleton and Humanoid Robotic Technology in Construction and Built Environment 111 T. Bock, T. Linner and W. Ikeda
  6. VI Contents Chapter 8 Affective Human-Humanoid Interaction Through Cognitive Architecture 147 Ignazio Infantino Chapter 9 Speech Communication with Humanoids: How People React and How We Can Build the System 165 Yosuke Matsusaka Chapter 10 Implementation of a Framework for Imitation Learning on a Humanoid Robot Using a Cognitive Architecture 189 Huan Tan Chapter 11 A Multi-Modal Panoramic Attentional Model for Robots and Applications 211 Ravi Sarvadevabhatla and Victor Ng-Thow-Hing Chapter 12 User, Gesture and Robot Behaviour Adaptation for Human-Robot Interaction 229 Md. Hasanuzzaman and Haruki Ueno Chapter 13 Learning Novel Objects for Domestic Service Robots 257 Muhammad Attamimi, Tomoaki Nakamura, Takayuki Nagai, Komei Sugiura and Naoto Iwahashi Part 4 Current and Future Challenges for Humanoid Robots 277 Chapter 14 Rob’s Robot: Current and Future Challenges for Humanoid Robots 279 Boris Durán and Serge Thill
  8.   Preface   This book provides state of the art scientific and engineering research findings and developments in the field of humanoid robotics and its applications. The book contains chapters that aim to discover the future abilities of humanoid robots by presenting a variety of integrated research in various scientific and engineering fields such as locomotion, perception, adaptive behavior, human-robot interaction, neuroscience and machine learning. Without a dose of imagination it is hard to predict whether human-like robots will become viable real-world citizens or whether they will be confined to certain specific purposes. However, we can safely predict that humanoids will change the way we interact with machines and will have the ability to blend perfectly into an environment already designed for humans. This book’s intended audience includes upper-level undergraduates and graduates studying robotics. It is designed to be accessible and practical, with an emphasis on useful information to those working in the fields of robotics, cognitive science, artificial intelligence, computational methods and other fields of science directly or indirectly related to the development and usage of future humanoid robots. The editor of the book has extensive research and development experience, and he has patents and publications in the area of humanoid robotics, and his experience is reflected in editing the content of the book.   Riadh Zaier Department of Mechanical and Industrial Engineering, Sultan Qaboos University Sultanate of Oman
  9. Part 1 Periodic Tasks and Locomotion Control
  10. 0 1 Performing Periodic Tasks: On-Line Learning, Adaptation and Synchronization with External Signals Andrej Gams, Tadej Petriˇ , Aleš Ude and Leon Žlajpah c Jožef Stefan Institute, Ljubljana Slovenia 1. Introduction One of the central issues in robotics and animal motor control is the problem of trajectory generation and modulation. Since in many cases trajectories have to be modified on-line when goals are changed, obstacles are encountered, or when external perturbations occur, the notions of trajectory generation and trajectory modulation are tightly coupled. This chapter addresses some of the issues related to trajectory generation and modulation, including the supervised learning of periodic trajectories, and with an emphasis on the learning of the frequency and achieving and maintaining synchronization to external signals. Other addressed issues include robust movement execution despite external perturbations, modulation of the trajectory to reuse it under modified conditions and adaptation of the learned trajectory based on measured force information. Different experimental scenarios on various robotic platforms are described. For the learning of a periodic trajectory without specifying the period and without using traditional off-line signal processing methods, our approach suggests splitting the task into two sub-tasks: (1) frequency extraction, and (2) the supervised learning of the waveform. This is done using two ingredients: nonlinear oscillators, also combined with an adaptive Fourier waveform for the frequency adaptation, and nonparametric regression 1 techniques for shaping the attractor landscapes according to the demonstrated trajectories. The systems are designed such that after having learned the trajectory, simple changes of parameters allow modulations in terms of, for instance, frequency, amplitude and oscillation offset, while keeping the general features of the original trajectory, or maintaining synchronization with an external signal. The system we propose in this paper is based on the motion imitation approach described in (Ijspeert et al., 2002; Schaal et al., 2007). That approach uses two dynamical systems like the system presented here, but with a simple nonlinear oscillator to generate the phase and the amplitude of the periodic movements. A major drawback of that approach is that it requires the frequency of the demonstration signal to be explicitly specified. This means that the frequency has to be either known or extracted from the recorded signal by signal 1 The term “nonparametric” is to indicate that the data to be modeled stem from very large families of distributions which cannot be indexed by a finite dimensional parameter vector in a natural way. It does not mean that there are no parameters.
  11. 4 The Future of Humanoid Robots – Research and Applications 2 Will-be-set-by-IN-TECH processing methods, e.g. Fourier analysis. The main difference of our new approach is that we use an adaptive frequency oscillator (Buchli & Ijspeert, 2004; Righetti & Ijspeert, 2006), which has the process of frequency extraction and adaptation totally embedded into its dynamics. The frequency does not need to be known or extracted, nor do we need to perform any transformations (Righetti et al., 2006). This simplifies the process of teaching a new task/trajectory to the robot. Additionally, the system can work incrementally in on-line settings. We use two different approaches. One uses several frequency oscillators to approximate the input signal, and thus demands a logical algorithm to extract the basic frequency of the input signal. The other uses only one oscillator and higher harmonics of the extracted frequency. It also includes an adaptive fourier series. Our approach is loosely inspired from dynamical systems observed in vertebrate central nervous systems, in particular central pattern generators (Ijspeert, 2008a). Additionally, our work fits in the view that biological movements are constructed out of the combination of “motor primitives” (Mataric, 1998; Schaal, 1999), and the system we develop could be used as blocks or motor primitives for generating more complex trajectories. 1.1 Overview of the research field One of the most notable advantages of the proposed system is the ability to synchronize with an external signal, which can effectively be used in control of rhythmic periodic task where the dynamic behavior and response of the actuated device are critical. Such robotic tasks include swinging of different pendulums (Furuta, 2003; Spong, 1995), playing with different toys, i.e. the yo-yo (Hashimoto & Noritsugu, 1996; Jin et al., 2009; Jin & Zacksenhouse, 2003; Žlajpah, 2006) or a gyroscopic device called the Powerball (Cafuta & Curk, 2008; Gams et al., 2007; Heyda, 2002; Petriˇ et al., 2010), juggling (Buehler et al., 1994; Ronsse et al., 2007; Schaal & c Atkeson, 1993; Williamson, 1999) and locomotion (Ijspeert, 2008b; Ilg et al., 1999; Morimoto et al., 2008). Rhythmic tasks are also handshaking (Jindai & Watanabe, 2007; Kasuga & Hashimoto, 2005; Sato et al., 2007) and even handwriting (Gangadhar et al., 2007; Hollerbach, 1981). Performing these tasks with robots requires appropriate trajectory generation and foremost precise frequency tuning by determining the basic frequency. We denote the lowest frequency relevant for performing a given task, with the term "basic frequency". Different approaches that adjust the rhythm and behavior of the robot, in order to achieve synchronization, have been proposed in the past. For example, a feedback loop that locks onto the phase of the incoming signal. Closed-loop model-based control (An et al., 1988), as a very common control of robotic systems, was applied for juggling (Buehler et al., 1994; Schaal & Atkeson, 1993), playing the yo-yo (Jin & Zackenhouse, 2002; Žlajpah, 2006) and also for the control of quadruped (Fukuoka et al., 2003) and in biped locomotion (Sentis et al., 2010; Spong & Bullo, 2005). Here the basic strategy is to plan a reference trajectory for the robot, which is based on the dynamic behavior of the actuated device. Standard methods for reference trajectory tracking often assume that a correct and exhaustive dynamic model of the object is available (Jin & Zackenhouse, 2002), and their performance may degrade substantially if the accuracy of the model is poor. An alternative approach to controlling rhythmic tasks is with the use of nonlinear oscillators. Oscillators and systems of coupled oscillators are known as powerful modeling tools (Pikovsky et al., 2002) and are widely used in physics and biology to model phenomena as diverse as neuronal signalling, circadian rhythms (Strogatz, 1986), inter-limb coordination (Haken et al., 1985), heart beating (Mirollo et al., 1990), etc. Their properties, which include robust limit cycle behavior, online frequency adaptation (Williamson, 1998)
  12. Performing Periodic Tasks: On-Line Learning, 5 Adaptation and Synchronization with SynchronizationSignals Signals Performing Periodic Tasks: On-Line Learning, Adaptation and External with External 3 and self-sustained limit cycle generation on the absence of cyclic input (Bailey, 2004), to name just a few, make them suitable for controlling rhythmic tasks. Different kinds of oscillators exist and have been used for control of robotic tasks. The van der Pol non-linear oscillator (van der Pol, 1934) has successfully been used for skill entrainment on a swinging robot (Veskos & Demiris, 2005) or gait generation using coupled oscillator circuits, e.g. (Jalics et al., 1997; Liu et al., 2009; Tsuda et al., 2007). Gait generation has also been studied using the Rayleigh oscillator (Filho et al., 2005). Among the extensively used oscillators is also the Matsuoka neural oscillator (Matsuoka, 1985), which models two mutually inhibiting neurons. Publications by Williamson (Williamson, 1999; 1998) show the use of the Matsuoka oscillator for different rhythmic tasks, such as resonance tuning, crank turning and playing the slinky toy. Other robotic tasks using the Matsuoka oscillator include control of giant swing problem (Matsuoka et al., 2005), dish spinning (Matsuoka & Ooshima, 2007) and gait generation in combination with central pattern generators (CPGs) and phase-locked loops (Inoue et al., 2004; Kimura et al., 1999; Kun & Miller, 1996). On-line frequency adaptation, as one of the properties of non-linear oscillators (Williamson, 1998) is a viable alternative to signal processing methods, such as fast Fourier transform (FFT), for determining the basic frequency of the task. On the other hand, when there is no input into the oscillator, it will oscillate at its own frequency (Bailey, 2004). Righetti et al. have introduced adaptive frequency oscillators (Righetti et al., 2006), which preserve the learned frequency even if the input signal has been cut. The authors modify non-linear oscillators or pseudo-oscillators with a learning rule, which allows the modified oscillators to learn the frequency of the input signal. The approach works for different oscillators, from a simple phase oscillator (Gams et al., 2009), the Hopf oscillator, the Fitzhugh-Nagumo oscillator, etc. (Righetti et al., 2006). Combining several adaptive frequency oscillators in a feedback loop allows extraction of several frequency components (Buchli et al., 2008; Gams et al., 2009). Applications vary from bipedal walking (Righetti & Ijspeert, 2006) to frequency tuning of a hopping robot (Buchli et al., 2005). Such feedback structures can be used as a whole imitation system that both extracts the frequency and learns the waveform of the input signal. Not many approaches exist that combine both frequency extraction and waveform learning in imitation systems (Gams et al., 2009; Ijspeert, 2008b). One of them is a two-layered imitation system, which can be used for extracting the frequency of the input signal in the first layer and learning its waveform in the second layer, which is the basis for this chapter. Separate frequency extraction and waveform learning have advantages, since it is possible to independently modulate temporal and spatial features, e.g. phase modulation, amplitude modulation, etc. Additionally a complex waveform can be anchored to the input signal. Compact waveform encoding, such as splines (Miyamoto et al., 1996; Thompson & Patel, 1987; Ude et al., 2000), dynamic movement primitives (DMP) (Schaal et al., 2007), or Gaussian mixture models (GMM) (Calinon et al., 2007), reduce computational complexity of the process. In the next sections we first give details on the two-layered movement imitation system and then give the properties. Finally, we propose possible applications. 2. Two-layered movement imitation system In this chapter we give details and properties of both sub-systems that make the two-layered movement imitation system . We also give alternative possibilities for the canonical dynamical system.
  13. 6 The Future of Humanoid Robots – Research and Applications 4 Will-be-set-by-IN-TECH Ω1..Q Φ1..Q ... Q canonical dynamical system y1...Q ydemo output dynamical w1...Q system two-layered system ... Q Fig. 1. Proposed structure of the system. The two-layered system is composed of the Canonical Dynamical System as the first layer for the frequency adaptation, and the Output Dynamical System for the learning as the second layer. The input signal ydemo (t) is an arbitrary Q-dimensional periodic signal. The Canonical Dynamical System outputs the fundamental frequency Ω and phase of the oscillator at that frequency, Φ, for each of the Q DOF, and the Output Dynamical System learns the waveform. Figure 1 shows the structure of the proposed system for the learning of the frequency and the waveform of the input signal. The input into the system ydemo (t) is an arbitrary periodic signal of one or more degrees of freedom (DOF). The task of frequency and waveform learning is split into two separate tasks, each performed by a separate dynamical system. The frequency adaptation is performed by the Canonical Dynamical System, which either consists of several adaptive frequency oscillators in a feedback structure, or a single oscillator with an adaptive Fourier series. Its purpose is to extract the basic frequency Ω of the input signal, and to provide the phase Φ of the signal at this frequency. These quantities are fed into the Output Dynamical System, whose goal is to adapt the shape of the limit cycle of the Canonical Dynamical System, and to learn the waveform of the input signal. The resulting output signal of the Output Dynamical System is not explicitly encoded but generated during the time evolution of the Canonical Dynamical System, by using a set of weights learned by Incremental Locally Weighted Regression (ILWR) (Schaal & Atkeson, 1998). Both frequency adaptation and waveform learning work in parallel, thus accelerating the process. The output of the combined system can be, for example, joint coordinates of the robot, position in task space, joint torques, etc., depending on what the input signal represents. In the next section we first explain the second layer of the system - the output dynamical system - which learns the waveform of the input periodic signal once the frequency is determined. 2.1 Output dynamical system The output dynamical system is used to learn the waveform of the input signal. The explanation is for a 1 DOF signal. For multiple DOF, the algorithm works in parallel for all the degrees of freedom. The following dynamics specify the attractor landscape of a trajectory y towards the anchor point g, with the Canonical Dynamical System providing the phase Φ to the function Ψi of the control policy:
  14. Performing Periodic Tasks: On-Line Learning, 7 Adaptation and Synchronization with SynchronizationSignals Signals Performing Periodic Tasks: On-Line Learning, Adaptation and External with External 5 ∑iN 1 Ψi wi r αz ( β z ( g − y) − z) + = z=Ω ˙ (1) N ∑ i =1 Ψ i y = Ωz ˙ (2) Ψi = exp (h (cos (Φ − ci ) − 1)) (3) Here Ω (chosen amongst the ωi ) is the frequency given by canonical dynamical system, Eq. (10), α Z and β z are positive constants, set to αz = 8 and β z = 2 for all the results; the ratio 4:1 ensures critical damping so that the system monotonically varies to the trajectory oscillating around g - an anchor point for the oscillatory trajectory. N is the number of Gaussian-like periodic kernel functions Ψi , which are given by Eq. (3). wi is the learned weight parameter and r is the amplitude control parameter, maintaining the amplitude of the demonstration signal with r = 1. The system given by Eq. (1) without the nonlinear term is a second-order linear system with a unique globally stable point attractor (Ijspeert et al., 2002). But, because of the periodic nonlinear term, this system produces stable periodic trajectories whose frequency is Ω and whose waveform is determined by the weight parameters wi . In Eq. (3), which determines the Gaussian-like kernel functions Ψi , h determines their width, which is set to h = 2.5 N for all the results presented in the paper unless stated otherwise, and ci are equally spaced between 0 and 2π in N steps. As the input into the learning algorithm we use triplets of position, velocity and acceleration ˙ ¨ ydemo (t), ydemo (t), and ydemo (t) with demo marking the input or demonstration trajectory we are trying to learn. With this Eq. (1) can be rewritten as ∑ N 1 Ψ i wi r 1 z − αz ( β z ( g − y) − z) = i=N ˙ (4) Ω ∑ i =1 Ψ i and formulated as a supervised learning problem with on the right hand side a set of local models wi r that are weighted by the kernel functions Ψi , and on the left hand side the target ydemo − αz β z ( g − ydemo ) − 1 1˙ ¨ function f targ given by f targ = , which is obtained by Ω ydemo Ω2 ˙ ¨ ydemo ydemo ˙ matching y to ydemo , z to Ω, and z to Ω. Locally weighted regression corresponds to finding, for each kernel function Ψi , the weight vector wi , which minimizes the quadratic error criterion 2 P 2 ∑ Ψi ( t ) f targ (t) − wi r (t) Ji = (5) t =1 where t is an index corresponding to discrete time steps (of the integration). The regression can be performed as a batch regression, or alternatively, we can perform the minimization of the Ji cost function incrementally, while the target data points f targ (t) arrive. As we want continuous learning of the demonstration signal, we use the latter. Incremental regression is done with the use of recursive least squares with a forgetting factor of λ, to determine the parameters (or weights) wi . Given the target data f targ (t) and r (t), wi is updated by wi (t + 1) = wi (t) + Ψi Pi (t + 1)r (t)er (t) (6) 2 LWR is derived from a piecewise linear function approximation approach (Schaal & Atkeson, 1998), which decouples a nonlinear least-squares learning problem into several locally linear learning problems, each characterized by the local cost function Ji . These local problems can be solved with standard weighted least squares approaches.
  15. 8 The Future of Humanoid Robots – Research and Applications 6 Will-be-set-by-IN-TECH 2 400 1 200 ¨ 0 0 y y −1 −200 −2 −400 10 10.5 11 11.5 12 10 10.5 11 11.5 12 20 1 10 Ψi 0 0.5 y ˙ −10 |ydemo − ylearned|2 N = 10 0.6 −20 0 N = 25 10 10.5 11 11.5 12 10 10.5 11 11.5 12 N = 50 0.4 2 6 mod(Φ, 2π ) 1.5 0.2 4 1 r 0 2 0.5 −0.2 0 0 20 20.2 20.4 20.6 20.8 21 21.2 21.4 21.6 21.8 22 10 10.5 11 11.5 12 10 10.5 11 11.5 12 t [s] t [s] t [s] Fig. 2. Left: The result of Output Dynamical System with a constant frequency input and with continuous learning of the weights. In all the plots the input signal is the dash-dot line while the learned signal is the solid line. In the middle-right plot we can see the evolution of the kernel functions. The kernel functions are a function of Φ and do not necessarily change uniformly (see also Fig. 7). In the bottom right plot the phase of the oscillator is shown. The amplitude is here r = 1, as shown bottom-left. Right: The error of learning decreases with the increase of the number of Gaussian-like kernel functions. The error, which is quite small, is mainly due to a very slight (one or two sample times) delay of the learned signal. Pi (t)2 r (t)2 1 Pi (t) − Pi (t + 1) = (7) λ λ + Pi (t)r (t)2 Ψi er (t) = f targ (t) − wi (t)r (t). (8) P, in general, is the inverse covariance matrix (Ljung & Söderström, 1986). The recursion is started with wi = 0 and Pi = 1. Batch and incremental learning regressions provide identical weights wi for the same training sets when the forgetting factor λ is set to one. Differences appear when the forgetting factor is less than one, in which case the incremental regression gives more weight to recent data (i.e. tends to forget older ones). The error of weight learning er (Eq. (8)) is not “related” to e when extracting frequency components (Eq. (11)). This allows for complete separation of frequency adaptation and waveform learning. Figure 2 left shows the time evolution of the Output Dynamical System anchored to a Canonical Dynamical System with the frequency set at Ω = 2π rad/s, and the weight parameters wi adjusted to fit the trajectory ydemo (t) = sin (2π t) + cos (4π t) + 0.4sin(6π t). As we can see in the top-left plot, the input signal and the reconstructed signal match closely. The matching between the reconstructed signal and the input signal can be improved by increasing the number of Gaussian-like functions. Parameters of the Output Dynamical System When tuning the parameters of the Output Dynamical System, we have to determine the number of Gaussian-like Kernel functions N , and specially the forgetting factor λ. The number N of Gaussian-like kernel functions could be set automatically if we used the locally weighted learning (Schaal & Atkeson, 1998), but for simplicity it was here set by hand. Increasing the number increases the accuracy of the reconstructed signal, but at the same time also increases the computational cost. Note that LWR does not suffer from problems of overfitting when the
  16. Performing Periodic Tasks: On-Line Learning, 9 Adaptation and Synchronization with SynchronizationSignals Signals Performing Periodic Tasks: On-Line Learning, Adaptation and External with External 7 number of kernel functions is increased.3 Figure 2 right shows the error of learning er when using N = 10, N = 25, and N = 50 on a signal ydemo (t) = 0.65sin (2π t) + 1.5cos (4π t) + 0.3sin (6π t). Throughout the paper, unless specified otherwise, N = 25. The forgetting factor λ ∈ [0, 1] plays a key role in the behavior of the system. If it is set high, the system never forgets any input values and learns an average of the waveform over multiple periods. If it is set too low, it forgets all, basically training all the weights to the last value. We set it to λ = 0.995. 2.2 Canonical dynamical system The task of the Canonical Dynamical System is two-fold. Firstly, it has to extract the fundamental frequency Ω of the input signal, and secondly, it has to exhibit stable limit cycle behavior in order to provide a phase signal Φ, that is used to anchor the waveform of the output signal. Two approaches are possible, either with a pool of oscillators (PO), or with an adaptive Fourier Series (AF). 2.2.1 Using a pool of oscillators As the basis of our canonical dynamical system we use a set of phase oscillators, see e.g. (Buchli et al., 2006), to which we apply the adaptive frequency learning rule as introduced in (Buchli & Ijspeert, 2004) and (Righetti & Ijspeert, 2006), and combine it with a feedback structure (Righetti et al., 2006) shown in Figure 3. The basic idea of the structure is that each of the oscillators will adapt its frequency to one of the frequency components of the input signal, essentially “populating” the frequency spectrum. We use several oscillators, but are interested only in the fundamental or lowest non-zero frequency of the input signal, denoted by Ω, and the phase of the oscillator at this frequency, denoted by Φ. Therefore the feedback structure is followed by a small logical block, which chooses the correct, lowest non-zero, frequency. Determining Ω and Φ is important because with them we can formulate a supervised learning problem in the second stage - the Output Dynamical System, and learn the waveform of the full period of the input signal. ω1(t),ɸ1 Ω,Φ lowest non-zero ω2(t),ɸ2 ydemo ω3(t),ɸ3 e + Σαicos(ɸi) - ^ y ωM(t),ɸM Fig. 3. Feedback structure of a network of adaptive frequency phase oscillators, that form the Canonical Dynamical System. All oscillators receive the same input and have to be at different starting frequencies to converge to different final frequencies. Refer also to text and Eqs. (9-13). The feedback structure of M adaptive frequency phase oscillators is governed by the following equations: 3 This property is due to solving the bias-variance dilemma of function approximation locally with a closed form solution to leave-one-out cross-validation (Schaal & Atkeson, 1998).
  17. 10 The Future of Humanoid Robots – Research and Applications 8 Will-be-set-by-IN-TECH φi = ωi − Ke sin(φi ) ˙ (9) ωi = −Ke sin(φi ) ˙ (10) e = ydemo − y ˆ (11) M ∑ αi cos(φi ) ˆ y= (12) i =1 αi = η cos(φi )e ˙ (13) where K is the coupling strength, φi is the phase of oscillator i, e is the input into the oscillators, ˆ ydemo is the input signal, y is the weighted sum of the oscillators’ outputs, M is the number of oscillators, αi is the amplitude associated to the i-th oscillator, and η is a learning constant. In the experiments we use K = 20 and η = 1, unless specified otherwise. Eq. (9) and (10) present the core of the Canonical Dynamical System – the adaptive frequency phase oscillator. Several ( M) such oscillators are used in a feedback loop to extract separate frequency components. Eq. (11) and (12) specify the feedback loop, which needs also amplitude adaptation for each of the frequency components (Eq. (13)). As we can see in Figure 3, each of the oscillators of the structure receives the same input signal, which is the difference between the signal to be learned and the signal already learned by the feedback loop, as in Eq. (11). Since a negative feedback loop is used, this difference approaches zero as the weighted sum of separate frequency components, Eq. (12), approaches the learned signal, and therefore the frequencies of the oscillators stabilize. Eq. (13) ensures amplitude adaptation and thus the stabilization of the learned frequency. Such a feedback structure performs a kind of dynamic Fourier analysis. It can learn several frequency components of the input signal (Righetti et al., 2006) and enables the frequency of a given oscillator to converge as t → ∞, because once the frequency of a separate oscillator is set, it is deducted from the demonstration signal ydemo , and disappears from e (due to the negative feedback loop). Other oscillators can thus adapt to other remaining frequency components. The populating of the frequency spectrum is therefore done without any signal processing, as the whole process of frequency extraction and adaptation is totally embedded into the dynamics of the adaptive frequency oscillators. Frequency adaptation results for a time-varying signal are illustrated in Figure 4, left. The top plot shows the input signal ydemo , the middle plot the extracted frequencies, and the bottom plot the error of frequency adaptation. The figure shows results for both approaches, using a pool of oscillators (PO) and for using one oscillator and an adaptive Fourier series (AF), explained in the next section. The signal itself is of three parts, a non-stationary signal (presented by a chirp signal), followed by a step change in the frequency of the signal, and in the end a stationary signal. We can see that the output frequency stabilizes very quickly at the (changing) target frequency. In general the speed of convergence depends on the coupling strength K (Righetti et al., 2006). Besides the use for non-stationary signals, such as chirp signals, coping with the change in frequency of the input signal proves especially useful when adapting to the frequency of hand-generated signals, which are never stationary. In this particular example, a single adaptive frequency oscillator in a feedback loop was enough, because the input signal was purely sinusoidal. The number of adaptive frequency oscillators in a feedback loop is therefore a matter of design. There should be enough oscillators to avoid missing the fundamental frequency and to limit the variation of frequencies described below when the input signal has many



Đồng bộ tài khoản