Robotics Automation and Control - PDF Free Download (2024)

Robotics, Automation and Control

Published by In-Teh

In-Teh is Croatian branch of I-Tech Education and Publishing KG, Vienna, Austria. Abstracting and non-profit use of the material is permitted with credit to the source. Statements and opinions expressed in the chapters are these of the individual contributors and not necessarily those of the editors or publisher. No responsibility is accepted for the accuracy of information contained in the published articles. Publisher assumes no responsibility liability for any damage or injury to persons or property arising out of the use of any materials, instructions, methods or ideas contained inside. After this work has been published by the In-Teh, authors have the right to republish it, in whole or part, in any publication of which they are an author or editor, and the make other personal use of the work. © 2008 In-teh www.in-teh.org Additional copies can be obtained from: [emailprotected] First published October 2008 Printed in Croatia

A catalogue record for this book is available from the University Library Rijeka under no. 120101001 Robotics, Automation and Control, Edited by Pavla Pecherková, Miroslav Flídr and Jindřich Duník p. cm. ISBN 978-953-7619-18-3 1. Robotics, Automation and Control, Pavla Pecherková, Miroslav Flídr and Jindřich Duník

Preface This book was conceived as a gathering place of new ideas from academia, industry, research and practice in the fields of robotics, automation and control. The aim of the book was to point out interactions among the various fields of interests in spite of diversity and narrow specializations which prevail in the current research. We believe that the resulting collection of papers fulfills the aim of the book. The book presents twenty four chapters in total. The scope of the topics presented in the individual chapters ranges from classical control and estimation problems to the latest artificial intelligence techniques. Moreover, whenever possible and appropriate, the proposed solutions and theories are applied to real-world problems. The common denominator of all included chapters appears to be a synergy of various specializations. This synergy yields deeper understanding of the treated problems. Each new approach applied to a particular problem, may enrich and inspire improvements of already established approaches to the problem. We would like to express our gratitude to the whole team who made this book possible. We hope that this book will provide new ideas and stimulation for your research. October 2008

Editors

Pavla Pecherková Miroslav Flídr Jindřich Duník

Contents Preface 1. Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

V 001

Paolo Lino and Bruno Maione

2. Time-Frequency Representation of Signals Using Kalman Filter

023

Jindřich Liška and Eduard Janeček

3. Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

039

Guido Maione

4. Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

059

Antonio Martin and Carlos Leon

5. Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

079

Adnan Martini, François Léonard and Gabriel Abba

6. An Artificial Neural Network Based Learning Method for Mobile Robot Localization

103

Matthew Conforth and Yan Meng

7. The Identification of Models of External Loads

113

Yuri Menshikov

8. Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access Grazia Cicirelli and Annalisa Milella

123

VIII

9. On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

143

Antonio J. Vallejo, Rubén Morales-Menéndez and J.R. Alique

10. Controlled Use of Subgoals in Reinforcement Learning

167

Junichi Murata

11. Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

183

Oussama Mustapha, Mohamad Khalil, Ghaleb Hoblos, Houcine Chafouk and Dimitri Lefebvre

12. Pareto Optimum Design of Robust Controllers for Systems with Parametric Uncertainties

205

Amir Hajiloo, Nader Nariman-zadeh and Ali Moeini

13. Genetic Reinforcement Learning Algorithms for On-line Fuzzy Inference System Tuning “Application to Mobile Robotic”

227

Abdelkrim Nemra and Hacene Rezine

14. Control of Redundant Submarine Robot Arms under Holonomic Constraints

257

E. Olguín-Díaz, V. Parra-Vega and D. Navarro-Alarcón

15. Predictive Control with Local Visual Data

289

Lluís Pacheco, Ningsu Luo and Xavier Cufí

16. New Trends in Evaluation of the Sensors Output

307

Michal Pavlik, Jiri Haze and Radimir Vrba

17. Modelling and Simultaneous Estimation of State and Parameters of Traffic System

319

Pavla Pecherková, Jindřich Duník and Miroslav Flídr

18. A Human Factors Approach to Supervisory Control Interface Improvement

337

Pere Ponsa, Ramon Vilanova, Marta Díaz and Anton Gomà

19. An Approach to Tune PID Fuzzy Logic Controllers Based on Reinforcement Learning Hacene Rezine, Louali Rabah, Jèrome Faucher and Pascal Maussion

353

20. Autonomous Robot Navigation using Flatness-based Control and Multi-Sensor Fusion

395

Gerasimos G. Rigatos

21. An Improved Real-Time Particle Filter for Robot Localization

417

Dario Lodi Rizzini and Stefano Caselli

22. Dependability of Autonomous Mobile Systems

435

Jan Rüdiger, AchimWagner and Essam Badreddin

23. Model-free Subspace Based Dynamic Control of Mechanical Manipulators

457

Muhammad Saad Saleem and Ibrahim A. Sultan

24. The Verification of Temporal KBS: SPARSE A Case Study in Power Systems Jorge Santos, Zita Vale, Carlos Serôdio and Carlos Ramos

473

1 Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems Paolo Lino and Bruno Maione

Dipartimento di Elettrotecnica ed Elettronica, Politecnico di Bari Via Re David 200, 70125 Bari, Italy 1. Introduction The optimal design of a mechatronic system calls for the proper dimensioning of mechanical, electronic and embedded control subsystems (Dieterle, 2005; Isermann, 1996; Isermann, 2008). According to the current approach, the design problem is decomposed into several sub-problems, which are faced separately, thus leading to a sub-optimal solution. Usually, the mechanical part and the control system are considered independently of each others: the former is designed first, then the latter is synthesized for the already existing physical system. This approach does not exploit many potential advantages of an integrated design process, which are lost in the separate points of view of different engineering domains. The physical properties and the dynamical behaviour of parts, in which energy conversion plays a central role, are not determined by the choices of the control engineers and therefore are of little concern to them. Their primary interests, indeed, are signal processing and information management, computer power requirements, choice of sensors and sensor locations, and so on. So it can happen that poorly designed mechanical parts do never lead to good performances, even in presence of advanced controllers. On the other hand, a poor knowledge of how controllers can directly influence and balance for defects or weaknesses in mechanical components does not help in achieving quality and good performances of the whole process. Significant improvements to overall system performances can be achieved by early combining the physical system design and the control system development (Isermann, 1996b; Stobart et al., 1999; Youcef-Toumi, 1996). Nevertheless, some obstacles have to be overcome, as this process requires the knowledge of interactions of the basic components and sub-systems for different operating conditions. To this end, a deep analysis considering the system as a whole and its transient behaviour seems necessary. In this framework, simulation represents an essential tool for designing and optimizing mechatronic systems. In fact, it can help in integrating the steps involved in the whole design process, giving tools to evaluate the effect of changes in the mechanical and the control subsystems, even at early stages. Available or suitably built models may be exploited for the geometric optimization of components, the design and test of control systems, and the characterization of new systems.

Robotics, Automation and Control

Since models are application oriented, none of them has absolute validity. Models that differ for complexity and accuracy can be defined to take into account the main physical phenomena at various accuracy levels (Bertram et al., 2003; Dellino et al., 2007b; Ollero et al., 2006). Mathematical modelling in a control framework requires to trade off between accuracy in representing the dynamical behaviour of the most significant variables and the need of reducing the complexity of controller structure and design process. Namely, if all engineering aspects are taken into account, the control design becomes very messy. On the other hand, using virtual prototyping techniques allows characterizing system dynamics, evaluate and validate the effects of operative conditions and design parameters, which is appropriate for mechanical design (Ferretti et al., 2004); nevertheless, despite of good prediction capabilities, models obtained in such a way are completely useless for designing a control law, as they are not in the form of mathematical equations. Instead, from the control engineer point of view, the use of detailed modelling tools allows the safe and reliable evaluation of the control systems. It is clear that an appropriate modelling and simulation approach cannot be fitted into the limitations of one formalism at time, particularly in the early stages of the design process. Hence, it is necessary a combination of different methodologies in a multi-formalism approach to modelling supported by an appropriate simulation environment (van Amerongen, 2003; van Amerongen & Breedveld, 2003; Smith, 1999). The use of different domain-specific tools and software packages allows to take advantage of the knowledge from different expertise fields and the power of the specific design environment. In this chapter, we consider the opportunity of integrating different models, at different level of details, and different design tools, to optimize the design of the mechanical and control systems as a whole. The effectiveness of the approach is illustrated by means of two practical case studies, involving both diesel and CNG injection systems for internal combustion engines, which represent a benchmark for the evaluation of performances of the approach. As a virtual environment for design integration, we choose AMESim (Advanced Modelling Environment for Simulation): a simulation tool, which is oriented to lumped parameter modelling of physical elements, interconnected by ports enlightening the energy exchanges between element and element and between an element and its environment (IMAGINE S.A., 2007). AMESim, indeed, is capable of describing physical phenomena with great precision and details and of accurately predicting the system dynamics. In a first step, we used this tool to obtain virtual prototypes of the injection systems, as similar as possible to the actual final hardware. Then, with reference to these prototypes, we also determined reduced order models in form of transfer function and/or state space representations, more suitable for analytical (or empirical) tuning of the pressure controllers. Using virtual prototypes in these early design stages enabled the evaluation of the influence of the geometrical/physical alternatives on the reduced models used for the controller tuning. Then, based on these reduced models, the controller settings were designed and adjusted in accordance with the early stages of the mechanical design process. Finally, the detailed physical/geometric models of the mechanical parts, created by the AMESim package, were exported ad used as a module in a simulation program, which enabled the evaluation of the controllers performances in the closed-loop system. In other words, the detailed simulation models surrogated for a real hardware. Experimental and simulation proved the validity of the proposed approach.

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

2. Steps in the multi-domain design approach An integrated design approach gives more degrees of freedom for the optimization of both the mechanical and its control system than the classical approach. In particular, the improvement of the design process could be obtained by considering the following aspects: iteration of the design steps, use of different specific-domains interacting tools for design, application of optimization algorithms supported by appropriate models (Dellino et al., 2007a). The use of different domain-specific tools allows one to take advantage of the knowledge of engineers from different expertise fields and the power of the specific design environment. The interaction during the design process can be realized by using automatic optimization tools and a proper management of communication between different software environments, without the need of the expertise intervention. Instead, the expertise opinion takes place during the analysis phase of performances. The resulting integrated design process could consist in the following steps (Fig. 1):

Fig. 1. Integrated design approach for mechatronic systems development. -

Development of a virtual prototype of the considered system using a domain-specific tool (e.g. AMESim, Modelica, etc.) and analysis of the system performances. Eventually, realization of a real prototype of the system. Alternatively, a virtual prototype of an existing process can be built and these first two steps have to be swapped. Validation of the virtual prototype by comparing simulation results and real data. At the end of this step, the virtual prototype could be assumed as a reliable model of the real system. Derivation of a simplified control-oriented analytical model of the real system (white box or black box models). Solving equation of such analytical models is made easier by employing specific software packages devoted to the solution of differential equations (e.g. MATLAB/Simulink). Validation of the analytical model against the virtual prototype: this step can be considerably simplified by simulation of different operating conditions. Design of control algorithms based on the analytical model parameters. Complex and versatile algorithms are available in computational tools like MATLAB/Simulink to design and simulate control systems. Nevertheless, the construction of accurate models in the same environment could be a complex and stressful process if a deep knowledge of the system under study is not achieved. Evaluation of performances of the control laws on the virtual prototype. The use of the virtual prototype allows to perform safer, less expensive, and more reliable tests than

Robotics, Automation and Control

using the real system. In this chapter, the AMESim-Simulink interface allows to integrate AMESim models within the Simulink environment, taking advantage of peculiarities of both software packages. • The final step consists in evaluating the control algorithm performances on the real system. The described process could be suitably reiterated to optimize the system and the controller design by using automatic optimization tools. In the next Sections, two case studies involving the common rail injection systems for both CNG and diesel engines are considered to show the feasibility of the described design approach.

3. Integrated design of a compressed natural gas injection system We consider a system composed of the following elements (Fig. 2): a fuel tank, storing high pressure gas, a mechanical pressure reducer, a solenoid valve and the fuel metering system, consisting of a common rail and four electro-injectors. Two different configurations were compared for implementation, with different arrangements of the solenoid valve affecting system performances (i.e. cascade connection, Fig. 2(a), and parallel connection, Fig. 2(b), respectively). Detailed AMESim models were developed for each of them, providing critical information for the final choice. Few details illustrate the injection operation for both layouts.

(a)

(b)

Fig. 2. Block schemes of the common rail CNG injection systems; (a) cascade connection of solenoid valve; (b) parallel connection of solenoid valve. With reference to Fig. 2(a), the pressure reducer receives fuel from the tank at a pressure in the range between 200 and 20 bars and reduces it to a value of about 10 bar. Then the solenoid valve suitably regulates the gas flow towards the common rail to control pressure level and to damp oscillations due to injections. Finally, the electronically controlled injectors send the gas to the intake manifold for obtaining the proper fuel air mixture. The injection flow only depends on rail pressure and injection timings, which are precisely driven by the Electronic Control Unit (ECU). The variable inflow section of the pressure reducer is varied by the axial displacement of a spherical shutter coupled with a moving

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

piston. Piston and shutter dynamics are affected by the applied forces: gas pressure in a main chamber acts on the piston lower surface pushing it at the top, and elastic force of a preloaded spring holden in a control chamber pushes it down and causes the shutter to open. The spring preload value sets the desired equilibrium reducer pressure: if the pressure exceeds the reference value the shutter closes and the gas inflow reduces, preventing a further pressure rise; on the contrary, if the pressure decreases, the piston moves down and the shutter opens, letting more fuel to enter and causing the pressure to go up in the reducer chamber (see Maione et al., 2004, for details). As for the second configuration (Fig. 2(b)), the fuel from the pressure reducer directly flows towards the rail, and the solenoid valve regulates the intake flow in a secondary circuit including the control chamber. The role of the force applied by the preloaded spring of control chamber is now played by the pressure force in the secondary circuit, which can be controlled by suitably driving the solenoid valve. When the solenoid valve is energized, the fuel enters the control chamber, causing the pressure on the upper surface of the piston to build up. As a consequence, the piston is pushed down with the shutter, letting more fuel to enter in the main chamber, where the pressure increases. On the contrary, when the solenoid valve is non-energized, the pressure on the upper side of the piston decreases, making the piston to raise and the main chamber shutter to close under the action of a preloaded spring (see Lino et al., 2008, for details). On the basis of a deep analysis performed on AMESim virtual prototypes the second configuration was chosen as a final solution, because it has advantages in terms of performances and efficiency. To sum up, it guarantees faster transients as the fuel can reach the common rail at a higher pressure. Moreover, leakages involving the pressure reducer due to the allowance between cylinder and piston are reduced by the lesser pressure gradient between the lower and upper piston surfaces. Finally, allowing intermediate positions of the shutter in the pressure reducer permits a more accurate control of the intake flow from the tank and a remarkable reduction of the pressure oscillations due to control operations. A detailed description of the AMESim model of the system according the final layout is in the following (Fig. 3a). 3.1 Virtual prototype of the compressed natural gas injection system By assumption, the pressures distribution within the control chamber, the common rail and the injectors is uniform, and the elastic deformations of solid parts due to pressure changes are negligible. The pipes are considered as incompressible ducts with friction and a non uniform pressure distribution. Temperature variations are taken into account, affecting the pressure dynamics in each subcomponent. Besides, only heat exchanges through pipes are considered, by properly computing a thermal exchange coefficient. The tank pressure plays the role of a maintenance input, and it is modelled by a constant pneumatic pressure source. To simplify the AMESim model construction some supercomponents have been suitably created, collecting elements within a single one. The main components for modelling the pressure reducer are the Mass block with stiction and coulomb friction and end stops, which computes the piston and the shutter dynamics through the Newton's second law of motion, a Pneumatic ball poppet with conical seat, two Pneumatic piston, and an Elastic contact modelling the contact between the piston and the shutter.

Robotics, Automation and Control

(a)

(b) Fig. 3. (a) AMESim model of the CNG injection system; (b) solenoid with preloaded spring supercomponent.

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

The Pneumatic piston components compute the pressure forces acting upon the upper and lower piston surfaces. The viscous friction and leakage due to contact between piston and cylinder are taken into account through the Pneumatic leakage and viscous friction component, by specifying the length of contact, the piston diameter and the clearance. Finally, a Variable volume with pneumatic chamber is used to compute the pressure dynamics as a function of temperature T and intake and outtake flows min , mout , as well as of volume changes due to mechanical part motions, according to the following equation: dp RT ⎛ dV ⎞ = ⎜ m in − m out + ρ ⎟ dt V ⎝ dt ⎠

(1)

where p is the fuel pressure, ρ the fuel density and V the taken up volume. The same component is used to model the common rail by neglecting the volume changes. Both pressure and viscous stresses contribute to drag forces acting on a body immersed in a moving fluid. In particular, the total drag acting on a body is the sum of two components: the pressure or form drag, due to pressure gradient, and the skin friction or viscous drag, i.e. Drag force = Form drag + Skin friction drag. By introducing a drag coefficient CD depending on Reynolds number, the drag can be expressed in terms of the relative speed v (Streeter et al., 1998):

Drag = C DρAv 2 2

(2)

Moving shutters connecting two different control volumes are subject to both form drag and skin friction drag. The former one is properly computed by AMESim algorithms for a variety of shutters, considering different poppet and seat shapes. As for the latter, it is computed as a linear function of the fluid speed by the factor of proportionality. It can be obtained by noting that for a spherical body it holds Form Drag = 2πDµv (Streeter et al., 1998), being µ the absolute viscosity and D the shutter diameter. The moving anchor in the solenoid valve experiences a viscous drag depending on the body shape. The skin friction drag can be computed using eq. (2) by considering the appropriate value of CD. Since, by hypothesis, the anchor moves within a fluid with uniform pressure distribution, the form drag is neglected. The continuity and momentum equations are used to compute pressures and flows through pipes so as to take into account wave propagation effects. In case of long pipes with friction, a system of nonlinear partial differential equations is obtained, which is implemented in the Distributive wave equation submodel of pneumatic pipe component from the pneumatic library. This is the case of pipes connecting pressure reducer and common rail. The continuity and momentum equations can be expressed as follows (Streeter et al., 1998):

∂ρ ∂v +ρ =0 ∂t ∂x

(3)

∂v α 2 ∂ρ f + + v v =0 ∂t ρ ∂x 2 d

(4)

where α is the sound speed in the gas, d is the pipe internal diameter, f is the D'Arcy friction coefficient depending on the Reynolds number. AMESim numerically solves the above equations by discretization.

Robotics, Automation and Control

For short pipes, the Compressibility + friction submodel of pneumatic pipe is used, allowing to compute the flow according the following equation:

2 dΔp Lρf

(5)

where Δp is the pressure drop along the pipe of length L. The pipes connecting common rail and injectors are modelled in such a way. Heat transfer exchanges are accounted for by the above mentioned AMESim components, provided that a heat transfer coefficient is properly specified. For a cylindrical pipe of length L consisting of a hom*ogeneous material with constant thermal conductivity k and having an inner and outer convective fluid flow, the thermal flow Q is given by (Zucrow& Hoffman, 1976):

2 πkLΔT ln ro ri

(6)

where ΔT is the temperature gradient between the internal and external surfaces, and ro and ri are the external and internal radiuses, respectively. With reference to the outside surface of the pipe, the heat-transfer coefficient Uo is:

Uo =

k ro ln ro ri

(7)

The AMESim model for the solenoid valve is composed of the following elements: a solenoid with preloaded spring, two moving masses with end stops subject to viscous friction and representing the magnet anchor and the shutter respectively, and a component representing the elastic contact between the anchor and the shutter. The intake section depends on the axial displacement of the shutter over the conical seat and is computed within the Pneumatic ball poppet with conical seat component, which also evaluates the drags acting on the shutter. The solenoid valve is driven by a peak-hold modulated voltage. The resulting current consists of a peak phase followed by a variable duration hold phase. The valve opening time is regulated by varying the ratio between the hold phase duration and signal period, namely the control signal duty cycle. This signal is reconstructed by using a Data from ASCII file signal source that drives a Pulse Width Modulation component. To compute the magnetic force applied to the anchor, a supercomponent Solenoid with preloaded spring in Fig. 3a modelling the magnetic circuit has been suitably built, as described in the following (Fig. 3b). The magnetic flux within the whole magnetic circuit is given by the Faraday law:

ϕ = (eev − Re v iev ) n

(8)

where φ is the magnetic flux, R the n turns winding resistance, eev the applied voltage and iev the circuit current. Flux leakage and eddy-currents have been neglected. The magnetomotiveforce MMF able to produce the magnetic flux has to compensate the magnetic tension drop along the magnetic and the air gap paths. Even though most of the circuit reluctance is applied to the air gap, nonlinear properties of the magnet, due to saturation and hysteresis, sensibly affect the system behaviour. The following equation holds:

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

MMF = MMFs + MMFa = H s ls + H a la

9 (9)

where H is the magnetic field strength and l is the magnetic path length, within the magnet and the gap respectively. The air gap length depends on the actual position of the anchor. The magnetic induction within the magnet is a nonlinear function of H. It is assumed that the magnetic flux cross section is constant along the circuit, yielding:

B = ϕ Am = f (H s ) = μ0 H a

(10)

where Am is the air gap cross section and µ0 is the magnetic permeability of air. The B-H curve is the hysteresis curve of the magnetic material. Arranging the previous equations yields to φ, B and H. The resulting magnetic force and circuit current are:

Fev = Am B2 μ0

(11)

iev = MMF n

(12)

The force computed by the previous equation is applied to the mass component representing the anchor, so that the force balance can be properly handled by AMESim. The injectors are solenoid valves driven by the ECU in dependence of engine speed and load. The whole injection cycle takes place in a 720° interval with a 180° delay between each injection. A supercomponent including the same elements as for the solenoid valve has been built to model the electro-injectors. The command signal generation is demanded to the ECU component, which provides a square signal driving each injector and depending on the current engine speed, injector timings and pulse phase angle. 3.2 Controller design for a compressed natural gas injection system In designing an effective control strategy for the injection pressure it is necessary to satisfy physical and technical constraints. In this framework, model predictive control (MPC) techniques are a valuable choice, as they have shown good robustness in presence of large parametric variations and model uncertainties in industrial processes applications. They predict the output from a process model and then impress a control action able to drive the system to a reference trajectory (Rossiter, 2003). A 2nd order state space analytical model of the plant (Lino et al., 2008) is used to derive a predictive control law for the injection pressure regulation. The model trades off between accuracy in representing the dynamical behaviour of the most significant variables and the need of reducing the computational effort and complexity of controller structure and development. The design steps are summarized in the following. Firstly, the model is linearized at different equilibrium points, in dependence of the working conditions set by the driver power request, speed and load. From the linearized models it is possible to derive a discrete transfer function representation by using a backward difference method. Finally, a discrete Generalised Predictive Contorl (GPC) law suitable for the implementation in the ECU is derived from the discrete linear models equations. By considering the duty cycle of the signal driving the solenoid valve and the rail pressure as the input u and output y respectively, a family of ARX models can be obtained, according the above mentioned design steps (Lino et al., 2008):

(1 − a z ) y (t ) = (b z −1

−1

)

− b1 z −2 u ( t )

(13)

Robotics, Automation and Control

where z-1 is the shift operator and a1, b0, b1 are constant parameters. The j-step optimal predictor of a system described by eq. (13) is (Rossiter, 2003):

yˆ (t + j|t ) = G j Δu(t + j − 1) + Fj y (t )

(14)

where Gj and Fj are polynomials in q-1, and Δ is the discrete derivative operator. Let r be the vector of elements y(t+j), j=1, ..., N, depending on known values at time t. Then eq. (14) can ~ = [Δu(t ),… , Δu(t + N − 1)]T , and G a ~ + r , being u be expressed in the matrix form yˆ = Gu lower triangular N×N matrix (Rossiter, 2003). If the vector w is the sequence of future reference-values, a cost function taking into account the future errors can be introduced:

{

}

~ + r − w )T (Gu ~ + r − w ) + λ~ J = E (Gu uT ~ u

(15)

where λ is a sequence of weights on future control actions. The minimization of J with u gives the optimal control law for the prediction horizon N: respect of ~ −1 ~ u = (GT G + λI ) GT (w − r )

(16)

At each step, the first computed control action is applied and then the optimization process is repeated after updating all vectors. It can be shown (Lino et al., 2008) that the resulting control law for the case study becomes:

Δu(t ) = k1 w(t ) + (k2 + k3q −1 )y (t ) + k4 Δu(t − 1)

(17)

where [k1, k2, k3, k4] depends on N.

4. The common rail injection system of diesel engines The main elements of the common rail diesel injection system in Fig. 4 are a low pressure circuit, including the fuel tank and a low pressure pump, a high pressure pump with a delivery valve, a common rail and the electro-injectors (Stumpp & Ricco, 1996). Few details illustrate the injection operation. The low pressure pump sends the fuel coming from the tank to the high pressure pump. Hence the pump pressure raises, and when it exceeds a given threshold, the delivery valve opens, allowing the fuel to reach the common rail, which supplies the electro-injectors. The common rail hosts an electro-hydraulic valve driven by the Electronic Control Unit (ECU), which drains the amount of fuel necessary to set the fuel pressure to a reference value. The valve driving signal is a square current with a variable duty cycle (i.e. the ratio between the length of “on” and the “off” phases), which in fact makes the valve to be partially opened and regulates the rail pressure. The high pressure pump is of reciprocating type with a radial piston driven by the eccentric profile of a camshaft. It is connected by a small orifice to the low pressure circuit and by a delivery valve with a conical seat to the high pressure circuit. When the piston of the pump is at the lower dead centre, the intake orifice is open, and allows the fuel to fill the cylinder, while the downstream delivery valve is closed by the forces acting on it. Then, the closure of the intake orifice, due to the camshaft rotation, leads to the compression of the fuel inside the pump chamber.

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

Fig. 4. Block schemes of the common rail diesel injection systems. When the resultant of valve and pump pressures overcomes a threshold fixed by the spring preload and its stiffness, the shutter of the delivery valve opens and the fuel flows from the pump to the delivery valve and then to the common rail. As the flow sustained by the high pressure pump is discontinuous, a pressure drop occurs in the rail due to injections when no intake flow is sustained, while the pressure rises when the delivery valve is open and injectors closed. Thus, to reduce the rail pressure oscillations, the regulator acts only during a specific camshaft angular interval (activation window in the following), and its action is synchronized with the pump motion. The main elements of an electro-injector for diesel engines are a distributor and a control chamber. The control chamber is connected to the rail and to a low pressure volume, and both its inlet and outlet sections are regulated by an electro-hydraulic valve. The distributor includes the feeding pipes and a plunger pushed by a spring against the injection orifices. The plunger axial position depends on the balance of forces acting upon its surfaces, i.e. the control chamber pressure force, the spring force pulling it in closed position, and the distributor pressure force, in the opposite direction. During normal operations the valve electro-magnetic circuit is off and the control chamber is fed by the high pressure fuel coming from the common rail. When the electro-magnetic circuit is excited, the control chamber intake orifice closes while the outtake orifice opens, causing a pressure drop; after a short transient, the plunger reaches the top position disclosing the injection orifices, allowing the injection of the fuel in the cylinders. The Energizing Time (ET) depends on the fuel amount to be injected. When the electro-magnetic circuit is off the control chamber is filled in, so that the plunger is pulled back by the preloaded spring towards the closed position. In the system under study, the whole injection cycle takes place in a complete camshaft revolution and consists of two injections starting every 180 degrees of rotation. In the described system, the pressure regulation aims at supplying the engine precisely with the specific amount of fluid and the proper air/fuel mixture demanded by its speed and load.

Robotics, Automation and Control

4.1 Virtual prototype of the common rail diesel injection system To build the AMESim model for the common rail diesel injection system, similar assumptions than the previous case are made concerning pressure distributions within lumped volumes like common rail, high pressure pump and injectors control volumes. Diversely, temperature does not affect pressure dynamics. Drag forces acting on moving shutters are computed as previously described. Further, the low pressure pump delivers fuel towards the high pressure pump at a constant pressure, so it is considered as an infinite volume source. On the other hand, because of the isobaric expansion during injection, the cylinders’ pressure is slightly variable within a range that can be determined experimentally. For this reason, the cylinders are considered as infinite volumes of constant, albeit uncertain, pressure. Since most of the relevant components have been previously described in dept, a brief discussion introduces those used to assemble the virtual prototype of the diesel injection system shown in Fig. 5a. The high pressure pump model is composed of two subsystems, the former representing the pump dynamics, the latter describing the delivery valve behaviour. In particular, the Cam and cam follower block is used to represent the cam profile and its rotary motion, which affects the piston axial displacement. The Spool with annular orifice models the orifice connecting the low pressure circuit to high pressure pump; its section varies according to the piston displacement. The piston inertia is neglected in this model. The leakage due to contact between piston and cylinder is taken into account through the Viscous frictions and leakages component, by specifying the length of contact, the piston diameter and the clearance. Finally, a Hydraulic volume with compressibility is used to compute the pressure dynamics as a function of intake and outtake flows qin and qout, as well as of volume changes dm/dt due to mechanical part motions, according to the following equation (IMAGINE S.A., 2007): dp K f = dt V

dm ⎞ ⎛ ⎜ qin − q out − ⎟ dt ⎠ ⎝

(18)

where p is the fuel pressure, V the instantaneous volume of liquid and Kf is the fuel bulk modulus of elasticity. The intake and outtake flows come from the energy conservation law. The same component is used to model the delivery valve internal volume, the control chamber and the distributor volumes inside the electro-injectors, and the common rail. The components included in the delivery valve model are a Mass block with stiction and coulomb friction and end stops, which computes the shutter dynamics, a Poppet with sharp edge seat, a Hydraulic volume with compressibility, and a Piston with spring representing the force applied by the preloaded spring on the delivery valve shutter. To model pipes within the diesel common rail injection system two situations are considered, i.e. short pipes and long pipes, both accounting for friction, fuel compressibility and expansion of pipes due to high pressures. Short pipes are modelled by the compressibility + friction hydraulic pipe sub-model, which uses an effective bulk modulus KB to take into account both compressibility of the fluid and expansion of the pipe wall with pressure; the effective bulk modulus depends on the wall thickness and Young's modulus for the wall material. The equation describing the pressure dynamics at the mid-point is: ∂p K B ∂q =0 + ∂t A ∂x

(19)

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

(a)

(b) Fig. 5. (a) AMESim model of the common rail diesel injection system; (b) Injector supercomponent. where A is the pressure dependent cross sectional area of pipe. Pipe friction is computed using a friction factor based on the Reynolds number and relative roughness (IMAGINE SA, 2007). The resulting flow is calculated by means of Eq. (5). This model is used for pipes connecting common rail and injectors. The long pipe connecting delivery valve to common rail is modelled by using the Simple wave equation hydraulic pipe, which is based on the continuity equation (19) and momentum equation, giving for uncompressible fluids:

Robotics, Automation and Control

∂q A ∂p ∂q f q q =0 − +v + ∂t ρ ∂x ∂x 2 dA

(20)

being ρ the fuel density, d the pipe internal diameter, v the mean flow speed and f the friction factor. The electro-hydraulic valve model includes: a Mass block with stiction and coulomb friction and end stops representing the shutter dynamics; a Piston with spring; the supercomponent Electro-magnetic circuit, which is obtained similarly to the Solenoid with preloaded spring supercomponent and converts the controller signal into a force applied to the shutter; a Spool with annular orifice modelling the shutter. Finally, a supercomponent has been used to model the electro-injectors (Fig. 5b), which is a slightly modified version of the block available within the AMESim library. In particular, it consists of two sub-models representing the control chamber and the distributor, respectively. The former sub-model is equal to the electro-hydraulic valve model. The latter includes a Mass block with stiction and coulomb friction and end stops and a Poppet with conical seat for the plunger, a Piston with spring, a Piston computing volume changes due to plunger motion, and a Viscous frictions and leakages component to take into account flows between the control chamber and the distributor. 4.2 Controller design of the diesel common rail As previously stated, to develop both an appropriate control strategy and an effective controller tuning a simplified model of the diesel common rail injection system is necessary. In fact, considering too many details and a high number of adjustable parameters make the design of the control law quite difficult. Hence, to this aim, a lumped parameter nonlinear model is considered (Lino et al., 2007), which is suitable for control purposes and can be adapted to different injection systems with the same architecture. The model is validated by simulation, using the package AMESim. The model is expressed in state space form, where the pump pressure pp and the common rail pressure pr are the state variables, while the camshaft angular position θ and speed ωrpm, the regulator exciting signal u and the injectors driving signal ET are the inputs. Assuming, without loss of generality, that no reversal flows occur, the state space representation is (Lino et al., 2007): p p = η a ( p p , pr , ωrpm , θ )

pr = f1 ( p p , pr ) + f 2 ( pr ) ⋅ u + ηb ( pr , ET )

(21)

where f1 and f2 are known functions of pressures, which are accessible for measurement and control. Functions ηa and ηb also depend on parameters which are uncertain (i.e. camshaft angular position and cylinders pressure) and not available for control purpose. The aim of the control action u is to take pp and pr close to the constant set-points Pp and Pr . Hence, by defining ep = pp – Pp and er = pr – Pr, equation (21) become: e p = η a ( e p , er , ωrpm , θ )

er = f1 ( e p , er ) + f 2 ( er ) ⋅ u + ηb ( er , ET 12 )

(22)

Given a system described by equations (22), it is possible to design a sliding mode control law that can effectively cope with system nonlinearities and uncertainties (Khalil, 2002). The

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

aim of the sliding mode approach is to design a control law u able to take the system trajectory on a sliding surface s = er – ψ(ep) = 0 and, as soon as the trajectory lies on this surface, u must also make ep = 0. In (22), er plays the role of the control input; therefore the control can be achieved by solving a stabilization problem for e p = ηa e p , er , ωrpm , θ . A

(

)

control law u is designed to bring s to 0 in finite time and to make s maintain this value for all future time. Since by (22)

(

)

s = f 1 e p , er −

(

)

∂ψ η a e p , e r , ωrpm , θ + f 2 (e r ) ⋅ u + ηb (e r , ET 12 ) ∂e p

(23)

the control law u can be designed to cancel f1(ep, er) on the right-hand side of (23): u=

[ (

) ]

1 − f 1 e p , er + ε f 2 (e r )

(24)

where ε must be chosen to compensate the other nonlinear terms in (23). If the absolute value of the remaining term in (23) is bounded by a positive function σ(ep, er) ≥ 0, it is possible to design ε to force s toward the sliding surface s = 0. More precisely, the sliding surface is attractive if ε is given by:

(

) ) (

ε = −β e p , e r sgn (s )

(

)

β e p , er ≥ σ e p , er + β 0

(25)

with β0 > 0 coping with uncertainties (Lino et al., 2007). This ensures that ss ≤ 0 , so that all the trajectories starting off the sliding surface reach it in finite time and those on the surface cannot leave it. The sliding surface is chosen as s = er + kep = 0, where k is an appropriate constant representing the sliding surface slope. In particular, ep → 0 only if k → ∞. With a finite k, ep, and consequently er, are finite and can only be made smaller than a certain value. However, to avoid saturation of the control valve, k cannot be chosen too high. Finally, the sliding surface is made attractive for the error trajectory by a proper choice of β(ep, er). To compensate the rail pressure drop Δpr caused by the injection occurring within the angular interval [180°, 360°] during regulator inactivity, a compensation term is introduced in the pressure reference which is derived from the injection flow equation. Thus, the new pressure reference becomes Pr - Δpr (Lino et al., 2007).

5. Simulation and experimental results 5.1 The CNG injection system To assess the effectiveness of the AMESim model in predicting the system behaviour, a comparison of simulation and experimental results has been performed. Since, for safety reasons, air is used as test fluid, the experimental setup includes a compressor, providing air at a constant input pressure and substituting the fuel tank. The injection system is equipped with four injectors sending the air to a discharging manifold. Moreover, a PC system with a National Instrument acquisition board is used to generate the engine speed and load signals, and a programmable MF3 development master box takes the role of ECU driving the injectors and the control valve. Figure 6 refers to a typical transient operating condition, and compares experimental and simulation results. With a constant 40 bar input pressure, the system behaviour for a

Robotics, Automation and Control

constant tj = 3ms injectors opening time interval, while varying engine speed and solenoid valve driving signal has been evaluated. The engine speed is composed of ramp profiles (6c), while the duty cycle changes abruptly within the interval [2%, 12%] (Fig. 6d). Figures 6a and 6b show that the resulting dynamics is in accordance with the expected behaviour. A maximum error of 10% confirms the model validity. After the validation process, the AMESim virtual prototype was used to evaluate the GPC controller performances in simulation by employing the AMESim-Simulink interface, which enabled us to export AMESim models within the Simulink environment. The interaction between the two environments operates in a Normal mode or a Co-simulation mode. As for the former, a compiled S-function containing the AMESim model is generated and included in the Simulink block scheme, and then integrated by the Simulink solver. As for the latter, which is the case considered in this chapter, AMESim and Simulink cooperate by integrating the relevant portions of models.

(a)

(b)

(c)

(d)

Fig. 6. Simulation and experimental results when varying duty cycle and engine speed, with a constant tj = 3ms; (a) control chamber pressure; (b) common rail pressure; (c) engine speed; (d) control signal duty cycle. The GPC controller was tuned referring to models linearized at the starting equilibrium point, according to design steps of Section 3.2. The test considered ramp variations of the engine speed and load, for the system controlled by a GPC with a N = 5 (0.5s) prediction horizon. The input air pressure from the compressor was always 30bar. The rail pressure reference was read from a static map depending on the working condition and had a sort of

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

ramp profile as well. The final design step consisted in the application of the GPC control law to the real system. In Fig. 7, the engine speed accelerates from 1100rpm to 1800rpm and then decelerates to 1100rpm, within a 20s time interval (Fig. 7b). The control action applied to the real system guarantees a good reference tracking, provided that its slope does not exceed a certain value (Fig. 7a, time intervals [0, 14] and [22, 40]). Starting from time 14s, the request of a quick pressure reduction causes the control action to close the valve completely (Fig. 7c) by imposing a duty cycle equal to 0. Thanks to injections, the rail pressure (Fig. 7a) decreases to the final 5bar reference value, with a time constant depending on the system geometry; the maximum error amplitude cannot be reduced due to the actuation variable saturation. Fig. 7d shows the injectors' exciting time during the experiment. It is worth to note that simulation and experimental results are in good accordance, supporting the proposed approach.

(a)

(b)

(c)

(d)

Fig. 7. Model and real system responses for speed and load ramp variations and a 30bar input pressure, when controlled by a GPC with N = 5; (a) common rail pressure; (b) engine speed (c) duty cycle; (d) injectors exciting time interval. 5.2 The diesel injection system The state space model used for designing the pressure controller has been implemented and simulated in the Matlab/Simulink® environment. To assess its capability of predicting the rail pressure dynamics for each injection cycle, simulation results have been compared both with experimental data obtained on a Common Rail injection system (Dinoi, 2002) and with

Robotics, Automation and Control

those provided by the AMESim software for fluid dynamic simulation. Modelling and simulation within this prototyping environment are used for verifying alternative designs and parameterizations. For the sake of brevity, experimental results are not shown in this chapter (see Lino et al., 2007 for details).

(a)

(d)

(b)

(e)

(c)

(f)

Fig. 8. Comparison of AMESim and Matlab simulated pump and rail pressures, by varying the solenoid valve driving signal and for different camshaft speeds: (a), (b), (c) pump pressure; (d), (e), (f) rail pressure.

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

Fig. 8 compares Matlab and AMESim simulations referred to two complete camshaft revolutions. This figure represents pump and rail pressures, for 800, 1300 and 1800 rpm camshaft speeds respectively, and different values of the electro-hydraulic valve duty-cycle. According to Figure 8, the pump pressure increases because of the piston motion, until the delivery valve opens. From this moment on, the pressure decreases because of the outflow towards the rail. Subsequently, the pressure increases again because of the camshaft profile. The rail pressure is constant during the angular interval in which the delivery valve is closed. The opening of the delivery valve causes a pressure increase, which is immediately compensated by the intervention of the regulation valve. We can conclude that the pressure dynamics are well modelled, both in amplitude and in timing. The difference in the steady values is due to the approximation introduced in the state space model for the electrohydraulic valve.

(a)

(b)

(c)

(d)

Fig. 9. AMESim and Matlab rail pressure dynamics with reference step variations and ramp camshaft speed variations; (a) step and ramp increments in absence of injections; (b) step and ramp increments in presence of injections; (c) step and ramp decrements in absence of injections; (d) step and ramp decrements in presence of injections. To test the sliding mode controller tracking and disturbance rejection capabilities we have extensively simulated different operating conditions by using AMESim software. To check the effectiveness of the approach AMESim simulations with Matlab simulations of the state space model have been compared. Significant results are discussed in the following.

Robotics, Automation and Control

First of all, it is considered a reference pressure step variation, by varying the camshaft speed, without injections. In Fig. 9a, a 300 bar set-point variation occurs at time 0.2, starting from 200bar up to 500bar. The initial 1000rpm camshaft speed increases to 1600rpm following a ramp profile, within a 0.5s time interval starting at time 0.2s. The injectors are kept completely closed, so that the controller copes with the pressure disturbances due to the pump motion. Moreover, the control action, computed at the beginning of each injection cycle, is applied during the valve activation window. The rail pressure is properly taken close to the set-point without any overshoot, but it suffers of undesirable oscillations, as the control action is held constant during the whole camshaft revolution. Figure 9b considers analogous operating conditions, but in presence of injections. The injectors’ driving signal acts to keep the angular opening interval constant, regardless of the camshaft speed. The sliding mode controller is still able to maintain the rail pressure close to the reference value, with a good rejection of the disturbance due to the injection flow. The high pressure oscillations, due to the injectors’ operation, cannot be removed as the control valve acts only during the activation window. Finally, the rising transient is slower than the previous case, as a fraction of the fuel delivered by the pump is sent to cylinders. In both cases, Matlab simulations are in good accordance with those performed within the AMESim environment, showing the feasibility of the derivation of the control law from the reduced order model. In Figures 9c and 9d, a pressure reference step variation occurs at time 0.4, while the speed decreases from 1600rpm to 1000rpm, within a 0.5s time interval starting at time 0.2s, following a ramp profile. In the first case (Fig. 9c), the injectors are kept completely closed. In the second case (Fig. 9d), the injectors opening time is proportional to the camshaft speed. It is possible to note that, within the time interval 0.2-0.4s, the control action is not able to maintain the rail pressure at set-point, as decreasing the pump speed reduces fuel supply for each camshaft revolution. Even in these working conditions the comparison of Matlab and AMESim results confirms the proposed approach.

7. Conclusion In this chapter, we presented a procedure for integrating different models and tools for a reliable design, optimization and analysis of a mechatronic system as a whole, encompassing the real process and the control system. The effectiveness of the methodology has been illustrated by introducing two practical case studies involving the CNG injection system and the common rail diesel injection system for internal combustion engines. The design process included the possibility of analyzing different candidate configurations carried out with the help of virtual prototypes developed in the AMESim environment, design and performance evaluation of controllers designed on simpler models of the plants by employing the virtual prototypes, validation of the control laws on the real systems. Simulation and experimental results proved the validity of the approach.

8. References van Amerongen, J. (2003). Mechatronic design. Mechatronics, Vol. 13, pp. 1045-1066, ISSN 0957-4158. van Amerongen, J. & Breedveld, P. (2003). Modelling of physical systems for the design and control of mechatronic systems. Annual Reviews in Control, Vol. 27, pp. 87–117, ISSN 1367-5788.

Multi-Domain Modelling and Control in Mechatronics: the Case of Common Rail Injection Systems

Bertram, T., Bekes, F., Greul, R., Hanke, O., Ha, C., Hilgert, J., Hiller, M., Ottgen, O., OpgenRhein, P., Torlo, M. & Ward, D. (2003). Modelling and simulation for mechatronic design in automotive systems. Control Engineering Practice, Vol. 11, pp. 179–190, ISSN 0967-0661. Dellino, G., Lino, P., Meloni, C. & Rizzo, A. (2007a). Enhanced evolutionary algorithms for multidisciplinary design optimization: a control engineering perspective, In: Hybrid Evolutionary Algorithms, Grosan, C., Abraham, A., Ishibuchi H. (Ed.), 39-76, Springer Verlag, ISBN 978-3-540-73296-9, Berlin, Germany. Dellino, G., Lino, P., Meloni, C. & Rizzo, A. (2007b). Kriging metamodel management in the design optimization of a CNG injection system. To appear on Mathematics and Computers in Simulation, ISSN 0378-4754. Dieterle, W. (2005). Mechatronic systems: automotive applications and modern design methodologies. Annual Reviews in Control, Vol. 29, pp. 273-277, ISSN 1367-5788. Dinoi, A. (2002). Control issues in common rail diesel injection systems (In Italian), Master Theses, Bari, Italy. Ferretti, G., Magnani, G. & Rocco P. (2004). Virtual prototyping of mechatronic systems. Annual Reviews in Control, Vol. 28, No 2, pp. 193-206, ISSN 1367-5788. IMAGINE S.A. (2007). AMESim Reference Manual rev7, Roanne. Isermann, R. (1996a). Modeling and design methodology for mechatronic systems. IEEE Transactions on Mechatronics, Vol. 1, No 1, pp. 16-28, ISSN 1083-4435. Isermann, R. (1996b). On the design and control of mechatronic systems – a survey. IEEE Transactions on Industrial Electronics, Vol. 43, No. 1, pp. 4-15, ISSN 0278-0046. Isermann, R. (2008). Mechatronic systems - Innovative products with embedded control. Control Engineering Practice, Vol. 16, pp. 14–29, ISSN 0967-0661. Khalil, H. K., (2002). Nonlinear Systems, Prentice Hall, ISBN 978-0130673893, Upper Saddle River. Lino, P., Maione, B. & Rizzo, A. (2007). Nonlinear modelling and control of a common rail injection system for diesel engines. Applied Mathematical Modelling, Vol. 31, No 9, pp. 1770-1784, ISSN 0307-904X. Lino, P., Maione, B. & Amorese, C. (2008). Modeling and predictive control of a new injection system for compressed natural gas engines. Control Engineering Practice, Vol. 16, No 10, pp. 1216-1230, ISSN 0967-0661. Maione, B., Lino, P., DeMatthaeis, S., Amorese, C., Manodoro, D. & Ricco, R. (2004). Modeling and control of a compressed natural gas injection system. WSEAS Transactions on Systems, Vol. 3, No 5, pp. 2164-2169, ISSN 1109-2777. Ollero, A., Boverie, S., Goodall, R., Sasiadek, J., Erbe, H. & Zuehlke, D. (2006). Mechatronics, robotics and components for automation and control. Annual Reviews in Control, Vol. 30, pp. 41–54, ISSN 1367-5788. Rossiter, J.A. (2003). Model-Based Predictive Control: a Practical Approach, CRC Press, ISBN 978-0849312915, New York. Smith, M. H. (1999). Towards a more efficient approach to automotive embedded control system development, Proceedings of IEEE CACSD Conference, pp. 219-224, ISBN 07803-5500-8, Hawaii, USA, August 1999. Stobart, R., May, A., Challen, B. & Morel, T. (1999). New tools for engine control system development. Annual Reviews in Control, Vol. 23, pp. 109–116, ISSN 1367-5788.

Robotics, Automation and Control

Streeter, V., Wylie, K. & Bedford, E. (1998). Fluid Mechanics, McGraw-Hill, ISBN 9780070625372, New York. Stumpp, G. & Ricco, M. (1996). Common rail – An attractive fuel injection system for passenger car DI diesel engine. SAE Technical Paper 960870. Youcef-Toumi, K. (1996) Modeling, design, and control integration: a necessary step in mechatronics. IEEE Transactions on Mechatronics, Vol. 1, No. 1, pp. 29-38, ISSN 10834435. Zucrow, M. & Hoffman J. (1976). Gas Dynamics, John Wiley & Sons, ISBN 978-0471984405, New York.

2 Time-Frequency Representation of Signals Using Kalman Filter Jindřich Liška and Eduard Janeček

Department of Cybernetics, University of West Bohemia in Pilsen, Univerzitní 8, 306 14 Pilsen, Czech Republic 1. Introduction Data analysis is a necessary part of practical applications as well as of pure research. The problem of data analysis is of great interest in engineering areas such as signal processing, speech recognition, vibration analysis, time series modelling, etc. Sensing of physical effects transformed to time series of voltage represents measured reality for us. However, the knowledge of time characteristics of measured data is often quite insufficient for description of real signal properties. Therefore, frequency analysis, as an alternative signal description approach, has been engaged in signal processing methods. Unfortunately, data from real applications are mostly non-stationary and represent non-linear processes in general. This fact has leaded to introduction of joint time-frequency representation of measured signals. Former time-frequency methods dealing with non-stationary data analysis, such as the short-time Fourier transform, repeatedly use a block of data processing with the assumption that frequency characteristics are time-invariant and/or a process is stationary within the usage of the data block. The resolution of such methods is then given by the HeisenbergGabor uncertainty principle. This limitation is mainly addressed by Fourier pairs and by the block processing. In this chapter, the definition of instantaneous frequency introduces a new approach to acquire a proper time-frequency representation of a signal. The description of limitations which have to be taken into account in order to achieve meaningful results is also given there. The section that deals with the instantaneous frequency phenomenon is followed by the part describing the construction of the signal model. A signal is modelled as an output of a system that consists of sum of auto-regressive (AR), linear time-invariant (LTI), secondorder subsystems. In fact, such a model corresponds to a system of resonators formed in parallel. The reason for such model selection is simplicity in matrix implementation as well as preservation of a physical meaning of decomposed signal components. The signal model is developed in state space and there is also demonstrated how to select adequately all system matrices. The estimation of particular signal components is then obtained through the adaptive Kalman filtering which is based on the previously defined state space model. The Kalman filter recursively estimates time-varying signal components in a complex form. The complex form is required in order to compute the instantaneous frequency and amplitude of the

Robotics, Automation and Control

signal components. Furthermore, the algorithm recursive form facilitates implementations in signal processing devices. Initial parameters of the Kalman filter are obtained from frequency spectrum characteristics and from estimation of signal spectral density through the adaptive algorithm by matching the auto-regressive prediction model and the signal. To illustrate performance of the proposed method, experimental results are presented in order to show comparisons of common time-frequency techniques and the Kalman filter method. Moreover, results from analysis of example signals (chirp signal, step frequency change, etc.) are added. The contribution of this method mainly consists in improvement of the time-frequency resolution as discussed in the last section, which concludes the whole chapter.

2. Instantaneous frequency and the complex signal In mechanics, the frequency of vibration motion is defined as the number of oscillations per time period. In the course of one oscillation, the body deflects from the equilibrium, goes through the extremes and the oscillation ends again in the equilibrium position. The simple harmonic sine wave is often used for representation of such a vibration motion. The harmonic motion actually represents projection of circle body move with uniform velocity on a circle diameter. However, in many applications the motion velocity and therefore also the oscillation frequency changes in time. Signals with these properties are often referred to as nonstationary and their important characteristic is the dependence of frequency on time, therefore the notion of instantaneous frequency. In other words, instantaneous frequency is the time-varying parameter which describes the location of the signal’s spectral peak in time. It can be interpreted as the frequency of a sine wave which locally fits the analyzed signal, but physically it has meaning only for monocomponent signals, where there is only one frequency or a narrow range of frequencies varying as a function of time (Boashash, 1992a). Let’s assume the simple harmonic motion in following form: sr (t ) = a cos(ωt + θ ), ω = 2πf

(1)

where a is amplitude, ω is the angular frequency, θ is a phase constant and the argument of the cosine function is the instantaneous phase φ(t) (namely ωt + θ ). In the case of frequency changing the instantaneous phase φ(t) is then the integral of frequency in time and the signal form should be rewritten as t

sr (t ) = a cos( ∫0 ω (t )dt + θ )

(2)

Considering the monocomponent signal, instantaneous frequency ω (t ) is defined, with respect to (2), as a derivation of phase φ (t)

ω (t ) =

dφ (t ) = 2πf (t ) dt

(3)

In most cases, there is no way how to determine the instantaneous phase directly from the real signal. One of the tricks how to obtain the unknown phase is the introduction of

Time-Frequency Representation of Signals Using Kalman Filter

complex signal z (t ) which somehow corresponds to the real signal. As described in (Hahn, 1996) or in (Huang, 1998), the Hilbert transform is the elegant solution for generating of complex signal from the real one. The Hilbert transform of a real signal s r (t ) of the continuous variable t is si (t ) =

∞

P ∫−∞

s r (η ) dη , η −t

(4)

where P indicates the Cauchy Principle Value integral. The complex signal z (t ) z (t ) = s r (t ) + j ⋅ si (t ) = a (t )e jϕ ( t ) ,

(5)

whose imaginary part is the Hilbert transform si (t ) of the real part s r (t ) is then called the analytical signal and its spectrum is composed only of the positive frequencies of the real signal s r (t ) . From the complex signal, an instantaneous frequency and amplitude can be obtained for every value of t . Following (Hahn, 1996) the instantaneous amplitude is simply defined as

a (t ) = s r (t ) 2 + si (t ) 2

(6)

and similarly the instantaneous phase as

ϕ (t ) = arctan

si (t ) . sr (t )

(7)

The instantaneous frequency is then equal to

ω (t ) =

d s (t ) s (t ) si (t ) − si (t ) sr (t ) (arctan i ) = r . dt sr (t ) sr (t ) 2 + si (t ) 2

(8)

Even with the Hilbert transform, there is still considerable controversy in defining the instantaneous frequency as mentioned also in (Boashash, 1992a). Applying the Hilbert transform directly to a multicomponent signal provides values of a(t ) and ω (t ) which are unusable for describing the signal. The idea of instantaneous frequency and amplitude does not make sense when a signal consists of multiple components at different frequencies. It leads (Huang, 1998) to introduce a so called Empirical Mode Decomposition method to decompose the signal into monocomponent functions (Intrinsic Mode Functions). In this work, another method for signal decomposition is introduced.

3. Complex signal component model Let’s consider the multicomponent real signal sr(t) N

s r (t ) = ∑ s r( n ) (t ) + ρ (t ) n =1

(9)

which consists from noise ρ (t ) representing any undesirable components and from N single component nonstationary signals described by envelopes a and frequencies ω.

Robotics, Automation and Control

s r( n ) (t ) = a n ⋅ cos(ω n ⋅ t )

(10)

The first derivation of the signal component is

s r( n ) (t ) = −a n ⋅ ω n ⋅ sin(ω n ⋅ t )

(11)

and the second one is described as follows

s r( n ) (t ) = −a n ⋅ ω n cos(ω n ⋅ t ) = −ω n ⋅ sr( n ) (t ) 2

(12)

On the basis of equation (12) the state space model of signal component s r( n ) (t ) can be derived as second-order ( n = 2 ) model of auto-regressive (AR), linear time-invariant (LTI) system. Let’s assume an AR state space model with state x(t)

x(t ) = A ⋅ x(t )

(13)

y (t ) = C ⋅ x(t )

(14)

and system’s model output y(t)

where A is the state matrix and C is the output matrix. The state vector x(t) consists, in our case, of two internal states: real state part xr(t) and imaginary part xi(t). With respect to the former introduced notation of the signal components and considering the analytical signal, we can choose the state vector components as xr( n ) = cos(ω n t ) and

xi( n ) = sin(ω n t ) . This choice takes into account the Hilbert transform and the orthogonal character of both components. Hence, the corresponding state space model is represented by the following state equation

⎡ xr( n ) (t ) ⎤ ⎡ 0 ⎢ (n) ⎥ = ⎢ ⎣ xi (t ) ⎦ ⎣ω n

− ω n ⎤ ⎡ xr( n ) (t )⎤ ⎥ ⎥⋅⎢ 0 ⎦ ⎣ xi( n ) (t )⎦

(15)

and the model’s output is

⎡ x ( n ) (t )⎤ y n (t ) = [1 0]⋅ ⎢ r( n ) ⎥ . ⎣ xi (t )⎦

(16)

In this case the model output yn (t ) represents the signal component s r( n ) (t ) . The state matrix A of the model is a 2D rotation matrix whose eigenvalues are pure imaginary numbers. The trajectory in state space is a circle and the model represents in fact an undamped resonator (oscillator) with natural frequency in ωn. The solution of the state equation consist only from the hom*ogenous part (there is no model input) and is described by the following form

xr( n ) (t ) = e A(t −t0 ) xr( n ) (t 0 ) .

(17)

Computing state transition matrix e At and using discretization step of Δt = h , the discrete state space representation is acquired as

Time-Frequency Representation of Signals Using Kalman Filter

⎡ xr( n ) (k + 1)⎤ ⎡cos(h ⋅ ωn ) − sin(h ⋅ ωn )⎤ ⎡ xr( n ) (k )⎤ ⎢ (n) ⎥=⎢ ⎥ ⋅ ⎢ ( n ) ⎥ + Γ( k ) ⋅ ξ ( k ) . ⎣ xi (k + 1)⎦ ⎣ sin(h ⋅ ω n ) cos(h ⋅ ωn ) ⎦ ⎣ xi (k )⎦

(18)

and the output equation is

⎡ x ( n ) (k )⎤ y n (k ) = [1 0]⋅ ⎢ r( n ) ⎥ . ⎣ xi ( k ) ⎦

(19)

The generalized output equation for all signal components is then

y ( k ) = C ⋅ x ( k ) + Δ ⋅η ( k ) .

(20)

In previous equations, ξ (k ) is the state noise and η (k ) is the output noise of state model. Both noise vectors, ξ (k ) and η (k ) , are zero-centred with identity covariance matrices. The specific features of the noises are characterized by the covariance matrix Γ and value Δ . This derived resonator model forms together with Kalman filtering approach an estimator of analytical signal. The estimation of the first model state is a real part (sine function) of the signal component and estimation of the second state is an imaginary part (cosine function). 3.1 Selection of matrices Γ and value Δ The choice of the proper matrix Γ and value Δ is the important part in definition of the signal component model. These two parameters decide what amount of energy of the original signal will be assigned to the actual signal component. We have already defined the description of the complex signal in equation (5) which represents the general output of the component model. Let’s assume that the amplitude a of the nth complex signal component z n (k ) is not constant but it changes from sample k to sample k+1 of deviation ε(k):

z n (k + 1) = (a (k ) + ε (k )) ⋅ e jϕ ( k +1) .

(21)

It means that the amplitude in sample k+1 is described in recursive form by the following equation

a (k + 1) = a (k ) + ε (k ) , where ε ( k ) ~ N (0,σ ε ) . 2

(22)

The time changing phasor with its corresponding amplitude a(k), phase ϕ(k) and deviation ε(k) is displayed in Fig.1. The real trajectory of the complex signal mode is marked by black curve and phasor trajectory in case of the constant amplitude in time k is signed as greydashed circle. The recursive form of the amplitude progression model is shown in the left picture in Fig. 1. Each phasor amplitude differs from the previous one (green dotted line) of deviation ε (red line). The right picture of the Fig. 1. shows in detail the progression of the phasor in samples k and k+1. The characteristics of the γ1 and γ2 parameters are derived in the following text. The state in the sample k+1 in dependence on the state in sample k is described, as above defined, in the following form

x( k + 1) = A ⋅ x(k ) + Γ(k ) ⋅ ξ (k ) , where ξ (k ) ~ N (0,1)

(23)

Robotics, Automation and Control

Equation (23) describes model, where the complex signal x(k ) is firstly rotated by the matrix A and the sum with ε (k ) produces a new state vector x( k + 1) . Let’s assume equation (23) in a new form, where vector x(k ) is firstly summed with ε (k ) and as late as then rotated by matrix A:

(

x( k + 1) = A ⋅ x( k ) + Γ (k ) ⋅ ξ ( k )

)

(24)

This form is used in order to reveal the relation between ε and Γ and that is why the new vector Γ is introduced into the state equation.

ε(k)

a(k)

γ2(k)

si (k)

γ1(k)

ϕ(k)

a(k) sr(k)

ϕ(k+1)

a(k)

si(k)

ϕ(k) sr(k)

Fig. 1. The time changing phasor with its corresponding amplitude a(k), phase ϕ(k) and deviation ε(k). Equation (24) can be also rewritten in quadratic form (in vector domain, the transposed vectors and matrices are used)

(

) ( T

x T (k + 1) x( k + 1) = A ⋅ x( k ) + A ⋅ Γ (k ) ⋅ ξ (k ) ⋅ A ⋅ x( k ) + A ⋅ Γ (k ) ⋅ ξ (k )

)

(25)

The converted form of the equation x T (k + 1) ⋅ x(k + 1) = x( k ) T ⋅ AT ⋅ A ⋅ x(k ) + x(k )T ⋅ AT ⋅ A ⋅ Γ (k ) ⋅ ξ (k ) +

(26)

+ ξ ( k ) ⋅ Γ ( k ) T ⋅ AT ⋅ A ⋅ x ( k ) + Γ ( k ) T ⋅ AT ⋅ A ⋅ Γ ( k ) ⋅ ξ ( k ) 2

The quadratic form equality is valid also for the recursive amplitude equation

a (k + 1) 2 = (a( k ) + ε (k ) ) = a (k ) 2 + 2a (k )ε (k ) + ε (k ) 2 2

(27)

Using (27) and the following knowledge of the state product a ( k ) 2 = xT ( k ) x ( k )

(28)

the equation (26) can be rewritten into (30) where product of system matrices A ⋅ A is substituted by the identity matrix I as follows T

Time-Frequency Representation of Signals Using Kalman Filter

AT ⋅ A = I ⇒ x(k ) T ⋅ AT ⋅ A ⋅ x(k ) = x(k ) T ⋅ x(k ); Γ (k ) T ⋅ AT ⋅ A ⋅ Γ (k ) = Γ (k )T ⋅ Γ ( k )

(29)

The result is following

x T (k + 1) ⋅ x(k + 1) = a ( k +1) 2

(30)

= x(k ) T ⋅ x(k ) + x( k ) T ⋅ Γ (k ) ⋅ ξ (k ) + ξ (k ) ⋅ Γ (k ) T ⋅ x( k ) + Γ (k ) T ⋅ Γ (k ) ⋅ ξ (k ) 2 a ( k )2

The main reason of this derivative description is the need to obtain a relation between ε and Γ . Application of mean value operator on (30) results in the equation where only the required variables are presented:

(

)

(

)

E a (k ) 2 + E (2a (k )ε ( k ) ) + E ε ( k ) 2 =

(

) (

= E a( k ) 2 + E x( k ) T ⋅ Γ (k ) ⋅ ξ (k ) + E ξ (k ) ⋅ Γ (k ) T ⋅ x(k ) + E Γ (k )T ⋅ Γ (k ) ⋅ ξ (k ) 2 0

)

(31)

and resulting equation which connects ε and Γ is

σ ε = Γ (k )T ⋅ Γ (k ) 2

(32)

The Γ vector consists of two components (because of second order model) and the result from multiplying of the Γ vectors is the sum of the square vector components

⎡γ (k ) ⎤ Γ (k ) = ⎢ 1 ⎥ ⇒ Γ (k )T ⋅ Γ (k ) = γ 1 (k ) 2 + γ 2 (k ) 2 ⎣γ 2 (k )⎦

(33)

Using (32) the first condition for selection of Γ is obtained as

σ ε = γ 1 (k ) 2 + γ 2 (k ) 2 . 2

(34)

The second condition can be obtained from the presumption that the phasor of the state x(k) and of the deviation ε (k ) have the same normal line. Going from the estimation of the state components xr(k) and xi(k) the ratio between them should be the same as the ration between γ 1 (k ) and γ 2 (k ) (see(35)). To be precise, there is a product of vector Γ and stochastic variable ξ in the ratio equation but because of the same stochastic value in sample k the variable ξ is then cancelled out from the ratio (therefore the γ 1 and γ 2 coordinates of ε in Fig. 1. are also used without variable ξ )

γ 2 ( k ) ⋅ ξ ( k ) xi ( k ) = γ 1 (k ) ⋅ ξ (k ) xr (k )

(35)

The converted form of the ratio equation

γ 2 (k ) =

xi ( k ) ⋅ γ 1 (k ) xr (k )

(36)

Robotics, Automation and Control

is then substituted into (34) and results in formula for the first parameter of vector Γ (k )

γ 1 (k ) =

σ ε 2 ⋅ xr (k ) 2

(37)

x r ( k ) 2 + xi ( k ) 2

and for the second one

γ 2 (k ) =

σ ε 2 ⋅ xi ( k ) 2

(38)

x r ( k ) 2 + xi ( k ) 2

These equations rewritten for the original model definition with the vector Γ(k ) have the following form

⎡ σε 2 ⎢ xr (k ) ⋅ x r ( k ) 2 + xi ( k ) 2 ⎡γ (k ) ⎤ ⎢ Γ( k ) = ⎢ 1 ⎥ = A ⋅ ⎢ 2 ⎣γ 2 ( k )⎦ σε ⎢ x (k ) ⋅ ⎢ i x r ( k ) 2 + xi ( k ) 2 ⎣

⎤ ⎥ ⎥ ⎥. ⎥ ⎥ ⎦

(39)

This formula describes the characteristics of the amplitude deviation between samples k and k+1 namely in the state model in (18) which is then together with (19) used as a model for Kalman estimation. In Kalman filter computation of Γ(k ) an estimate μ (k ) of state x(k ) is used (see(46)). Selection of value Δ in the output equation (20) is the next step of defining the state model. Output y (k ) of n component model is formed only from the real parts of the components. This is acquired through the vector C and the product of C ⋅ x(t ) . The value Δ makes it possible to include the measurement noise ( Δ ⋅η (k ) ) into the output equation. In fact, Δ is the variance of additive error of the measuring chain. For example in case where the sensor accuracy class is defined in statistical sense so that the absolute additive error is 95% quantile in limits of product of accuracy class × measuring range, the 95% quantile represents the range of 0 ± 2σ where σ is the standard deviation of additive error. In this case the following form of the measurement error variance can be accepted 2

⎛ accuracy class × measuring range ⎞ Δ=⎜ ⎟ . 2 ⎝ ⎠

(40)

4. Use of Kalman filter for estimation of the signal components In this work, an adaptive Kalman filter based approach is used for estimation of the analyzed signal. Signal is modelled as sum of resonators (signal components) and it is required that the estimated components are complex functions because of efficient computation of the instantaneous frequency. 4.1 Discrete Kalman filter A discrete-time Kalman filter realizes a statistical estimation of the internal states of noisy linear system and it is able to reject uncorrelated measurement noise – a property shared by

Time-Frequency Representation of Signals Using Kalman Filter

all Kalman filters. Let’s assume a system with more components as mentioned above. Then the state matrix consists of following blocks:

⎡ cos(h ⋅ ω n ) sin( h ⋅ ω n ) ⎤ An = ⎢ ⎥, . ⎣− sin( h ⋅ ω n ) cos(h ⋅ ω n )⎦

(41)

and the state noise vector blocks are defined as in (39). Then in state-variable representation, the description of the whole system, which is characterized by the sum of resonators, is given by the following matrices:

⎡ A1 ⎢ ⎢0 A=⎢0 ⎢ ⎢ ⎢0 ⎣

0⎤ ⎥ 0⎥ ⎥; . ⎥ 0⎥ An ⎥⎦

(42)

C = [1 0 1 0 … 1 0]; .

(43)

0 A2 0

0 0

…

… …

2 n×2 n

the output matrix is following

1×2 n

and state and measurement noise is characterized by the following parameters ⎡ Γ1 ⎤ ⎢ ⎥ Γ Γ = ⎢ 2⎥ ; ⎢ ⎥ ⎢ ⎥ ⎣⎢Γn ⎦⎥

Δ.

(44)

1×1

2 n×1

Commonly, the Kalman estimation includes two steps – prediction and correction phase. Let’s assume that the state estimate μ (0) is known with an error variance P(0) . A priori value of the state at instant k+1 can be obtained as

μ (k + 1) = A ⋅ μ (k )

(45)

The measured value y (k ) is then used to update the state at instant k. The additive correction of the a priori estimated state at k+1 is according to (Vaseghi, 2000) proportional to the difference between the a priori output at instant k defined as C ⋅ μ (k ) and the measured y (k ) :

μ (k + 1) = A ⋅ μ ( k ) + K (k ) ⋅ ( y (k ) − (C ⋅ μ (k ) ))

(46)

where K (k ) is the Kalman gain which guarantees the minimal variance of the error x( k ) − μ (k ) . Also, at each step the variance P(k + 1) of the error of μ ( k + 1) is calculated (see (Vaseghi, 2000)):

Robotics, Automation and Control

(

P (k + 1) = A ⋅ P (k ) ⋅ AT + Γ ⋅ Γ T − K (k ) ⋅ C ⋅ P (k ) ⋅ AT + Δ ⋅ Γ T

)

(47)

This variance matrix is then used for calculation of Kalman gain in the next step of the recursive calculation (correction phase):

(

)(

K (k ) = A ⋅ P (k ) ⋅ C T + Γ ⋅ ΔT ⋅ C ⋅ P (k ) ⋅ C T + Δ ⋅ ΔT

)

−1

(48)

4.2 Estimation of initial parameters The initial parameters for Kalman filter are obtained from the estimation of the signal spectrum. In principle, there are many ways how to fix the parameters, let’s present two of them. Generally, there is need to define which modes, or more precisely which frequencies ω n , should be estimated and what is the variance of the amplitude on these frequencies. The first way how to acquire the initial parameters is based on the short-time Fourier transform (STFT) algorithm. The half length of the STFT window can be used as an order of the modelled system (number of modelled resonators) and the corresponding frequencies are then the frequencies of the modelled signal components. The variation of the amplitude 2 in the STFT frequency bands serves then as the estimation of the variation σ ε of ε for each resonator. This approach shows the Kalman filter method based on the short-time Fourier frequency analysis. The second alternative is based on the estimation of spectral density by fitting an AR prediction model to the signal. The used estimation algorithm is known as Burg’s method (Marple, 1987), which fits an AR linear prediction filter model of a specified order to the input signal by minimizing the arithmetic mean of the forward and backward prediction errors. The spectral density is then computed from the frequency response of the prediction filter. The AR filter parameters are constrained to satisfy the Levinson-Durbin recursion. The initial Kalman filter parameters (frequencies of the resonators) are then obtained as local maxima of estimated spectral density which are greater than a predefined level. These values indicate significant frequencies in spectral density and determine the order of the model. The Fig. 2. shows the schema of the whole method. The estimated parameters from the input signal form the system model which is then used in the estimation phase of the algorithm.

spectrum estimation

model frequencies

analyzed signal

z1 - estimate z2 - estimate . . . .

Kalman filter zn - estimate

IA, IF calculation (A1-An , ω1- ωn)

TFR

Fig. 2. Schema of the method for time-frequency decomposition of the signal using Kalman filter

Time-Frequency Representation of Signals Using Kalman Filter

Kalman filter estimates the states of the modelled system in form of complex signal. It means that the imaginary parts of components are estimated simultaneously with the estimation of real signal components. This procedure substitute the Hilbert transform in generation of analytical signal and is performed recursively. Each component in complex form serves then for calculation of the instantaneous amplitude and instantaneous frequency (see (6) and (8)). The output of the method is the timefrequency representation of the analyzed signal. The representation consists of amplitude curves with changing frequency in contrast to other time-frequency methods whose TF representation is computed in bands that is why the resolution in time and frequency is limited.

5. Examples In this section, some examples of the time-frequency analysis using Kalman filter method are shown. The most important indicators in comparison with other analysis methods are the sharpness of the estimated frequency curves and the adaptability to the changes in the component amplitudes. These attributes are deciding for the next signal processing in the class of problems which are solved by authors. Let’s have a signal with three harmonic components, with 1kHz sampling rate and the total analyzed length of the signal is 1 second ( N = 1000 points). Signal is formed by sinus functions with oscillation frequencies f1 = 10 Hz , f 2 = 30 Hz and f 3 = 50 Hz . The amplitude for all three components is set to 10, the second component is set to zero for the first 0.5 2 seconds. The output noise with mean mη = 0 and variance σ η = 1 was added to the simulated signal. The initial parameters for Kalman filter were obtained through Burg’s AR linear prediction filter of order 10 and the level for local maxima in the Burg’s spectral density was determined as max > 1 . The initial conditions of Kalman estimator were set up

in the following way: μ (0) = [1…1] , P (0) = 10 6 ⋅ I , Δ = 1 , where dim( μ ) = n × 1 and dim( P) = 2n × 2n . The time-frequency representation of the instantaneous frequency and amplitude is shown in Fig. 3 (left). The result shows that the adaptation of one of the model component has an influence also on the instantaneous frequency estimates of other components which is expressed in the oscillation between 0.5 and 0.6 second (the overall oscillation of the instantaneous frequency is the effect of the output noise). The adaptation rate is shown in Fig. 3 (right), where the comparison of estimate convergence of the component at f 2 = 30 Hz using Hilbert approach and Kalman filter is performed. The T

Hilbert transform was applied artificially to the component itself, not to the analyzed signal (in order to compare the adaptation). The reason is the disadvantage of the Hilbert transform that requires the pre-processing of the signal through some signal decomposition method. The above introduced Kalman filter method uses for decomposition of the signal into its components, as already mentioned, the model of sum of resonators and thus simultaneously decomposes the signal and estimates the time progression of its components.

Robotics, Automation and Control

Fig. 3. Simulation signal with three components (left). Complex signal (right) obtained through Hilbert transform (dotted line) and through Kalman estimation (solid line). The algorithm based on Kalman estimation is also illustrated on another signal example (see Fig. 4). It consists from signal components, where the first one is the stationary harmonic signal with constant frequency. This signal part is summed in time with the concave parabolic chirp signal where its frequency changes from 0.45 to 0.1 of normalized frequency. Both components exist in time between t = 100 and t = 900 . The initial conditions of Kalman estimator were set up as mentioned in previous example and initial frequencies of the model were obtained through Burg’s AR linear prediction filter of order 25 (number of estimated frequencies was n=10 and thus order of the estimated system was 20). The Kalman filter method is compared with other typical time-frequency methods and the results are shown in Fig. 4. For comparison, the short-time Fourier transform - STFT, wavelet transform – WT (using Morlet wavelet) and smoothed pseudo Wigner-Ville distribution – SPWVD, were used. The output of Kalman estimation in time-frequency domain has comparable resolution as the output of the SPWVD. The other methods have relative wide frequency bands containing the signal energy.

Time-Frequency Representation of Signals Using Kalman Filter

Fig. 4. Estimation of instantaneous frequency and amplitude of harmonic and chirp signal components. Similarly, the result of methods comparison in Fig. 5. shows the Kalman filter method and SPWVD as more detailed time-frequency approaches. The test signal consists, in this case, of four harmonic components and the accuracy of the methods to identify the frequency and time of the component origin or end is tested. The signal begins again in time t = 100 and ends in t = 900 . The frequency changes in t = 300 and t = 600 , whereas there are two components simultaneously between these times. The ability of methods to distinguish between these two component frequencies is visible. SPWVD and Kalman filter representations display these two simultaneous components separately. On the contrary STFT and WT representations contain time-frequency artefacts between components and that’s why the identifying of each component separately using these methods could be in such cases really difficult.

Robotics, Automation and Control

Fig. 5. Estimation of instantaneous frequency and amplitude of harmonic components of the signal.

Fig. 6. Estimation of instantaneous frequency and amplitude of loosened part impact signal: Kalman filter method (left) and STFT (right).

Time-Frequency Representation of Signals Using Kalman Filter

The last example is the transform of the acoustic signal from the real equipment where the nonstationary event took place. This event is measured result of loosed part impact in primary circuit of nuclear power plant. Signal was measured with 100 kHz sampling rate. For comparison, in Fig. 6, the time-frequency-amplitude responses of STFT (right) and of the Kalman method (left) are compared. Kalman filter signal model was initialized with frequencies from the frequency analysis with window of length 28 samples. The same window length with 75% overlapping was used for analysis by means of STFT (Fig. 6 right). An event (impact) which occurs at time 0.041 seconds is well visible in both spectrograms (see the frequency band 1 - 12 kHz). The advantage of the Kalman version of spectrogram is its better resolution in time and frequency than in STFT spectrogram (which is used by authors for improvement of impact location).

6. Conclusion and perspective The new approach for time-frequency signal analysis and for representation of instantaneous frequency and amplitude has been introduced in this chapter. The procedure is based on the Kalman estimation and shares its advantages regarding the suppression of measurement noise. In this method the Kalman filter serves for dissociation of signal into modes with well defined instantaneous frequency. The analyzed signal is modelled as sum of second-order subsystems (resonators) whereby the Kalman filter decompose the system by estimating these subsystems. Simultaneously with the signal decomposition the time progression of signal components in complex form is evaluated. This procedure utilizes the adaptive feature of the Kalman filter and it is done recursively for each sample. In cases where the short–time Fourier transform cannot offer sufficient resolution in timefrequency domain, there can be taken the advantage of this method despite of higher computational severity. Also the experimental results show that the resolution of the introduced method is equal or higher than in other usual time-frequency techniques. In vibro-diagnostic methods, where time-frequency information is used for location of nonstationary events, the sharpness of the introduced method is helpful for the improvement of the non-stationary event location. The next tasks of the Kalman filter method development are the systematization of the results from different actually solved problems in loose part monitoring and finding the new branches where the method could be effectively apllied.

8. References Boashash, B. (1992a). Estimating and Interpreting the Instantaneous Frequency of a Signal – Part 1: Fundamentals, Proceedings of the IEEE, Vol. 80, No. 4, pp. 520-538, April 1992, ISSN 0018-9219 Boashash, B. (1992b). Estimating and Interpreting the Instantaneous Frequency of a Signal – Part 2: Algorithms and Applications, Proceedings of the IEEE, vol. 80, no. 4, pp. 540568, April 1992, ISSN 0018-9219 Cohen, L. (1995). Time Frequency Analysis: Theory and Applications, Prentice Hall PTR., ISBN 978-0135945322, New Jersey Fairman, F.W. (1998). Linear Control Theory: The State Space Approach, John Wiley& Sons, ISBN 0471974897, Toronto

Robotics, Automation and Control

Hahn, S.L. (1996). Hilbert Transforms in Signal Processing, Artech House Publishers, ISBN 978089006886, Boston Huang, N.E., et al. (1998). The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis, Proceedings: Mathematical, Physical and Engineering Sciences, Vol. 454, No. 1971, pp. 903-995, March 1998, ISSN 1364-5021 Marple, S.L. (1987). Digital Spectral Analysis with Applications, Prentice Hall Signal Processing Series, Prentice Hall, Englewood Cliffs, ISBN 978-0132141499, New Jersey. Vaseghi, S.V. (2000). Advanced Digital Signal Processing and Noise Reduction, John Wiley & Sons Ltd, ISBN 0471626929, New York

3 Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals Guido Maione

Technical University of Bari Department of Environmental Engineering and Sustainable Development (DEESD) Viale del Turismo, 8 – I74100 – Taranto Italy 1. Introduction Maritime intermodal container terminals (CTs) are complex hub systems in which multiple transport modes receive and distribute freight to various destinations. They can be considered as interchange places in a transport system network where seaways, railways, and motorways intersect. Freight is usually organized, transported, stacked, handled and delivered in standard units of a typical container, which is called TEU (Twenty Equivalent Unit), 20 feet long, 8 feet wide, and 8 feet high. TEUs easily fit to ships, trains and trucks that are built and work for this kind of cargo units. Usually, ships travelling on long routes (e.g. from Taiwan to a Mediterranean Sea port) are called vessel or ‘mother’ ships, and ships offering service on short distances in a local area (e.g. from Turkey to Mediterranean ports in other countries) are called feeder ships. Vessel and feeder ships differ in size and container capacity. A maritime CT is usually managed to offer three main services: a railway/road ‘export cycle’, when containers arrive to the terminal by trains/trucks and depart on vessel ships; a railway/road ‘import cycle’, when containers arrive on vessel ships and depart by trains/trucks; a ‘transshipment cycle’, when containers arrive on vessel (or feeder) ships and depart on feeder (or vessel) ships. These activities cause different concurrent and competition processes for the available resources. The aim is to achieve efficiency in flows of TEUs and in information distribution to guarantee fast operations and low costs. Many problems were investigated and many of them have connections and cannot be solved separately. The main focused problems are: berth allocation of arriving ships; loading and unloading of ships (crane assignment, stowage planning); transfer of containers from ships to storage area and viceversa (quayside operation); stacking operations (yardside operation); transfer to/from other transport modes (landside operation); workforce scheduling. In a word, managing a CT requires: • planning, monitoring and controlling all internal procedures for handling TEUs; • optimizing handling equipment, human operators, and information and communication technologies (control software and hardware, PDAs, wireless sensors

Robotics, Automation and Control

and actuators, etc.); optimization must take into account connections and relations between humans and hardware resources. Then, an intelligent and reactive control system is required to meet terminal specifications, independently of disturbances and parameter variations. As a matter of fact, robustness is important to guarantee quick reaction to different phenomena perturbing normal (or steadystate) operating conditions. Perturbations may come from: increase of ship traffic volumes or urgent demands for service; infrastructure development (reduction/expansion of berthing or stacking spaces, changes in yard organization, acquisition of new resources, etc.); changes in routing of transport vehicles or traffic congestion inside the CT; faults and malfunctions or sudden lack of hardware resources. Complexity of the issues involved justifies trends towards distributing the computing resources and decision controllers while aiming at robustness, and hence motivates heterarchical control by means of a Multi-Agent System (MAS). In a MAS, indeed, information, decision, and control are physically and logically distributed across agents. This chapter reports some preliminary results and ideas on how to model MAS controlling intermodal CTs, by using the DEVS-Scheme (Zeigler, 2000). The accurate model could be used to develop a simulation platform, which can be useful to test different control strategies to be applied in real cases. For example, the CT in Taranto (Italy) can be considered as a real system to be simulated in different operating scenarios. The organization of this chapter is as follows. Section 2 discusses about literature contributions to modelling, simulation and control of CTs. Motivations are explained to use the methodology of Discrete-Event Dynamic Systems to represent the MAS architecture for simulation and control of complex CTs. Section 3 describes the typical services and processes to be guaranteed in an intermodal container terminal. Then, the basic components of the proposed MAS are presented, and their roles and relations are generally specified. Section 4 specifies how agents are modelled as atomic discrete-event dynamic systems and focuses on the interactions between agents, when containers are downloaded from ships to the terminal yard area. Section 5 gives some ideas and details about the plans of a simulation platform to test efficiency and robustness of the proposed MAS architecture. Section 6 overviews the benefits of the approach and enlightens the open issues.

2. Literature overview: motivation of DEVS modelling for MAS simulation and control Planning processes and scheduling resources in a maritime CT pose very complex modelling and control problems to the scientific community. In particular, flexible and powerful modelling and simulation tools are necessary to represent intermodal hub systems made up of many different infrastructures and services. Modelling, simulation and control of CTs is a relatively new field of research. Several literature contributions developed models and simulation tools, but no standard exists to be applied to the different real terminals and scenarios. In the absence of standard tools, several research studies are based on discrete-event simulation techniques (Vis & de Koster, 2003), on mathematical models or empirical studies (Crainic et al., 1993; Peterkofsky & Daganzo, 1990; Gambardella et al., 1998). Analytical models, based on different approaches, have been proposed as tool for the simulation of terminals useful to define the optimal design and layout, organization, management policies and control. A thorough literature review on modelling approaches is

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

given in (Steenken et al., 2004). As regards control issues, determining the best management and control policies is a challenging problem (Mastrolilli et al., 1998). Literature highlights two main classes of approaches: microscopic and macroscopic modelling approaches (Cantarella et al., 2006). Microscopic modelling approaches are generally based on discrete-event system simulation that may include Petri Nets (Degano & Di Febbraro, 2001; Fischer & Kemper, 2000; Liu & Ioannou, 2002), object-oriented approaches (Bielli et al., 2006; Yun & Choi, 1999), and queuing networks theory approaches (Legato & Mazza, 2001). Even if it requires high computational effort, microscopic simulation allows the explicit modelling of each activity within the terminal as well as of the whole system by considering the single containers as entities. Then, it is possible to estimate performance as consequence of different system design and/or management scenarios. Macroscopic modelling is suitable for terminal system analysis. It assumes continuous containers flow along the whole sequence of activities. This representation is useful and appropriate for supporting strategic decisions, system design, terminal layout, handling equipment investments. The most frequent problems in this class are berth planning, marshalling strategies, space allocation and system layout, handling equipment capacity and technologies. A network-based approach is presented in (Kozan, 2000) for optimising the terminal efficiency by using a linear programming method. In (de Luca et al., 2005) a macroscopic model is based on a space-time domain. In the context of intelligent control of transport systems, this research contribution proposes a rigorous approach based on the Discrete EVent System (DEVS) specification technique (Zeigler et al., 2000) to completely and unambiguously characterize a MAS for controlling an intermodal container terminal. Namely, no standard exists for modelling and simulating complex MAS controlling CTs, but a generic and unambiguous framework is needed to describe the discrete and asynchronous actions of agents, and to guarantee modular analysis and a feasible computational burden. These requirements are easily enforced by the DEVS formalism. Connecting agents, each modelled as an atomic DEVS, makes the whole MAS represented as a DEVS, as the formalism is closed under coupling. Moreover, DEVS theory provides a strong mathematical foundation and models can be easily translated in a simulation environment. The DEVS approach is then useful to develop a simulation platform to test the MAS efficiency in controlling the CT activities. In particular, both static and dynamically adapted decision strategies can be tested for the agents defining the MAS. Then, performance is measured in terms of commonly used indices (ship service time, throughput or lateness of containers, resource utilization, etc.) and other indices of the MAS efficiency (number of requests in negotiation, waiting time before decision, etc.). Performance can be evaluated both in steady-state operating conditions and in perturbed conditions, when disturbances or parameter variations occur. Autonomous agents play as atomic DEVS dynamic systems. They exchange messages one with another to negotiate services in a common environment. Agents are local controllers, represented by software modules devoted to a hardware component or a specific function (like mediation between different agents). They behave autonomously and concurrently, by communicating each other to negotiate tasks, so exchanging messages and information to ‘buy’ or ‘sell’ services. Usually, there is no hierarchy between agents. In this context, cooperation may be explicitly designed or implicitly achieved through adaptation of the agents’ decision mechanism, by using feedback of their effects. The general recognized

Robotics, Automation and Control

benefits of a MAS architecture are reducing the programming complexity, guaranteeing scalability and re-configurability, obtaining an intelligent and reactive control software. The DEVS technique is fully compatible with the heterarchical design principles, it leads to MAS where all information and control functions are distributed across agents, and it is also suitable for a rigorous theoretical analysis of structural properties of the MAS. Moreover, the DEVS formalism is an interesting alternative to other recently proposed tools for MAS specification, e.g. the Unified Modeling Language (Huhns & Stephens, 2001) and Petri Nets (Lin & Norrie, 2001). This formalism is suitable to develop useful models both for discreteevent simulation and for implementation of the software controllers for the considered system. As in MAS for manufacturing control (Heragu et al., 2002; Shen & Norrie, 1999), agents may use decision algorithms emulating micro-economic environments. Each ‘buyer’ agent uses a fictitious currency to buy services from other ‘seller’ agents which, on their turn, use pricing strategies. Sellers and buyers have to reach an equilibrium between conflicting objectives, i.e. to maximize profit and to minimize costs, respectively. Recently developed analytical models of negotiation processes (Hsieh, 2004), underline the need of a systematical analysis and validation method for distributed networks of autonomous control entities. Other researches have focused on the experimental and detailed validation of MAS on distributed simulation platforms (Schattenberg & Uhrmacher, 2001; Logan & Theodoropoulos, 2001). This work proposes a DEVS model of a MAS architecture for controlling an intermodal container terminal system. In particular, a detailed DEVS model of the interactions, occurring between the agents concurrently operating during the critical downloading process of containers from a ship, is developed. The author aims at giving a contribution to define a complete DEVS model to develop a detailed simulation platform for testing and comparing the proposed MAS with other centralized or distributed control architectures for intermodal container terminals. The simulation model could be used to design and test alternative system layouts and different control policies, to be used in standard or perturbed operating conditions.

3. Multi-agent system framework for intermodal container terminals To define the autonomous agents operating and interacting in an intermodal container terminal environment, the main processes executed in the terminal area have to be examined to represent the most significant and critical aspects. These processes are associated to the offered services, syntethically described as follows. 3.1 Import, export and transshipment cycles in an intermodal container terminal As recalled before, an intermodal terminal usually offers three different kinds of services interconnecting different transport modes: 1. an import cycle, when containers arrive on a vessel or feeder ship and depart by trains or trucks, corresponding to a transition from sea to railway or road modes; 2. an export cycle, when containers arrive by trains or trucks and depart on a vessel ship, for a transition from railway or road to sea mode; 3. a transshipment cycle, when containers arrive and depart by ship: cargo is moved from vessel to feeder ships to reach close destinations, or, viceversa, from feeder to vessel ships to reach far ports.

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

The possible flows of containers are synthetically described in Fig. 1, where TRS, IMP and EXP represent transhipment, import and export cycles, respectively, and RA and RO symbolize a railway or road transport mode. Note that full containers can be imported, exported or transshipped. Obviously, imported TEUs are only downloaded from ships, exported TEUs are only loaded on ships, while transshipped TEUs occur both processes. Empty containers are downloaded from feeder ships or arrive on trains or trucks, then they are loaded on vessel ships. So, these TEUs are transhipped or exported. Feeder Ship Vessel Ship

TRS

Blocks for full TEUs

RA/RO IMP Train/Truck

RA/RO EXP TRS

Train/Truck Blocks for full TEUs Feeder Ship

Vessel Ship

Train/Truck Blocks for empty TEUs Feeder Ship

Vessel Ship

Fig. 1. Flows of containers in an intermodal container terminal To execute these cycles, the companies managing container terminals provide activities like: loading/downloading containers on/from ships; delivering/picking containers to/from trucks or trains; stacking and keeping containers in dedicated areas, called blocks, in which the terminal yard is organized; transferring containers from ship/train/truck to yard blocks and backwards; inspecting containers for customs, safety and other requirements; consolidating, i.e. redistributing containers between blocks to allow fast retrieval. Several dedicated or shared resources are used to execute the above processes: cranes (quay cranes, yard cranes, railway cranes, jolly mobile cranes); internal transport vehicles (trailers, automatically guided vehicles or AGVs); reach stackers for handling containers between trailers and trucks, and for dangerous or inspected containers; side loaders for handling empty containers; other special areas and infrastructures like quays and berths, lanes for internal transport, terminal gates, railway tracks; skilled human operators. For example, the layout of the Taranto Container Terminal (TCT, see official website at http://www.tct-it.com/), located in the Taranto city harbour area in the south-east of Italy and currently managed by a private company, is organized as sketched in Fig. 2. A quay receives ships, yard stacking blocks are used for keeping full or empty TEUs. A gate (G) let trucks enter or exit the terminal, a railway connection allows trains to enter or exit. Special blocks are for parking of trailers (PARK), fuel station (FU), customs (EX), inspections, and other functions. A control centre (CC) is used to follow the processes. Stacking blocks are divided between blocks for full containers (from 11 to 46), blocks for empty containers (M’s) and blocks for dangerous containers (DG). High-level planning operators schedule and monitor

Robotics, Automation and Control

all activities, while low-level quay and yard specialized operators execute the planned activities. Most of activities in TCT are for transshipment services.

RAILWAY T RACKS

W&M

SE ASIDE FU

DG 11

M11

M12

M10 M8

STRIP

PARK

QUAY SE ASIDE

Fig. 2. Taranto Container Terminal organization A road import cycle may be divided into three steps. Firstly, quay cranes download containers from a berthed ship to trailers, which transfer cargo into blocks. Here, yard cranes pick-up containers from trailers and stack them in assigned positions. Secondly, containers stay in the blocks for a certain time, while waiting for their destination; they are eventually relocated by yard cranes in a more proper position, according to a consolidation procedure which may use trailers to move containers between blocks. Thirdly and after consolidation, containers are loaded from blocks to trucks, which exit from the terminal gate. Similarly, in a railway import cycle, yard cranes pick-up containers from blocks and load them on trailers moving from yard to the railway connection, where special railway cranes pick-up containers to put them on departing trains. In a road/railway export cycle, the sequence goes in the opposite direction: from the terminal gate or railway connection to blocks and then to vessel ships. If arriving on trucks, containers are transferred to yard blocks where they are picked up by yard cranes; then, they are stacked and consolidated; finally, they are moved by trailers to the quayside where quay cranes load them on ships. If containers arrive on trains, railway cranes load them on trailers for the transfer to yard blocks. In a transshipment cycle, when a vessel ship arrives, containers are downloaded from ships to trailers using quay cranes, transferred to blocks using trailers, picked-up and stacked by yard cranes. After a consolidation and/or a delay in their position, containers are picked-up by trailers and transferred back to the quay area, where they are loaded on a feeder ship. The opposite occurs if a feeder arrives after a short trip and a vessel departs for a long one. Hereinafter, the modelling focus will be on the first step of the import and transshipment cycles, when a downloading process from a ship is executed. Namely, this is considered as a critical phase for the terminal efficiency. But similar models and observations can be obtained for the subsequent steps and also for the export cycles. 3.2 Classification of agents in the MAS Now, a MAS is specified modelling the negotiations occurring in an intermodal container terminal which is mainly devoted to transshipment cycles. The main considered agents are:

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

•

the Container Agent (CA): each CA is an autonomous entity in charge of controlling the flow of a single container unit or a group of containers; • the Quay crane Agent (QA): it is an autonomous controller for a (set of) quay crane(s) with the same performance characteristics, or with the same physical possibility to reach and serve ship bays; • the Trailer Agent (TA): it is an entity associated to a (set of) trailer(s) with the same performances, or with the same reachable portions of quay or yard spaces; • the Yard crane Agent (YA): it manages the control of a (set of) yard crane(s) guaranteeing the same performances, or associated to the same yard blocks; • the Railway crane Agent (RA): it is a software controller of a (set of) railway crane(s), used to receive containers from trailers and to load them on trains, or to deliver containers from trains to trailers; • the Truck Agent (KA): it follows the operations executed by a (set of) truck(s), entering or leaving the terminal by the gate. If related to a set of containers, a CA represents units physically stowed in a ship bay when considering the downloading process or units stacked in yard blocks when considering the loading process. In import processes, the CA should identify the most suitable quay crane to download containers from ship, then the most suitable trailers to transport containers to their assigned yard blocks, and finally the most proper yard cranes to pick-up and stack containers in their assigned block positions. All these choices result from negotiations between the CA and several QAs, TAs, and YAs. The CA has also responsibilities in export processes: it selects the yard cranes to pick-up containers from blocks, trailers to transport them to the quay area, and quay cranes to load them into their assigned bay-row-tier location in the ship. In this case, the CA negotiates with YAs, TAs, and finally YAs. During consolidation, containers are sometimes moved around and relocated to new positions in yard blocks, such that subsequent export operations are optimized or made easier. These moves are the consequence of other negotiations of CAs with YAs and TAs. The decisions taken by a CA are based upon real-time updated information received from agents of the alternative available cranes and trailers. The global control of the activities in the terminal emerges from the behaviour of concurrently operating agents. The dynamical interaction between agents has to be analysed to specify the desired global system behaviour. For instance, the precedent observations lead us to examine interactions between a CA and several QAs, TAs, and YAs for downloading, transferring, and stacking containers, or for picking, transferring, and loading containers. Interactions exist also between a CA and YAs, TAs, and RAs/KAs when railway/road transport modes are considered. The interaction between agents is usually based on a negotiation mechanism, which is typically organized in the following steps. Announcement: an agent starts a bid and requires availability to other agents for a service. Offer: the agent requests data to the agents which declared availability. These data regard the offered service the queried agents can guarantee. Reward: the agent selects the best offer between the collected replies from the queried agents, and sends a rewarding output message. Confirmation: the agent waits for a confirmation message from the rewarded agent, after which it acquires the negotiated service. If confirmation is delayed or does not arrive, then the agent selects another offer in the rank or starts the bid over again.

Robotics, Automation and Control

In this work, the focus is on the interactions between a CA with QAs, TAs and YAs for downloading containers from ships to yard blocks during transhipment cycles. The reason is twofold: transhipment is the main and more complex service in intermodal container terminals (e.g., see the TCT case), and the downloading part has critical effects on terminal efficiency. The developments may be easily extended to other negotiations between agents for the loading part of transshipment cycle, export and import cycles or other processes. The agents' negotiations and decisions are only limited by: • constraints of terminal spaces and resources (e.g., the limited number of quay cranes that may physically serve a fixed ship bay, considering that quay cranes are sequentially lined up and move along fixed tracks; the limited number of yard cranes serving blocks); • fixed working schedules for downloading/loading processes (they often establish the number of containers moved for each ship bay, their exact location in the ship hold or cover, the sequence of handling moves, sometimes even the preferred handling quay crane). Agents' decisions are fulfilled by human operators devoted to the associated resources (i.e. crane operators, trailer/truck drivers). The network of interacting agents may appear and behave as a unique distributed supervisor for the physical terminal system.

4. Discrete-event systems modelling of agents’ dynamics Now, each agent in the previously identified classes is described as an atomic DEVS (Zeigler, 2000). Namely, all agents interact by transmitting outputs and receiving inputs, which are all considered as event messages. Events are instantaneous, then timed activities are defined by a start-event and a stop-event. For each agent, internal events are triggered by internal mechanisms, external input events (i.e. inputs) are determined by exogenous entities, for example other agents, and external output events (i.e. outputs) are generated and directed to other entities. These external or internal events change the agent state. Namely, an agent stays in a state until either it receives an input X or the time scheduled by a time advance function ta elapses (this time specifies the time before the occurrence of an internal event I). In the first case, an external transition function δext determines the state next to the occurrence of the received input; in the second case, an internal transition function δint gives the state next to the occurrence of the internal event. An output function λ is used to generate the reactions of the agent, each output being indicated with a symbol Y. It is important to note that the DEVS formalism makes a difference between the total state, q, and the sequential state, s. This latter refers to the transition mechanism due to internal events. It is based on the current value of the so-called status, i.e. the condition in which an agent stays between two consecutive events during negotiations, and other characteristic information inf peculiar to the considered agent: s = (status, inf)

(1)

The total state is composed by s, the time e elapsed since the last transition, and some additional information like the decision logic DL currently used by the agent to rank and choose the offers received by other agents during negotiation: q = (s, e, DL)

(2)

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

For a CA, s may include information on: the current container position; the quay cranes available for negotiating downloading/loading operations; the trailers available for negotiating the transport from quay to yard or viceversa; the yard cranes available for negotiating pick-up and stacking operations (from trailer to a block position or backwards); the time scheduled in current state before the next internal event, if no external input event occurs. For QAs, YAs, TAs, RAs, KAs, the sequential state may include the time prospected before the next internal event and the queued requests coming from CAs for availability, for data about offered service, for the confirmation of assigned service, etc.. To summarize, each agent can be represented as an atomic DEVS in the following way: A = < X, Y, S, δint, δext, λ, ta >

(3)

where X is the set of input events, Y is the set of output events, S is the set of sequential states, δint: S → S is the internal transition function, δext: Q×X → S is the external transition function, Q = {q = (s, e, DL) | s∈S, 0 ≤ e ≤ ta(s)} is the set of total states, λ: S → Y is the output function, ta: S → ℜ0+ is the time advance function, with ℜ0+ set of positive real numbers with 0 included. Here, the status-transitions triggered by events are examined since the status is considered as the main component of the sequential state. This assumption let the dynamics of each agent be described by status-transitions. The same DEVS methodology can be applied to represent interactions between agents in the loading part of a transhipment cycle, when containers are moved from yard blocks to ships, or in import or export cycles. 4.1 Dynamics of interactions between agents in a downloading process The MAS can be regarded as ‘container-driven’ because each container follows a sequence of handling operations executed by terminal resources. The container is the main entity flowing in the system and flows of containers are controlled by agents. To download containers from a ship bay, transport and stack them into a yard block, each CA follows three phases in negotiation. Firstly, it interacts with QAs to choose the quay crane for downloading. Secondly, with TAs to select the trailers for moving the containers from the quay to the yard area. Thirdly, with YAs to determine the yard cranes for stacking. Then, it is assumed that the CA firstly communicates exclusively with QAs, then with TAs only, finally with YAs. The negotiation is based on data about the offered services: a QA gives the estimated time to wait before the associated quay crane can start downloading containers, and the estimated time to execute the operation; a TA the estimated time to wait a trailer, and the estimated time for the transport task; a YA the estimated time for the yard crane to be ready close to the block, and the estimated time for the stacking task. Note that, for the loading part of the transshipment cycle from yard to ship, the CA will sequentially interact with YAs, TAs, and QAs. 4.1.1 Interactions between a CA and QAs For ttC0, t02,…, t0q=tC1. These messages

Robotics, Automation and Control

request the availability to all the q alternative QAs of quay cranes that can serve the container. The sequence of requests cannot be interrupted by any internal or external occurrence. For sake of simplicity, instead of modelling a sequence of q different and subsequent status-values, REQQAV represents the whole duration of the activity and it is assumed that C makes transition at tC1 (internal event IC1). In [tC1, tC2] agent C waits for answers (WAIQAV) from QAs. Namely, the request C transmits to each QA may queue up with similar ones sent by other CAs. Next transition occurs at tC2 when either C receives all the answers from the queried QAs (XC1), or a specified time-out of WAIQAV expires before C receives all the answers. In case it receives no reply within the time-out (IC2), C returns to REQQAV and repeats the request procedure. In case of time-out expiration and some replies received (IC3), C considers only the received answers to proceed. The repeated lack of valid replies may occur for system congestion, for crane failures or communication faults, or for other unpredictable circ*mstances. In all cases permanent waits or deadlocks may occur. To avoid further congestion and improve system fault-tolerance, time-outs are used and C is allowed to repeat the cycle REQQAV-WAIQAV only a finite number of times, after which C is replaced by another software agent. If all or some replies are received before the time-out expiration, C starts requesting service to the g ≤ q available QAs at tC2. In [tC2, tC3] C requests information to these QAs by sending them outputs YC11, YC12, …, YC1g at instants t11>tC2, t12,…, t1g=tC3. As the sequence of requests cannot be interrupted, REQQSE status is referred for the whole activity: at tC3 agent C is assumed to make transition (IC4). Then, agent C spends [tC3, tC4] waiting for offers from the available QAs (WAIQOF), as the request C transmits to each QA may queue up with those sent by other CAs. Next transition occurs at tC4 when either C receives all the answers from the queried QAs (XC2) or a time-out of WAIQOF expires. In case no reply is received within the time-out (IC5), C returns to REQQSE and repeats the procedure. In case of time-out expiration and some replies are received (IC6), C considers only the received offers to select the crane. Again, to avoid congestion, C repeats the cycle REQQSE-WAIQOF a finite number of times, then it is discharged. Once received the offers from QAs, C utilizes [tC4, tC5] to take a decision for selecting the quay crane (TAKQDE). At tC5 the decision algorithm ends (IC7), after selecting a QA and building a rank of all the offers from QAs. Subsequently, C reserves the chosen crane by transmitting a booking message (YC2) to the corresponding QA. So C takes [tC5, tC6] for communicating the choice to the ‘winner’ QA (COMCHQ). At tC6 the communication ends (IC8). Now, the selected QA has to send a rejection, if there is a conflict with another CA, or a booking confirmation (XC5). Hence, C uses [tC6, tC7] to wait for a confirmation from the selected QA (WAIQCO). The confirmation is necessary because the availability of the cranes can be modified by actions of CAs other than C during the decision interval, and the selected crane can be no longer available. If C receives a rejection (XC3), or does not receive any reply within a time-out (IC9), it returns to COMCHQ, and sends a new request of confirmation to the second QA in the decision rank. If C has no other alternative destinations and the rejection (XC4) or the time-out (IC10) occurs, it returns to REQQAV and repeats the negotiation. Note that WAIQAV, WAIQOF and WAIQCO cannot lead to indefinite circular waits (deadlocks), thanks to the time-out mechanism.

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

At tC7, after receiving a confirmation (XC5) from the selected QA, C makes a transition to issue a downloading command (DWNLDG). It takes the interval [tC7, tC8] to issue the command YC3 for the quay crane downloading the container. 4.1.2 Other interactions When, at time tC8, the downloading command is complete (IC11), C starts the second negotiation-phase with TAs for a trailer to carry the container to its assigned block. This new negotiation follows the same procedure as in the interaction with QAs. Agent C requests and waits for availability, requests information and waits for offers about trailer service, evaluates and ranks the offers to take a decision, communicates the choice to the best offering TA, and waits for a confirmation/rejection. This can be noted from Fig. 3, where the second part of the status-transition graph has the same repeated structure as in the first part, and the status-values have the same meaning. Also, time-outs are used to limit waiting and the repeated requests. This fact guarantees the model is modular, which is very important for simulation and control purposes. When a confirmation is received, C makes a transition to issue a transport command (TRANSP), and the container is loaded on the vehicle associated to the selected TA. Then, C starts the third negotiation-phase with YAs for a yard crane to stack the container in an assigned position inside a specified block. Again, due to the modularity of the approach, the sequence of allowed status-values follows the same protocol, and, finally, if a confirmation is received, C makes a transition to issue a stacking command (STCKNG). When the command is complete, C gets back to QUIESC. From tC24 to the beginning of the next negotiation cycle (if any) for downloading, consolidating or loading another container, C stops making decisions, receiving and sending messages, and remains quiescent. The associated container is downloaded, transported and stacked in a block where it waits for the next destination (a new block or a ship bay): all these processes do not involve agent activities. Only when they are over, C is ready to start a new negotiation for the same container. If faults occur to the selected cranes or trailer, C remains in QUIESC and there is no need to restart negotiations with QAs, TAs, and YAs. Terminal operators manage the repair process and when the normal operating conditions are restored, the container can be handled by the selected resources. Fig. 3 depicts the complex interaction dynamics previously described for the negotiation between a CA with QAs, TAs, and YAs. Circles represent the CA status-values, and the period spent in each status is indicated aside of it. Arrows represent the (internal or input) events, labelling the arrows themselves (the event time is indicated below the event symbol). The outputs, directly associated with status-values, are encapsulated in the circles. The output function simply defines outputs for the allowed status-values. The time advance function gives the residual time ta(s) in state s before the scheduled occurrence of next internal invent. For instance, at the time t* of entering a waiting status, the related time-out fixes the maximum time to wait Tw, such that ta(s) = t*+Tw-t, where t is the current time. To synthesize, one can use a ‘macro-status’ for each of the negotiation phases and depict the diagram of Fig. 4. Macro-status QA-neg represents all the diagram corresponding to the first negotiation-phase between the CA and QAs. In the same way, TA-neg and YA-neg aggregate the parts of diagram in Fig. 3, which are related to the second and third negotiation-phases of the CA with TAs and YAs, respectively.

Robotics, Automation and Control

QUIESC XC0 (tC0)

[tC1, tC2]

[tC3, tC4]

WAIQAV

WAIQOF

IC1 (tC1)

IC2 REQQAV (tC2) YC01…YC0q

IC4 XC1/IC3 (t ) C3 (tC2) REQQSE YC11…YC1g

IC5 (tC4)

[tC4, tC5] TAKQDE XC2/IC6 (tC4)

IC11 (tC8)

COMCHQ YC2

WAIQCO

XC5 (tC7)

DWNLDG YC3

[tC11, tC12]

XC3/IC9 (tC7)

[tC6, tC7]

XC4/IC10 (tC7)

[tC7, tC8] [tC9, tC10]

IC7 (tC5)

IC8 (tC6)

[tC2, tC3]

[tC0, tC1]

[tC5, tC6]

[tC12, tC13]

IC33 (tC24)

[tC13, tC14]

COMCHT WAITOF TAKTDE YC6 IC18 IC15 XC7/IC17 XC8/IC20 (tC13) XC6/IC14 (t ) C11 (t ) C12 (tC15) IC16 IC19 IC13 (tC10) (tC12) (tC14) (tC10) REQTSE WAITCO YC51…YC5u [tC14, tC15]

WAITAV IC12 (tC9) REQTAV YC41…YC4v [tC8, tC9]

IC22 (tC16) IC23 (tC17)

[tC10, tC11]

XC9/IC21 (tC15) TRANSP YC7

[tC17, tC18] WAIYAV

[tC19, tC20]

[tC20, tC21]

WAIYOF

TAKYDE

IC26 XC12/IC28 XC11/IC25 (t ) C19 (tC20) IC24 (tC18) IC27 (t C20) REQYAV (tC18) REQYSE YC81…YC8k YC91…YC9h [tC16, tC17]

[tC18, tC19]

XC10 (tC15)

[tC15, tC16]

IC29 (tC21)

[tC21, tC22]

[tC23, tC24]

STCKNG COMCHY YC11 YC10 XC13/IC31 XC15 (t C23) IC30 (tC23) (tC22) WAIYCO [tC22, tC23]

XC14/IC32 (tC23)

Fig. 3. Downloading in a transshipment cycle: negotiation of a CA with QAs, TAs, and Yas

Fig. 4. Synthesized diagram of negotiations of a CA with QAs, TAs, and YAs

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

4.1.3 Remarks about negotiation As Fig. 3 shows, the CA may receive confirmation from a QA, TA or YA after several successive loops COMCHQ-WAIQCO, COMCHT-WAITCO or COMCHY-WAIYCO. Time-outs can bring the CA back to REQQAV from WAIQAV if no availability signal comes from QAs, or from WAIQCO if all selected QAs rejected the selection. The CA can also go back to REQTAV from WAITAV if no availability signal comes from TAs, or from WAITCO after rejection from all selected TAs. Finally, the CA comes back to REQYAV from WAIYAV if no availability signal comes from YAs, or from WAIYCO after rejection from all selected YAs. Other time-outs rule the loops between WAIQOF and REQQSE, between WAITOF and REQTSE, and between WAIYOF and REQYSE. One could merge the requests for availability and for information to get a more compact representation. But this would imply a lower effectiveness in reducing CA waiting times and in preventing deadlocks. Namely, the detailed model separates and reduces the effects of delays and losses of messages due to communication faults or to the unavailability of cranes and trailers, when particular conditions occur (faults, overloading conditions, etc.). Moreover, the number of status-loops and the consequent delays in decision are reduced if the described detailed model is adopted. 4.2 Specification of the DEVS model of a container agent On the basis of the negotiation mechanism described in section 4.1 and by Fig. 3, the components of the DEVS model of a CA are identified. Table 1 reports and explains all the admissible status-values. The sequential state s = (C-status, C-inf) of a container agent is defined as function of the current status-value, with C-status belonging to the set of all admissible values: C-status ∈ {QUIESC, REQQAV, WAIQAV, REQQSE, WAIQOF, TAKQDE, COMCHQ, WAIQCO, DWNLDG, REQTAV, WAITAV, REQTSE, WAITOF, TAKTDE, COMCHT, (4) WAITCO, TRANSP, REQYAV, WAIYAV, REQYSE, WAIYOF, TAKYDE, COMCHY, WAIYCO, STCKNG} and as function of other information C-inf about the CA, which is defined as follows: C-inf = (p, AQ, AT, AY, ta(s))

(5)

which, in turn, depends on the current position p of the container (on ship, picked by quay crane, on trailer, picked by yard crane, in the yard block); the set AQ of alternative quay cranes available for the currently negotiated downloading operations, or the set AT of trailers available for the currently negotiated transport tasks, or the set AY of yard cranes available for the currently negotiated stacking tasks; the time ta(s) scheduled in current state s before the next internal event occurrence. A detailed description of inputs, outputs and internal events is reported in Tables 2, 3 and 4, respectively, for sake of clarity. It can be easily derived from the interactions between a CA and QAs, TAs, and YAs, as explained in section 4.1 and indicated in Fig. 3. Most inputs received and outputs sent by a CA obviously correspond to outputs coming from and inputs to other agents, respectively. Then, these messages will concur to define the DEVS atomic models of the associated agents. More specifically, the agents interacting with

Robotics, Automation and Control

a CA will change their status, then their sequential state, on the basis of the inputs received by the CA. Moreover, they will generate outputs by using their specific output functions to answer the CA during the different negotiation-phases. Phase of negotiation

First phase

Second phase

Third phase

Status

Activity Description

QUIESC

Agent quiescent (inactive)

REQQAV

Request availability to all alternative QAs

WAIQAV

Wait for availability from QAs

REQQSE

Request service to available QAs

WAIQOF

Wait for offers from available QAs

TAKQDE

Rank QAs and take decision for the best QA

COMCHQ

Communicate choice to selected QA

WAIQCO

Wait confirmation/rejection from selected QA

DWNLDG

Command selected QA to download container

REQTAV

Request availability to all alternative TAs

WAITAV

Wait for availability from TAs

REQTSE

Request service to available TAs

WAITOF

Wait for offers from available TAs

TAKTDE

Rank TAs and take decision for the best TA

COMCHT

Communicate choice to selected TA

WAITCO

Wait confirmation/rejection from selected TA

TRANSP

Command selected TA to transport container

REQYAV

Request availability to all alternative YAs

WAIYAV

Wait for availability from YAs

REQYSE

Request service to available YAs

WAIYOF

Wait for offers from available YAs

TAKYDE

Rank YAs and take decision for the best YA

COMCHY

Communicate choice to selected YA

WAIYCO

Wait confirmation/rejection from selected YA

STCKNG

Command selected YA to stack container

Table 1. Status-values for a Container Agent Finally, note how the successive inputs, outputs and internal events of the CA in the three subsequent negotiation-phases repeat with the same role, which gives the DEVS model a structure that can be easily implemented in simulation. DEVS models can be also specified for the other agents, by following the same methodology. In particular, the sequential state of each QA, TA or YA will include information about the queued requests coming from different CAs competing for the same controlled quay crane, trailer or yard crane.

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

Phase of negotiation

First phase

Second phase

Third phase

Input Time XC0 tC0 XC1

tC2

XC2 XC3 XC4 XC5

tC4 tC7 tC7 tC7

XC6

tC10

XC7 XC8 XC9 XC10

tC12 tC15 tC15 tC15

XC11

tC18

XC12 XC13 XC14 XC15

tC20 tC23 tC23 tC23

Event Description Start of negotiation activity for a new operation Last of q replies for availability received from QAs Last of g replies for offer received from QAs Rejection & other QAs in the CA rank Rejection & no other QA in the CA rank Confirmation from selected QA Last of v replies for availability received from TAs Last of u replies for offer received from TAs Rejection & other TAs in the CA rank Rejection & no other TA in the CA rank Confirmation from selected TA Last of k replies for availability received from YAs Last of h replies for offer received from YAs Rejection & other YAs in the CA rank Rejection & no other YA in the CA rank Confirmation from selected YA

Table 2. Input events received by a Container Agent in negotiation with QAs, TAs, and Yas Phase of negotiation

Output

YC01, YC02, … , YC0q YC11, YC12, … , First phase YC1g YC2 YC3 YC41, YC42, …, YC4v Second YC51, YC52, …, phase YC5u YC6 YC7 YC81, YC82, …, YC8k YC91, YC92, …, Third phase YC9h YC10 YC11

Time t01>tC0, t02, ... , t0q=tC1 t11>tC2, t12, ... , t1g=tC3 tC6 tC8 t41>tC8, t42, ... , t4v=tC9 t51>tC10, t52, ... , t5u=tC11 tC14 tC16 t81>tC16, t82, ... , t8k=tC17 t91>tC18, t92, ... , t9h=tC19 tC22 tC24

Event Description Requests of availability to q QAs Requests of information to g available QAs Choice communication to the selected QA Command for downloading container Request of availability to v TAs Requests of information to u available TAs Choice communication to the selected TA Command for transporting container Request of availability to k YAs Requests of information to h available YAs Choice communication to the selected YA Command for stacking container

Table 3. Output events sent by a Container Agent in negotiation with QAs, TAs, and Yas

54 Phase of negotiation

First phase

Robotics, Automation and Control

Internal Event IC1 IC2 IC3 IC4 IC5 IC6 IC7 IC8

Time

Event Description

tC1 tC2 tC2 tC3 tC4 tC4 tC5 tC6

End of request for availability of QAs Time-out & no availability signal received from QAs Time-out & g availability signals received from QAs End of request for offered service from available QAs Time-out & no offer received from the g available QAs Time-out & oq ≤ g offers received from available QAs End of decision for choosing quay crane (agent) End of choice communication to the selected QA Time-out & no confirmation received from the selected QA & ranked offers are available from other QAs Time-out & no confirmation received from the selected QA & no ranked offers are available from other QAs

IC9

tC7

IC10

tC7

Transition to IC11 2nd phase IC12 IC13 IC14 IC15 IC16 Second IC17 phase IC18 IC19

tC8

End of downloading command

tC9 tC10 tC10 tC11 tC12 tC12 tC13 tC14

End of request for availability of TAs Time-out & no availability signal received from TAs Time-out & u availability signals received from TAs End of request for offered service from available TAs Time-out & no offer received from the u available TAs Time-out & ot ≤ u offers received from available TAs End of decision for choosing trailer (agent) End of choice communication to the selected TA Time-out & no confirmation received from the selected TA & ranked offers are available from other TAs Time-out & no confirmation received from the selected TA & no ranked offers are available from other TAs

IC20

tC15

IC21

tC15

Transition to IC22 3rd phase IC23 IC24 IC25 IC26 IC27 IC28 Third phase IC29 IC30

tC16

End of transport command

tC17 tC18 tC18 tC19 tC20 tC20 tC21 tC22

End of request for availability of YAs Time-out & no availability signal received from YAs Time-out & h availability signals received from YAs End of request for offered service from available YAs Time-out & no offer received from the h available YAs Time-out & oy ≤ h offers received from available YAs End of decision for choosing yard crane (agent) End of choice communication to the selected YA Time-out & no confirmation received from the selected YA & ranked offers are available from other YAs Time-out & no confirmation received from the selected YA & no ranked offers are available from other YAs

IC31

tC23

IC32

tC23

Transition to IC33 QUIESC

tC24

End of stacking command

Table 4. Internal events in a Container Agent in negotiation with QAs, TAs, and YAs

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

Namely, let Q, T and Y be three of such agents. Then, each agent will be characterized by a queue c1 for the availability requests associated to one of the messages: YC0i (1 ≤ i ≤ q) arriving to Q, YC4i (1 ≤ i ≤ v) arriving to T, YC8i (1 ≤ i ≤ k) arriving to Y; a queue c2 for the information requests associated to one of the messages: YC1i (1 ≤ i ≤ g) arriving to Q, YC5i (1 ≤ i ≤ u) arriving to T, YC9i (1 ≤ i ≤ h) arriving to Y; a queue c3 for the confirmation requests associated to messages: YC2 arriving to Q, YC6 arriving to T, YC10 arriving to Y. Moreover, the state of Q, T, or Y will be defined by the current number of containers served by the associated quay crane, trailer or yard crane.

5. Ideas for simulation and evaluation of efficiency and robustness of the MAS control The DEVS atomic models can be integrated to obtain a complete networked system, which can be used as a platform for simulating the MAS architecture for controlling an intermodal container terminal. E.g. the TCT in Taranto can be used as a test-bed. In this context, it is possible to simulate not only the dynamics of terminal activities, the flow of containers, and the utilization of terminal resources (cranes, trailers, human operators, etc.), but also the efficiency of the MAS and its agents (flow of event messages, status transitions, waiting loops, etc.). Then, two types of performance indices can be defined. Namely, it is possible to measure the following conventional indices: the total number of handled (imported, exported, transshipped) containers; the average throughput, during downloading (from ship to yard) or loading (from yard to ship) processes; the average lateness of containers in the terminal; the utilization of resources; the ship turn-around time, i.e. the average time required to serve a ship for downloading and loading containers. Moreover, it is possible to measure the behaviour of the MAS and the efficiency of the agents' decision policies by means of: the average number of requests for each negotiation; the number of repeated negotiation loops of status-values before a final decision is taken by a CA, expressed in percentile terms with respect to the total number of operations executed by every CA. The lower is this index, the better is the agent capability to obtain a service at the first request. The higher is the value of the index, the higher is the lack of feasible replies due to congestion of the other agents or of the communication system. More specifically, the terminal performance measures can be evaluated both in steady-state operating conditions and in perturbed conditions. Perturbations may arise from: hardware faults or malfunctions; abrupt increase/decrease of containers to be handled due to changes in maritime traffic volumes; sudden increase/reduction of yard space; traffic congestion of trailers; congestion, delays, message losses in the communications between agents. For example, the private company managing TCT usually plans and controls the activities to serve one ship at a day. This is due to the fact that ship arrivals are known and scheduled in advance with ship agencies. But the company itself has recently foreseen a traffic increase in the following years, due to expected cargo movements coming from China and eastern countries to the Mediterranean Sea. Then, it is quite reasonable to think about working days in which more than one ship is berthed and served at the same time. In this case, at least two ships berthed to the quay would give a big perturbation to the required operations and terminal efficiency. Fig. 5 shows a snapshot of a discrete-event simulation made in this condition (2 ships in quay in the schematic view of the terminal), by using a conventional centralized control architecture, based on the current policies used in the terminal. The performance obtained were much lower than in standard conditions.

Robotics, Automation and Control

Fig. 5. Simulation snapshot of TCT in perturbed conditions – centralized control Then, it is important to use distributed MAS control architectures and to measure robustness of agents' decision laws, to see how they dynamically react to disturbances and parameter variations, and eventually to adapt them. The adaptation aims to make the autonomous agents learn the most appropriate decision laws in all terminal conditions. In this sense, the decision logic DL of a CA could be partly constant to encapsulate the most reliable strategies, and partly adaptable according to a learning algorithm. The constant part is constituted by a set of R heuristic decision rules, each related to a different evaluation parameter provided by QAs, TAs, and YAs. Some decision rules may be more effective in perturbed conditions (anomalies, faults, congestions), whereas a tradeoff between different rules may be more appropriate in other cases. Therefore, weights assigned to the rules represent the adjustable part of DL = {α1, α2, … , αR}, where each αj (j = 1, ..., R) specifies the factor weighting the role of the j-th heuristic in the global decision criterion. Then, an evolutionary algorithm can be used to adapt factors. In this way, in any operating condition, the worst performance of an agent should never be significantly lower than the performance of the worst decision rule. To conclude, the discrete-event simulation platform also allows the comparison of alternative types of control architectures which can be defined by: • a static MAS in which CAs use static logics based on heuristic decision parameters (estimated delivery time of the requested task, distance of cranes or trailers); • a dynamic MAS in which CAs take decisions by fuzzy weighted combinations of heuristic criteria; the weights are adapted by an evolutionary algorithm; • other distributed control architectures.

6. Conclusion A MAS architecture was proposed for controlling operations in intermodal container terminal systems. The autonomous agents are represented as atomic DEVS components. The interactions between agents are modelled according to the DEVS formalism to represent negotiations for tasks when downloading containers from ships to yard stacking blocks. The

Discrete-Event Dynamic Systems Modelling Distributed Multi-Agent Control of Intermodal Container Terminals

developed model can be easily extended to describe other processes, like loading containers from yard to ships, redistributing containers in the yard, import or export cycles. An accurate DEVS model of the MAS can be used in a detailed simulation environment of a real system (the TCT in Taranto), which allows the measurement of standard terminal performance indices and of the efficiency of the MAS in real scenarios. Moreover, open issues are testing and comparing static MAS and dynamically adapted MAS, if for example evolutionary adaptation mechanisms are used, especially with reference to different operating scenarios and to possible perturbations with respect to steady-state operating conditions.

7. References Bielli, M.; Boulmakoul, A. & Rida, M. (2006). Object oriented model for container terminal distributed simulation. European Journal of Operational Research, Vol. 175, No. 3, pp. 1731-1751, ISSN 0377-2217 Cantarella, G. E.; Cartenì, A. & de Luca, S. (2006). A comparison of macroscopic and microscopic approaches for simulating container terminal operations, Proceedings of the EWGT2006 International Joint Conferences, pp. 556-558, ISBN 88-901798-2-1, Bari, Italy, 27-29 September 2006, 01Media, Molfetta (Bari), Italy Crainic, T. G.; Gendreau, M. & Dejax, P. (1993). Dynamic and stochastic models for the allocation of empty containers. Operations Research, Vol. 41, No. 1, pp. 102-126, ISSN 0030-364X de Luca, S.; Cantarella, G. E. & Cartenì, A. (2005). A macroscopic model of a container terminal based on diachronic networks, Second Workshop on the Schedule-Based Approach in Dynamic Transit Modelling, Ischia, Naples, Italy, 29-30 May 2005 Degano, C. & Di Febbraro, A. (2001). Modelling Automated Material Handling in Intermodal Terminals, Proceedings 2001 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM'01), Vol. 2, pp. 1023-1028, ISBN 0-7803-67375, Como, Italy, 8-12 July 2001, IEEE, Piscataway, NJ, USA Fischer, M. & Kemper, P. (2000). Modeling and Analysis of a Freight Terminal with Stochastic Petri Nets, Proceedings of the 9th IFAC Int. Symp. Control in Transportation Systems, Vol. 2, pp. 195-200, ISBN-13 978-0-08-043552-7, Braunschweig, Germany, 13-15 June 2000, Eds. Schnieder, E. & Becker, U., Elsevier-Pergamon, Oxford, UK Gambardella, L. M.; Rizzoli, A. E. & Zaffalon, M. (1998). Simulation and planning of an intermodal container terminal. Simulation, Special Issue on Harbour and Maritime Simulation, Vol. 71, No. 2, pp. 107-116, ISSN 0037-5497 Heragu, S. S.; Graves, R. J.; Kim, B.-I. & Onge, A. St. (2002). Intelligent Agent Based Framework for Manufacturing Systems Control. IEEE Trans. Systems, Man, and Cybernetics - Part A: Systems and Humans, Vol. 32, No. 5, pp. 560-573, ISSN 1083-4427 Hsieh, F.-S. (2004). Model and control holonic manufacturing systems based on fusion of contract nets and Petri nets. Automatica, Vol. 40, No. 1, pp. 51-57, ISSN 0005-1098 Huhns, M. N. & Stephens, L. M. (2001). Automating supply chains. IEEE Internet Computing, Vol. 5, No. 4, pp. 90-93, ISSN 1089-7801 Kozan, E. (2000). Optimising container transfers at multimodal terminals. Mathematical and Computer Modelling, Vol. 31, No. 10-12, pp. 235-243, ISSN 0895-7177

Robotics, Automation and Control

Legato, P. & Mazza, R. M. (2001). Berth planning and resources optimisation at a container terminal via discrete event simulation. European Journal of Operational Research, Vol. 133, No. 3, pp. 537-547, ISSN 0377-2217 Lin, F. & Norrie, D. H. (2001). Schema-based conversation modeling for agent-oriented manufacturing systems. Computers in Industry, Vol. 46, No. 3, pp. 259-274, ISSN 0166-3615 Liu, C. I. & Ioannou, P. A. (2002). Petri Net Modeling and Analysis of Automated Container Terminal Using Automated Guided Vehicle Systems. Transportation Research Record, No. 1782, pp. 73-83, ISSN 0361-1981 Logan, B. & Theodoropoulos, G. (2001). The distributed simulation of multi-agent systems. Proceedings of the IEEE, Vol. 89, No. 2, pp. 174-185, ISSN 0018-9219 Mastrolilli, M.; Fornara, N.; Gambardella, L. M.; Rizzoli, A. E. & Zaffalon, M. (1998). Simulation for policy evaluation, planning and decision support in an intermodal container terminal, Proceedings Int. Workshop “Modeling and Simulation within a Maritime Environment”, pp. 33-38, ISBN 1-565-55132-X, Riga, Latvia, 6-8 September 1998, Eds. Merkuryev, Y., Bruzzone, A. & Novitsky, L., Society for Computer Simulation International, Ghent, Belgium Peterkofsky, R. I. & Daganzo, C. F. (1990). A branch and bound solution method for the crane scheduling problem. Transportation Research Part B: Methodological, Vol. 24B, No. 3, pp. 159-172, ISSN 0191-2615 Schattenberg, B. & Uhrmacher, A. M. (2001). Planning Agents in James. Proceedings of the IEEE, Vol. 89, No. 2, pp. 158-173, ISSN 0018-9219 Shen, W. & Norrie, D. H. (1999). Agent-Based Systems for Intelligent Manufacturing: A State-of-the-Art Survey. Knowledge and Information Systems, An International Journal, Vol. 1, No. 2, pp. 129-156, ISSN 0219-1377 Steenken, D.; Voss, S. & Stahlbock, R. (2004). Container terminal operation and operations research - a classification and literature review. OR Spectrum, Vol. 26, No. 1, pp. 349, ISSN 0171-6468 Vis, I. F. A. & de Koster, R. (2003). Transshipment of containers at a container terminal: An overview. Europ. Jour. Operational Research, Vol. 147, No. 1, pp. 1-16, ISSN 0377-2217 Yun, W. Y. & Choi, Y. S. (1999). A simulation model for container-terminal operation analysis using an object-oriented approach. International Journal of Production Economics, Vol. 59, No. 1-3, pp. 221-230, ISSN 0925-5273 Zeigler, B. P.; Praehofer, H. & Kim, T. G. (2000). Theory of Modelling and Simulation, Academic Press, ISBN 0-12-778455-1, New York, 2nd edition

4 Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure Antonio Martin and Carlos Leon

Dpto. Tecnología Electrónica. University of Sevilla Spain 1. Introduction With the technology of telecommunication network developing and the size of communication network enlarging constantly, network management needs additional requirements. To satisfy these needs of quality and performance, it is necessary to achieve an attended management network by advance software. The Artificial Intelligent is incorporated to the network management to provide the task of control and administration. OSI and Internet are dominant network management models, which have been used for administration and control of the most of existing networks. Network management activities, such as fault, security, performance, etc. have performed for network management models. These activities are becoming supplementary functionalities in the network management, direct expert system involvement. Traditional intelligent management technologies can play an import role in the problems solving and reasoning techniques employed in network management. However, these systems are not flexible enough for today’s evolving network needs. It is necessary to develop new models, which offer more possibilities. We propose a hybrid solution that employs both managed model and AI reasoning technique for the management of heterogeneous distributed information networks. We present a new concept called Integrated Management Expert System that employs both managed model and AI reasoning techniques for the intelligent management of heterogeneous networks. This new paradigm contains aspects which clearly make the difference from the former management techniques which uses separately expert systems and management platforms. We propose the normalization of the knowledge management necessary to administrate the resources that exist in the networks independently from the builder of the management resources. The goal is to get a syntactically uniformed definition of all the management knowledge supplied by the expert, independently from the maker. The novelty comes from the fact that the employed knowledge for the networks management (conditions and management operations to achieve on the different resources) is included and normalised itself in the definitions of the network elements and is treated as if it were a propriety. So, we realise syntactically uniformed normalization of intelligence applied to the management. This technique integrates the Expert System within the Management Information Base. The advantage is that a large problem can be broken down into smaller

Robotics, Automation and Control

and manageable sub-problems/modules. For this goal, an extension of OSI and SNMP management framework specifications language has been investigated. A new property named RULE has been added into the Management Information Base (MIB), which gathers important aspects of the facts and the knowledge base of the embedded expert system.

2. Related work The work that we present has two parts, figure 1. In the first part we analyse the current management models, its evolution and the applications of the expert systems in the network management. It offers a general vision of the traditional management expert system, analyzing its deficiencies and discovering the needs that push us toward new management paradigms.

Fig. 1 Synopsis of the research. The second step in the modelling sequence includes an introduction to standard Structure of Management Information (SMI) and intelligent management, showing the advantages and problems of integrated intelligent management. Next we present the extension of the standard GDMO (ISO/ITU-T, 1993), to accommodate the intelligent management requirements. We describe how the templates are modified using the new extension called GDMO+ and present the new template RULE, which defines the knowledge base of the management expert system. We study the characteristics of this new property RULE: template structure, behaviour, priority, inheritance of expert rules, etc. As the last step in order to show the viability of our proposal, we perform a practical demonstration in which the information and the management knowledge is unified in a unique specification. We present a tool based on a GDMO+ standard. It shows an expert system prototype of integration rules. This prototype gives service at the moment to a power utility.

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

3. Network management Once we have presented the scope and goals of the work, this section presents most important management models and we make a brief overview of relevant topics of the network management. There are several organizations which have developed services, protocols and architectures for network management. At the moment there are two main management models for computer communication: Open System Interconnection OSI and Internet. Those are standardized models for computer communication and starting point for understanding network management. It provides for common understanding between communicating processes in a heterogeneous environment, servers as the basis for a precise specification of network services and protocols and constitutes a vendor-independent concept (ITU-T, 1992). ISO was the first which started, as part of its Open Systems Interconnection (OSI) program. The term OSI systems management actually refers to a collection of standards for network management that include a management service and a protocol. Common Management Information Protocol (CMIP), which provides the information exchange capability to support the Common Management Information Service (CMIS) and offers management services to management applications. ISO issued a set of standards and draft standards for network management. A subset of these standards provides the foundation for Telecommunication Management Network (TMN) developed by the organization called Telecommunication Union (ITU). TMN conceptually a separate network that interfaces a telecommunications network at several different points (ITU-T, 1996). Internet Model by the Internet Engineering Task Force (IEFT). Internet model is a structured and standardized approach to Internet management. This uses the Simple Network Management Protocol (SNMP) (Black, 1995), figure 2.

Fig. 2. Management Models 3.1 Network management elements These network management systems operate using client/server architecture. Four fundamental concepts of these models are (ISO/ITU-T, 1998): Manager or Manager Role: In the network management model a manager is an unit that provides information to users, issues requests to devices in a network, receives responses to the requests and receives notifications. Agent or Agent Role: It is an unit that is part of a device in the network that monitors and maintains status about that device. It can act and respond to requests from a manager. Network Management Protocols: Managers and agents require some form of communication to issue their requests and responses. The Common Management

Robotics, Automation and Control

Information Protocol (CMIP) is the protocol used in management model ISO and TMN. SNMP is the protocol used to issue requests and receive responses in a management model Internet. The combination of Internet and OSI management requires the use of different network management protocols working with different levels of modelling complexity. CMIP requires the use of full OSI stack, for implementation while the SNMP protocol operates with the lower layers of stack. CMIP and SNMP have been designed to scale as network grows, i.e. the ability to perform “manager” or “agent”. Management Information Base (MIB): In this information Base, we add the knowledge management. In addition to being able to pass information back and forth, the manager and the agent need to agree on and understand what information the manager and agent receive in any exchange. This information varies for each type of agent. The collection of this information is referred to as the management information base. A manager normally contains management information describing each type of agent the manager is capable of managing. This information would typically include ISO and Internet MIB definitions for managed objects and agents (Morris, 2003).

3.2 Management information Information modelling plays a large part in modern network management systems. It is a way to represent the capabilities of a Managed object interfacing directly to the equipment's native command. This abstraction has given rise to management information models. This is the information associated with a managed object that is operated on by the OSI and Internet Management protocols to control and monitor that object. The description of management information has two aspects. First, a Structure of Management Information (SMI) defines the logical constitution of management information and how it is identified and described. Second, the MIB, which is specified using the SMI, defines the actual objects to be managed. The MIB can be seen as kind of database. It is a conceptual repository of management information. The elements that make up a management information model are referred to as managed objects. The content of this database is not set of managed objects themselves, but the information that is associated with the managed objects. These resources are viewed as ‘managed objects’. A managed object is the abstract view of a logical or physical system resource to be managed. Thus, a managed object is the abstraction of such a resource that represents its properties as been by, and for purpose of, management. These special elements provide the necessary operations for the administration, monitoring and control of the telecommunications network. A managed object is a management view of a resource that is subject to management, such as a layer entity, a connection or an item of physical communications equipment. Managed objects are used for management functions, performance monitoring and analysis, and the setup and teardown of virtual circuits and virtual paths. The managed object concept is refined in a number or additional standards, which are called the Structure of Management Information (SMI) standards. The managed objects are defined according to the SMI, which defines how network objects and their behaviour are to be specified, including the syntax and semantics (Clemm, 2006). In the Open System Interconnection (OSI) and TMN systems management the information architecture is based on an object-oriented approach and the agent/manager concepts that are of paramount importance. In OSI, SMI provides the Guidelines for Definition of Managed Objects (GDMO), for definition objects contained in the MIB.

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

Internet management model doesn’t use the Object Oriented Programming such as it is used by the OSI model. This is one of the reasons for the Internet model simplicity. The definitions contain objects, specified with ASN.1 macros. We are studying the way to integrate the expert knowledge in the management Internet model. The resources specifications can only be groups of scalar variables and cells tables in spite of not being an Object Oriented Programming model, we can use the tables as classes where the attributes are the table columns and every file contains an instance of the class. The same as in OSI every object has an OID associated identifier.

4. Overview on expert systems and telecommunications management After this analysis of management elements in common to OSI an Internet models we present an overview of the state art in traditional expert management system. ISO classifies the systems management activities into five functional areas: fault management, accounting management, configuration management, performance management and security management. This specific functions performed by OSI are also applicable in the Internet model. We can categorize the expert systems within these five groups. In the next table 1 we can observe a synopsis of some expert systems applied of network management. This table indicates the applications area and technique use. MANAGEMENT DOMAIN TECHNIQUE

Fault

Expert Rules

Max & Opti ANSWER

ESS-ES ECXpert

Bayesian Network

Trouble Locator

APRI

Case-Based Reasoning (CBR)

CRITTER

Blackboard

Accounting Configuration Performance Security ACE XCON NMCS

EXSim

TASA

Expert Interface

NIDES P-BEST NetHELP NIDX

NETTRA Scout

Table 1. Expert system applications developed for network management. We can observe that most of expert systems are built for fault management and use models based in expert rules. These rules obtain an experience accumulated by human experts. The last years new techniques are applied: Bayesian networks, blackboard, the systems based in case based reasoning, which make an approach to most similar case of the all cases existing in the knowledge base, etc. (Negnevitsky, 2002). In traditional intelligent management these expert systems can be treated as a black box which receives the events coming from the elements that compose the managed system. There must exist between them a certain level of compatibility and interoperability that makes sure that the stream of data and information control flow in both directions. An unique interface must be developed, according to the possibilities and tools that the system possesses for the external communication with these applications: sockets, CORBA, etc.

Robotics, Automation and Control

The traditional expert management presents inconvenients, which show the limitations of current systems management. Some disadvantages of traditional Expert Management are the next: • There are disadvantages caused by the existence of two independent elements, the management platform and the expert system, with a permanent and continuous dialogue between them. • Restrictions may appear when choosing a determined managed platform. • This presets incompatibility problems between management platform and expert system. • Because of the need to use heterogeneous networks, it is difficult to normalize the knowledge for an adequate management of resources. This disadvantage can be solved by using the integration management that we propose. In the integrate intelligent management, it is not necessary to interconnect the expert system and the management platform. Both components are defined using the same syntax, being finally both completely integrated in some unique specifications. In next sections we will approach our research in the integration of knowledge management into MIB of OSI and Internet management models.

5. Knowledge integration in OSI and internet models This paper focuses on a framework and a language for formalizing knowledge management descriptions and combining them with existing managed object definitions. To find a solution, we must develop an interface that makes it possible to introduce expert rules of management within the definition of the MIB objects. This is a solution that will be generally more flexible than the one found by traditional methods (formal language). The solution is the inclusion of formal knowledge descriptions in regular RULE-templates. An essential part of the definition of an intelligent managed object is the relationship between their properties and the management knowledge of the resource. This relationship is not modelled in a general way. Unfortunately, the knowledge of managed objects is defined in another way using programming language. This results in resources specifications which are not often information about the knowledge base of expert system, and increasing the possibility of different intelligent implementations not being interoperable. To achieve consistent, clear, concise, and unambiguous specifications, a formal methodology has to be used. OSI and Internet Management models are faced with several impediments to improve the integrated intelligent management. Management models explain how individual management operations should be performed. The current management standards do not specify however the sequence in which intelligent operations should be performed to solve specific management problems. OSI and Internet management is rather complicated. The model has introduced several new concepts, which are sometimes difficult to comprehend. This makes if difficult the construction of intelligent platforms that work with the model. The basic questions raised are: 1. Formulate, describe, distribute and assign the knowledge between the different intelligent agents defined. 2. Communicate the knowledge between different objects and what communications language or protocol must they use. 3. Ensure that the objects act coherently when they make a decision or conduct.

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

It enables the objects to be reason about the actions, plans and knowledge of other objects. To make this possible, it is essential to explore the capacities that have the management information models. In particular, the OSI GDMO standard specifies the proprieties and characteristics that make easy the inclusion of the expert management rules as a part of the defined managed resources. In this way the intelligent network managers can interpretate the rules. Finished by the expert system and achieve an intelligent treatment of the management information. The integrated framework that we propose has numerous advantages. Among them: Abstraction of the users from the management systems and good compatibility between different management platforms. The ability of different network management system to communicate the knowledge management of the expert systems without overloading the management applications. The flexibility to define new managed objects than contain knowledge management without modifying all the system that needs to interact with them. Possibility of standardization of the new expert rules, by an organism such as the ISO and a common set of network management knowledge based in standards. Reusability of the knowledge management. GDMO has an object oriented syntax. Its classes inheritance mechanisms allows management expert rules reuse. Easily to build new expert management systems. A general top-down view of the integrated multi-vendor network and structured list of network management objects that contains some set of expert rules for the administration. This structure also provides a richer level of abstraction facilitating the coexistence knowledge management, allowing different levels of modelling complexity, and organizing the knowledge management of the managed objects, figure 3.

Fig. 3. Integration of Knowledge management and Resources properties in one single specification.

6. Including formal knowledge management in OSI To allow deployment of equipment from different vendors, the managed models, OSI framework define the language GDMO (Guidelines for the Definition of Managed Objects). GDMO has been standardized by ITU (International Telecommunication Union) in ITU-T X.722 and is now widely used to specify interfaces between different components of the TMN architecture. This section introduces a framework for the inclusion of formal Knowledge management descriptions into GDMO specifications and focuses on the syntax and semantics of the language GDMO and the extension GDMO+. According to OSI’s Management Information Model, the management view of a managed object is viable at the managed object boundary. Managed object can be viewed as mediator

Robotics, Automation and Control

between the network management interface and the hardware in the networks. Managed Object is modelled by attributes, actions and notifications, figure 4.

Fig. 4. Managed Object Boundary They can usually represent a certain part of the internal state of an element. Actions invoke certain functions which a device can perform. Notifications are spontaneous messages emitted if certain events occur. Managed objects which are alike are grouped together to form Managed Object Classes. Classes can inherit their appearance from other classes and add new features (Goleniewski & Jarrett, 2006). The appearance of managed objects can be formally described using the language GDMO. This language defines a number of so-called templates, which are standard formats used in the definition of a particular aspect of a real device in the networks. A complete object definition is a combination of interrelated templates. The nine templates that conform the actual GDMO standard are next: Managed Object Class Template references all other templates either directly or indirectly to make up the managed object class definition: Package, Parameter, Attribute, Attribute Group, Behaviour, Action and Notification. Package Template, defines a combination of behaviour definitions, attributes, attribute groups, operations, actions, notifications and parameters for later inclusion in a managed object class template. A package can be referenced by more than one managed object class definition. Attribute Template defines individual attribute types used by managed object classes. A single attribute can be referenced by more than one managed object class definition. If desired these attribute types can be combined in a group by using the Attribute Group template. Action Template defines actions that can be performed by a managed object class. These actions can be executed by using the Common management information service (CMIS). In particular the M-ACTION service that requests an action to be performed on a managed object. An action can be referenced by more than one managed object class. The Action template defines the behaviour and syntax of an action. The syntax definitions specify the contents of the action information and action reply fields in CMIS action requests and responses. The Notification template defines the behaviour and syntax of a notification that can be emitted by a managed object class. A notification can be referenced by more than one managed object class. The syntax definitions specify the contents of the event information and event reply fields in CMIP, event report requests and responses.

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

The Parameter template specifies and registers the parameter syntaxes and associated behaviour that may be associated with particular attributes, operations, and notifications within Package, Attribute, Action, and Notification templates. A parameter can be referenced by one or more of each of these templates. The type specified in a Parameter template is used to fill in the ANY DEFINED BY oid construct. Attribute Group Template defines one or more attributes that can be referenced as a group. A managed object class definition can include all attributes of a group by referencing the group, rather than referencing each attribute individually. More than one managed object class definition can reference an attribute group. Attribute groups make it easier to collectively perform operations on a large number of individual attribute. Behaviour Template describes the expected behaviour of another element of the standard: Managed object classes, Name bindings, Parameters, Attributes, Actions and Notifications. The behaviour may be defined by readable text, high level languages, formal description techniques, references to standard constructs, or by any combination of the preceding definition methods. Name Binding Template, specifies instantiation and legal parameters for managed objects. Containment, creation, and deletion constraints are initiated from this template. The elements that at the moment form the GDMO standard do not make a reference to the knowledge base of an expert system. To answer these questions, it will be necessary to make changes on the template of the GDMO standard. We present an extension of the standard GDMO, to accommodate the intelligent management requirements, figure 5.

Fig. 5. Extension of GDMO+ Before considering a concrete language for specifying managed object’s knowledge, a methodology for combining knowledge management with GDMO+ definitions is presented. To improve the quality of the descriptions and the resulting implementations, a formal method for specifying knowledge is desirable. We describe how to achieve this goal using a new extension called GDMO+. An object-oriented logic programming language is presented, which can be used in conjunction with the framework to specify the management knowledge of a managed object. The methodology is independent of the language used and can be combined with other approaches for formalizing knowledge It is based on the notion of events and supports the object-oriented features of GDMO (Hebrawi, 1995).

Robotics, Automation and Control

Management knowledge is introduced in GDMO+ which defines a number of new templates that contain certain aspects of the expert rules. This extension presents a new element RULE, which defines the knowledge base of the management expert system. This template groups the knowledge base supplied by an expert in a specific management dominion. It allows the storage of the management knowledge in the definition of the resources that form the system to be managed, figure 6.

Fig. 6. Relations between proposed standard templates The standard we propose contains the singular template RULE and its relations to other templates. Two relationships are essential for the inclusion of knowledge in the component definition of the network: Managed Object Class and Package Template. In the standard we propose, both templates have the new property RULES. Let us study both relationships.

7. Template for management of object classes This template is used to define the different kinds of objects that exist in the system. MANAGED OBJECT CLASS [DERIVED FROM [,]*;] [CHARACTERIZED BY <package-label> [,<package-label]*;] [CONDITIONAL PACKAGES <package-label> PRESENT IF condition; ,<package-label>] PRESENT IF condition]*;] REGISTERED AS object-identifier; DERIVED FROM plays a very important role, when determining the relations of inheritance which makes it possible to reutilize specific characteristics in other classes of managed objects. In addition, a great advantage is the reusability of the object classes and therefore of the expert rules which are defined. This template also can contain packages and conditional packages, including the clauses CHARACTERIZED BY and CONDITIONAL PACKAGES. A package is used to define a package that contains a combination of many characteristics of a managed object class: behaviours, attributes, groups of attributes, operations, parameters, actions and notifications. The structure of package template is next:

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

<package-label> PACKAGE [BEHAVIOUR [,]*;] [ATTRIBUTES propertylist[,<parameter-label>]* [,propertylist[,<parameter-label>]*]*;] [ATTRIBUTE GROUPS []* [ []*]* ;] [ACTIONS [<parameter-label>]* [ [<parameter-label>]*]* ; [NOTIFICATIONS <notification-label> [<parameter-label>]* [<notification-label> [<parameter-label>]*]* ;] [RULES [,]*;] REGISTERED AS object-identifier; In addition to the properties indicated above, we suggest the incorporation of a new property called RULES, which contains all the specifications of the knowledge base for the expert system. Next definition shows the elements of a package template, in which it is possible to observe the new property RULES.

8. Expert rule template The RULE template permits the normalised definition of the specifications of the expert rule to which it is related. Figure 7 shows a UML representation of Rule template.

Fig. 7. RULE template in GDMO+ standard. This template allows a particular managed object class to have properties that provide a normalised knowledge of a management dominion. A rule is an expression such as: “If the antecedent is true for facts in a list of facts, then it can carry out the actions specified in consequent” (Brachman, 2004). Each of the template type consists of a specification label, a template type, a list of keywords and an unique ASN.1 object identifier. The structure of the RULE template is shown here: RULE [PRIORITY <priority> ;] [BEHAVIOUR [,]*;] [IF occurred-event-pattern [,occurred-event-pattern]*] [THEN sentence [, sentence]* ;] REGISTERED AS object-identifier; The first element in a template definition is headed. It consists of two sections: this is the name of the management expert rule and RULE, a key word indicates the type of

Robotics, Automation and Control

template. After the head, the following elements compose a normalised definition of an expert rule. After the head, the following elements compose the RULE template. BEHAVIOUR: This construct is used to extend the semantics of previously defined templates. It describes the behaviour of the rule. This element is common to the others templates of the GDMO standard. PRIORITY: This represents the priority of the rule, that is, the order in which competing rules will be executed. IF: We can add a logical condition that will be applied on the events occurred or their parameters. It contains all the events that must be true to activate a rule. Those events must be defined in the Notification template. The occurrence of these events is necessary for the activation of the rule and the execution of their associated actions. For the establishment of the conditions of fire, we must following structure: [IF <patter> [, <patter>]* [,condition]*;] THEN: This gives details of the operations performed when the rule is executed. Those operations must be previously defined in the Action template. These are actions and diagnoses that the management platform makes as an answer to network events occurred. Our formal structure is next: [THEN <sentence> [,<sentence>]*;] REGISTERED AS is an object-identifier: A clause identifies the location of the expert rule on the ISO Registration Tree. The identifier is compulsory. To finish this section next paragraph shows an expert rule integration; it defines the managed object of a resource port that belongs to a switch. port MANAGED OBJECT CLASS DERIVED FROM “element_switch”:top; CHARACTERIZED BY portPackage PACKAGE; REGISTERED AS {nm-MobjectClass 1}; portPackage PACKAGE ATTRIBUTES portNum GET, portStatus GET, … OPERATIONS diagnose, disconnect, connect, … NOTIFICATIONS portFailure, portInitializated… RULES transmissionError, powerError, …, REGISTERED AS {nm-package 1}; transmissionError RULE PRIORITY 4; BEHAVIOUR transmissionErrorBehaviour; IF (?date ?time1 ?station portFailure ?portNum ALARM) (?date ?time2 ?station portFailure ?portNum ALARM & : (=(portstatus(? ?time1 ?time2)) OFF) THEN (“Severity:" PRIORITY) , (“Diagnostic: “It damages in the modulate transmission of port”, ?portNum), (“Recommendation “Revision port”), (“Actions “put off/on port”); REGISTERED AS {nm-rule 2); In the previous example a Class of Managed Object is port, which defines the properties corresponding to the port of a switch. This class includes the compulsory portPackage

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

which contains all the specifications corresponding to the device. We can indicate which re the expert rules that have been associated with the defined class by means of the RULES clause. The rules are defined by using the RULE template. These rules detect anomalies or defects of operations produced in the port and suggest the necessary measures for solving the problem. The first rule transmisionError is in charge of detecting failures in the data transmission module and gives recommendations on how to solve this failure.

9. New expert management rules identification The integration of the management knowledge in the proposal GDMO+, implies that the expert rules are defined together with the specifications of the managed objects. As the rest of the proprieties of the GDMO+ standard that are included in the definition of a managed object class: classes, attributes, actions, etc., the expert management rules will be attached to revision convention and analogue register structures. The CCIT Rec. X.208 | ISO/IEC 8824, provides a structure for the object identifies and the proprieties that composed it, as soon as the new property called RULES. To make this possible, the systems management functions, function(2) and the management information model smi(3), must allow an amplification, so that these recommendations can pick up functions and mechanisms of definition that allow to contain the new property RULES. For this purpose we propose new versions of these recommendations. In the next figure 8 we show the amplification that shows the sections function(2) and smi(3) and the new elements RULES(11) is presented and gathers all necessary elements to define an associated knowledge.

Fig. 8. Amplification of the Model of Information and Management Functions In this way, the definition of a new rule “New_Rule” would be included in the place indicated to the recommendation X.720 of CCITT | ISO 10165-1 [ISO92B]. In this case it is to add a new management rule to the Standard GDMO+ proposal. Extended, it would be achieved in the following level: {join-iso-ccitt ms(9) smi(9) part4(4) rule(11) New_Rule(45)}

Robotics, Automation and Control

10. Integrate environment for internet network management. This section discusses and analyzes the intelligent management approach that is standardized by the Internet Engineering Task Force. Internet management can be compared to OSI management model. In fact, Internet management uses many of the concepts that existed in OSI at the time SNMP started. As a result, the remarks that were made above are to some extent also applicable to Internet management. As opposed OSI management, however, Internet management uses only a small part of management functions for exchange of management information. Note that objects in the Internet and those OSI are different. Internet objects are similar to attributes in a OSI managed object, and an Internet object group can best be described as analogous to a OSI management object class. In the Internet, an object is more like a variable found in programming languages; it has a syntax and semantics. Each object can have one or more object instances, each of which, in turn, has one or more values. An interesting difference between Internet and OSI is that the Internet management model is more simple that OSI model. The principal characteristics of their architecture are next: The cost of adding network management to existing systems is minimal. All systems connected to the network should be manageable with SNMP. It should be noted that SNMP protocol only defines how management information should be exchanged, they do not define which management information exits. Such information is defined by the various MIB standards. It should be relatively easy to extend the management capabilities of existing systems, by extending the existing MIB’s or adding a new MIB. Due to these aspects, it is questionable whether OSI management will reach the dominant market position that has originally been anticipated. Simple measures that solve all the above mentioned problems are difficult to find. 10.1 Internet extension to integrate the knowledge management For two systems to communicate, each must understand the data sent from one to the other. This can be achieved by using a language that has the same syntax and semantics. In the application layer, we use abstract syntax, which states only how data are arranged and what meaning they have. One of the possible abstract syntaxes is abstract syntax notation one (ASN.1). Between the application layer and the presentation layer, a local set of rules can be used to transform data; however the syntax of the data transferred between presentation entities must be understood by each end. This is known as transfer syntax. Abstract syntax and transfer syntax are negotiated at the beginning, during association time. One transfer syntax is Basic Encoding Rules (BER). BER state how data must be transferred to the other presentation entity. The local syntax can be purely dependent on the local protocols used, in this case SNMP. Figure 9 illustrates the concept of abstract syntax and transfer syntax. ITU-T Recommendations X.208 and X.680 describe standardized ways and steps to define data types and data values (Douglas & Kevin, 2005). The textual MIB representation is called module and consist on a plane text file. This file is used as a subset language of ASN.1 and based in the SMI specifications. The MIB-II is the most important and probably best known MIB. It contains all the variables to control the major Internet protocols: IP, ICMP, UDP, TCP, EGP and SNMP. The structure of this MIB is simple; all management variables that belong to the same protocol are grouped together. Within a protocol group there is hardly any additional structure that helps understanding the various variables within that group.

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

Fig. 9. Abstract Syntax Notation and Basic Encoding Rules Soon after definition of the MIB-II other MIBs, some of these standardized MIB`s are FDDI, ATM, X.25, X.500 Directory Monitoring, etc. Next to the standardized MIBs there are also a large number of enterprise specifics MIBs. The objects to be managed in Internet must follow a certain set of rules as mentioned in the Structure of Management Information (SMI) such as an object defined by the X group is compatible with the definition of the object by the Y group. We propose to add a new type named RULES and the incorporation of an extension of the MIB II named MIB II+. We broadly classify the ASN.1 built-in types as follows: simple types, structured types, Tagged types and subtypes. We introduced in ASN.1 a new concept denominated Expert Rule to definition the new group RULE existing in MIB II+. This group is introduced as a textual convention in this MIB II+ document. Group RULES contains all the aspect related with the expert management, figure 10.

Fig. 10. MIB-II+ Objects group To define an expert rule we use the modules definitions. Modules definitions are primarily used for grouping ASN.1 definitions. They also help in using type definitions characterized in the other places by making use IMPORT and EXPORT mechanisms. Modules are analogous to functions in C language or subroutines in PASCAL. There are module definitions in the definitions of managed object classes in standards and the others documents. The macro used for MIBs definition in SNMP was defined in RFC 1155 draft (Structure of Management Information) and later extended in the RFC 1212 (Concise MIB Definitions). RFC 1155 version is used to define objects in MIB-I. RFC 1212 version including more information and is used to define objects in MIB-II. Next definition shows OBJEC-TYPE macro in RFC 1212. These can be enhanced by including formal descriptions. In this case, the specifications formal parts and the knowledge must be distinguishable. An easy solution is the separation by keywords. Next section shows an example of definition of an integrate expert rule in Internet model:

Robotics, Automation and Control

transminissionError OBJECT-TYPE SYNTAX SEQUENCE {conditionRule COMPONENTS OF transminissionErrorCondition, actionRule COMPONENTS OF transminissionErrorAction} ACCESS read-write STATUS mandatory DESCRIPTION “Rule devoted to the detection of errors in the data transmission module” ::= { tcp 13 } trastimissionErrorCondition OBJECT-TYPE SYNTAX Condition ACCESS read-write STATUS mandatory DESCRIPTION "Information about a particular condition” INDEX { date, time, local, signal, remote, alarm, operator} ::= { tcpConnTable 1 } TcpConnEntry ::= SEQUENCE { date INTEGER, time TIMETICKS, local OCTETSTRING, signal OCTETSTRING, remote OCTETSTRING, alarm OCTETSTRING, operator OCTETSTRING} transminissionErrorAction OBJECT-TYPE SYNTAX OCTETSTRING ACCESS read-only STATUS mandatory DESCRIPTION "Action execute when fire the expert rule” ::= { tcp 13 } This example defines a portion of the Management Information Base (MIB) for use with network management knowledge in TCP/IP-based internets. In particular, it defines objects for managing remote network monitoring devices. This standard extends that specification by documenting the knowledge management in SMIv2 format. These groups are defined to provide a means of assigning managed object, and to provide a method for implementers of managed agents to know which objects can be administrator.

11. Case of study and practical experiments At present, some reports of experience with the OSI and expert system architecture are available, although the first OSI-based platforms are already on the market. We provide a rule-based expert system applied to the fault diagnosis in telecommunication system of a power utility (Maggiora et al., 2000). The communications systems employed to implement the integrated intelligent management prototype belongs to the SEVILLANA-ENDESA a major Spanish power utility. This Telecommunications network is made of several

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

equipments and systems, which differ from each other in terms of age, technology, network domain, etc. Each kind of equipment communicates to a particular supervisory system, which is responsible the equipment’s operation and maintenance information to one or more operations centers. The supervisory are also responsible for sending commands from the operations’ staff to the supervisory equipment. A supervisory system as composed from two kinds of equipment: Remote Terminal Units (UTR) and central units (UC). UTRs function is to gather, code and transmit information to the managed equipment and distribute command signals to those equipment (Boyer, 1999), figure 11.

Fig. 11. Network framework Management and control of that network as based on an Expert System called NOMOS+ developed by the Electronic Technology Department in the University of Seville. This tool understands transceivers and multiplex equipment. This expert system that integrates the knowledge base and the resources definition into an unique specifications. The knowledge base of this system is integrated in the specifications of the resources using for that purpose our GDMO+ proposal. It has been employed Workstation to program the expert system. The resultant expert system has about 200 rules. NOMOS+ is implemented in Brightware's ART*Enterprise, an expert system shell. ART*Enterprise is a set of programming paradigms and tools that are focused on the development of efficient, flexible, and commercially deployable knowledge-based systems. Expert system shells simplify developer interactions by eliminating the developer’s concern with operating system requirements. Its use can therefore reduce the design and implementation time of a program considerably. 11.3 Expert system architecture The integrate expert system we propose is composed of three major components: a knowledge base, an inference engine and a user interface, figure 12. The knowledge base is the core of the system, which is a collection of facts and if-then production rules that represent stored knowledge about the problem domain. The inference engine is the processing unit that solves any given problems by making logical inferences on the given facts and rules stored in the knowledge base. In our tool we used the ART*Enterprise. The user interface controls the inference engine and manages system input and output. The user interface of our tool contains a preprocessor for parsing GDMO+ specification files, a set of input and output handling routines for managing the system. Also, the user interface components allow administrators to inspect the definitions of

Robotics, Automation and Control

management object classes interactively (Giarratano & Riley, 2005). When new knowledge is uncovered, it will need to be incorporated into the system to keep it updated. The user interface allows to modify or include new expert management rules in the managed objects definition. The System Management window allows a system administrator to perform the following functions: • Adding new Management Object Class and change the definitions of the existing management object class. • Configure the Nomos+ workstation’s for the appropriate function and actions. • Set up alarm actions to be executed automatically when specific alarms occur. • Examine and modify the GDMO+ specifications and log files, configuration files, and other text files using an online visual editor.

Fig. 12. Elements of the prototype NOMOS+ Since the window is used to modify system files, only one administrator can use it. Figure 12a shows an example of a web based management interface where the system Object Class of a network is defined in NOMOS+. There is an Object Class called muxtiplexCTR190. If the “Modify” button is pressed then the definition of an Object Class dialog box, Figure 12b, will show. That dialog box allows describing different characteristics of an object class, such as attributes, actions and expert rules.

Fig. 13a System Object Classes Dialog Box

Fig. 13b System Object Classes Update

Inclusion of Expert Rules into Normalized Management Models for Description of MIB Structure

12. Conclusion In this paper we showed possibilities to apply and integrated the artificial intelligence techniques in network management and supervision, using ISO and Internet network management models. In fact we believe that these kinds of applications underline the power of CMIS as both simple and powerful knowledge modeling languages, offering possibilities that simpler protocols such as SNMP and CMIP do not offer. We have seen that the current management systems are not able to solve questions shown in the initial parts of this work. Until now the managed objects did not contain the knowledge coming from base of the expert systems. The managed objects are not able to use the knowledge management, which collects the management operations and control of a management domain. The point is to solve the current problem to undertake an intelligent integrated management. We offer an original contribution to include expert rules in the specifications of the network features, a new model named Integrated Intelligent Management an extension of standard GDMO+ and MIB II+. This paper presented a language for formalizing the knowledge base of the expert systems descriptions in OSI and Internet telecommunications management network framework. A number of questions which arise when designing a language have been discussed and a general framework for the inclusion of formal knowledge management in MIB specifications has been presented. The proposed model was used to formally specify the expert knowledge. An expert system has been implemented and used to manage the specification to the language used by the simulation environment. This demonstrates that expert systems is capable of specifying the knowledge of a reasonably sized information model. A large amount of the knowledge could be described in a surprisingly short and easy to understand manner. The specification of the NOMOS+ information model showed that a large part of the knowledge management was specified in a rather imperative fashion. Our research has demonstrated an useful and interesting modular approach in the development of a knowledge based integrated expert system which can be quite powerful in tackling the huge and enormously wide subject on diagnosis of common problems in management network. It is suggested that future work should aim to: Further development of this prototype system by adding more modules based on the framework provided by NOMOS+ so that more in-depth knowledge and specialized subjects may be captured; in particular the following are of great interest: Development of a design module, possibly a large system, for identifying specific areas as accounting management, configuration management, performance management and security management. Enhancement work in combining and integrating the various modules will be required as the number of modules increases with the growth of the knowledge base. Use of external programs and graphics interface to enhance the functions of the system will be desirable. A graphics interface has not been adopted in NOMOS+ but is an option that can be added in future enhancements to the system.

13. Acknowlegment The work described in this paper has been supported by the Spanish Ministry of Education and Science (MEC: Ministerio de Educación y Ciencia) through project reference number DPI2006-15467-C02-02.

Robotics, Automation and Control

14. References Boyer, Stuart A. (1999). Supervisory control and data acquisition. Research Triangle Park, NC: Instrument Society of America, cop. Black, U.D. (1995). Network Management Standards. McGraw Hill Brachman, Ronald J. & Levesque, Hector J. (2004) Knowledge representation and reasoning. San Francisco, CA: Elsevier/Morgan Kaufmann. Clemm, Alexander. (2006). Network Management Fundamentals. Cisco Press Douglas, Mauro and Kevin, Schmidt. (2005). Essential SNMP, 2nd Edition. O'Reilly. Giarratano, Joseph C. & Riley, Gary D. (2005). Expert Systems: Principles and Programming. Book, Brooks/Cole Publishing Co. Goleniewski, L. and Jarrett, K. W. (2006). Telecommunications Essentials, Second Edition: The Complete Global Source. Addison Wesley Professional. Hebrawi, B. (1995). GDMO, Object modelling and definition for network management. Technology appraisals ISO/IEC DIS 10165-4 / ITU-T. (1993). Recommendation X.722, Information Technology - Part 4: Guidelines for the Definition of Managed Objects (GDMO), International Organization for Standardization and International Electrotechnical Committee. ISO/IEC and ITU-T. (1998). Information Processing Systems - Open Systems Interconnection Systems Management Overview. Standard 10040-2, Recommendation X.701. ITU-T. (1992). Recommendation X.700, Management Framework for Open Systems Interconnection (OSI). CCITT Applications. ITU-T (1996). Rec. M.3010, Principles for a Telecommunications Management Network (TMN). Study Group IV. Maggiora, Paul L. Della., Elliott, Christopher E., Pavone, Robert L., Phelps, Kent J., Thompson, James M. (2000). Performance and Fault Management. Cisco Press Morris, Stephen B. (2003). Network Management, MIBs and MPLS: Principles, Design and Implementation by Publisher: Addison Wesley. Negnevitsky, Michael. (2002). Artificial intelligence: a guide to intelligent systems. New York: Addison Wesley.

5 Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust Adnan Martini, François Léonard and Gabriel Abba

Industrial Engineering and Mechanical Production Lab, Ecole Nationale d’Ingénieurs de Metz France 1. Introduction

High levels of agility, maneuverability and the capability of operating in degraded visual environments and adverse weather conditions are the new trends of helicopter design nowadays. Helicopter flight control system should make these performance requirements achievable by improving tracking performance and disturbance rejection capability. Robustness is one of the critical issues which must be considered in the control system design for such highperformance autonomous helicopter, since any mathematical helicopter model, especially those covering large flight envelope, will unavoidably have uncertainty due to the empirical representation of aerodynamic forces and moments. The purpose of this chapter is to present the stabilization (tracking) with motion planning of a reduced-order helicopter model having 3DOF (Degrees Of Freedom) (see Fig.1). This last one represents a scale model helicopter mounted on an experimental platform. It deals with the problem of disturbance reconstruction acting on the autonomous helicopter, the disturbance consists in vertical wind gusts. The objective is to compensate these disturbances and to improve the performances of the control. Consequently, a nonlinear simple model with 3DOF of a helicopter with unknown disturbances is used. Three approaches of robust control are then compared via simulations: a robust nonlinear feedback control, an active disturbance rejection control based on a nonlinear extended state observer and a backstepping control. Design of control of autonomous flying systems has now become a very challenging area of research, as shown by a large literature (Beji & Abichou, 2005) (Frazzoli et al., 2000) (Koo & Sastry, 1998). Many previous works focus on (linear and nonlinear, robust, ...) control, including a particular attention on the analysis of the stability (Mahony & Hamel, 2004), but very few works have been made on the influence of wind gusts acting on the flying system, whereas it is a crucial problem for out-door applications, especially in urban environment: as a matter of fact, if the autonomous flying system (especially when this system is relatively slight) crosses a crossroads, it can be disturbed by wind gusts and leave its trajectory, which could be critical in a highly dense urban context. In (Martini et al., 2005) and (Martini et al., 2007a), three controllers (nonlinear, H∞ and robust nonlinear feedback) are designed for a nonlinear reduced-order model of a 3 DOF helicopter. In (Pflimlin et al., 2004), a control strategy stabilizes the position of the flying vehicle in wind gusts environment, in spite of unknown aerodynamic efforts and is based on robust backstepping approach and estimation of the unknown aerodynamic efforts.

Robotics, Automation and Control

Fig.1. Helicopter-platform with wind gust In recent papers, feedback linearization techniques have been applied to helicopter models. The main difficulty in the application of such an approach is the fact that, for any meaningful selection of outputs, the helicopter dynamics are non-minimum phase, and hence are not directly input-output linearizable. However, it is possible to find good approximations to the helicopter dynamics (Koo & Sastry, 1998) such that the approximate system is inputoutput linearizable, and bounded tracking can be achieved. Nonlinear control designs previously attempted include neural network based controllers (McLean & Matsuda, 1998), fuzzy control (Sanders et al., 1998), backstepping designs (Mahony & Hamel, 2004), and adaptive control (Dzul et al., 2004). These methods either assume feedback linearizability, which in turn restricts the motion to be around hover, or do not include parametric uncertainties, or realistic aerodynamics. Specific issues such as unknown trim conditions that degrade the performance of the helicopter have not been addressed. While adaptive control schemes have been proposed in the aircraft and spacecraft control context, there is a lack of similar work on helicopter control. The nonminimum phase nature of the helicopter dynamics adds to the challenge of finding a stable adaptive controller. (Wei, 2001) showed the control of nonlinear systems with unknown disturbances, using a disturbance observer based control (DOBC). In (Ifassiouen et al., 2007) a robust sliding mode control structure is designed using the exact feedback linearization procedure of the dynamic of a small-size autonomous helicopter in hover. This chapter is organized as follows. In section 2, a 3DOF Lagrangian model of the disturbed helicopter mounted on an experimental platform is presented. This model can be seen as made of two subsystems (translation and rotation). In section 3 two approaches of robust control design for the reduced order model are proposed. The application of three approaches of robust control on our disturbed helicopter is analyzed in section 4. Section 6 is devoted to simulation results and the study of model stability is carried out in section 5. Finally some conclusions are presented in Section 7.

2. Model of the disturbed helicopter Helicopters operate in an environment where task performance can easily be affected by atmospheric turbulence. This chapter discusses the airborne flight test of the VARIO Benzin

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

Trainer helicopter in turbulent conditions to determine disturbance rejection criteria and to develop a low speed turbulence model for an autonomous helicopter simulation. A simple approach to modeling the aircraft response to turbulence is described by using an identified model of the VARIO Benzin Trainer to extract representative control inputs that replicate the aircraft response to disturbances. This parametric turbulence model is designed to be scaled for varying levels of turbulence and utilized in ground or in-flight simulation. Hereafter the nonlinear model of the disturbed helicopter (Martini et al., 2005) starting from a non disturbed model (Vilchis, 2001) is presented. The Vario helicopter is mounted on an experimental platform and submitted to a vertical wind gust (see Fig.1). It can be noted that the helicopter is in an Out Ground Effect (OGE) condition. The effects of the compressed air in take-off and landing are then neglected. The Lagrange equation, which describes the system of the helicopter-platform with the disturbance, is given by: (1) where the input vector of the control u = [u1 u2]T and q = [z ψ γ]T is the vector of generalized coordinates. The first control u1 is the collective pitch angle (swashplate displacement) of the main rotor. The second control input u2 is the collective pitch angle (swashplate displacement) of the tail rotor. The induced gust velocity is noted vraf. The helicopter altitude is noted z, ψ is the yaw angle and γ is the main rotor azimuth angle. M ∈ R3×3 is the inertia matrix, C ∈ R3×3 is the Coriolis and centrifugal forces matrix, G ∈ R3 represents the vector of conservative forces, Q(q, q , u, vraf ) = [fz τz τγ]T is the vector of generalized forces. The variables fz, τz and τγ represent respectively, the total vertical force, the yaw torque and the main rotor torque in presence of wind gust. Finally, the representation of the reduced system of the helicopter, which is subjected to a wind gust, can be expressed as (Martini et al., 2005) :

(2)

where ci (i =0,...,17) are numerical aerodynamical constants of the model given in table 1 (Vilchis, 2001). For example c0 represents the helicopter weight, c15 = 2ka1sb1s where a1s and b1s are the longitudinal and lateral flapping angles of the main rotor blades, k is the blades stiffness of main rotor. Table 2 shows the variations of the main rotor thrust and of the main rotor drag torque (variations of the helicopter parameters) operating on the helicopter due to the presence of wind gust. These variations are calculated from a nominal position defined as the equilibrium of helicopter when vraf = 0: γ = −124.63rad/s, u1 = −4.588 × 10−5, u2 = 5 × 10−7, TMo = −77.3N and CMo = 4.6N.m.

Robotics, Automation and Control

Table 1. 3DOF model parameters

Table 2. Variation of forces and torques for different wind gusts Three robust nonlinear controls adapted to wind gust rejection are now introduces in section 4.1, 4.2 and 4.3 devoted to control design of disturbed helicopter.

3. Control design 3.1 Robust feedback control Fig.2 shows the configuration of this control (Spong & Vidyasagar, 1989) based on the inverse dynamics of the following mechanical system: (3) Since the inertia matrix M is invertible, the control u is chosen as follows: (4) The term v represents a new input to the system. Then the combined system (3-4) reduces to: (5) Equation (5) is known as the double integrator system. The nonlinear control law (4) is called the inverse dynamics control and achieves a rather remarkable result, namely that the new system (5) is linear, and decoupled. (6) where

represent nominal values of M, h respectively. The uncertainty or modeling

error, is represented by: (3) and nonlinear law (6), the system becomes:

with system equation

(7)

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

Fig.2. Architecture of robust feedback control

can be expressed as Thus q (8) Defining

and

then in state space the system (8) becomes: (9)

where: Using the error vectors

and

leads to: (10)

Therefore the problem of tracking the desired trajectory qd(t) becomes one of stabilizing the (time-varying, nonlinear) system (10). The control design to follow is based on the premise that although the uncertainty η is unknown, it may be possible to estimate "worst case" bounds and its effects on the tracking performance of the system. In order to estimate a worst case bound on the function η, the following assumptions can be used (Spong & Vidyasagar, 1989) : • Assumption 1: • Assumption 2: for some , and for all q ∈Rn. for a known function ψ, bounded in t. • Assumption 3: Assumption 2 is the most restrictive and shows how accurately the inertia of the system must be estimated in order to use this approach. It turns out, however, that there is always a simple choice for satisfying Assumption 2. Since the inertia matrix M(q) is uniformly positive definite for all q there exist positive constants M and M such that: (11) If we therefore choose:

where

, it can be shown that:

. Finally, the following algorithm may now be used to generate a stabilizing control v: Step 1 : Since the matrix A in (9) is unstable, we first set:

Robotics, Automation and Control

(12) 2 2 where K = [K1 K2 ]:, and : K1 = diag{ ω1 , . . . , ωn }, K2 = diag{2ζ1ω1, . . . , 2 ζ n ωn}. The desired

trajectory qd(t) and the additional term Δv will be used to attenuate the effects of the uncertainty and the disturbance. Then we have: (13) is Hurwitz and where Step 2: Given the system (13), suppose we can find a continuous function ρ(e, t), which is bounded in t, satisfying the inequalities: (14) The function ρ can be defined implicitly as follows. Using Assumptions 1-3 and (14), we have the estimate: (15) This definition of ρ makes sense since 0 < < 1 and we may solve for ρ as: (16) Note that whatever Δv is now chosen must satisfy (14). Step 3: Since A is Hurwitz, choose a n × n symmetric, positive definite matrix Q and let P be the unique positive definite symmetric solution to the Lyapunov equation: (17) Step 4: Choose the outer loop control Δv according to: (18) that satisfy (14). Such a control will enable us to remove the principal influence of the wind gust. 3.2 Active disturbance rejection control The primary reason to use the control in closed loop is that it can treat the variations and uncertainties of model dynamics and the outside unknown forces which exert influences on the behavior of the model. In this work, a methodology of generic design is proposed to treat the combination of two quantities, denoted as disturbance. A second order system described by the following equation is considered (Gao et al., 2001) (Hou et al., 2001): (19)

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

where f(.) represents the dynamics of the model and the disturbance, p is the input of unknown disturbance, u is the input of control, and y is the measured output. It is assumed that the value of the parameter b is given. Here f(.) is a nonlinear function. An alternative method is presented by (Han, 1999) as follows. The system in (19) is initially increased: (20) is treated as an increased state. Here f and f are

where

unknown. By considering f(y, y , p) as a state, it can be estimated with a state estimator. Han in Han (1999) proposed a nonlinear observer for (20): (21) where: (22) The observer error is

and: (23)

The observer is reduced to the following set of state equations, and is called extended state observer (ESO):

(24)

Fig.3. ADRC structure

Robotics, Automation and Control

The active disturbance rejection control (ADRC) is then defined as a method of control is estimated in real time and is compensated by the control where the value of it is used to cancel actively f by the application of: signal u. Since The process is now a This expression reduces the system to: double integrator with a unity gain, which can be controlled with a PD controller. u0 = where r is the reference input. The observer gains Li and the controller gains kp and kd can be calculated by a pole placement. The configuration of ADRC is presented in fig.3.

4. Control of disturbed helicopter 4.1 Robust feedback control 4.1.1 Control of altitude z We apply this robust method to control the altitude dynamics z of our helicopter. Let us remain the equation which describes the altitude under the effect of a wind gust: (25)

(26) The value of |vraf | = 0.68m/s corresponds to an average wind gust. In that case, we have the following bounds: 5 × 10−5 ≤ M1 ≤ 22.2 × 10−5; −2, 2 × 10−3 ≤ h1 ≤ 1, 2 × 10−3. Note: We will add an integrator to the control law to reduce the static error of the system and to attenuate the effects of the wind gust which is located in low frequency (raf ≤7rad/s. We then obtain (Martini et al., 2007b): (27) and the value of Δv becomes: Δv1 = − ρ1(e, t) sign (287e1 + 220e2 + 62e3). Moreover ρ1 = 1.7 v1 + 184. 4.1.2 Control of yaw angle ψ: The control law for the yaw angle is: (28) We have:

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

(29)

Using (26) and with we find the following values : −4 −5 −3 −2.7 × 10 ≤ M2 ≤ −6.1 × 10 ; −1.3 × 10 ≤ h2 ≤ 0.16. We also add an integrator to the control law of the yaw angle (Martini et al., 2007b) : (30) where We obtain : ρ2 = 1.7 v2 + 1614.6, the value of Δv becomes: Δv2 = −ρ2(e, t)sign(217e1 + 87e2 + 4e3). On the other hand, the variation of inertia matrices M1(q) and M2(q) from their equilibrium value (corresponding to γ = −124.63rad/s) are shown in table 3. It appears, in this table, that when γ varies from −99.5 to −209, 4rad/s an important variation of the coefficients of matrices M1(q) and M2(q) of about 65% is obtained.

Table 3. Variations of the inertia matrices M1 and M2 4.2 Active disturbance rejection control Two approaches are proposed here (Martini et al., 2007a) . The first uses a feedback and supposes the knowledge of a precise model of the helicopter. For the second approach, only two parameters of the helicopter are necessary, the remainder of the model being regarded as a disturbance, as well as the wind gust. • Approach 1 (ADRC) : Firstly, the nonlinear terms of the non disturbed model (vraf = 0) are compensated by introducing two new controls v1 and v2 such as: (31) Since vraf ≠ 0, a nonlinear system of equations is obtained: (32) Approach 2 (ADRCM): By introducing the two new controls ú1 and ú 2 such as:

Robotics, Automation and Control

a different nonlinear system of equations is got:

(33)

The systems (32) and (33) can be written as the following form: (34) with b = 1, u = v1 or v2 for the approach 1, whether:

(35)

and b = 1, u = ú 1 or (ADRC) ú 2 for the approach 2, whether:

(36)

Concerning the first approach, an observer is built: • for altitude z:

(37) where ez = z − zˆ 1 is the observer error, gi(ei, i, i) is defined as exponential function of modified gain:

(38) with 0 < i < 1 and 0 < i ≤ 1, a PID controller is used in stead of PD in order to attenuate the effects of disturbance: (39) The control signal v1 takes into account of the terms which depend on the observer The fourth part, which also comes from the observer, is added to eliminate the effect of disturbance in this system. • for the yaw angle ψ:

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

(40) is the observer error, with gi(eψ, iψ, i) is defined as exponential function where of modified gain:

(41) and (42) zd and ψd are the desired trajectories. PID parameters are designed to obtain two dominant poles in closed-loop: for

and for

. The approach 2 uses

the same observer with the same gain, simply (−ˆx3) and (−ˆx6) compensate respectively

4.3 Backstepping control To control the altitude dynamics z and the yaw angle ψ, the steps are as follows: 1. Compensation of the nonlinear terms of the nondisturbed model (vraf = 0) by introducing two new controls Vz and Vψ such as: (43) with these two new controls, the following system of equations is obtained: (44) (45) 2.

Stabilization is done by backstepping control, we start by controlling the altitude z then the yaw angle ψ.

4.3.1 Control of altitude z We already saw that z = Vz + d1( γ , vraf ). The controller, generated by backstepping, is generally a PD (Proportional Derived). Such PD controller is not able to cancel external disturbances with non zero average unless they are at the output of an integrating process. In order to attenuate the errors due to static disturbances, a solution consists in equipping the regulators obtained with an integral action (Benaskeur et al., 2000). The main idea is to

Robotics, Automation and Control

introduce, in a virtual way, an integrator in the transfer function of the process and t carry out the development of the control law in a conventional way using the method of backstepping. The state equations of z dynamics which are increased by an integrator, are given by:

(46) where The introduction of an integrator into the process only increases the state of the process. Hereafter the control by backstepping is developed: Step 1: Firstly, we ask the output to track a desired trajectory x1d, one introduces the trajectory error: ξ1 = x1d − x1, and its derivative: (47) which are both associated to the following Lyapunov candidate function: (48) The derivative of Lyapunov function is evaluated: The state x2 is then used as intermediate control in order to guarantee the stability of (47). We define for that a virtual control: Its derivative is written as follows: Step 2: It appears a new error: (49) In order to attenuate this error, the precedent candidate function (48) is increased by another term, which will deal with the new error introduced previously: (50) its derivative: The state x3 can be used as an intermediate control in (49). This state is given in such a way that it must return the expression between bracket equal to The virtual control obtained is: its derivative: Step 3: Still here, another term of error is introduced: (51) and the Lyapunov function (50) is augmented another time, to take the following form: (52)

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

its derivative:

(53)

The control Vz should be selected in order to return the expression between the precedent bracket equal to −a3ξ3 for d1 = 0:

(54) with the relation (47), we obtain: the control law, gives for

These values, replaced in

(55) If we replace (54) in (53), we obtain finally: (56) Step 4: It is here that the design of the control law by the method of backstepping stops. The integrator, which was introduced into the process, is transferred to the control law, which gives the final following control law: (57)

4.3.2 Control of yaw angle ψ: The calculation of the yaw angle control is also based on backstepping control (Zhao & Kanellakopoulos, 1998) dealing with the problem of the attenuation of the disturbance which acts on lateral dynamics. The representation of yaw state dynamics with the angular velocity of the main rotor is:

(58) The backstepping design then proceeds as follows:

Robotics, Automation and Control

Step 1: We start with the error variable: ξ4 = x4 − x4d, whose derivative can be expressed as: here x5 is viewed as the virtual control, that introduces the following error variable: (59) where 4 is the first stabilizing function to be determined. Then we can represent ξ 4 as: (60) In order to design 4, we choose the partial Lyapunov function time derivative along the solutions of (60):

and we evaluate its The choice of:

Step 2: According to the computation of step 1, driving ξ5 to zero will ensure that V 4 is negative definite in ξ4. We need to modify the Lyapunov function to include the error variable ξ 5: (61) We rewrite ξ 5:

(62) In this equation, γ is viewed as the virtual control. This is a departure from the usual backstepping design which only employs state variables as virtual controls. In this case, however, this simple modification is not only dictated by the structure of the system, but it also yields significant improvements in closed-loop system response. The new error variable and 5 is yet to be computed. Then (62) becomes: is

(63) From (63), the choice of:

provides: (64)

Step 3: Similarly to the previous steps, we will design the stabilizing function w2 in this step. To achieve that, firstly, we define the error variable its time derivative:

(65)

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

Therefore, along the solutions of ξ 4, ξ 5 and ξ 6, we can express the time derivative of the partial Lyapunov function

as:

(66)

In the above expression (66), our choice of ω 2 is: (67) Then one replaces (67) in (65), to obtain: of V6 becomes:

the derivative

(68) The integral of (67) provides w2 and Vψ= w2 + γ . In this way, the yaw angle control is calculated.

5. Stability analysis of ADRC control In this section, the stability of the perturbed helicopter controlled using observer based control law (ADRC) is considered. To simplify this study, the demonstration is done with one input and one output as in (Hauser et al. (1992)) and the result is applicable for other outputs. Let us first define the altitude error using equations (32) , (37) and the control (39): we can write:

(69)

Where A is a stable matrix determined by pole placement, and η represents the zero dynamics of our system, η = γ − γ eq, where γ main rotor angular speed :

= −124.63rad/s is the equilibruim of the

Robotics, Automation and Control

is the observer error. Hereafter, we consider the case of a linear observer, so that:

(70)

ˆ is a stable matrix determined by which can be written as: Where A pole placement. Theorem: Suppose that: • The zero dynamics of the system η = β(z, η, vraf ) (where is represented by the γ dynamics) are locally exponentially stable and •

The amplitude of vraf is sufficiently small and the function f (z, η, vraf ) is bounded and

small enough (i.e lˆ u < 1/5, see equation (72) for definition of bound lˆ u). Then for desired trajectories with sufficiently small values and derivatives (zd, z d, z d), the states of the system (32) and of the observer (37) will be bounded. Proof: Since the zero dynamics of model are assumed to be exponentially stable, a conserve Lyapunov theorem implies the existence of a Lyapunov function V1(η) for the system:

η = β (0, η, 0) satisfying for some positive constants k1, k2, k3 and k4. We first show that e, eˆ , η are bounded. To this end, consider as a Lyapunov function for the error system ((69) and (70)): (71)

ˆ ˆ = −I (possible since A and where P, Pˆ > 0 are chosen so that: ATP +PA = −I and Aˆ T Pˆ + PA Aˆ are Hurwitz), μ and  are a positives constants to be determined later. Note that, by assumption, zd and its first derivatives are bounded: The functions, β(z, η, vraf ) and f (z, η, vraf ) are locally Lipschitz (since f is bounded) with

f (0, 0, 0) = 0 , we have:

(72) with lq and lˆ u 2 positive reals. Using these bounds and the properties of V1(.), we have:

(73)

Taking the derivative of V (., ., .) along the trajectory, we find:

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

Define:

Then, for all μ ≤ μ0 and

for 2 ≤  ≤ 1, we have:

Thus, V < 0 whenever e ,

eˆ and η is large which implies that eˆ , e and η and,

hence, z , xˆ and η are bounded. The above analysis is valid in a neighborhood of the origin. By choosing bd and vraf sufficiently small and with appropriate initial conditions, we can guarantee the state will remain in a small neighborhood, and which implies that the effect of the disturbance on the closed-loop can be attenuated. Moreover, if vraf → 0 then lˆ

u→

0 and 1

→ ∞; 2 →1 + 4( B P )2, so that the constraint lˆ u < 1/5 is naturally satisfied for small vraf .

6. Results in simulation Robust nonlinear feedback control (RNFC), active disturbance rejection control based on a nonlinear extended state observer (ADRC) and backstepping control (BACK) are now compared via simulations. 1. RNFC: The various numerical values for the (RNFC) are the following: • For state variable z: {K1 = 84, K2 = 24, K3 = 80} for ω1 = 2rad/s which is the bandwidth of the closed loop in z (the numerical values are calculating by pole placement). • For state variable ψ: We have {K4 = 525, K5 = 60, K6 = 1250} for ω2 = 5rad/s which is 2.

the bandwidth of the closed loop in ψ. ADRC: The various numerical values for the (ADRC) are the following: a. For state variable z: k1 = 24, k2 = 84 and k3 = 80 (the numerical values are calculating by pole placement ). Choosing a triple pole located in ω0z such as ω 0z = (3 ∼ 5) ωc1, one can choose ω0z = 10 rad/s, 1 = 0.5, 1 = 0.1, and using pole placement method the gains of the observer for the case |e| ≤  (i.e linear observer) can be evaluated:

(74)

Robotics, Automation and Control

which leads to: Li = {9.5, 94.87, 316.23}, i∈ [1, 2, 3]. b. For state variable ψ: k4 = 60, k5 = 525, k6 = 1250, ω0ψ = 25 rad/s, '2 = 0.5 and 2 = 0.025. And by the same method in (74) one can find the observer gains: Li = {11.86, 296.46, 2.47 × 103}, i ∈ [4, 5, 6]. BACK: The regulation parameters (a1; a2; a3; a4; a5; a6) for the (BACK) controller was calculated to obtain two dominating poles in closed-loop such as ω 1 = 2 rad/s, which defines the bandwidth of the closed-loop in z, and ω 2 = 5 rad/s for ψ. a.

The closed-loop dynamics of the z-dynamics with d1( γ , vraf ) = 0 is given by (Benaskeur et al., 2000):

(75) Eigenvalues of A0 can be calculated solving: (76) If one gives as a desired dynamics specification, one dominant pole in −κ and the two other poles in −10κ, one must solve: (77) which leads to:

For . = ω1 = 2 rad/s, and resolving the above equations, we find 4 positive solutions for every parameter (see Table 1). The solution: a1 = 21, a2 = 19, a3 = 1.95 has been used for simulation.

Table 1. Regulation parameters of z and ψ-dynamics b. The closed-loop dynamics of the ψ-dynamics with d3(Vz, γ , vvraf ) = 0 is given by:

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

(78) Eigenvalues of B0 can be calculated by solving: (79) By using the same development as for z-dynamics, one can write:

For κ= ω2 = 5 rad/s and resolving the above equations, we find again 4 positive solutions for every parameter (see Table 1). As justified in annex ?? the solution: a4 = 4.97, a5 = 49, a6 = 51 has been used for simulation. The induced gust velocity operating on the principal rotor is chosen as (G.D.Padfield, 1996): (80) where td1 = t − 70 and td2 = t − 220, the value of 0.042 represent

where V in m/s is the

height rise speed of the helicopter and vgm = 0.68m/s is the gust density. This density corresponds to an average wind gust, and Lu = 1.5m is its length (see Fig.5). The take-off time at t = toff = 50 s is imposed and the following desired trajectory is used (Vilchis et al., 2003):

Fig. 4. Trajectories in z and ψ

Fig. 5. Induced gust velocity vraf

(81)

Robotics, Automation and Control

where ta = 130s and tb = 20π + 130s,

(82)

and tc = 120 s and td = 180 s. The following initial conditions are applied: z(0) = −0.2m, z (0) =

0, ψ(0) = 0, ψ (0) = 0 and γ (0) = −99.5 rad/s. A band limited white noise of variance 3mm for

z and 1o for ψ, has been added respectively to the measurements of z and ψ for the three controls. The compensation of this noise is done using a Butterworth second-order low-pass filter. Its crossover frequency for z is ωcz = 12 rad/s and for ψ is ωcψ = 20 rad/s. Fig.4 shows the desired trajectories in z and ψ. One can observe that γ → −124.6 rad/s remains bounded away from zero during the flight. For the chosen trajectories and gains γ converges rapidly to a constant value (see Fig.7). This is an interesting point to note since it shows that the dynamics and feedback control yield flight conditions close to the ones of real helicopters which fly with a constant γ thanks to a local regulation feedback of the main rotor speed (which does not exist on the VARIO scale model helicopter). One can also notice that the main rotor angular speed is similar for the three controls as illustrated in Fig.7. The difference between the three controls appears in Fig.6 where the tracking errors in z are less significant by using the (BACK) and (ADRC) control than (RNFC) control. For ψ it is the different. This is explained by the use of a PID controller for the (RNFC) and (ADRC) but a PD controller for the (BACK) controller of ψ (Fig.6). Here, the (ADRC) and (BACK) controls show a robust behavior in presence of noise.

Fig. 6. Tracking error in z and in ψ.

Fig. 7. Variations of the main rotor thrust TM and the main rotor angular speed γ .

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

One can see in Fig.7 that the main rotor thrust converges to values that compensate the helicopter weight, the drag force and the effect of the disturbance on the helicopter. The (RNFC) control allows the main rotor thrust TM to be less away from its balance position than the other controls, where the RNFC control is less sensitive to noise. Fig.9 represent the effectiveness of the observer: xˆ 3 and fz(y, y ,w) are very close and also xˆ 6 and fψ(y, y ,w). Observer errors are presented in the Fig.8.

Fig. 8. Observer error in z and in ψ

Fig. 9. Estimation of fz and of f$

If one keeps the same parameters of adjustment for the three controls and using a larger wind gust (vraf = 3m/s), we find that the control (BACK) give better results than the two controls (ADRC) and (RNFC) (see Fig.10).

Fig. 10. Large disturbance vraf = 3m/s Fig.11 shows the tracking error in z and ψ for two different ADRC controls. These errors are quite simular for approach 1 (ADRC) and approach 2 (ADRCM). Nevertheless ADRCM induces larger error at the take off, which can be explained by the fact that the control depends directly on the angular velocity of the main rotor: this last one need a few time to reach its equilibrium position as seen in Fig.6. The same argument can be invoked to explain the saturation of ADRCM control u1 and u2 as illustrated in Fig.12.

100

Robotics, Automation and Control

Fig. 11. Tracking error in z and ψ for both

Fig. 12. Inputs u1 and u2 for both approachs

approachs 1 and 2 of ADRC control.

1 and 2 of ADRC control.

7. Conclusion In this chapter, a robust nonlinear feedback control (RNFC), an active disturbance rejection control based on a nonlinear extended state observer (ADRC) and backstepping control (BACK) have been applied for the drone helicopter control disturbed by a wind gust. The technique of a robust nonlinear feedback control use the second method of Lyapunov and an additional feedback provides an extra term Δv to overcome the effects of the uncertainty and disturbances. The basis of ADRC is the extended state observer. The state estimation and compensation of the change of helicopter parameters and disturbance variations are implemented by ESO and NESO. By using ESO, the complete decoupling of the helicopter is obtained. The major advantage of the proposed method is that the closed loop characteristics of the helicopter system do not depend on the exact mathematical model of the system. The backstepping technique should not viewed as a rigid design procedure, but rather as a design philosophy which can be bent and twisted to accommodate the specific needs of the system at hand. In the particular example of an autonomous helicopter, we were able to exploit the flexibility of backstepping with respect to the selection of virtual controls, initial stabilizing functions and Lyapunov functions. Comparisons were made in detail between the three methods of control. It is concluded that the three proposed controls algorithms produces satisfactory dynamic performances. Even for large disturbance, the proposed backstepping (BACK) and (ADRC) control systems are robust against the modeling uncertainties and external disturbance in various operating conditions. It is also indicated that (BACK) and (ADRC) achieve a better tracking and stabilization with prescribed performance requirements. For practical reasons, the second ADRC approach is the best one because it only requires to know some aerodynamic parameters of the helicopter (dimensions of the blades of the main and tail rotor and the helicopter weight), whereas the other approaches (first ADRC

Robust and Active Trajectory Tracking for an Autonomous Helicopter under Wind Gust

101

approach, RNCF and BACK) depend on all the aerodynamic parameters which generate the forces and the couples that act on the helicopter. For first ADRC control, a stability analysis has been carried out where boundness of states of helicopter and observer are proved in spite of the presence of wind gust. As illustrated in tables 2 and 3, wind gust induces large variation of helicopter parameters, and the controls quoted in this work can efficiently treat these parameter deviations. As perspective, this work is carried on a model of a 7DOF VARIO helicopter, where ADRC and linearizing control will be tested in simulation. The first results using ADRC control on this 7DOF helicopter have been recently obtained (see (Martini et al., 2008) ). Moreover, our control methodologies will be also implemented on a new platform to be built using a Tiny CP3 helicopter.

8. References Beji, L. and A. Abichou (2005). Trajectory generation and tracking of a minirotorcraft. Proceedings of the 2005 IEEE International Conference on Robotics and Automation, Spain, 2618–2623. Benaskeur, A., L. Paquin, and A. Desbiens (2000). Toward industrial control applications of the backstepping. Process Control and Instrumentation, 62–67. Dzul, A., R. Lozano, and P. Castillo (2004). Adaptive control for a radio-controlled helicopter in a vertical flying stand. International journal of adaptive control and signal processing 18, 473–485. Frazzoli, E., M. Dahleh, and E. Feron (2000). Trajectory tracking control design for autonomous helicopters using a backstepping algorithm. Proceedings of the American Control Conference Chicago, Illinois, 4102–4107. Gao, Z., S. Hu, and F. Jiang (2001). A novel motion control design approach based on active disturbance rejection. pp. 4877–4882. Orlando, Florida USA: Proceedings of the 40th IEEE Conference on Decision and Control. G.D.Padfield (1996). Helicopter Flight Dynamics: The Theory and Application of Flying Qualities and Simulation Modeling. Blackwell Science LTD. Han, J. (1999). Nonlinear design methods for control systems. Beijing, China: The Proc of the 14th IFAC World Congress. Hauser, J., S. Sastry, and G. Meyer (1992). Nonlinear control design for slightly nonminimum phase systems: Applications to v/stol aircraft. Automatica 28 (4), 665–679. Hou, Y., F. J. Z. Gao, and B. Boulter (2001). Active disturbance rejection control for web tension regulation. Proceedings of the 40th IEEE Conference on Decision and Control, Orlando, Florida USA, 4974–4979. Ifassiouen, H., M. Guisser, and H. Medromi (2007). Robust nonlinear control of a miniature autonomous helicopter using sliding mode control structure. International Journal Of Applied Mathematics and Computer Sciences 4 (1), 31–36. Koo, T. and S. Sastry (1998). Output tracking control design of a helicopter model based on approximate linearization. The 37th Conference on Decision and Control (Florida, USA) 4, 3636–3640. Mahony, R. and T. Hamel (2004). Robust trajectory tracking for a scale model autonomous helicopter. Int. J. Robust Nonlinear Control 14, 1035–1059.

102

Robotics, Automation and Control

Martini, A., F. Léonard, and G. Abba (2005). Suivi de trajectoire d’un hélicoptère drone sous rafale de vent[in french]. CFM 17ème Congrès Français de Mécanique. Troyes, France, CD ROM.N0.467. Martini, A., F. Léonard, and G. Abba (2007a). Robust and active trajectory tracking for an autonomous helicopter under wind gust. ICINCO International Conference on Informatics in Control, Automation and Robotics, Angers, France 2, 333–340. Martini, A., F. Léonard, and G. Abba (2007b). Suivi robuste de trajectoires d’un hélicoptère drone sous rafale de vent. Revue SEE e-STA 4, 50–55. Martini, A., F. Léonard, and G. Abba (2008, 22 -26 Septembre). Robust nonlinear control and stability analysis of a 7dof model-scale helicopter under wind gust. In IEEE/RSJ, IROS, International Conference of Intelligent Robots and Systems, to appear. NICE, France. McLean, D. and H. Matsuda (1998). Helicopter station-keeping: comparing LQR, fuzzy-logic and neural-net controllers. Engineering Application of Artificial Intelligence 11, 411– 418. Pflimlin, J., P. Soures, and T. Hamel (2004). Hovering flight stabilization in wind gusts for ducted fan uav. Proc. 43 rd IEEE Conference on Decision and Control CDC, Atlantis, Paradise Island, The Bahamas 4, 3491– 3496. Sanders, C., P. DeBitetto, E. Feron, H. Vuong, and N. Leveson (1998). Hierarchical control of small autonomous helicopters. 37th IEEE Conference on Decision and Control 4, 3629 – 3634. Spong, M. and M. Vidyasagar (1989). Robot Dynamics and Control. John Willey and Sons. Vilchis, A. (2001). Modélisation et Commande d’Hélicoptère. Ph. D. thesis, Institut National Polytechnique de Grenoble. Vilchis, A., B. Brogliato, L. Dzul, and R. Lozano (2003). Nonlinear modeling and control of helicopters. Automatica 39, 1583 –1596. Wei, W. (2001). Approximate output regulation of a class of nonlinear systems. Journal of Process Control 11, 69–80. Zhao, J. and I. Kanellakopoulos (1998). Flexible backstepping design for tracking and disturbance attenuation. International journal of robust and nonlinear control 8, 331– 348.

6 An Artificial Neural Network Based Learning Method for Mobile Robot Localization Matthew Conforth and Yan Meng

Department of Electrical and Computer Engineering Stevens Institute of Technology Hoboken, NJ 07030, USA 1. Introduction One of the most used artificial neural networks (ANNs) models is the well-known MultiLayer Perceptron (MLP) [Haykin, 1998]. The training process of MLPs for pattern classification problems consists of two tasks, the first one is the selection of an appropriate architecture for the problem, and the second is the adjustment of the connection weights of the network. Extensive research work has been conducted to attack this issue. Global search techniques, with the ability to broaden the search space in the attempt to avoid local minima, has been used for connection weights adjustment or architecture optimization of MLPs, such as evolutionary algorithms (EA) [Eiben & Smith, 2003], simulated annealing (SA) [Jurjoatrucj et al., 1983], tabu search (TS) [Glover, 1986], ant colony optimization (ACO) [Dorigo et al., 1996] and particle swarm optimization (PSO) [Kennedy & Eberhart, 1995]. The NeuroEvolution of Augmenting Topologies (NEAT) [Stanley & Miikkulainen, 2002] method turns the neural networks topology and connect weight simultaneioulsy using an evolutionary computation method. It evolves efficient ANN solutions quickly by complexifying and optimizing simultaneously; it achieves performance that is superior to comparable fixed-topology methods. In [Patan & Parisini, 2002] the stochastic methods Adaptive Random Search (ARS) and Simultaneous Perturbation Stochastic Approximation (SPSA) outperformed extended dynamic backpropagation at training a dynamic neural network to control a sugar factory actuator. Recently, artifical neural networks based methods are applied to robotic systems. In [Racz & Dubrawski, 1994], an ANN was trained to estimate a robot’s position relative to a particular local object. Robot localization was achieved by using entropy nets to implement a regression tree as an ANN in [Sethi & Yu, 1990]. An ANN was trained in [Choi & Oh, 2007] to correct the pose estimates from odometry using ultrasonic sensors. In this paper, we propose an aritifucal neural networks learning method for mobile robot localization, which combines the two popular swarm inspired methods in computational intelligence areas: Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) to train the ANN models. ACO was inspired by the behaviors of ants and has many successful applications in discrete optimization problems. The particle swarm concept originated as a simulation of a simplified social system. It was found that the particle swarm model could be used as an optimizer. These algorithms have been applied already to solving

104

Robotics, Automation and Control

problems of clustering, data mining, dynamic task allocation, and optimization [Lhotska et al., 2006]. The basic idea of the proposed SWarm Intelligence-based Reinforcement Learning (SWIRL) method is that ACO is used to optimize the topology structure of the ANN models, while the PSO is used to adjust the connection weights of the ANN models based on the optimized topology structure. This is designed to split the problem such that ACO and PSO can both operate in the environment they are most suited for. ACO is ideally applied to finding paths through graphs. One can treat the ANN’s neurons as vertices and its connections as directed edges, thereby transforming the topology design into a graph problem. PSO is best used to find the global maximum or minimum in a real-valued search space. Considering each connection plus one associated fitness score as orthogonal dimensions in a hyperspace, each possible weight configuration is merely a point in that hyperspace. Finding the optimal weights is thus reduced to finding the global maximum of the fitness function in that hyperspace. The chapter is organized as follows. Section 2 introduces two swarm intelligence based methods: ant colony optimiztion and particle swarm optimization. The proposed SWIRL method is described in Section 3. Simulation results and discussions using SWIRL method for mobile robot localization are presented in Section 4. Conclusions are given in Section 5.

2. Swarm intelligence 2.1 Ant colony optimization Dorigo et al. [11] proposed an Ant Colony Optimization (ACO). ACO is essentially a system that simulates the natural behavior of ants, including mechanisms of cooperation and adaptation. The involved agents are steered toward local and global optimization through a mechanism of feedback of simulated pheromones and pheromone intensity processing. It is based on the following ideas. First, each path followed by an ant is associated with a candidate solution for a given problem. Second, when an ant follows a path, the amount of pheromone deposit on that path is proportional to the quality of the corresponding candidate solution for the target problem. Third, when an ant has to choose between two or more paths, the path(s) with a larger amount of pheromone are more attractive to the ant. After some iterations, eventually, the ants will converge to a short path, which is expected to be the optimum or a near-optimum solution for the target problem. 2.2 Particle swarm optimization The PSO algorithm is a population-based optimization method, where a set of potential solutions evolves to approach a convenient solution (or set of solutions) for a problem. The social metaphor that led to this algorithm can be summarized as follows: the individuals that are part of a society hold an opinion that is part of a "belief space" (the search space) shared by every possible individual. Individuals may modify this "opinion state" based on three factors: (1) The knowledge of the environment (explorative factor); (2) The individual's previous history of states (cognitive factor); (3) The previous history of states of the individual's neighborhood (social factor). Therefore, the basic idea is to propel towards a probabilistic median, where explorative factor, cognitive factor (local robot respective views), and social factor (global swarm wide views) are considered simultaneously and try to merge these three factors into consistent behaviors for each robot. The exploration factor can be easily emulated by random movement.

An Artificial Neural Network Based Learning Method for Mobile Robot Localization

105

3. The SWIRL approach In the SWIRL approach, the ACO algorithm is utilized to select the topology of the neural network, while the PSO algorithm is utilized to optimize the weights of the neural network. The SWIRL approach is modeled on a school, with the ACO, PSO, and neural networks taking on the roles of administrator, teacher, and student respectively. Students learn, teachers train students, and administrators allocate resources to teachers. In the same fashion, the ACO algorithm allocates training iterations to the PSO algorithms. The PSO algorithms then run for their allotted iterations to train their neural networks. The global best score for all the neural networks trained by a particular PSO instance is then used by the ACO algorithm to reallocate the training iterations. 3.1 ACO-based topology optimization ACO is used to allocate training iterations among a set of candidate network topologies. The desirability in ACO is defined as:

d (i ) =

1 h +1

(1)

where h is the number of hidden nodes in neural network i. τ (pheromone concentration) is initialized to 0.1, so that the ants' initial actions are based primarily on desirability. τ is then updated according to:

τ (i, t + 1) = ρ ⋅τ (i, t ) + na (i ) ⋅

g (i ) g sum

(2)

where ρ is the rate of evaporation, na is the number of ants returning from neural network i, g(i) is the global best for i, and g sum is the sum of all the current global bests. Each ant represents one training iteration for the PSO teacher. During each major iteration (i.e. ACO step), the ants go out into the topology space. The probability ant k goes to neural network i is given by:

p(i ) =

[τ (i, t )]α [d (i )]β i

∑ { [τ ( j, t )]α ⋅ [d ( j )]β }

(3)

j =1

where α and β are constant factors that control the relative influence of pheromones and desirability, respectively. 3.2 PSO-based weight adjustment The ANNs are then trained via the PSO algorithm for a number of iterations determined by the number of ants at that node. Each PSO teacher starts with a group of networks whose connection weights are randomly initialized. The student ANNs are tested on the chosen problem and each receives a score. The PSO teacher keeps track of the global best configuration and each student’s individual best configuration. A configuration for an ANN with n connections can be considered as a point in an n+1 dimensional space, where the extra dimension is for the reinforcement score. After every round of testing, the teacher updates the connection weights of the student ANNs according to the following equations:

106

Robotics, Automation and Control

v t +1 = cinr r1 • v t + ccgnr2 • (x pb − xt ) + cscl r3 • (x gb − xt )

(4)

xt +1 = xt + v t +1

(5)

where position and velocity vectors are denoted by x and v, respectively. The big dot symbol is for Hadamard multiplication. The ri represents vectors where each element is a new sample from the unit-interval uniform random variable. Personal best, xpb, is the point in the solution-space where that particular student received its highest score so far. Global best, xgb, is the point with the highest score achieved by any student of this PSO teacher. The three constants, cinr, ccgn, and cscl, allow the adjustment of the relative weighting for the inertial, cognitive, and social components of the velocity, respectively. After exhausting its allotted training iterations, the PSO teacher reports the global best to the ACO administrator. If/when it is allocated additional training iterations, the PSO teacher resumes training from exactly where it left off. 3.3 The SWIRL method summary Pseudo-code for the SWIRL algorithm follows: procedure SWIRL initialize ACO_Administrator for(candidate topology i=1…N) create PSO_Teacher(i) for(ANN_Student j=1…M) initialize ANN_Student end for end for while(solution not found) compute ant movement CDF ants allocate training iterations for(PSO_Teacher i=1…N) while(iterations < allocation) for(ANN_Student j=1…M) test ANN_Student(j) end for for(ANN_Student j=1…M) update weights ANN_Student(j) end for end while return global best end for update pheromone concentrations end while end

An Artificial Neural Network Based Learning Method for Mobile Robot Localization

107

4. Simulation The SWIRL system is implemented in Java for simulation testing. There is a 5:1 ratio of ants to candidate network topologies. The candidates are 1 through 5 hidden nodes. The pheromone influence factor, desirability influence factor, and rate of pheromone evaporation are set to 2, 1, and 0.5 respectively. The initial pheromone level is 0.1 for all topologies. Each PSO teacher has 100 students. The PSO particle velocity is capped at 5. The velocity factors are 0.8 for the inertial constant, 2 for the cognitive constant, and 2 for the social constant. The neural networks are fully connected, with initial connection weights uniformly random in the range (-5,5). Hyperbolic tangent is used as the transfer function. The SWIRL system was retrofitted into an existing Markov localization simulator, which required reimplementing SWIRL in C++. Markov localization systems are typically divided into an odometry component and a sensory component which alternate updating the belief values for each location. The SWIRL system was challenged with generating an ANN that could replace the sensory component in the Markov localization system. Fig. 1 shows a series of snapshots of a robot localization simulation using the SWIRL system. The red pentagon is the robot. The pale blue wedges represent the robot’s sonar sensors and the green rays are the robot’s laser rangefinders. The large green square is the goal to which the robot must navigate. The large orange square represents the location where the robot most strongly believes itself to be. The map is divided into a grid of squares, which are each divided into a black and red triangle. The black triangle and red triangle indicate the belief of the Markov algorithm and current best ANN, respectively, that the robot may be located in that particular square. Note that the red triangles do not show up underneath the pale blue wedges; this is merely a color layering issue in the simulator’s screen painter. The map is 1086 by 443 pixels in size. The Markov grid squares are 7 pixels on a side, and rotations are multiples of 10°. Note that this quantization applies only to the pose estimation in the localization system. The robot is simulated with a full range of motion using floating point x, y, and θ values. The robot has 4 sonar sensors and 19 laser rangefinders. The sonar sensor range is 75 pixels in the map, whereas the range of the laser rangefinders is 100 pixels. Fig. 1(a) shows the localization simulator at the beginning of a simulation run. Fig. 1(b) shows an early state of the simulation. The blue pentagons are stamped periodically on the map to indicate the path taken by the robot. At this point, the robot has narrowed down its location to a few regions on the map. In Fig. 1(c), near the end of the simulation, the robot is certain of its general location. Finally, Fig. 1(d) shows the robot when it has successfully reached the goal.

Goal Fig. 1(a). The initialization state of a simulation.

108

Robotics, Automation and Control

Fig. 1(b). An early state of a simulation.

Fig. 1(c). A late state of a simulation.

Fig. 1(d). The end of a simulation. Fig. 1. Start of a simulation. Black = Markov value; Red = ANN value. The robot, shown as a red pentagon, must navigate from its initial position to the goal, the green square. The triangles indicate the likelihood that the robot is in that location according to the beliefs of the localization systems. The green lines show the robot’s laser rangefinders and the pale blue wedges show its sonar sensors. The orange square is the robot’s most likely position according to whichever localization system is currently in use for navigation. Blue pentagons are periodically stamped on the map to show the path taken by the robot. The sensory component of the Markov localization system computes the likelihood the robot is at a particular location by comparing the current sensor readings to the predicted sensor readings for that location which are generated from the map. Sensor noise is the main source of difficulty. The following pseudo-code describes this function:

An Artificial Neural Network Based Learning Method for Mobile Robot Localization

109

function probability(location x) prob_match=1.0 for(i=1…Number_Sensors) j=normalize(reading(i)-predict(x,i)) prob_match=prob(j)*prob_match end for return prob_match end These match probabilities are then used to update the robot’s location belief matrix. In place of this, the ANN generated by SWIRL should take a vector of j’s as the input, and give prob_match as the output. The obvious source of concern in this scenario is that ANNs inherently deal in weighted sums, whereas joint probability calls for a product. This is solved by taking advantage of the fact that j > 0 and e eoutput

ln( x )+ln( y )

= xy for x,y > 0. Thus, a vector of ln(j)’s is used as the input, and

is used as prob_match. In the simulation, a robot must navigate from an unknown starting position to a goal position that is specified on the map. Sensor noise is the main source of difficulty. The sensor noise has 3 components: bias, skew, and incidental. Each sensor has its own bias and skew values that are randomly initialized at the beginning of the simulation, but remain fixed thereafter. The incidental noise is a new Gaussian value generated each time the sensor is read. The SWIRL system is first trained over one or more training runs. The best ANN produced is then used to replace the sensory component of the Markov localization system for the testing run. For the purpose of training, another Markov localizer with access to noise-free sensors is used to produce the “solution set” for the fitness function. This enables the ANN (ideally) to weigh the sensors according to their relative accuracy. The simulation results are shown in Fig. 2, Fig. 3, and Fig. 4. The results are produced for 1 and 3 training runs. Unfortunately, the wavefront navigation system used by the Markov localization simulator introduces substantial variability into the robot’s performance, as demonstrated by the large standard deviations. Consequently, it would require an exorbitantly large number of tests to establish with a high degree of certainty whether the average SWIRL solution is slightly better or slightly worse than the Markov method. In any case, improving Markov localization is not the goal here. It is clear from the results that SWIRL can indeed produce ANNs that function comparably to the sensory component of the Markov localization system. SWIRL demonstrates its ability to generate effective solutions in the face of real-world complications such as noise and poor calibration. After only a single test run, the SWIRL solution is already performing comparably to the Markov method. The reason more finegrained progress is not presented is that the nature of Markov localization is such that early mistakes are compounded each cycle. No matter how good the SWIRL solution gets, its mistakes from earlier in the run will prevent it from making accurate predictions. To make the graphical display useful, every so often during training (only during training) SWIRL’s old belief matrix is overwritten with “correct” values so that one can see an estimate of how accurate the best ANN is at that point. However, this obviously gives a rough estimate only, and is not appropriate for a numerical comparison with the Markov predictions.

110

Robotics, Automation and Control 180

Simulation Cycles

170

160

150

140

130

120 0.5

1.5

2.5

3.5

Training Runs ANN Cycles

Markov Cycles

Fig. 2. Simulation cycles versus training runs. 960 940

Distance Traveled

920 900 880 860 840 820 800 780 760 0.5

1.5

2.5

Training Runs ANN Distance

Markov Distance

Fig. 3. Distance traveled versus training runs 12000 10000

Degrees Turned

8000 6000 4000 2000 0 0.5

1.5

2.5

-2000 -4000

Training Runs ANN Degrees

Fig. 4. Degrees turned versus training runs

Markov Degrees

An Artificial Neural Network Based Learning Method for Mobile Robot Localization

111

7. Conclusion In this paper, a SWIRL algorithm is proposed to generate ANN solutions to tasks/problems amenable to reinforcement learning. Basically, the ACO algorithm is applied to select the neural network topology, while the PSO algorithm is utilized to adjust the connection weights of the selected topology. The robot localization using the SWIRL algorithm have been efficiently conducted in a noisy, imprecise environment with large-scale number of input signals, as might be encountered in the real world. As a result of its generality, the SWIRL method is scalable, robust, and can be applied to almost any real world task. This implementation is merely a simple proof-of-concept where the ACO algorithm chooses the number of hidden nodes. By limiting the scope to fully-connected feedforward neural networks, this single number fully defines the topology (since the input and output nodes are determined by the test problem). These are limits of this test implementation only; not limits on the SWIRL method.

8. References Haykin, S. (1998). Neural Networks: A comprehensive Foundation. 2nd Edition, Prentice Hall. Eiben, E. & and Smith, J.E. (2003). Introduction to Evolutionary Computing. Natural Computing Series. MIT Press. Springer. Berlin. Kirkpatrick, S.; Gellat Jr., C.D., & Vecchi, M. P., (1983). Optimization by simulated annealing, Science, 220: 671-680. Glover, F. (1986). Future paths for integer programming and links to artificial intelligence, Computers and Operation Research, Vol. 13, pp. 533-549. Dorigo, M.; Maniezzo, V., & Colorni. A. (1996). Ant System: optimization by a colony of cooperating agents, IEEE Transactions on Systems, Man and Cybernetics - Part B, vol. 26, no. 1, pp. 29-41. Kennedy, J. & Eberhart, R. (1995). Particle Swarm Optimization, in: Proc. IEEE Intl. Conf. on Neural Networks (Perth, Australia), IEEE Service Center, Piscataway, NJ, IV:19421948. Stanley, K. O. & Miikkulainen. R. (2002). Evolving Neural Networks through Augmenting Topologies, Evolutionary Computation, 10(2): 99-127. Patan, K. & Parisini, T. (2002). Stochastic learning methods for dynamic neural networks: simulated and real-data comparisons, Proceedings of American Control Conference. Racz. J. & Dubrawski, A. (1994). Mobile Robot Localization with an Artificial Neural Network, International Workshop on Intelligent Robotic Systems IRS '94, Grenoble, France. Sethi, I. K. & Yu, G. (1990). A Neural Network Approach to Robot Localization Using Ultrasonic Sensors, Proceedings of 5th IEEE International Symposium on Intelligent Control, 1990. pp. 513-517 vol. 1, 5-7. Choi, W. S. & Oh, S. Y. (2007). Range Sensor-based Robot Localization Using Neural Network, International Conference on Control, Automation and Systems, pp. 230-234, 17-20.

112

Robotics, Automation and Control

Lhotská, L.; Macaš, M., & Burša, M. (2006). PSO and ACO in Optimization Problems, E. Corchado et al. (Eds.): IDEAL 2006, LNCS 4224, pp. 1390 – 1398.

7 The Identification of Models of External Loads Yuri Menshikov

Dnepropetrovsk University Ukraine 1. Introduction One of the important problems of mathematical modelling of dynamic systems is the coincidence of modelling results with experimental measurements. Such a coincidence is being attained by construction of "correct" mathematical model (MM) of the dynamical system and the choice of a “good” model of external load (MEL). MM of object the motion of which coincides with experimental measurements with acceptable accuracy under action of MEL (or external impact) is understood by us as “correct” model. Thus the degree of "correctness" of MM depends directly on the chosen model of EL and required accuracy of the coincidence with experiment. It is formally possible to write as an inequality for models with the concentrated parameters the following: F (Аp z, uδ) ≤ ε,

(1)

where Ap is an operator of the certain structure which carries out the connection of EL (z) and the response of MM (u, Аp z = u) and which depends on vectors - parameters p; ε = Const > 0 is the required accuracy of the coincidence of experiment with results of mathematical modelling; z is the function of model of EL, z ∈ Z; u, uδ are the vectorfunctions of the response of researched object on external load, u∈U, uδ∈U. One of possible variants of an inequality (1) can be the following inequality ⎥⎜Аp z − uδ⎥⎜U ≤ ε,

(2)

. where ⎥⎜ ⎥⎜U there is a norm in functional space U. Characteristic feature for problems of the considered type is that the operator Аp is compact operator (Тikhonov & Аrsenin ,1979). The value ε is set a priori and characterizes desirable quality of mathematical modeling. The vector-function uδ is obtained from experiment with the know error δ : ⎥⎜ uТ − uδ⎥⎜U ≤ δ0,

(3)

where uT is an exact response of object on real EL. It is obvious that in the case of performance of an inequality (2) operators Аp and function z are connected. It is easy to show that at the fixed operator Аp in (2) exists infinite set of various among themselves functions z which satisfies to an inequality (2) (Тikhonov & Аrsenin, 1979). And, on the contrary, at the fixed function z there are infinite many various

114

Robotics, Automation and Control

operators for which an inequality (2) is valid. Thus, there are no opportunities of a choice of good model of system (of process) separately from a choice of correct model of external load. As a rule the check of inequality (2) is not executed in the practice of mathematical modelling, but its performance is meant. The error of the measuring equipment δ0 is contained in value ε as obligatory component and therefore the inequality δ0 ≤ ε is always takes place. It occurs for the reason, that the accuracy of experimental measurements is higher as required accuracy of modelling as a rule. Frequently only qualitative coincidence of results of mathematical modelling with experiment satisfies. At research of real dynamic systems the structure of the mathematical description, as a rule, is fixed. For example, at research of dynamics rolling mills (Menshikov, 1976, 1985), at the solution of a problem of unbalance diagnostics (Menshikov,2004) it is possible to use the models with the concentrated parameters. Proceeding from design features of real systems or devices, it is possible to determine parameters of the mathematical description (parameters of operators) precisely enough. However these parameters are believed to be given approximately. The error of definition of parameters depends on a way of reduction of dynamic systems to more simple systems, from a various sort of the conditions and assumptions, from the account of those or other factors (Menshikov, 1994). This error can be appreciated from above and it does not surpass 10 % as a rule. Two approaches exist to problem of construction of couple MM and model of EL: 1. MM is given a priori with inexact parameters and then the model of EL is being determined for which the inequality (2) is valid; 2. Some model of EL is given a priori and then MM is being chosen for which the inequality (2) is satisfied. For example for the operators which are given with an error it is necessary to construct models of external load under use of which the results of mathematical modeling will coincide with the certain accuracy with results of experiment. Such algorithms of construction of pair (mathematical description + model of external load) are not unique.

2. Statement of synthesis of external loads by identification method Let's consider now opportunities of the first algorithm on an example of dynamic system ∑ with the concentrated parameters, the motion of which is described by ordinary differential equations of n-order. It is suggested that the records of all external loads f 2 (t ), f 3 (t ),..., f m (t ) (except only one f1 (t ) ) and one state variable, for example x1 (t ) , are obtained in experimental way during motion of system for some interval of time t ∈ [0, T ] . It is necessary to find the model z (t ) of external load f1 (t ) after the action of which the mathematical model of system ∑ (MM∑) moves in a such way that the state variable x1 (t )

x1 (t ) of x1 (t ) . The rest of external loads coincide with coincides with experimental record ~ external loads f 2 (t ), f 3 (t ), ... , f m (t ) known from experiment. The problems of such a type

were named the problems of external loads identification (Gelfandbein & Кolosov, 1972), (Ikeda et al., 1976). The model z (t ) which was obtained by such a method depends on chosen MM∑ and on goals of the use at mathematical modeling in future.

115

The Identification of Models of External Loads

If the initial dynamic system does not satisfy the condition as have been specified above then this system can be reduced to system ∑ with the help of additional measurements (Menshikov, 2004). Let us assume that the MM∑ is linear and that the connection between unknown function z (t ) and functions f 2 (t ), f 3 (t ), ... , f m (t ) , x1 (t ) has the form:

Ap z = B p x ,

(4)

where Ap is linear integral operator ( Ap : Z → U ) which depends continuously on vector parameters p of mathematical model of system (MM∑), p = ( p1 , p2 , ... , p N )T , (⋅)T is the sign of transposition, p ∈ R N , R N

is the Euclidean vector space with norm

= ( p, p ) ;

B p is linear bounded operator ( B p : X → U ) which depends continuously on vector parameters p ; x = ( x1 (t ), f 2 (t ), . . . , f m (t ))T ; z ∈ Z , x ∈ X ; Z , X , U are Gilbert spaces. The ~ ~ x = (~ x (t ), f (t ), .. . . f (t ))T functions x (t ), f (t ), ... , f (t ) are given with known inaccuracy ~ 1

as these functions had been obtained from experimental measurements:

x(t ) − ~ x (t )

≤δ,

where x(t ) is the exact vector function of initial data,

(5) – given value.

Besides it is supposed that the vector of parameters p given inexactly. So vector p can have values in some closed domain D : p ∈ D ⊂ R N . Two operators Ap , B p correspond to each vector from D . The set of possible operators Ap has been denoted as class of operators K A , the set of possible operators B p has been denoted as class of operators K B . So we have Ap ∈ K A , B p ∈ K B . The maximal deviations of operators Ap from class K A and operators B p from class K B are equal:

Apα − Apβ

Z →U

≤h,

B pη − B pγ

X →U

≤d .

Denote by Qδ , p the set of the possible solutions of equation (4) with account of experimental measurements inaccuracy only:

Qδ , p = {z : z ∈ Z , Ap z ∈ U δ , p , p ∈ D} , where U δ , p = {u = B p x : u ∈ U , x ∈ X δ , p ∈ D} ,

X δ = {x : x ∈ X , x − ~ x

≤ δ}.

Any function z from set Qδ , p simulates of the motion of dynamic system MM∑ with the inaccuracy of experimental measurements only. The operator A p in equation (4) is a completely continuous operator for overwhelming majority of cases and so the set Qδ , p is unbounded set in space Z as a rule (ill-posed problem) (Тikhonov et al., 1990).

116

Robotics, Automation and Control

The regularization method for equations (4) was used for obtaining of stable solutions of denoted above problems (Тikhonov & Аrsenin ,1979). Let us consider the stabilizing functional Ω [ z ] which has been defined on set Z1 , where Z 1 is everywhere dense in Z (Тikhonov & Аrsenin ,1979). Consider now the extreme problem I:

Ω[ z p ] =

inf

z∈Qδ , p ∩ Z1

Ω [ z ], p ∈ D .

(6)

It was shown that under certain conditions the solution of the extreme problem I exists as unique and stable to small change of initial data ~ x1 (t ), δ (Тikhonov & Аrsenin ,1979). The function z p is named the stable model of EL after taking into account experimental measurements inaccuracy only. Such a model can be used for modeling of initial system motion with operators Ap , B p only.

3. Synthesis of external loads for class of mathematical descriptions But according to the first approach it is necessary to take into account the inaccuracy of operators Ap , B p . Let us now consider the problem of EL identification in this case. The set of possible solutions of equation (4) Qδ , p has to expand to the set Qδ , D if we additionally take into account the inaccuracy of operators Ap , B p : Qδ, h, d = {z : z ∈ Z , Ap ∈ K A , B p ∈ K B , Ap z − B p xδ ≤ δ b0 + d xδ

+h z

where b0 = sup B p . p∈D

Any function z from set Qδ, h, d simulates the motion of initial system with the inaccuracy of experimental measurements and inaccuracy of operators Ap , B p . The set Qδ, h, d is unbounded for any δ > 0, h > 0, d > 0, p ∈ D ⊂ R N (Тikhonov & Аrsenin, 1979). The regularization method for equations with inexact given operators was used for an obtaining stable solutions of denoted above problems (Тikhonov et al., 1990). Consider now the extreme problem II: Ω[ ~ z] =

inf

z∈Qδ , h , d ∩ Z1

Ω[ z ] .

(7)

It was shown that under certain conditions the solution of the extreme problem II exists as unique and stable to small change of initial data ~ x1 (t ), δ, d , h, Ap , B p (Тikhonov et al., 1990). ~ A problem of finding z ∈ Q was named as problem of synthesis of external load for a class of δ, h , d

models (Menshikov, 2002, 2004). Let's consider the union of sets of the possible solutions Qδ, p with fixed operators Аp,Bp :

117

The Identification of Models of External Loads

Qδ* = ∪ Qδ , p ( ∪ is the sign of union). p∈D

(8)

In some cases as the solution zmin of a problem of synthesis of external load for a class of models we shall accept the stable element of set Qδ* instead the set Qδ, h, d (extreme problem III): Ω [ z min ] = inf *

z∈Qδ ∩Z1

Ω[ z] .

(9)

This problem can being reduced to more simple extreme problem: Ω [ z min ] = inf inf Ω [ z ] .

(10)

p∈D z∈Qδ , p

The model of EL zmin will give the results of mathematical modeling which coincide with x with inaccuracy δ b0 . given function B p ~ T The statement of following problem of model EL construction with the help of identification is possible (extreme problem IV):

Ω [ z max ] = sup inf Ω [ z ] .

(11)

p∈D z∈Qδ , p

The model of EL zmax will give results of mathematical modeling with inaccuracy δ b0 . The function z max gives the evaluation from above of all possible solutions of identification problem for all operators Аp,Bp from classes KA,KB. Then the stable model z bel which gives the evaluation from below of the selected response B ~ x of dynamic system for all possible operators A , B can been defined as result of the p

solution of the following extreme problem V: Abbel zbel

2 U

inf

inf Ab z p

Ab∈K A , Bb∈K B z p

2 U

, b, p ∈ D ,

(12)

where z p is the solution of extreme problem (6) on set Qδ , p .

The stable model z ab which gives the evaluation from above of the selected response B p ~ x of dynamic system for all possible operators Ap , B p can been defined as result of the solution of the following extreme problem VI:

Abab z ab

2 U

sup

sup Ab z p

Ab∈K A , Bb∈K B z p

2 U

, b, p ∈ D .

(13)

In some cases it is necessary to synthesize model of external load by a method of identification which gives the best results of mathematical modeling for all possible mathematical descriptions of dynamic system motion. Actually such a problem is the solution of a problem of a choice of the second component (model of external load) for adequate mathematical modeling within of the first approach (Menshikov, 2008). Such kind of identification problems can find applications in different practical areas where the methods of mathematical modeling are used (Menshikov, 2004).

118

Robotics, Automation and Control

The stable model zun of external load which gives the best result of motion of dynamic system with guarantee as the solution of the following extreme problem VI is: A pun z un − ~ x

= inf sup Ac z p − Bc ~ x

p∈D c∈D

2 U

, pun ∈ D ,

(14)

where z p is the solution of extreme problem (6) on set Qδ , p (Menshikov & Nakonechny, 2005). x ), if the Function zun ∈ Q*δ exists and is stable to small change of initial data (function ~ functional Ω[z] is stabilizing functional and the function zо is defined uniquely from (14). The solution of extreme problem VII was named as unitary mathematical model of external load. If the classes КA , КB consist of a limited number of operators КA = {А1, А2, … АN}= {Аi}, КВ = {В1, В2, … ВN}= {Вi}, i = 1, N , then the algorithm of finding the best unitary model of external load zun has the form inf sup A p z − B p ~ x

z∈QD ,δ p∈D

= Apun z un − B pun ~ x

= min max Ai z j − Bi xδ j

(15)

where QD ,δ = {z j : Ai z j − Bi ~ x

= δ ; j , i = 1,2, ..., N } .

The offered formulations of a problem of identification of external load can not be classified by one name as identification of external load. The additional explanations and additional assumptions are required in the solution of each particular problem. Most likely this set of problems can be united only by principle of synthesis of models - principle of use of experimental data. In other words it is possible to name all these problems as problems in which the principle of identification (comparison of results of calculations with experimental data) is used. The similar situation is present also in a rather conservative area such as identification of parameters (Menshikov, 2006).

4. Practical problems of identification of external load models 4.1 Synthesis of moment of technological resistance on rolling mill One of the important characteristics of rolling process is the moment of technological resistance (МТR) arising at the result of plastic deformation of metal in the center of deformation. Size and character of change of this moment define loadings on the main mechanical line of the rolling mill. However the complexity of processes in the center of deformation does not allow construct the authentic mathematical model of МТR by usual methods. In most cases at research of dynamics of the main mechanical lines of rolling mills МТR is being created on basis of hypothesis and it is imitated as piecewise smooth linear function of time or corner of turn of the working barrels (Menshikov, 1985, 1994). The results of mathematical modeling of dynamics of the main mechanical lines of rolling mills with such model МTR are different among themselves (Menshikov, 1976).

119

The Identification of Models of External Loads

In work the problem of construction of models of technological resistance on the rolling mill is considered on the basis of experimental measurements of the responses of the main mechanical system of the rolling mill under real EI (Menshikov, 1985, 1994). Such an approach allows to carry out in a future mathematical modeling of dynamics of the main mechanical lines of rolling mills with a high degree of reliability and on this basis to develop optimum technological modes. The kinematics scheme of the main mechanical line of rolling mill was presented on Fig.1. (a). 2

МUrol

ϑ3 С23 Мeng

C12 МLrol С24

ϑ2

ϑ1

ϑ4

Fig.1. Kinematics scheme of the main mechanical line of rolling mill. The four-mass model with weightless elastic connections is chosen as MM of dynamic system of the main mechanical line of the rolling mill (Menshikov, 1985, 1994):

+ ω 2 M − c12 M − c12 M = c12 M (t ) ; M 12 12 12 23 24 eng

ϑ2

ϑ1

+ ω 2 M − c 23 M + c 23 M = c 23 M U rol (t ) ; M 23 23 23 12 24

ϑ2

ϑ3

(16)

+ ω 2 M − c 24 M + c 24 M = c 24 M L rol (t ) ; M 24 24 24 12 23

ϑ2

where ω

cik (ϑi + ϑk )

ϑi ϑ k

ϑ4

, ϑk are the moments of inertia of the concentrated weights, cik are

the rigidity of the appropriate elastic connection, MUrol, MLrol are the moments of

120

Robotics, Automation and Control

technological resistance applied to the upper and lower worker barrel, respectively, Meng(t) is the moment of the engine. The problem of synthesis of model of EL can be formulated so: it is necessary to define such models of technological resistance on the part of metal which would cause in elastic connections of model fluctuations identical experimental (in points of measurements) taking into account of an error of measurements for chosen MM of the main mechanical line of rolling mill. The information on the real motion of the main mechanical line of rolling mill is received by an experimental way (Menshikov, 1976, 1976a). Such information is understood as presence of functions M12(t), M23(t), M24(t). The most typical case of rolling on a smooth working barrels was chosen for processing when the frustration of fluctuations are not observed and when a skid is absent (Menshikov, 1976, 1976a). The records of functions M12(t), M23(t), M24(t) by rolling process are shown on Fig.2. Let's consider a problem of construction of models of EL to the upper working barrel. On the lower working barrel all calculations will be carried out similarly. From system (16) the equation concerning required model MUrol can be received t

∫ sin ω 23 (t − τ ) M rol (τ )dτ = uδ (t ) or Ap z = uδ , U

(17)

where z = MUrol(τ), Ap is a linear integral operator. The maximal deviation of the operators Ap∈ КA from one another is defined by an error of parameters of mathematical model of the rolling mill. The error of definition of values of discrete weights is accepted as 8 %, the error of stiffness values - 5 %, error of values of damping factors - 30 %. The size of the maximal deviation of the operators Ap∈ КA was defined by numerical methods and it is h = 0.121, the size of the maximal deviation of the operators Bp∈ КB was defined by numerical methods and it is d = 0.11. An error initial data for a case Z = U = C[0,T] is δ = 0.0665 МHм.

Fig. 2. The records of functions M12(t), M23(t), M24(t).

Fig. 3. The diagrams of extreme problem I U L (τ ) , M rol (τ ) . solutions M rol

121

The Identification of Models of External Loads

We shall choose functional T

Ω [ z ] = ∫ ( z 2 + z 2 )dt ,

(18)

as the stabilizing functional (Тikhonov & Аrsenin, 1979). At first the problem of identification of the stable models of external load to working barrels was calculated with account of inaccuracy of experimental data only. The results of calculation are presented on Fig.3. 4.2 Unitary model of external load for rolling mill It is evident that the results of identification will change under other parameters of mathematical descriptions. So the problem of external load identification was solved as extreme problem VI (unitary model of external load). In Fig.4 the diagram of function zun for a typical case of rolling on an upper worker barrel is submitted.

Fig. 4. The diagrams of change of models of the moment of technological resistance on the upper worker barrel of rolling mill. The solution of extreme problem III is presented on Fig. 4. with the same conditions of initial data and inaccuracy of operators Ap, Bp . The results of calculations show that the evaluation from above of accuracy of mathematical modeling with model zun for all Ap∈ КA and for all Bp∈ КB does not exceed 11% in the uniform metrics with error of MM parameters of the main mechanical line of rolling mill in average 10 % and errors of experimental measurements 7 % in the uniform metrics. The calculations of model of EI z for classes КA, КB on set of the possible solutions Qδ,h,d was executed for comparison. The function which is the solution of a problem of synthesis in this case has the maximal deviation from zero equal 0.01 Мнм. Such a model does not present interest for the purposes of mathematical modeling as it practically coincides with trivial a model. In work (Menshikov, 1994), the comparative analysis of mathematical modeling with various known models of external load was executed. The model of external load zun corresponds to experimental observations in the greater degree.

5. Conclusion In paper some problems of construction of external load models for dynamic systems with this case has the maximal deviation from zero equal 0.01 Мнм. Such a model does not present interest for the purposes of mathematical modeling as it practically coincides with trivial a model.

122

Robotics, Automation and Control

In work (Menshikov, 1994), the comparative analysis of mathematical modeling with various formulations of such a problem is offered: the stable model for obtaining the best results of of mathematical modeling with guarantee, the stable model for obtaining evaluation of response from above, stable model for obtaining evaluation of response from below, stable model for mathematical modeling of the selected motion with the fixed model of dynamic system, the stable model for mathematical modeling of the selected motion of system for whole class of mathematical descriptions of system. The offered approach to synthesis of models of external loads on dynamical system can find application in cases when the information about external impacts is absent or scarce and also for check of hypotheses on the basis of which were constructed the known models of external loads.

6. References Gelfandbein, Ju. & Кolosov, L. (1972). Retrospective identification of perturbations and interferences, Science, Мoscow. Ikeda, S.,; Migamoto, S. & Sawaragi,Y. (1976). Regularization method for identification of distributed systems. Proc. of IY a Symposium IFAC: Identification and evaluation of parameters of systems, pp. 153-162, Tbilisi, USSR, v.3, 1976, Preprint. Мoscow. Menshikov, Yu. (1976). Identification of moment of technological resistance on rolling. J. of Differential equations and their applications in Physics. Dnepropetrovsk University, Dnepropetrovsk, Ukraine, n.1,1976, pp.22-28. Menshikov, Yu. (1976a). About influence of external loading to dynamics of main line of rolling mill. J. of Differential equations and their applications in Physics. Dnepropetrovsk University, Dnepropetrovsk, Ukraine, n.1, 1976, pp. 29-33. Menshikov, Yu. (1985). The synthesis of external impact for the class of models of mechanical objects. J. of Differential equations and their applications in Physics. Dnepropetrovsk University, Dnepropetrovsk, Ukraine, pp. 86-91. Menshikov, Yu. (1994). The Models of External Actions for Mathematical Simulation. System Analysis and Mathematical Simulation (SAMS), New-York, v.14, n.2, 1994, pp.139-147. Menshikov, Yu. (2002). Identification of external impacts models. Bulletin of Kherson State Techn. Univ., Cherson, Ukraine, 2(15), 2002, pp.326-329. Menshikov, Yu. (2004). Identification of external impacts under minimum of a priori information: statement, classification and interpretation. Bulletin of КNU, Mathematics, Кiev, Ukraine, n. 2, pp.310-315. Menshikov, Yu.; Polyakov N. (2004). The new statement of problem of unbalance identification, Proceedings of ICTAM, 4p., August, 2004, Warsaw, Poland. Menshikov, Yu.; Nakonechny, A. (2005). Constraction of the Model of an External Action on Controlled Objects. Journal of Automation and Information Sciences , v.37, is. 7, 2005, pp. 20-29. Menshikov, Yu. (2006). Parameters Identification in Minimax Statement. Journal of Automation and Information Sciences, v.38, is.11, 2006, pp.14-21. Menshikov, Yu. (2008). About Adequate of Results of Mathematical Modeling, Proceedings of Conf. Simulation-2008, v.1, Kiev, May 2008, Kiev,Ukraine, pp.119-124. Тikhonov, A. & Аrsenin, V. (1979). Мethods of solution of incorrectly problems, Science, Мoscow. Тikhonov, A.; Goncharsky, A.; .Stepanov, A. & Yagola, A. (1990). Numerical methods for the solution of ill-posed problems, Science, Мoscow.

8 Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access Grazia Cicirelli and Annalisa Milella

Institute of Intelligent Systems for Automation (ISSIA) National Research Council (CNR) Italy 1. Introduction As awareness of cultural heritage raised, much effort was devoted, in the last decade, to improve accessibility and preservation of cultural assets. At present, several methods are available that generally make use of laser scanners and cameras to construct 3D photorealistic models of different-sized items, ranging from small objects, like statues, up to large buildings and archaeological sites. These methods provide effective technological solutions for cultural heritage preservation, while guaranteeing, at the same time, their accessibility to as much people as possible. Nevertheless, the generation of models may turn into a timeconsuming procedure if data acquisition and processing is done by hand, since, in order for models to be sufficiently accurate, an extremely painstaking work is required. Contributions to implement automated model building techniques and remote access systems for cultural heritage applications may derive from experience in the mobile robotics field. Recently, several research projects have attempted to develop mobile robotic agents in museums (Burgard et al., 1999; Thrun et al., 2000; Trahanias et al., 2005), with different tasks, such as to supply remote access to distant users, accommodate and guide people in the museum, survey those areas where access is not permitted. Equipped with sensors, like cameras and laser rangefinders, mobile platforms provide a variety of viewpoints and may supply the user with dedicated tours of the exhibition and personalized tele-presence, which result in greater interaction capabilities than fixed or even remotely controllable cameras (Trahanias et al., 2005). Generally, in order for a mobile robot to perform its tasks, the knowledge of a map of the environment is needed. Hence, a number of methods for efficient environment modeling, based on information from onboard robot sensors, have been also developed. Yet, relatively little work has been done to extend these techniques to cultural heritage applications, such as modeling of historical and archaeological sites. In this chapter, we describe our research concerning the development of methods for environment exploration and modelling by a multisensor mobile platform, in the context of cultural heritage access and preservation. Our goal is to have a system able to navigate in the environment and acquire sensorial data, in order to either construct global or local models of the site, or send information to a remote console.

124

Robotics, Automation and Control

Fig. 1. The robotic mobile platform in the configuration used in the polygonal environment. Usually, when exploring an unknown environment for the first time, a mobile robot is remotely controlled by a joystick or other tele-operation devices. Recently, a novel scenario is receiving considerable attention, which relates to the possibility of teaching the robot its environment by human interaction. This concept, introduced in (Topp & Christensen, 2005), is known as Human Augmented Mapping (HAM). Pursuing this trend, we propose a novel laser-based leg detection and tracking algorithm (Milella et al., 2007) that enables a mobile platform to follow a human user in a tour of the environment, in order to explore the surroundings, acquire sensorial data for map building, and learn particular regions or locations specified by the user. We present two case studies. The first one deals with the problem of modelling a polygonal environment, such as a museum. The second case study is concerned with the use of a mobile robot for exploration and mapping of a pre-historical underground cave in Southern Italy, named “Grotta dei Cervi”, rich in ancient wall paintings of historical and artistic relevance. The mobile robot employed to carry out this research is shown in Fig.1. It consists of a Pioneer-P3AT by ActivMedia Robotics, equipped with a SICK LMS 200 laser range finder, sixteen forward and rear sonar sensors, encoders, gyroscope, and a monocular pan-tilt-zoom camera. The laser range finder is mounted at a height of about 30cm from the floor and is able to sense objects at a distance of up to 80m with a resolution of 0.5°. Sonar sensors can detect obstacles up to 7m away. Its four tractor wheels can scale a 45° gradient and sills of 9cm. The robot case contains four motors, a local processor, and the batteries. For experimentation in the cave, a 1m high aluminium support was added to the platform to carry the illumination system and the camera (see Fig. 2). Such a configuration allowed us to acquire the wall paintings from an appropriate perspective, since they are mainly located approximately between 1m and 2m above the ground. ARIA C++ libraries by ActivMedia Robotics were used for communication between sensors and the robot controller.

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

125

Fig. 2. The mobile robotic platform in the configuration used in the cave. The results of the experimental sessions realized in both case studies show that the proposed methods are feasible and accurate. They allow us to produce detailed 2D and 3D representations that can be usefully employed to support the study of relevant historical treasures, guaranteeing, at the same time, their safety. The remainder of this chapter is organized as follows. After a review of the literature related to this work in Section 2, the people following method for Human Augmented Mapping (HAM) is described in Section 3. Section 4 illustrates our approach to polygonal environment modelling. Section 5 describes the study conducted in the pre-historical cave. Finally, Section 6 draws some conclusions.

2. Related works Several works in literature have shown mobile robots to be useful in environment modelling tasks, since, equipped with sensors, such as cameras and laser rangefinders, they can acquire and process sensorial data while navigating in the environment, with minimum human intervention. In (Biber et al., 2004), a 3D modeling method using data from a laser range finder and an omnidirectional camera mounted on a mobile robot is proposed. The method consists of manual, semi-automatic, and automatic parts. Data collection and sensor calibration is carried out manually by teleoperating the robot; wall extraction is done semiautomatically with a user interface; the rest of the processing is fully automatic. In (Nevado et al., 2004), 3D models of the inside of buildings are obtained by using points measured by a laser scanner on-board a mobile robot. The most likely orientation of the surface normal is first calculated at every point by considering also the neighbouring regions to avoid measurement noise. Similar planes are merged and each point is projected onto the planes they belong to. Finally, the contours of the points in the corresponding planes are obtained by a triangulation procedure producing a simple representation of the environment. In this case, the lack of video information limits the final 3D model to a topological representation of the environment without texture. In (Leiva et al., 2001), a 3D model of the environment is

126

Robotics, Automation and Control

built combining sonar and video information. First, sonar sensors and odometers are used to estimate the distance of the robot from the objects in the environment. Odometric errors are corrected when image data is suitable. Then, a 2D probabilistic map of the environment is divided into segments which become planes in 3D. Finally, the texture from snapshots is assigned to each plane. A rough 3D representation of the environment is obtained, as the aim of the authors is to construct a basic 3D model by an easy and fast method using cheap sensors. Simultaneous Localization and Mapping (SLAM) approaches for concurrent robot localization and modelling of unknown environments can also be found in (Gutmann & Konolige, 1999; Se et al., 2002; Grisetti & Iocchi, 2004; Thrun et al., 2004; Ip & Rad, 2004; Stachniss et al., 2005; Wolf & Sukhatme, 2005). Despite this high number of model building methods using mobile robots, a few authors have explicitly suggested the use of mobile robotics techniques for applications in the domain of cultural heritage, and realized field tests. Examples can be found in (Allen et al., 2001; Allen et al., 2003; Hirzinger et al., 2005). Specifically, in (Hirzinger et al., 2005), methods for 3D modelling in robotic environments are applied to digitization of cultural heritage from small to large scale objects. In (Allen et al., 2001; Allen et al., 2003), instead, the use of a mobile robot is proposed to build models of urban environments and historic sites. In our work, we employ a multisensor mobile platform for data acquisition and processing in the context of cultural heritage, and present the results of two practical implementations of the approach, one in a typical indoor polygonal environment and the other in a prehistorical cave. In both cases, the constructed environment model is employed not only for in site navigation of the robot, but also for remote access to cultural assets. For the polygonal environment, our work is related to (Biber et al., 2004); however, we employ a monocular camera and processing is completely automatic. Furthermore, we suggest the use of a Human Augmented Mapping (HAM) (Topp & Christensen, 2005) approach to perform laser and video data acquisition, in place of usual robot teleoperation. First, an accurate 2D map of the environment is generated, based on a SLAM algorithm using laser data (Gutmann & Schlegel, 1996). Starting from this map, the wireframe model of the environment is constructed. Then, images are processed to extract the texture to be added to the wireframe in order to obtain the complete 3D model. For exploration and mapping of the cave, instead, we present an integration of two different algorithms. Specifically, the robot constructs the 2D map of the environment using laser data, and then builds the 3D model of some zones of particular interest using computer vision techniques. The combination of these two approaches makes it possible to obtain a complete knowledge of the environment in an automatic way. The proposed solution allows the access to the cave without damaging it, thus providing an effective system to monitor and preserve its relevant treasures.

3. People-following for Human Augmented Mapping (HAM) There are several works in literature that concentrate on either people-tracking or following for interaction. They rely on the use of laser range finders or vision or both. (Kleinehagenbrock et al., 2002) integrate vision and laser range data to track a human user, based on a multi-modal anchoring technique. The legs of the user are extracted from laser range data, while his face is from camera images. In (Fod et al., 2002) a laser-based method for real-time tracking of multiple objects is presented, as a step towards the wider objective of identifying people and their activities. Range measurements are grouped into entities for an abstract representation of objects. A Kalman Filter is, then, associated to each object to

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

127

address occlusion and sensor noise problems. (Feyrer & Zell, 2000) present a system that detects and pursues people by using both vision and laser data. Color information is used to extract faces in images, whereas convex intervals are extracted in laser scans to detect human legs. In (Pineau et al., 2003) efficient particle filter techniques are used to detect and track people. In our work, only laser data are employed for detecting people, based on typical human leg shape and motion characteristics. Due to safety reasons, laser range sensors have to be attached near the bottom of the mobile robot; hence, laser information is merely available in a horizontal plane at leg height. In this case, legs constitute, therefore, the only part of the human body that can be used for laser-based people-tracking. The objective of the proposed approach is to develop a Human Augmented Mapping (HAM) system, that is, to implement a human-robot interaction approach to mapping. The people detection and following method consists of two main modules: • the Leg Detection and Tracking (LDT) module: this module allows the robot to detect and track people using range data based on typical shape and motion characteristics of human legs; • the People-Following (PF) module: this module enables the mobile platform to navigate safely in a real indoor environment while following a human user. During the tour the robot can acquire data for environment mapping tasks. Details about both modules are provided in the remainder of this section, along with the results of some tests performed in a real context.

(a)

(b)

(c)

Fig. 3. (a) The robot detecting a leg-shaped object; (b) geometrical representation of a scan interval; (c) geometrical representation of two subsequent scan points. 3.1 Leg-detection and tracking The Leg Detection and Tracking (LDT) method allows to detect and track legs, based on typical human leg shape and motion characteristics. The algorithm starts by acquiring a raw

128

Robotics, Automation and Control

scan covering a 180° field of view. Laser data are analyzed to look for scan intervals with significant differences in depth at their edges (Feyrer & Zell, 2000). Specifically, let us denote

S = [ s1 , s 2 ,..., s k ,..., s n ]

(1)

a raw scan, being sk, for k = 1, 2,…n, the laser readings ordered according to the rotation sense of the laser beam (counterclockwise sense in Fig. 3(a)). A scan interval, delimited by scan points Pj and Pm (see Fig. 3(b)), i.e.

S jm = [ s j , si , si +1 , si + 2 ,..., sm ]

(2)

with S jm ∈ S , is selected, if it satisfies the conditions

s j −1 − s j > τ j

si − si +1 ≤ τ i

for j < i < m

s m+1 − s m > τ m

(3) (4) (5)

The thresholds τi, for j ≤i ≤ m, are dynamically computed at each step based on the following considerations. Let us denote Pi and Pi+1 two subsequent point scan readings (see Fig. 3(c)). Since the rotational resolution ϑ of the laser beam is small (approximately 0.5°), it yields

hi ≅ siϑ

(6)

ϑ ≅ sinϑ

(7)

hi ≅ s i sin ϑ

(8)

and

so that we have

If Pi and Pi+1 belong to a continuous region, then a small difference between the corresponding readings will be observed, which is

si − si +1 > hi The value of hi can be, therefore, usefully employed to define a threshold

(10)

τ i , establishing

whether Pi and Pi+1 lie on a continuous surface or belong to different objects. Specifically, the threshold τi can be expressed as

τ i = β ⋅ hi

(11)

where β > 1 is an empirically determined coefficient that takes into account laser noise effects.

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

129

Once a set of scan intervals has been selected, a criterion to differentiate between human legs and other similar objects, such as legs of chairs and tables and protruding door frames, must be defined. To achieve this aim, first, the width of each pattern is calculated as the Euclidean distance between its end-points and is compared with the typical diameter of a human leg (0.1m to 0.25m). Then, a Region of Interest (ROI) is fixed in the vicinity of each candidate pattern. A leg-shaped region detected within each ROI at the next scan reading is classified as a human leg if the displacement of the pattern relative to its previous position has occurred with a velocity compatible to a typical human leg velocity (0.2m/s to 1m/s). Note that if the robot is moving and thus so is the scanner, the effect of ego-motion must first be accounted for. This can be done employing the information provided by the onboard odometers or by the laser scanner. 3.2 People-following The people following algorithm consists of the following steps: 1. detect human legs using the LDT module; 2. choose the closest moving person within a certain distance and angular position relative to the robot; 3. keep track and follow the target person until he/she stops or disappears from the scene. A control loop is employed, which sets speed and turn rate of the robot, based on the distance from the person and from other objects present in the environment. An obstacle avoidance routine that uses sonar information is also implemented. As will be shown below, the People Following (PF) module can be effectively employed in the context of Human Augmented Mapping (HAM). 3.3 Some tests In order to test both the performance of the people detection and tracking algorithm and the effectiveness of the people-following method, some tests were performed in our institute. Specifically, three different test scenarios were analysed: 1) robot still, one person present; 2) robot still, two people present; 3) robot following one person for Human Augmented Mapping (HAM). Each experimental setup is discussed in the rest of this subsection.

Fig. 4. The trajectory of a person moving in the area surveyed by the robot.

130

Robotics, Automation and Control

Fig. 5. The trajectories of two people crossing the area surveyed by the robot. 1.

Robot still, one person - In this case, the robot is not moving. Only one person is present and crosses the field of interest at varying speed. Fig. 4 shows the trajectory of the target within the inspected area (i.e. the semi circumference marked by the dashed line) obtained by the tracking system during one experiment. Black dots represent laser scan readings. The target was classified at all times as a moving person. Robot still, two people - Two people cross the scene surveyed by the robot which is not moving. Assuming that the motion direction of each person does not vary significantly, the system is able to keep track of the two trajectories separately, as is shown in Fig. 5.

Fig. 6. Two image sequences taken during the pursuit of a person. 3.

Robot following one person for Human Augmented Mapping (HAM) – In this case, the robot follows the user in a tour of the environment, maintaining a safety distance from him. The user is identified as the closest person in a range of 1m. Fig. 6 shows two short image sequences taken during the experiments. While following the user, the

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

131

robot is able to acquire laser data that allows to concurrently reconstruct its trajectory and build a 2D map of the environment. The simultaneous localization and mapping process is shown in Fig. 7 for two different tests. In this figure, the continuous lines indicate the robot path, while small arcs represent the user’s legs and black points the reconstructed map.

Fig. 7. Two people following experiments for Human Augmented Mapping (HAM). In the next section, details about the 2D mapping approach will be provided. It will be also shown how, assuming the environment to be polygonal and adding visual information, the 2D map can be used to recover a full 3D model of the environment.

4. Modelling a polygonal environment In this section, we illustrate our approach to build the 3D model of a polygonal environment, such as a museum, based on laser and video data acquired by a mobile platform. Experiments were carried out in a corridor of our institute. This does not lose generality, since an office environment is very similar to a museum one. First, data acquisition was performed taking the robot on a tour of the environment, and a 2D map was constructed using a laser-based Simultaneous Localization and Mapping (SLAM) algorithm. Then, a 3D wireframe model was generated, adding vertical planes to the contour defined by the planar map. Finally, the 3D model was completed, applying the texture recovered from the acquired images to the wireframe model. Note that the camera was mounted on board the robot above the laser range finder, and its optical axis was oriented 90° to the right with respect to the forward direction of the robot in order to acquire images of the lateral walls. 4.1 2D Map of a polygonal environment The construction of a map of the environment is a basic step both for robot navigation and accurate knowledge of the environment. The robot can use the map to localize itself as well as to recognize places already explored. This turns the problem of building a map into a problem of on-line Simultaneous Localization and Mapping (SLAM). In this work, the Combined Scan Matcher (CSM) method proposed in (Gutmann & Schlegel, 1996) is applied to construct a laser-based 2D map and simultaneously estimate the robot trajectory. This approach integrates the IDC algorithm by Lu and Milios (Lu & Milios, 1997) with the method proposed by Cox (Cox, 1991). Fig. 8 shows the point map, with overlaid the robot trajectory, obtained applying the CSM algorithm followed by a statistical filtering for

132

Robotics, Automation and Control

noise removal. This map is composed of 112 frames scanned by the laser and covering an area of approximately 20×2m2. The next step is to fit the points with line segments. The result of the line fitting procedure is a map to which a gap removing algorithm has been applied. The final map is shown in Fig. 9. Each couple of consecutive line segments are intersected to obtain the corners in the map. These points will be useful for building the 3D model of the environment.

Fig. 8. Planar point map. The path of the vehicle is also drawn.

Fig. 9. Final planar map characterized by line segments. 4.2 Building the 3D Model of a polygonal environment Once the planar map of the polygonal environment has been obtained, a 3D model can be constructed. The first step is to build the wireframe of the 3D model. The wireframe is generated by adding one vertical plane to each line segment in the 2D map. This operation is possible since the explored environment is polygonal. Each segment in the 2D map refers to the contour of walls, cupboards, doors, and other objects with a polygonal shape. Fig. 10 shows the resulting wireframe.

Fig. 10. Wireframe model. The next step to obtain the complete 3D model is the addition of texture to improve the visual appearance of the model. The texture is obtained by using the images acquired by the camera, on-board the vehicle, during the acquisition phase. Note that images were taken at a

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

133

fixed horizontal translation distance from each other and the positions of the vehicle, from which the images were acquired, are known. The process for adding the texture can be divided into the following fundamental steps: 1) generation of the entire image covering the whole contour of the environment, i.e. image registration or mosaicking; 2) correct positioning of the registered image to the wireframe. 1. Image registration. The fundamental step in image mosaicking is to find correspondences between different views, and estimate the hom*ographies between the reference image and all other images. Each pair of consecutive images is processed as follows: feature extraction; feature matching; selection of correct matches; final image generation. Specifically, the Harris corner detector is, first, applied to extract corners. Fig. 11 shows the extracted corners on two adjacent images. A simple cross-correlation algorithm is applied to find matches. False matches are then removed using RANSAC (Hartley & Zisserman, 2003). Fig. 12 shows the correct correspondences after applying the RANSAC algorithm.

Fig. 11. Corner points extracted by the Harris operator on two adjacent images.

Fig. 12. Correct correspondences of the extracted corners after the application of RANSAC algorithm. Successively, hom*ographies are estimated by solving overdetermined systems (number of points > 4) using the linear-least squares method. By using the hom*ographies, it is possible to estimate the horizontal translation between two consecutive images and then the images can be aligned correctly. The resulting image, of 38981×480pixel2, is obtained aligning 123 images of 640×480pixel2. Fig. 13 shows a portion of the registered image. The described procedure works properly when images present sufficient texture. In our experiment, regions with pictures on the walls have enough texture. Problems arise, instead, in hom*ogenous areas corresponding to walls, doors, etc. In these cases, feature

134

Robotics, Automation and Control

extraction is difficult and, therefore, hom*ographies cannot be estimated automatically. To solve this problem, the horizontal translations estimated between textured images are used. They are correlated to the known translation distances, performed by the robot during the acquisition phase and estimated by using the odometers. The median of the translations, estimated by using RANSAC and hom*ographies, among textured images is then used to align those images without texture.

Fig. 13. Portion of the mosaic image. 2.

Adding texture to the wireframe. The final step in building the 3D model is to add the texture, defined by the mosaic image, to the wireframe model. To achieve this aim, the idea is to match correctly the corners of the wireframe model to those in the mosaic image. The corners in the wireframe model are easily detected since they are known on the 2D map. The detection of the corners in the mosaic image needs some additional elaborations. For the sake of simplicity, processing is done on the single images forming the mosaic. First, the Sobel operator is applied to extract edges. Then, the vertical lines in the images are determined by using the Hough Transform (Hough, 1962). Vertical lines can be relative to door edges, picture edges, cupboard edges and so on. Only the ones relative to the corners of the environment are useful for the correct mapping of the texture on the wireframe. The knowledge of the robot positions corrected by using the scan-matching algorithm and the knowledge of the field of view of the camera allow us to find which corner falls in which image. In this way, it is possible to connect each corner to each image acquired during the acquisition phase. The result of the described procedure is the 3D model complete with texture shown in Fig. 14. For a more detailed illustration, Fig. 15 shows some portions of the same model from different points of view. The Virtual Reality Modelling Language (VRML) was used to obtain the model.

Fig. 14. Textured 3D model.

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

135

Fig. 15. Portions of the 3D model.

5. Reconstruction of a pre-historical cave In the south of Italy, along the Adriatic coast, a cave, named “Grotta dei Cervi”, holds a prehistorical treasure remarkable for its complexity, and artistic and historical relevance. The cave has on its walls a huge collection of paintings of hunting scenes, stags, men, and small animal groups, realized with red ochre and bat guano, dated to the Middle Neolithic period (see Fig. 16). The access to the cave is restricted to a few authorized people. Care must be taken to guarantee their safety and to prevent polluting elements from being introduced in this particular and valuable environment. The application of a technological solution seems to be the best way to allow remote access to the archaeological site, thus satisfying the need for cave preservation and safety.

Fig. 16. Some paintings present on the walls of the cave.

136

Robotics, Automation and Control

In this section, we describe the study conducted in the cave, using a mobile robot able to navigate throughout the cave and acquire useful data by means of its on-board sensors. This solution reduces the risk of damaging the cave, as it does not require the installation of invasive infrastructures. The only hand-drawn map of the cave available up to now is illustrated in Fig. 17. The cave is formed by a series of narrow and twisting corridors. The area inspected by the robot is the corridor highlighted in the figure. First, a planar map of the corridor was constructed; then, the 3D model of particularly interesting areas, rich in paintings, was generated. Details for each phase are provided in the rest of this section.

Fig. 17. The handmade map of the cave “Grotta dei Cervi”. The highlighted area represents the corridor explored by the mobile robot. 5.1 2D map of the cave Generally, the high accuracy of laser data allows to build accurate planar maps especially when the robot moves in a plane. The cave environment, instead, presents a rough terrain characterized by depressions and bumps (see Fig. 2). In this case, laser data, that supply planar information, must be integrated with data from an inclinometer in order to obtain accurate information about the scanned environment. Fig. 18(a) shows the robot path and laser data during the scanning procedure. This map is composed of 189 frames scanned by the laser and covering an area of approximately 15x40m2. The CSM algorithm was applied to reconstruct both the robot trajectory and the map of the corridor. Fig. 18(b) shows the point map obtained after the application of the CSM algorithm and noise removal. Four zones are also indicated. They are of particular interest for the presence of pre-historical paintings. In order to detect these zones, four artificial, non-invasive landmarks, distinguishable on the map by the laser, have been placed near those areas. Using these landmarks the robot can plan the path to these regions enabling the acquisition of images from the onboard camera of the pre-historical paintings, and the subsequent construction of a 3D model of the observed area. The robot positions estimated on the map after the CSM application were compared to those provided by the odometers. The estimated errors on the x, y, and θ components of the robot pose are plotted in Fig. 19(a), (b), and (c) respectively. The errors on the x, y coordinates are

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

(a)

137

(b)

Fig. 18 (a) Laser readings and odometer data after the scanning procedure. The odometer data are represented by little arrows. (b) 2D map of the explored corridor of the cave after the application of the scan-matching algorithm. Four zones rich in interesting paintings are also highlighted.

(a)

(b)

138

Robotics, Automation and Control

expressed in millimetres, whereas the error on the robot orientation θ is expressed in angle degrees. As it can be noticed, errors are considerable because of the wheel slipping on the rough terrain causing high inaccuracy in odometer position estimates. Nevertheless, the CSM algorithm is able to correct them producing an accurate map. The map obtained using the laser scanner supplies new and useful, although still approximate, information about the structure of the cave: such information was not available before our visit. It is important to note that the structure of the cave, supplied by the map, is very important for the knowledge and the study of the archaeological site, as it describes the morphology of the whole environment placing each painting inside its context and facilitating a better understanding of its role and meaning. Furthermore, the planar map is useful for robot navigation inside the cave. 5.2 Building the 3D model of particular areas Reconstructing 3D models using computer vision techniques generally requires to extract the features (points, lines, target objects) and match them (Aggarwal & Vemuri, 1986; Biber et al., 2004; Gramegna et al., 2005). Moreover, it is important to determine the correspondence between features in different images, since the accuracy of the resulting model depends directly on the accuracy of the feature correspondence. The method described here uses as the only geometrical constraint the correspondence between corners in different images. A complex 3D scene is reconstructed using a set of three images acquired from three different viewpoints of the same scene. The only requirement is that images must be acquired by the same camera with a fixed focal length. After image acquisition, feature points that correspond to high curvature points (corners), are extracted in each image using the Harris corner detector (Sequeira et al., 1999). The maximum number of corners to be extracted in each image is fixed a priori. A matching procedure is then applied to each couple of images. A classical correlation technique is first used to establish the matching candidates between two images by determining a correlation score for each couple of points. If the correlation score is higher than a given threshold, the related couple of points is considered as a candidate match. In order to verify the candidate matches, a parameter counts the number of similar candidate matches found in the neighbourhood of each candidate matched point. The sum of these parameters for all candidate matches defines an energy function. The minimization of the energy function through a relaxation technique solves the ambiguity problem (Zhang et al., 1994). After the determination of the corner correspondence for each couple of images, the set of correct matches for all the three images is determined.

Fig. 20. Three images of an area of the wall rich in paintings. White crosses represent the correct point matches.

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

139

Fig. 20 shows three images with the matched points. Knowing the corresponding corners it is possible to determine the Fundamental Matrix and the intrinsic parameters of the camera (Hartley, 1997). At this point, all the necessary data to reconstruct the 3D scene are known. The 3D model is reconstructed through the application of the polygonal mesh technique. The 3D model of the scene was made by using the VRML. Fig. 21 shows the 3D model of the scene acquired in the cave.

Fig. 21. 3D model of an area of the cave.

6. Conclusions In order to promote preservation of cultural heritage though guaranteeing accessibility to as much people as possible, novel technological solutions need to be researched. Methods from the mobile robotics field supply effective contributions to the development of environment modelling techniques that can be potentially used to either support the study of historically and artistically relevant assets or provide remote access to museums and archaeological sites, thus satisfying the need for cultural heritage conservation and accessibility. In this chapter, we presented the results of our research in the field of cultural heritage, concerning the use of a multisensor mobile platform for data acquisition and processing. First, we described a laser-based people-following approach that enables a mobile robot to keep track of and pursue a human user in a tour of the environment. During the tour, the robot can acquire sensorial data to be used for environment modelling. This generates what is usually referred to as Human Augmented Mapping (HAM). Then, we presented the results of two case studies. The first one was related to the problem of constructing a model of a polygonal environment, such as a museum. Data acquisition was performed using a mobile robot, equipped with a 2D laser rangefinder and a CCD camera. Specifically, laser information was employed to simultaneously reconstruct the robot trajectory and build a planar map of the environment. From this map, a wireframe model was recovered. Finally, images were used to generate the texture to be added to the wireframe. Experimental results obtained for tests performed in our institute demonstrated the effectiveness of the proposed methods. The second case study presented in the chapter focused on the application of a technological solution for remote access and mapping of a pre-historical cave in Southern Italy, named “Grotta dei Cervi”. A multisensor mobile robot platform was used to explore the cave and send useful information to a remote console. Based on sensor data, the two-dimensional

140

Robotics, Automation and Control

map of the site was reconstructed, along with the 3D model of zones of particular interest. Despite the structural complexity of the site, the proposed technological solution proved to be effective for making the archaeological cave available without damaging it. It was shown that the use of such a solution allows the growth of knowledge of this kind of sites, and improves the capability of monitoring and preserving their relevant archaeological treasures.

7. References Aggarwal, J. K. & Vemuri, B. C. (1986), 3-D model construction from multiple views and intensity data, Proceedings of IEEE Conf. Computer Vision and Pattern Recognition (CVPR), pp. 435-437, Miami Beach, June 1986, IEEE Computer Society, Washington Allen, P., Stamos, I., Gueorguiev, A., Gold, E. & Blaer, P. (2001), AVENUE: Automated Site Modeling in Urban Environments, Proceedings of IEEE International Conference on 3D Digital Imaging and Modeling, pp. 357–364, ISBN 0-7695-0984-3, Québec City, Canada, 28 May-1 June 2001, IEEE Computer Society, Washington Allen, P. K., Stamos, I., Troccoli, A., Smith, B., Leordeanu, M. & Hsu, Y. C. (2003), 3D Modeling of Historic Sites using Range and Image Data, Proceedings of IEEE International Conference on Robotics and Automation, pp. 145–150, ISBN 0-7803-7736-2, Taipei, Taiwan, 14-19 September 2003, IEEE Computer Society, Washington Biber, P.; Andreasson, H., Duckett, T. & Schilling, A. (2004), 3D Modeling of indoor environments by a mobile robot with a laser scanner and panoramic camera, Proceedings of IEEE/RSJ Int. Conference on Intelligent Robots and Systems, pp. 34303435, ISBN 0-7803-8464-4, Sendai, Japan, September 2004, IEEE Computer Society, Los Alamitos Burgard, W.; Cremers, A. B., Fox, D., Hahnel, D., Lakemeyer, G., Schulz, D., Steiner, W. & Thrun, S. (1999), Experiences with an interactive museum tour-guide robot, Artificial Intelligence, Vol. 114, No. 1-2, (October 1999), (3-55), ISSN 0004-3702 Cox, I.J. (1991), Blanche-an experiment in guidance and navigation of an autonomous robot vehicle, IEEE Transactions on Robotics and Automation, Vol.7, No. 2, (April 1991), (193-204), ISSN 1042-296 Feyrer, S. & Zell, A. (2000), Robust Real-Time Pursuit of Persons with a Mobile Robot using Multisensor Fusion, Proceedings of Int. Conference on Intelligent Autonomous Systems (IAS6), pp. 710-715, ISBN 1-58603-078-7, Venice, Italy, July 2000, IOS Press, Amsterdam Fod A.; Howard, A. & Mataric, M. (2002), Laser-Based People Tracking, Proceedings of IEEE Int. Conference on Robotics and Automation, pp. 3024-3029, ISBN 0-7803-7273-5, Washington, DC, USA, May 2002, IEEE Computer Society, Los Alamitos Gramegna, T., Attolico, G., Distante, A. (2005), Different approaches to improve the construction of 3D models using computer vision techniques, Proceedings of AI*IA Workshop for Cultural Heritage, ISBN 88-900910-0-2, Milan, Italy Grisetti, G. & Iocchi, L. (2004), Map building in planar and non planar environments, Proceedings of The 2nd International Workshop on Synthetic Simulation and Robotics to Mitigate Earthquake Disaster, Lisbon, Portugal, June 2004 Gutmann, J. S. & Schlegel, C. (1996), AMOS: Comparison of scan matching approaches for self-localization in indoor environments, Proceedings of First Euromicro Workshop on

Environment Modelling with an Autonomous Mobile Robot for Cultural Heritage Preservation and Remote Access

141

Advanced Mobile Robots, pp. 61–67, Kaiserlautern, Germany, ISBN 0-8186-7695-7, October 1996, IEEE Computer Society, Los Alamitos Gutmann, J. S. & Konolige, K. (1999), Incremental mapping of large cyclic environments, Proceedings of IEEE International Symposium on Computational Intelligence in Robotics and Automation, pp. 318-325 , Monterey, CA, ISBN 0-7803-5806-6, November 1999, IEEE Computer Society, Los Alamitos Hartley, R., (1997), Kruppa’s equations derived from the fundamental matrix, IEEE Transactions on pattern analysis and machine intelligence, Vol. 19, No. 2, (February 1997), (133-135), ISSN 0162-8828 Hartley, R. & Zisserman, A. (2003). Multiple View Geometry in Computer Vision, 2nd edition, Cambridge University Press, ISBN 0521540518 Hirzinger, G., Bodenmüller, T., Hirschmüller, H., Liu, R., Sepp, W., Suppa, M., Abmayr, T. & Strackenbrock, B. (2005), Photo-realistic 3D modelling - From robotics perception towards cultural heritage. Proceedings of International Workshop on Recording, Modeling and Visualization of Cultural Heritage, Ascona, Switzerland, May 2005 Hough, P. V. C, (1962), Method and means for recognizing complex patterns, U.S. Patent 3069654 Ip Y. L. & Rad A. B. (2004), Incorporation of feature tracking into Simultaneous Localization and Mapping building via sonar data, Journal of Intelligent and Robotic Systems, Vol. 39, No. 2, (February 2004), (149-172), ISSN 0921-0296 Kleinehagenbrock, M.; Lang S., Fritsch J., Lomker F., Fink G. & Sagerer G. (2002), Person tracking with a mobile robot based on multi-modal anchoring, Proceedings of IEEE Int. Workshop on Robot and Human Interactive Communication, pp. 423-429, ISBN 07803-7545-9, Berlin, Germany, September 2002, IEEE Computer Society, Los Alamitos Leiva, J. M.; Martinez, P., Perez, E. J., Urdiales, C. & Sandoval, F. (2001), 3D Reconstruction of static indoor environment by fusion of sonar and video data, Proceedings of Int. Symposium on Intelligent Robotics Systems, pp.179-188, ISBN 2-907801-01-5, Toulouse, France, July 2001, LAAS-CNRS, Toulouse Lu, F. & Milios, E. (1997), Robot pose estimation in unknown environments by matching 2D range scans, Journal of Intelligent and Robotic Systems, Vol. 18, No. 3, (March 1997), (249–275), ISSN 0921-0296 Milella, A., Dimiccoli, C., Cicirelli, G. & Distante, A. (2007), A., Laser-based people-following for human-augmented mapping of indoor environments, Proceedings of the 25th IASTED International Multi-Conference: artificial intelligence and applications, pp. 151155, ISBN 978-0-88986-631-7, Innsbruck, Austria, February 12-14, 2007, ACTA Press Anaheim, CA, USA Nevado, M. M; Garcia-Bermejo, J. G., Casanova, E. Z. (2004), Obtaining 3D models of indoor environments with a mobile robot by estimating local surface directions, Robotics and Autonomous Systems, Vol. 48, No. 2-3, (September 2004), (131–143), ISSN 09218890 Pineau J.; Montemerlo, M., Pollack, M., Roy, N. & Thrun, S. (2003), Towards robotic assistants in nursing homes: challenges and results, Robotics and Autonomous Systems, Vol. 42, No. 3-4, (March 2003), (271-281), ISSN 0921-8890

142

Robotics, Automation and Control

Se, S., Lowe, D. G. & Little, J. (2002), Mobile robot localization and mapping with uncertainty using scale-invariant visual landmarks, International Journal of Robotics Research, Vol. 21, No. 8, (August 2002), (735-758), ISSN 0278-3649 Sequeira, V., Ng, K., Wolfart, E., Gonçalves, J. G. M., Hogg, D. C. (1999), Automated Reconstruction of 3D Models from Real Environments, ISPRS Journal of Photogrammetry and Remote Sensing, Vol. 54, No. 1, (February 1999), (1-22), ISSN 0924-2716 Stachniss, C; Hanhel, D., Burgard W. & Grisetti, G. (2005), On actively closing loops in gridbased fast-slam, Advanced Robotics, Vol. 19, No. 10, (1059-1079), ISSN 0169-1864 Thrun, S.; Beetz, M. Bennewitz, M., Burgard, W., Cremers, A. B., Dellaert, F., Fox, D., Hahnel, D., Rosenberg, C., Roy, N., Schulte, J. & Schulz, D. (2000), Probabilistic algorithms and the interactive museum tour-guide robot minerva, The International Journal of Robotics Research, Vol. 19, No. 11, (November 2000), (972-999), ISSN 02783649 Thrun, S; Liu, Y., Koller D., Ng, A. Y., Ghahramani, Z., Durrant-Whyte, H., (2004), Simultaneous Mapping and Localization with Sparse Extended Information Filters: Theory and Initial Results, International Journal of Robotics Research, Vol. 23, No. 7-8 (July-August 2004), (693-716), ISSN 0278-3649 Topp, E. A. & Christensen, H. I. (2005). Tracking for Following and Passing Persons, Proceedings of IEEE/RSJ Int. Conference on Intelligent Robots and Systems (IROS), pp. 2321-2327, ISBN 0-7803-8913-1, Edmonton, Alberta, Canada, August 2005, IEEE Computer Society, Los Alamitos Trahanias, P.; Burgard, W., Argyros A., Hahnel, D., Baltzakis, H., Pfaff, P. & Stachniss, C. (2005), TOURBOT and WebFAIR: Web-operated mobile robots for telepresence in populated exhibitions, IEEE Robotics & Automation Magazine, Vol. 12, No. 2, (June 2005), (77-89), ISSN 1070-98932 Wolf, D.F. & Sukhatme, G. S. (2005), Mobile robot simultaneous localization and mapping in dynamic environment”, Autonomous Robots, Vol. 19, No. 1, (July 2005), (53-65), ISSN 0929-5593 Zhang, Z., Deriche, R., Faugeras, O., Luong, Q. (1994), A robust technique for matching two uncalibrated images trough the recovery of the unknown epipolar geometry, Technical report N° 2273, Institut national de recherche en informatique et en automatique.

9 On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence Antonio J. Vallejo1, Rubén Morales-Menéndez2 and J.R. Alique3 1Visiting

scholar at the Instituto de Automática Industrial, Madrid, Spain 2Tecnológico de Monterrey, Monterrey NL, 3Instituto de Automática Industrial, Madrid, 1,3Spain 2México

1. Introduction High Speed Machining (HSM) has become one of the leading methods in the improvement of machining productivity. The term HSM covers high spindle speeds, high feed rates, as well as high acceleration and deceleration rates. Furthermore, HSM does not imply only working with high speeds but also with high levels of precision and accuracy. Additional to the HSM, many companies producing machine tools are interested in new technologies which provide intelligent features. Several research works (Koren et al., 1999; Erol et al., 2000; Liang et al., 2004) predict that future manufacturing systems will have intelligent functions to enhance their own processes, and the ability to perform an effective, reliable, and superior manufacturing procedures. In the areas of process monitoring and control, these new systems will also have a higher process technology level. In any typical metal-cutting process, the key indexes which define the product quality are dimensional accuracy and surface roughness; both directly influenced by the cutting tool condition. One of the main goals in a Computer Numerically Controlled (CNC) machining centre is to find an appropriate trade-off among cutting tool condition, surface quality and productivity. A cutting tool condition monitoring system which optimizes the operating cost with the same quality of the product would be widely appreciated, (Saglam & Unuvar, 2003; Haber & Alique, 2003). For example, in (Tönshoff et al., 1988), it has been demonstrated that effective machining time of the CNC milling centre could be increased from 10 to 65% with a monitoring and control system. Also, (Sick, 2002) mentions that any manufacturing process can be significantly optimized using a reliable and flexible tool monitoring system. The system must develop the following tasks: • Collisions detection as fast as possible. • Tool fracture identification. • Estimation or classification of tool wear caused by abrasion or other influences. While collision and tool fracture are sudden and mostly unexpected events that require reactions in real-time, the development of wear is a slow procedure. This section focuses on

144

Robotics, Automation and Control

the estimation of wear. The importance of tool wear monitoring is implied by exchanging worn tools in time, and tool costs can be reduced with a precise exploitation of the tool's lifetime. However, cutting tool monitoring is not an easy task for several reasons. First, the machining processes are non-linear, and time-variant systems, which makes them difficult to model. Secondly, the acquired signals from sensors are dependent on other kind of factors, such as machining conditions, cutting tool geometry, workpiece material, among others. There is not a direct method for measuring the cutting tool wear, so indirect measurements are needed for its estimation. Besides, signals coming from machine tools sensors are disturbed by many other reasons such as cutting tool outbreaks, chatter, tool geometry variances, workpiece material properties, digitizers noise, sensor nonlinearity, among others. There is not a straightforward solution. Symbol A AC AE ae aij ANN ap BN B CNC Curv DY DOE Dtool FFT FAR FFR fHZ fMel fz Fx Fy Fz HB HMM HSM LVQ L M

Description State transition probability distribution Accelerometer Acoustic Emission Radial depth of cut (mm) Elements of the transition matrix Artificial Neural Networks Axial depth of cut (mm) Bayesian Networks Obs. symbol probability distribution Computer Numerically Controlled Machining geometry curvature(mm-1) Dynamometer Design Of Experiments Diameter of the cutting tool (mm) Fast Fourier Transform False Alarm Rate False Fault Rate Sampling frequency (Hz) Scale Mel frequency Feed per tooth (mm/rev/tooth) Cutting force in x-axis (N) Cutting force in y-axis (N) Cutting force in z-axis (N) Brinell Hardness Number of the workpiece (BHN) Hidden Markov Models High Speed Machining Learning Vector Quantization Machining length (mm) Log bandpass filter output amplitude

Symbol

Description

MFCC MR M N Ns Nf np O qt S SOFM SP T Tc Tmach Tr Ts V VB VB1 VB2 VB3 Vol x

Mel Frequency Cepstrum Coeff. Multiple Regression Number of distinct obs. symbols Spindle speed (rpm) Number of states in the model Number of bandpass filters Number of passes over workpiece Observation sequence of model State at time t State sequence in the model Self-Organizing Feature Maps Spindle Power Length of observation sequence Tool life (min) Machining time (min) Training dataset Testing dataset Set of individual symbols Flank wear (mm or μm) Uniform flank wear (mm o μm) Non-uniform wear (mm o μm) Localized flank wear (mm o μm) Volume of removal metal (mm3) Sample

z λ π μ σ

Number of teeth of cutting tool HMM model specification Initial state distribution for HMM Mean value Standard deviation

Table 1. Nomenclature. This work proposes new ideas for the cutting tool condition monitoring and diagnosis with intelligent features (i.e. pattern recognition, learning, knowledge acquisition, and inference from incomplete information). Two techniques will be applied using Artificial Neural

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

145

Networks and Hidden Markov Models. The proposal is implemented for peripheral milling process in HSM. Table 1 presents all the symbols and variables used in this chapter.

2. State of the art The cutting tool wear condition is an important factor in all metal cutting processes. However, direct monitoring systems are not easily implemented because their need of ingenious measuring methods. For this reason, indirect measurements are required for the estimation of cutting tool wear. Different machine tools sensors signals are used for monitoring and diagnosing the cutting tool wear condition. There are important contributions for cutting tool monitoring systems based on Artificial Neural Networks (ANN), Bayesian Network (BN), Multiple Regression (MR) approaches and stochastic methods. In (Owsley et al., 1997), the authors presented an approach for monitoring the cutting tool condition. Feature extraction from vibrations during the drilling is generated by SelfOrganizing Feature Maps (SOFM). The signals processing implies a spectral feature extraction to obtain the time-frequency representation. These features are the inputs of a HMM classifier. The authors demonstrated that SOFM are an appropriated algorithm for vibration signals feature extraction. A methodology based on frequency domain is presented by (Chen & Chen, 1999) for on-line detection of cutting tool failure. At low frequencies, the frequency domain presents two important peaks, which are compared to compute a ratio that could be an indicator for monitoring tool breakage. In (Atlas et al., 2000), the authors used HMM for the evaluation of tool wear in milling processes. The feature extraction from vibrations signals were the root mean squared, the energy and its derivative. Two cutting tool conditions were defined: worn and no-worn condition. The reported success was around 93%. In (Sick, 2002a), a new hybrid technique for cutting tool wear monitoring, which fuses a physical process model with an ANN model is proposed for turning. The physical model describes the influence of cutting conditions on measure force signals and it is used to normalize them. The ANN model establishes a relationship between the normalized force signals and the wear state of the cutting tool. The performance for the best model was 99.4% for the learning step, and 70.0% for the testing step. In (Haber & Alique, 2003) is developed an intelligent supervisory system for cutting tool wear prediction using a model-based approach. The dynamic behavior of the cutting force is associated with the cutting tool and process conditions. First, an ANN model is trained considering the cutting force, the feed rate, and the radial depth of the cut. Secondly, the residual error obtained from the measure and predicted force is compared with an adaptive threshold in order to estimate the cutting tool condition. This condition is classified as new, half-worn, or worn cutting tool. In (Saglam & Unuvar, 2003), the authors worked with multilayered ANN for the monitoring and diagnosis of the cutting tool condition and surface roughness. The obtained success rates were of 77% for tool wear and 80% for surface roughness. In (Dey & Stori, 2004), a monitoring and diagnosis approach based on a BN is presented. This approach integrates multiple process metrics from sensor sources in sequential machining operations to identify the causes of process variations. It provides a probabilistic

146

Robotics, Automation and Control

confidence level of the diagnosis. The BN was trained with a set of 16 experiments, and the performance was evaluated with 18 new experiments. The BN diagnosed the correct state with a 60% confidence level in 16 of 18 cases. In (Haber et al., 2004) is introduced an investigation of cutting tool wear monitoring in a HSM process based on the analysis of different signals signatures in time and frequency domains. The authors used sensorial information from dynamometers, accelerometers, and acoustic emission sensors to obtain the deviation of representative variables. The tests were designed for different cutting speeds and feed rates to determine the effects of a new and worn cutting tool. Data was transformed from time to frequency domain using the Fast Fourier Transform (FFT) algorithm. They concluded that second harmonics of tooth path excitation frequency in the vibration signal are the best indicator for cutting tool wear monitoring. A proposal to exploit speech recognition frameworks in monitoring systems of the cutting tool wear condition is presented in (Vallejo et al., 2005). Also, (Vallejo et al., 2006) presented a new approach for online monitoring the cutting tool wear condition in face milling. The proposal is based on continuous HMM classifier, and the feature vectors were computed from the vibration signals between the cutting tool and the workpiece. The feature vectors consisted of the Mel Frequency Cepstrum Coefficients (MFCC). The success to recognize the cutting tool condition was 99.86% and 84.55%, for the training and testing dataset, respectively. Also, in (Vallejo et al., 2007) an indirect monitoring approach based on vibration measurements during the face milling process is proposed. The authors compared the performance of three different algorithms: HMM, ANN, and Learning Vector Quantization (LVQ). The HMM was the best algorithm with 84.24% accuracy, followed by the LVQ algorithm with 60.31% accuracy. Table 2 summarizes all works discussed in this section.

3. Experimental set-up This research work was focused on covering a domain in mold and die industry with different aluminium alloys. In this industry, the peripheral milling process is of great importance, its geometry can be defined as a simple straight line or even as a different geometry path including concave and convex curvatures. The experiments took place in a HSM centre HS-1000 Kondia, with 25 KW drive motor, three axis, maximum spindle speed 24,000 rpm, and a Siemens open Sinumerik 840D controller, as shown in Figure 1. During the experiment several HSS end mill cutting tools (25° helix angle, and 2-flute) from Sandvik Coromant were selected for the end milling process, and different workpiece materials (Aluminium with hardness from 70 to 157 HBN) were used. These materials were selected because they have important applications in the aeronautic and mold manufacturing industry. Also, several cutting tool diameters (from 8 to 20 mm) were employed. 3.1 Design of experiments Currently, the most of the research experiments are related to surface roughness and flank wear (VB). In machining processes they only consider a specific combination of cutting tool and workpiece material. Therefore, several authors have pointed out the importance of building databases with information of different materials and cutting tools that allow

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

147

computing models by considering a complete domain in the machining process. The DOE was defined to consider the most important factors affecting the surface roughness during the peripheral end milling process, see (Vallejo et al., 2007a). Therefore, its results are relevant to compute a surface roughness model as well as and a model to predict the cutting tool condition. Process Drilling End Milling End Milling Turning Turning Face Milling Face Milling Milling Face Milling Face Milling

Monitoring States

Sensor Signals

Recognition methods

References

Tool wear Tool Breakage (Normal, Broke) Tool wear (Worn-no worn) Tool wear (Wear value) Tool wear (New, half worn, worn) Tool wear (Flank wear) Tool wear (Low-high) Tool wear (New, worn) Tool wear (New, half-new, half-worn, worn) Tool wear (New, half-new, half-worn, worn)

HMM

(Owsley et al., 1997)

FFT

(Chen & Chen, 1999)

HMM

(Atlas et al., 2000)

ANN

Sick, 2002

ANN

(Haber & Alique, 2003)

ANN

(Saglam & Unuvar, 2003)

AE, SP

(Dey & Stori, 2004)

AE, DY, AC

FFT

(Haber et al., 2004)

HMM

(Vallejo et al., 2006)

HMM, ANN, LVQ

(Vallejo et al., 2007)

Process parameters Process parameters

Table 2. Comparison of different research efforts for monitoring the cutting tool condition. The recognition method is defined by considering the machining process, sensor signals, and the classification method. The factors and levels were defined via the application of a screening factorial design over the most important factors affecting the surface roughness. These factors and levels were the following: feed per tooth (fz), cutting tool diameter (Dtool), radial depth of cut (ae), hardness of the workpiece material (HB), and the machining geometry curvature (Curv). Table 3 shows the factors and levels defined for the experiments. Table 4 presents the selected aluminium alloys with the different cutting tools used in the experiments. The dimensions of the workpiece were 100x170x25 mm, and they were designed to allow the machining of four replicates. The designed geometries are depicted in Figure 2a, and the cutting tools are shown in Figure 2b. The machining domain in HSM was characterized by using different aluminium alloys, cutting tools and several geometries (concave, convex and straight path) in peripheral milling process, and the DOE considered the following steps: 1. Run a set of experiments with the cutting tool in sharp condition. During the experimentation the process variables were recorded. 2. Wear the cutting tool with the harder aluminium alloys until reaching a specific flank wear in agreement with ISO-8688 Tool life testing in milling. 3. Run other set of experiments with a different cutting tool wear condition. 4. Repeat the steps 2 and 3 until the cutting tool reaches the tool-life criteria.

148

Robotics, Automation and Control

Fig. 1. Experimental Set-up. CNC machining centre HS-1000 Kondia (Right side), and the workpiece fixed to the table after the machined process (left side).

Fig. 2. a) Aluminium workpieces and geometries. b) Cutting tools for the experimentation. Levels

fz (mm/rev/tooth)

Dtool (mm)

ae (mm)

HB (BHN)

Curv (mm-1)

-2 -1 0 1 2

0.025 0.05 0.075 0.1 0.13

8 10 12 16 20

1 2 3 4 5

71 93 110 136 157

-0.05 -0.025 0 0.025 0.05

Table 3. Factors and levels defined for the experimentation. Workpiece material Hardness (HB)

Cutting tools Diameter (mm)

5083-H111 (71 HB) 6082-T6 (93 HB) 2024-T3 (110 HB) 7022-T6 (136 HB) 7075-T6 (157 HB)

R216.32-08025-AP12AH10F (8 mm) R216.32-10025-AP14AH10F (10 mm) R216.32-12025-AP16AH10F (12 mm) R216.32-16025-AP20AH10F (16 mm) R216.32-20025-AP20AH10F (20 mm)

Table 4. Aluminium alloys and specifications of the cutting tools used in the experimentation.

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

149

3.2 Tool life evaluation In practical workshop environment, the time at which a tool ceases to produce workpieces of the desired size or surface quality usually determines the end of useful tool life. It is essential to define tool life as the total cutting time to reach a specified value of tool-life criterion. Here, it is necessary to identify and classify the cutting tool deterioration phenomena, and where it occurs at the cutting edges. The main numerical values of tool deterioration used to determine tool life are the quantity of testing material required and the cost of testing. The following concepts are given to explain the deterioration phenomena in the cutting tool: • Tool wear. Change in shape of the cutting edge part of a tool from its original shape, resulting from progressive loss of tool material during cutting. • Brittle fracture (chipping). Cracks occurrence in the cutting part of a tool followed by the loss of small fragments of tool material. • Tool deterioration measure. Quantity used to express the magnitude of a certain aspect of tool deterioration by a numerical value. • Tool-life criterion. Predetermined value of a specified tool deterioration measure indicating the occurrence of a specified phenomenon. • Tool life (Tc). Total cutting time of the cutting part required to reach a specified tool-life criterion. In Figure 3, terms related to the tool deterioration phenomena on end milling cutters are shown. These terms include: • Flank wear (VB): Loss of tool material from the tool flanks, resulting in the progressive development of the flank wear land. • Uniform flank wear (VB1): Wear land which is normally of constant width and extends over the tool flanks of the active cutting edge. • Non-uniform wear (VB2): Wear land which has an irregular width and the original flank varies at each position of measurement. • Localized flank wear (VB3): Exaggerated and localized form of flank wear which develops at a specific part of the flank. The tool-life criterion can be a predetermined numerical value for any type of tool deterioration that can be measured. If there are different forms of deterioration, they should be recorded so when any so when any of the deterioration phenomena limits has been attained, we can say the end of the tool life has been the end of the tool life has been reached. Predetermined numerical values of specific types of tool wear are recommended: • For a width of the flank wear land (VB) the following tool life end points are recommended: 1. Uniform wear: 0.3 mm averaged over all teeth. 2. Localized wear: 0.5 mm maximum on any individual tooth. • When chipping occurs, it is to be treated as localized wear using a VB3 value equal to 0.5 mm as a tool-life end point. Finally, flank wear measurement is carried out parallel to the surface of the wear land and in a perpendicular direction to the original cutting edge. Although the flank wear land on a significant portion of the flank wear may be of uniform size, there will be variations in its value at other portions of the flank, depending on the tool profile and edge chipping. Values of flank wear measurements are related to the area or position along the cutting edges at which the measurement is made.

150

Robotics, Automation and Control

Fig. 3. Different terms in the flank wear are depicted for an end milling cutter (Taken from ISO 8088-2, 1989). Therefore, it was necessary to define a methodology to wear the cutting tool, and to use the total tool-life during the experimentation. The assessment of the flank wear was taken as tool-life criterion. The applied methodology considers the following steps: 1. The new cutting tools are specified and the DOE with the four replicates is made. 2. The flank wear is assessed and registered at the end of the experimentation. 3. The cutting tools are worn by using several workpiece materials, and during the process the flank wear was observed until specific flank wear is reached. 4. The DOE is repeated with the new cutting tools conditions. 5. The steps 2, 3 and 4 are repeated (two more times), and the flank wear is measured and registered at the end of the process. Figure 4 shows the evolution of the tool wear during the experimentation until the maximum tool-life criterion is reached. The experiments were interrupted at regular intervals for measurement of the flank wear (VB). The flank wear pattern along the cutting edge is showed as uniform wear over the surface (see Figure 5). In all cases, the tool wear data corresponds to localized wear. Milling is an interrupted operation, where the cutting tool edge enters and exits the workpiece several times. The machining time of the tool in minutes was computed by Equation (1):

Tmach =

L × np

fz × z × N The volume of removed material volume was computed by Equation (2): Vol = a e a p n p L

(1)

(2)

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

151

Fig. 4. Evolution of flank wear versus the volume of removal metal. The figure shows the behavior of the five cutting tools.

Fig. 5. Evolution of flank wear on the cutting edge. The images were taken throught a stereoscopic microscope. The cutting tool diameter is 12 mm. The VB was selected as the criterion to evaluate the tool’s life and its measurement was carried out according to ISO 8688-2, 1989. These two variables, Vol and VB, define the evolution of the cutting tool wear. The range of the flank wear was selected so that four cutting tool conditions were defined. They are shown in Table 5. Cutting tool wear condition

Flank wear (mm)

New Half-new Half-worn Worn

0 ≤ VB < 0.08 0.08 ≤ VB < 0.1 0.1 ≤ VB < 0.3 0.3 ≤ VB < 0.5

Table 5. Cutting tool wear conditions and the flank wear observed during the experimentation.

152

Robotics, Automation and Control

3.3 Data acquisition system The Data Acquisition System consists of several sensors that were installed in the CNC machine (see Figure 6). For measuring the vibration, 2 PCB Piezotronics accelerometers model 353B04 were fixed in x and y-axis directions on the workpiece. These instruments have a sensitivity of 10 mV/g, in a frequency range from 0.35 to 20,000 Hz. Measurement range is ±500g. Other 2 Bruel and Kjaer piezoelectric accelerometers model 4370, and another model 4371, with a charge sensitivity of 98±2% pC/g, were installed on a ring fixed to the spindle. Also, these sensors allow the recording of vibration in x, y, and z-axis, during the cutting process.

Fig. 6. Experimental Set-up. CNC machining centre and data acquisition system (sensors, amplifiers, boards and LabView interface). The vibration signals of the spindle and workpiece, and forces during machining process were acquired with the NI-6152 board. The acoustic emission signals were acquired with 1602 CompuScope board. The dynamic cutting force components (Fx, Fy, Fz) were sensed with a 3 component force dynamometer, on which the workpiece was mounted. All the signals were acquired with a high speed multifunction DAQ NI-6152 card, which ensures 16-bit accuracy at a sampling rate of 1.25 MS/s. The system was configured to obtain the signals with a sampling rate of 40,000 samples/s. The acoustic emissions were recorded with 2 Kistler Piezotron AE sensors model 8152B1, with frequency range from 50 to 400 KHz, and sensitivity of 700 V/(m/s). One was installed on a ring fixed to the spindle, and another was installed on the table of the machining centre. The AE signals were acquired with a CompuScope 1602 card for PCI bus, with 16 bit resolution. It provides a dual-channel simultaneous sampling rate of 2.5 MS/s. This board was configured to obtain signals with a sampling rate of 1,000,000 samples/s. The

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

153

acquisition system was controlled with a LabView program. This program was used to control the start and end of the recorded signal and storage the information in specific files.

4. Processing of the process variables Signals from the sensors must be processed to obtain the relevant features which identify the cutting tool condition. Basically, the raw signals undergo three steps in the signal processing: 1. Signal segmentation. During the machining process only one specific segment of the signal was selected and processed. This signal segment was divided into 20 small frames, which correspond to 0.15 (approximately) seconds of the machining time. 2. Features extraction. The feature vectors were computed for all the frames of each signal. 3. Average value. An average value was computed for all frames. 4.1 Feature extraction The acquired signals during the machining process contain abundant information of the tool status, such as, fundamental frequencies related with the spindle speed and number of inserts, wide band frequency, amplitude of vibration signal, the sensitivity to detect the tool condition, the chatter, and so forth. The different signals are pre-processed calculating their MFCC representation, (Deller et al., 1993). This common transformation has shown to be more robust and reliable than other techniques, (Davis & Mermelstaein, 1980). There is a mapping between the real frequency scale ( fHZ ) and the perceived frequency scale ( fMel ).

The Mel scale is defined by the following equation f ⎞ ⎛ fMel = 2595 × log ⎜ 1 + Hz ⎟ 700 ⎝ ⎠

(3)

The process to calculate the MFCC is shown in Figure 7. In this process, we must define the number of filters (Nf), sampling frequency (fHZ), filters amplitude, and the configuration of the filter banks (triangular or rectangular shape). At the end, the MFCC are computed using the Inverse Discrete Cosine Transform: MFCC i =

2 Nf

⎛ πi

j=1

⎝

∑ m j cos⎜⎜ N

⎞ ( j − 0.5) ⎟⎟ ⎠

(4)

The result is a seven-dimension vector, where each dimensions correspond to one parameter. MFCC were computed by using the VOICEBOX: Speech Processing Toolbox for MatLab, and written by (Brookes, 2006). The routines taken from Speech Recognition module were: (a) The routine melcepst, which implements a mel-cepstrum front end for a recognizer; and (b) The routine melbankm, which generates the associated bandpass filter matrix. 4.2 MFCC for vibrations and force signals Specifically for vibrations and force signals, the MFCC were computed by considering the following parameters: number of filters 20, sampling rate 40,000 Hz, and a bandpass filter with a triangular shape. The feature vector was of 7 dimensions (1 energy coefficient and 6 MFCC coefficients).

154

Robotics, Automation and Control

Fig. 7. Feature extraction process. The process variables (signals) are segmented and divided in short frames. A Discrete Fourier Transform and a mapping between the real frequency and the Mel frequency are computed. Then, a bandpass filters bank is applied for smoothing the scaled spectrum. Finally, the MFCC are computed using the discrete cosine transform. 4.3 MFCC for acoustic emission signals MFCC were computed by considering the following parameters: number of filters 20, sampling rate 1,000,000 Hz, and a triangular shape bandpass filter. The feature vector was of 7 dimensions (1 energy coefficient and 6 MFCC coefficients).

5. Monitoring and diagnose the cutting tool wear condition with HMM Real world processes generally produce observable outputs which can be characterized as signals. The signals can be discrete in nature (e.g., characters from a finite alphabet, quantized vectors from a codebook, etc.), or continuous in nature (e.g., speech samples, temperature measurements, vibration signals, music, etc.). They can be stationary or nonstationary, pure or corrupted from other signal sources. A problem of fundamental interest is characterizing such real-world signals in terms of signal models. There are many reasons to consider this issue. First, a signal model can provide the basis for the theoretical description of a signal processing system that can be used to process the signal so as to provide a desired output. A second reason why signal models are important is that they are potentially capable of letting us learn a great deal about the signal source. But, the most important reason why signal models are significant is that they often work

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

155

extremely well in practice, and enable us to realize important practical systems (e.g. prediction systems, recognition systems, identification systems, among others.). Signal models can be divided into deterministic and statistical models. Deterministic models generally exploit some known specific properties of the signal, and we only need to determine the values of the signal model parameters (e.g., amplitude, frequency, phase, etc.). On the other hand, statistical models use the statistical properties of the signal. Examples of such statistical models include Gaussian, Poison, Markov, and Hidden Markov processes. In this section, we are going to describe one type of stochastic signal model, namely HMM. A complete description of the HMM can be found in (Rabiner, 1989; Mohamed & Gader, 2000). 5.1 Discrete Markov Processes Consider a system which may be described at any time as being in one of a set of Ns distinct states, S1, S2, S3, ..., SN, as depicted in Figure 8 (where Ns=3). At regularly spaced discrete times, the system undergoes a change of state (possibly back to the same state) according to a set of probabilities associated with the state. The time instants associated with the state changes are t = 1, 2,...., and the actual state at time t, as qt. A full probabilistic description of the above system would, in general, require specification of the current state (at time t), as well as all the predecessor states. For the special case of a discrete, first order, Markov chain, this probabilistic description is reduced to just the current and the predecessor state, as shown in the following equation,

P[q t = S j q t −1 = S i , q t −2 = S k ,…] = P[q t = S j q t −1 = S i ]

(5)

Furthermore we only consider those processes in which the right-hand side of (5) is independent of time, thereby leading to the set of state transition probabilities ai,j of the form a ij = P[q t = S j q t −1 = S i ],1 ≤ i , j ≤ N

Fig. 8. Representation of a HMM with three states and the probabilities of the transition matrix (aij). with the state transition coefficients having the properties

(6)

156

Robotics, Automation and Control

aij ≥0 N

∑ a ij = 1

(7)

j=1

Because, they obey standard stochastic constraints. The above stochastic process could be called an observation Markov model since the output of the process is the set of states at each instant of time, where each state corresponds to a physical event. 5.2 Extension to Hidden Markov Process In this part we extend the concept of Markov models to include the case where the observation is a probabilistic function of the state, and the resulting model (which is called a HMM) is a doubly embedded stochastic process with an underlying stochastic process that is not observable, but can only be observed through another set of stochastic processes that produce the sequence of observations. To explain this concept, the following example is presented. Coin Toss Models. Assume that somebody is in a room behind the wall, and he can not see what is happening inside. On the other side of the wall is another person who is performing a coin tossing experiment. The other person will not tell you anything about what he is exactly doing; he will only tell you the result of each coin flip. After a sequence of hidden coin tossing experiments is performed, the observation sequence consisting of a series of heads and tails, would be

O=O1O2O3......OT

(8)

=H H J J J H J J H......H where H stands for heads and J stand for tails. Given the above scenario, the problem of interest is how do we build an HMM to explain the observed sequence of heads and tails. The first faced problem is deciding what states in the model correspond with what was observed. Then we should decide how many states should be in the model. One possible choice would be to assume that only a single biased coin was being tossed. In this case we could model the situation with a two-state model where each state corresponds to a side of the coin (i.e., heads or tails). This model is depicted in Figure 9a. A second form of HMM for explaining the observed sequence of coin toss outcomes is given in Figure 9b. In this case there are 2 states in the model and each state corresponds to a different, biased coin being tossed. Each state is defined by a probability distribution of heads and tails. Transitions between states are characterized by a state transition matrix. The physical mechanism which accounts for how state transition is selected could be itself a set of independent coin tosses, or some other probabilistic event. A third model of HMM for explaining the observed sequence of coin toss outcomes is defined very similarly to the HMM in Figure 8. This model corresponds to using 3 biased coins, and choosing among them a probabilistic event. Given the opportunity to choose among the three models in Figures 8 and 9 for the explanation of the observed sequence of heads and tails, a natural question would be which model matches the bets the actual observations. It should be clear that the simple 1-coin model of Figure 9a has only 1 unknown parameter, the model of Figure 9b has four unknown parameters, and the model of Figure 8 has nine

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

157

unknown parameters. Thus, with the greater degrees of freedom, the larger HMMs would seem to inherently be more capable of modeling a series of coin tossing experiments than it would be equivalent smaller models.

Fig. 9. (a) HMM with one coin and the two states. (b) HMM with two coins and each state with two observations. An HMM is characterized by the following: • The number of states in the model, Ns. Generally the states are interconnected in such as way that any state can be reached from any other state. We denote the individual states as S=S1,S2,......,SN, and the state at time t as qt. • The number of distinct observation symbols per state, M. The individual symbols such as V = v1,v2,.....,vM (i.e., the symbols in the last example were H (heads) and J (tails)). The state transition probability distribution A = aij, where • a ij = P[q t = S j q t −1 = S i ],1 ≤ i , j ≤ N •

(9)

The observation symbol probability distribution in state j, B = bj(k), where b j (k ) = P[ v k at t q t = S j ], 1 ≤ j ≤ N , 1 ≤ k ≤ M

(10)

The initial state distribution π = πi where

π i = P[ q 1 = S i ],1 ≤ i ≤ N

(11)

Given appropriate values for Ns, M, A, B, and π, the HMM can be used as a generator of an observation sequence O = O1O2,......,OT It can be seen from the above discussion that a complete specification of an HMM requires specification of two model parameters (Ns, and M), observation symbols, and three probability measures A, B, and π. For convenience, the compact notation is used,

λ = (A , B , π ) to indicate the complete parameter set of the model.

(12)

158

Robotics, Automation and Control

5.3 Baum-Welch algorithm to train the model The Baum-Welch algorithm, (Rabiner, 1989), is used to adjust the model parameters to maximize the probability of the observation sequence given by the model. The observation sequence used to compute the model parameters is called a training sequence. The training problem is crucial in the applications of the HMMs, because it allows us to optimally adapt model parameters to observed training data. The Baum-Welch algorithm is an iterative process that uses the forward and backward probabilities to solve the problem. The goal is

to obtain a new model λ = ( A , B , π ) to maximize the function, Q(λ , λ ) = ∑ Q

P( O , Q λ ) P( O λ )

[

]

log P(O , Q λ )

(13)

First, a current model is defined as λ = (A , B , π ) , and used to estimate a new model as

λ = ( A , B , π ) . The new model must present a better likelihood than the first model to reproduce the observation sequence. Based on this procedure, if we iteratively use λ in place of λ and repeat the calculus, then we can improve the probability of O being observed from the model until some limiting point is reached. The result of the recalculation procedure is called a maximum likelihood estimate of the HMM. At the end, the new set of parameters (means, variance, and transitions) is obtained for each HMM. 5.4 Viterbi algorithm In pattern recognition applications, it is useful to associate an optimal sequence of states to a sequence of observations, given the parameters of the model. In pattern recognition, the feature vector, representing the observations, is known, but the sequence of states that defines the model is unknown. A "reasonable" optimality criterion consists of choosing the state sequence (or path) that brings a maximum likelihood with respect to a given model (i.e., best "explains" the observation). This sequence can be determined recursively via the Viterbi algorithm. This algorithm identifies the single best state sequence, Q={q1 q2 ..... qT} for the given observation sequence O={O1O2 .... OT}, and makes use of two variables: • The highest likelihood δt(i) along a single path among all the paths ending in state i at time t:

δ t (i ) = max P[q 1 q 2 … q t = i , O 1 O 2 … O t λ ] q 1 ,q 2 …q t − 1

(14)

• A variable ψt(i) which allows to keep track of the "best path" ending in state j at time t. Using these two variables, the algorithm implies the following steps: 1. Initialization

δ 1 (i ) = π i b i (O 1 ) 1 ≤ i ≤ N ψi = 0 2.

(15)

Recursion

δ t ( j) = max[δ t −1 (i )a ij ] b j (O t ), 2 ≤ t ≤ T , 1 ≤ j ≤ N s 1≤ i ≤ N s

ψ t ( j) = arg max[δ t −1 (i )a ij ], 2 ≤ t ≤ T , 1 ≤ j ≤ N s 1≤ i ≤ N s

(16)

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

159

Termination: P ∗ = max[δ T (i )] 1≤ i ≤ N

q ∗T = arg max[δ T (i )]

(17)

1≤ i ≤ N

Path (state sequence) backtracking: q ∗t = ψ t +1 ( q ∗t +1 ), t = T − 1, T − 2 ,… ,1

(18)

The Viterbi algorithm delivers the best states path, which corresponds to the observations sequence. This algorithm also computes likelihood along the best path. The HMM models were computed by using the Hidden Markov Model Toolbox for MatLab. The routines were written by (Murphy, 2005).

6. Results This section presents the results that were obtained by applying two different artificial intelligence techniques for monitoring and diagnosing the cutting tool condition during the peripheral end milling process in HSM: (1) Artificial Neural Network, and (2) Hidden Markov Models. In agreement with the experiments, a database was built with 441 experiments: 110 experiments used a new cutting tool, 112 a half-new cutting tool, 110 a half-worn cutting tool, and 109 a worn cutting tool. A MonteCarlo simulation for the training/testing steps was implemented due to the stochasticity of the approach. The results correspond to the average of 10 runs, where every time a different training data set (Tr) and testing data set (Ts) was generated (Figure 10).

Fig. 10. Procedure for computing the approach performance. A random simulation for splitting the experimental dataset in training/testing sets was implemented due to the stochastic nature of the approaches. 6.1 Artificial neural network To compare our results with classical approaches, the cutting tool wear condition was modeled with an ANN model. The application of ANN to on-line process monitoring systems has attracted great interest due to their learning capabilities, noise suppression, and parallel computability. A complete recopilation of research works in on-line and indirect tool wear monitoring with ANN are presented in (Sick, 2002). ANN is often defined as a computing

160

Robotics, Automation and Control

system made up of a number of simple elements called neurons, which possesses information by its dynamic state response to external inputs. The neurons are arranged in a series of layers. Multi-layer feed-forward networks are the most common architecture. Furthermore, there are several learning algorithms for training neural networks. Backpropagation has proven to be successful in many industrial applications and it is easily implemented. The proposed architecture implies 12 input neurons, one hidden layer with 12 neurons, and 1 output neuron. Figure 11 shows the ANN model, where the input neurons represent the following information: feed per tooth, tool diameter, radial depth of cut, workpiece material hardness, curvature, and the MFCC vector (7 dimensions).

Fig. 11. ANN model implemented for monitoring and diagnosis the on-line cutting tool condition. We used a feedforward ANN model and “tanh” activation function. The trained algorithm was classical backpropagation. For computing, input data (fz, Dtool, ae, HB, Curv, and MFCC vector) was normalized and output data was mapped to [-1, 1]. All the experimental dataset was normalized to avoid numerical inestability. First, the dataset was normalized by considering the mean value (μ), and standard deviation (σ) with the following equation, f( x ) =

x−μ

(19)

A second normalized method was applied: bipolar sigmoidal. This method was used because the minimum and maximum values are unknown in real-time. The non-linear transformation prevent most values from being compressed into essentially the same values, and it also compress the large outlier values. The bipolar sigmoidal was applied with the following equation: f( x ) =

1 − e(− x) 1 + e(− x)

(20)

With respect to the output neuron, the cutting tool condition, these values were mapped between the normalized tool-wear and tool-wear condition (see Table 6). Finally, the dataset was randomly divided into two sets, training (70%), and testing (30%) sets, in order to measure their generalization capacity.

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

Normalized tool condition

Cutting tool condition

From +0.66 to +1.00 From 0.0 to +0.66 From -0.66 to 0.0 From -1.00 to -0.66

New Half-new Half-worn Worn

161

Table 6. Tool-wear from ANN model is mapped with the tool-wear Cutting. The performance of the ANN model was computed for ten different sets of data, which were selected in random form. The training and testing processes were programmed by using MatLab software. The obtained results correspond to 8 different ANN models, all of them with the same architecture but different MFCC vector. The MFCC were computed for each of the process signals (accelerometers, forces, and acoustic emission). Table 7 shows the results computed with different process signals. The obtained performance corresponds to an avarage value from the ten data sets. Data sets Training Testing

Workpiece Acc-X Acc-Y

Spindle Acc-X Acc-Y

X Force

Y Force

AE Spindle

AE Workp.

90.2% 31.3%

97.8% 40.4%

94.2% 48.5%

97.6% 48.0%

99.9% 89.9%

99.2% 69.7%

94.5% 33.8%

98.7% 47.2%

Table 7. Performance for the training and testing data sets of the ANN model. The first two columns define the success of the accelerometers on the workpiece. The next two, the accelerometers installed on the spindle. The last two columns correspond with the Acoustic Emission sensors. Table 7 shows that ANN model with acoustic emission signal (AE-Spindle) represents the best model for testing dataset, with a performance of 89.9% and Mean Squared Error (MSE) of 0.10075. Figure 12 plots the obtained results of the diagnosis system, when the ANN model was tested for the prediction of the cutting tool condition.

Fig. 12. Diagnosis the cutting tool condition with the ANN(12,12,1) model. The MFCC were computed for the acoustic emission signal (AE-Spindle).

162

Robotics, Automation and Control

Fig. 13. Flow diagram for monitoring and diagnosis the cutting tool wear condition with continuous HMM. The features from signals are separeted into 2 branches. The training branch leads leads to HMM, and the diagnose branch uses the new observations and HMMs to recognize the cutting tool condition. 6.2 Hidden Markov Model Figure 13 shows the flow diagram implemented for monitoring and diagnosing the cutting tool wear condition on-line with the HMM model. First, the signals are processed and splited into two: training and testing branches. Second, the training branch produces the HMM parameters by using the Baum-Welch algorithm. In this case, four models were computed to represent the four cutting tool conditions. Third, the testing branch uses the preprocessed signals and the HMM to compute the P(O/λ) using the Viterbi algorithm for each model. The model with higher probability is selected as result. The HMM framework was evaluated for different states and Gaussians in order to find the optimum performance results. Three different configurations were defined with seven MFCC: 1. HMM with 3 states and 2 Gaussians 2. HMM with 4 states and 2 Gaussians 3. HMM with 4 States and 4 Gaussians. Figure 14 shows how the performance increases by increasing the states and Gaussians in the HMM approach. Based on this result, the selected configuration was with 4 states and 2 Gaussians, where the average performance was 77.51% for testing dataset. The process signal was the AE installed over the table.

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

163

90% 80% 70% 60% 50% 40% 30% Worn

20%

Half-worn

10%

Half-new

0% HMM(7,3,2)

Fresh HMM(7,4,2)

HMM(7,4,4)

Fig. 14. Perfomance of the HMM with different configuration. The HMM were computed with different number of states (3, 4) and Gaussians (2,4). Figure 15 shows the performance of the HMMs with the different signals. The acoustic emission signals present the best performance. For the AE-Spindle signal, the average performance was 99.4% for training dataset, and 95.1% for testing dataset. 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Acx-wk

Acy-wk

Acx-Sp

Acy-Sp

Fresh

Half-new

Half-worn

AE-Sp

AE-Table

Worn

Fig. 15. Performance of the HMM for the different process signals. The results correspond to the obtained success for the testing dataset. A classical test in a diagnosis system is to identify two alarms due to a false classification of cutting tool condition. These alarms are: False Alarm Rate (FAR), and False Fault Rate (FFR).

164

Robotics, Automation and Control

FAR condition represents a damage tool, but it is not true. FFR condition corresponds to a good state of the tool, but it is really damaged. The FAR condition is not a problem for diagnosis, but it reduces the productivity. However, the FFR condition might represent an expensive problem when the rate is high, because the tool can break before it is replaced. Figure 16 shows the misclassification percentage due to the FFR condition. The classifier with the lower percentage of the FFR was the HMM using the acoustic emission sensor. Once again, the AE-spindle does not produce any FFR condition. 70%

60%

50%

40%

30%

20% 10% Worn Half-worn

0% Acx-wk

Half-new Acy-wk

Acx-Sp

Acy-Sp

Fresh AE-Sp

AE-Table

Fig. 16. Misclassification percentage in FFR alarms, for the HMM with different process signals.

7. Conclusions This chapter presented new ideas for monitoring and diagnosis of the cutting tool condition with two different algorithms for pattern recognition: HMM, and ANN. The monitoring and diagnosis system was implemented for peripheral milling process in HSM, where several Aluminium alloys and cutting tools were used. The flank wear (VB) was selected as the criterion to evaluate the tool’s life and four cutting tool conditions were defined to be recognized: New, half new, half worn, and worn condition. Several sensors were used to record important process variables; accelerometer, dynamometer, and acoustic emission. Feature vectors, based on the Mel Frequency Cepstrum Coefficients, were computed to characterize the process signals during the machining processes. First, with the cutting parameters and MFCC, the cutting tool condition was modeled with an ANN model. The feedfoward ANN model and backpropagation algorithm were used to define the ANN model. The proposed architecture implies 12 input neurons and one output neuron (cutting condition). The best results were obtained by using the signals from the Acoustic Emission installed on the machine spindle. The success rate for the ANN model was 89.9% for the testing dataset.

On-line Cutting Tool Condition Monitoring in Machining Processes using Artificial Intelligence

165

Second, the HMM approach was configured with four states and two Gaussians, and the HMM models were computed with each one of the process signals. The best result was obtained with the signals coming from AE-Spindle. The performance was 95.08% for testing dataset, and 0.0% in the FFR condition. It is very important to mention, that HMM approach only uses one sensor to classify the cutting tool condition, while the ANN approach uses sensor fusion of five cutting parameters and one process variable to get the reported performance.

8. References Atlas, L., Ostendorf, M., and Bernard, G. D., (2000). Hidden Markov Models for Machining Tool-Wear. IEEE, pp. 3887-3890. Brookes, M., (2006). VOICEBOX: Speech Processing Toolbox for MatLab (http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html), Exhibition Road, London SW7 2BT, UK. Chen, J.C., and Chen, W. (1999). A Tool Breakage Detection System using an Accelerometer Sensor. J. of Intelligent Manufacturing, 10(2), pp. 187-197. Davis, B.S., and Mermelstaein P., (1980). Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Transactions on Acoustic, Speech, and Signal Processing, 4(28), pp. 357-366. Deller, J.R., Hansen, J.H., and Proakis, J.G., (1993). Discrete-Time Processing of Speech Signals. IEEE pres, NJ 08855-1331. Dey, S., and Stori, J.A. (2004). A Bayesian Network Approach to Root Cause Diagnosis of Process Variations. International Journal of Machine Tools & Manufacture, (45), pp. 7591. Erol, N.A., Altintas, Y., and Ito, M.R. (2000). Open System Architecture Modular Tool Kit for Motion and Machining Process Control. IEEE/ASME Transactions on Mechatronics, 5(3), pp. 281-291. Haber, R.E. and Alique, A. (2003). Intelligent Process Supervision for Predicting Tool Wear in Machining Processes. Mechatronics, (13), pp. 825-849. Haber, R.E., Jiménez, J.E., Peres, C.R., and Alique, J.R. (2004). An Investigation of Tool-Wear Monitoring in a High-Speed Machining Process. Sensors and Actuators A, (116), pp. 539-545. ISO 8688-2, (1989). Tool Life Testing in Milling – Part 2: End Milling. International Standard, first edition. Korem, Y., Heisel, U., Jovane, F., Moriwaki, T., Pritschow, G., Ulsoy, G. and Van Brussel, H. (1999). Reconfigurable Manufacturing Systems. Annals of the CIRP 48(2) , pp. 527540. Liang, S.Y., Hecker, R.L., and Landers, R.G. (2004). Machining Process Monitoring and Control: the State of the Art. Manufacturing Science and Engineering, 126, pp. 297-310. Mohamed, M.A., and Gader, P., (2000). Generalized Hidden Markov Models - Part I: Theoretical Frameworks. IEEE Transactions on Fuzzy Systems, 8(1), pp. 67-81. Murphy, K., (2005). Hidden Markov Model Toolbox for MatLab (http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html). Owsley, L.M., Atlas, L.E., and Bernard, G.D. (1997). Self-Organizing Feature Maps and Hidden Markov Models for Machine-Tool Monitoring. IEEE Transactions on Signals Processing, 45(11), pp. 2787-2798.

166

Robotics, Automation and Control

Rabiner, L.R., (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE. 77(2), pp. 257-286. Saglam, H., and Unuvar, A. (2003). Tool Condition Monitoring in Milling based on Cutting Forces by a Neural Network. International Journal of Production Research, 41(7), pp. 1519-1532. Sick, B. (2002). On-Line and Indirect Tool Wear Monitoring in Turning with Artificial Neural Networks: A review of more than a decade of research. Mechanical Systems and Signal Processing, 16(4), pp. 487-546. Sick, B. (2002a). Fusion of Hard and Soft Computing Techniques in Indirect, Online Tool Wear Monitoring. IEEE Transactions of Systems, Man, and Cybernetics, 32(2), pp. 8091. Tönshoff, H.K., Wulfsberg, J.P., Kals, H.J., König, W., and Van Luttervelt, C.A. (1988). Developments and Trends in Monitoring and Control of Machining Processes. Annals of the CIRP, 37(2), pp. 611-622. Vallejo, A.J., Nolazco-Flores, J.A., Morales-Menéndez, R., Sucar, L.E., and Rodríguez, C.A., (2005). Tool-wear Monitoring based on Continuous Hidden Markov Models. LNCS 3773 Springer-Verlag, X CIARP., pp. 880-890. Vallejo, A.J., Morales-Menéndez, R., Rodríguez, C.A., and Sucar, L.E., (2006). Diagnosis of a Cutting Tool in a Machining Center. IEEE International Joint Conference on Neural Networks, pp. 7097-7101. Vallejo, A.J., Morales-Menéndez, R., Garza-Castañon, L.E., and Alique, J.R., (2007). Pattern Recognition Approaches for Diagnosis of Cutting Tool Wear Condition. Transactions of the North American Manufacturing of Research Institution of SME, 35, pp. 81-88. Vallejo, A.J., Morales-Menéndez, and Alique, J.R., (2007a). Designing a Cost-effective Supervisory Control System for Machining Processes. IFAC-Cost effective Automation in Networked Product Development and Manufacturing, IFACPapersOnLine. Vallejo, A.J., Morales-Menéndez, R., and Alique, J.R., (2008). Intelligent Monitoring and Decision Control System for Peripheral Milling Process. To appear in IEEE International Conference on Systems, Man, and Cybernetics, October, 2008.

10 Controlled Use of Subgoals in Reinforcement Learning Junichi Murata

Kyushu University Japan 1. Introduction Reinforcement learning (Kaelbling et al., 1996; Sutton & Barto, 1998) is a machine learning technique that automatically acquires a good action policy, i.e. a mapping from the current state to a good action to take, through trials and errors. A learning agent observes the state of its environment, chooses an action based on its current policy and executes the action. Responding to the action, the environment transitions to a new state, and a reword is given to the agent when applicable. The reward indicates how good or how bad the new state is, and the agent uses it to improve its policy so that it can obtain more rewards. Since reinforcement learning (abbreviated as RL hereafter) requires no other information, e.g. a model of environment, than the perceived states and rewards, it can be applied to a class of problems where the environment is complex or uncertain. The applications of RL include control of multi-legged robots (Kimura et al., 2001; Zennir at al., 2003), navigation of mobile robots (Millan, 1995), elevator hall call assignment (Crites & Barto, 1998; Kamal et al., 2005; Kamal and Murata, 2008) and board games (Tesauro, 1994). However, the learning agent must perform a large number of action trials in order to collect sufficient information about the unknown environment and organize it into a good policy. This takes a long time and therefore puts a limit on practical application of RL. A number of techniques have been proposed to make RL faster. Most of them utilize a priori information on the target problem to compensate shortage of information and thus save the agent from a large number of trials. Smart and Kaelbling proposed using records of successful state-action pairs achieved by a human operator or a controller (Smart & Kaelbling, 2000). In the user-guided reinforcement learning of robots by Wang et al., a user teaches the robot a good action in real time (Wang et al., 2003). Driessens and Džeroski used a policy given by the human designer (Driessens & Džeroski, 2004). In all of the above research, the a priori information is in the form of good actions. Good states, however, are easier to find than good actions; you only need to find what the environment should be like but not how the agent can achieve that. The technique described in this chapter focuses on subgoals as a form of good-state-related a priori information. A subgoal is a state or a subset of states that must be visited on the way from the initial state to the final goal state. With subgoals, the original problem can be divided into a set of small subproblems and this makes the learning faster. Use of subgoals in RL has been proposed by researchers, and it is closely related to (hierarchical) task partition and abstraction of actions. Singh used known subgoals to accelerate RL (Singh,

168

Robotics, Automation and Control

1992). Wiering and Schmidhuber proposed HQ-Learning which learns subgoals to convert a POMDP (partially observable markov decision process) problem to a sequence of MDP problems (Wiering & Schmidhuber, 1996). HASSLE by Bakker and Schmidhuber learns subgoals for efficient RL (Bakker & Schmidhuber, 2004). A method was proposed by McGovern and Barto for discovering subgoals to create ‘options’ or temporally-extended actions (McGovern & Barto, 2001), and an improved method by Kretchmar et al. (Kretchmar et al., 2003). Subgoal information is either given by human designers/operators or acquired by the agent itself through learning. The idea of automatic acquisition of subgoals by learning (Wiering & Schmidhuber, 1996; McGovern & Barto, 2001; Kretchmar et al., 2003; Bakker & Schmidhuber, 2004) is very interesting. This learning itself, however, requires a considerable amount of time, and therefore is not very fascinating from the overall learning time point of view. On the other hand, it is not unrealistic that human designers/operators give the agents their a priori knowledge since there is usually some sort of a priori information available about the target problems. This kind of a priori information is usually ready before starting learning and does not require additional learning time. But, instead, there is a possibility that the humans’ a priori knowledge is not perfect; it may contain errors and/or ambiguity. In this chapter a technique is proposed that utilizes human-supplied subgoal information to accelerate RL. The most prominent feature of the proposed technique is that it has a mechanism to control use of subgoals; it keeps watching the effect of each subgoal on the action selection, and when it detects redundancy or harmfulness of a subgoal, it gradually stops using the subgoal. The technique is an extension of the one proposed by the author previously (Murata et al., 2007). Unlike the previous one, the redundancy and the harmfulness of subgoals are treated separately and thus finer control of use of subgoals is achieved. Moreover, after a sufficient number of learning iterations, we can tell which subgoal is useful and which is harmful by looking at the values of parameters that control use of subgoals. These features are illustrated by simulation results using grid worlds with doors as example environments where the learning agent tries to find a path from the start cell to the goal cell. It is shown that use of exact subgoal information makes RL ten times faster in these rather simple problems. More acceleration can be expected in more complex problems. When a wrong subgoal is given and used without proper control, the agent can hardly find the optimal policy. On the other hand, with the proposed control mechanism enabled, the wrong subgoal is properly controlled, and its existence does not obstruct acquiring optimal actions. In this case, the learning is delayed, but the delay is not significant. It is also confirmed that the values of parameters introduced to control use of subgoals successfully display harmfulness/usefulness of subgoals. A wrong subgoal is not necessarily harmful; in some problem settings, it helps the agent find the optimal policy, and this is also properly reflected in the derived parameter values. The remaining part of the chapter is organized as follows. In the next section, more descriptions will be given to the subgoals in RL, and a technique will be proposed to use them with proper control. In Section 3, several examples will be shown that illustrate the validity of the proposed method, which is followed by conclusions in Section 4.

2. Controlled use of subgoals in reinforcement learning 2.1 Subgoals in reinforcement learning problems A subgoal is a state or a subset of states that must be visited on the way from the initial state to the final goal state. Since we assume subgoal information is provided by humans it

169

Controlled Use of Subgoals in Reinforcement Learning

should be more adequate to define a subgoal here as ‘a state or a subset of states that the human designer/operator thinks must be visited on the way from the initial state to the final goal state’, which implies that the subgoals can be erroneous. A subgoal divides the original problem into two subproblems with itself as the boundary: one subproblem where the agent is to find a policy that leads the agent/environment from the initial state to the subgoal state, and the other subproblem from the subgoal to the final goal state. Each subproblem is smaller than the original problem, and therefore it is easier for the agent to reach the goal of the subproblem by random actions, which results in a smaller number of trials to find a good policy. In this way, with subgoals, the original problem can be divided into a set of small subproblems, and this makes the learning faster. Imagine, for example, that you have an assistant robot in your office which learns its action policy by RL. Suppose that you order the robot to deliver a document to your colleague in an office upstairs. The robot tries to find a path from your office to the colleague’s office by random wandering. But it will take a very long time to find the office upstairs by chance. Alternatively, you can give an additional instruction ‘take an elevator’, which means a subgoal ‘being in an elevator’ is given. With this instruction, the robot will first try to find an elevator and then try to find a path from the elevator to your colleague’s office. This will be much easier to accomplish. The instruction ‘take an elevator’ can be valid in other situations like fetching a document from your colleague’s office. In this sense, subgoals can be general information that is useful in different but similar problems. However, your robot may go downstairs even if it successfully arrives at the subgoal ‘being in an elevator’. Or your building may have two or more elevators. Or even it may happen that you forget that your colleague has recently moved to an office on your floor. Subgoals can be a good guiding information but they may contain ambiguity and errors when given by humans. I3 B

F I4

Fig. 1. Subgoals and their structure Structure of subgoals can be represented by a directed graph as shown in Fig. 1 where B denotes the initial state and F the (final) goal state, and Ii, i=1, …, 4 stand for the subgoals. This figure indicates that, to reach the final goal F, the subgoal I1 must be visited first, next the subgoal I2, and then either I3 or I4 must be visited. Subgoals I3 and I4 are in parallel implying either subgoal must be visited, and I1 and I2 are in series indicating the subgoals must be visited in this order. In the following, terms upstream subgoals and downstream subgoals will be used. Upstream subgoals of a subgoal Ii are those subgoals closer to the initial state B than the subgoal Ii in the directed graph, and downstream subgoals of subgoal Ii are those closer to the final goal state F. In Fig. 1, subgoals I1 is an upstream subgoal of subgoal I2, and I3 and I4 are downstream subgoals of I2. Subgoals are presumably good states and the learning agent must be aware of this. Here, the goodness of a subgoal is represented and conveyed to the agent as a reward. This reward is not the real reward given by the environment but a virtual reward given by the agent itself. The received virtual rewards must be used to guide the action selection through value

170

Robotics, Automation and Control

functions. We have to be careful because the subgoals may contain errors and ambiguity and also because an already achieved subgoal becomes no longer useful. A wrong or useless subgoal can mislead the agent in the wrong direction or does not play any significant role in policy update. If a subgoal turns out to be harmful, we have to stop using it in learning. If a subgoal is found to be redundant, it is better to cease to use it. Therefore, we need to treat the real reward and the virtual reward associated with each subgoal separately so that we can control use of each subgoal independently. Besides, by dealing with harmfulness and redundancy separately, we can detect harmful subgoals that have been mistakenly included by human designer/operator. 2.2 Controlled use of subgoals Let us assume that we have n subgoals I1, …, In. The agent gives a virtual reward ri,t with respect to subgoal Ii to itself at time t which takes a positive value when state at t is in subgoal Ii and is equal to zero otherwise. The real reward given by the environment at time t is denoted by rt. To treat these real and virtual rewards separately, let us define a distinct action-value function based on each of them as follows: ∞

Q(s t , at ) = ∑ γ k rt + 1+ k ,

(1)

k =0

∞

Qi (s t , at ) = ∑ γ k ri ,t +1+ k , i = 1," , n ,

(2)

k =0

where st and at are state and action at time t, respectively, and γ ∈ (0 ,1) is discount factor. Before learning, these action-value functions are initialized as zero, and then updated by the standard Q-learning:

[

]

Q(st , at ) := (1 − α )Q(st , at ) + α rt +1 + γ max Q(st +1 , a) ,

[

]

Qi (st , at ) := (1 − α )Qi (s t , at ) + α ri ,t + 1 + γ max Qi (s t +1 , a) , i = 1," , n , a

(3) (4)

where operator ‘:=’ means substitution and α > 0 is learning rate. Usually, when a subgoal is achieved, the agent can concentrate on seeking the next subgoal and then the next. For example, in Fig. 1, when subgoal I1 is achieved, the agent only need to try to find subgoal I2, and when this is also achieved, subgoal I3 or I4 is the next target. In this way, the target subgoal is switched from one to another. In other words, the original problem is hierarchically divided into distinct subproblems. However this is relevant only when the given subgoal information is correct. In the situations of our interest, the subgoal information is not perfect, and therefore the next subgoal can be a wrong subgoal. If I2 is a wrong subgoal, it will be better, once I1 is achieved, to seek I3 or I4 than I2. In some cases, the order of subgoals may be wrong. In Fig. 1., for example, subgoal I2 may be actually the first subgoal to be achieved before I1. Therefore, we cannot employ the hierarchical problem partition. We must consider not the next subgoal only but all the subgoals in learning. In accordance with the above discussion, the actions are selected based on the following

171

Controlled Use of Subgoals in Reinforcement Learning

composite action-value function that consists of the action-value function calculated from the real rewards and all the action-value functions computed from the virtual rewards: n

Q A (s , a ) = Q(s , a) + ∑ c i d i Q i (s , a ),

(5)

i =1

where suffix t is dropped for simplicity. Positive coefficients ci and di, i = 1, …, n are introduced to control use of subgoals. Coefficient ci is used specifically to control use of subgoal when it is redundant while di is for regulating subgoal when it is harmful. They are initialized to 1.0, i.e. at the beginning, all the virtual rewards are considered as equally strongly as the real reward in action selection. Actual action is derived by applying an appropriate exploratory variation such as ε-greedy and softmax to the action that maximizes Q A (s , a) for the current state s. Therefore, learning of Q(s , a ) by equation (3) is an off-policy learning, and its convergence is assured just like the ordinary Q-learning on the condition that all state-action pairs are visited infinitely often. However, our interest is not in the convergence of Q(s , a ) for all state-action pairs but in avoiding visiting unnecessary stateaction pairs by appropriately controlled use of subgoals. When a subgoal Ii is found to be either redundant or harmful, its corresponding coefficient ci or di is decreased to reduce its contribution to the action selection. A subgoal Ii is redundant in state s when the optimal action in state s towards this subgoal Ii is identical to the optimal action towards the final goal F or towards another subgoal Ij, j ∈ R i , where Ri is the set of suffixes of subgoals that are reachable from subgoal Ii in the directed graph. In other words, subgoal Ii is redundant if, without help of the subgoal, the agent can find the optimal action that leads to the final goal or a downstream subgoal of ~ subgoal Ii which is closer to the final goal and thus more important. Let us define Q(s , a) as a sum of Q(s , a) and those Q j (s , a) associated with the downstream subgoals of subgoal Ii, ~ Qi ( s , a) = Q(s , a) + ∑ c j d j Q j (s , a).

(6)

j∈Ri

Then the optimal action in state s towards the downstream subgoals and the final goal is given by ~ ~ ai * (s ) = arg max Qi (s , a), a

(7)

and the optimal action towards subgoal Ii in state s by a i * (s ) = arg max Qi (s , a). a

(8)

The relationship between subgoals and action-value functions is illustrated in Fig. 2. If ~ Qi (s , a ) or Qi (s , a ) is zero or negative for any a, it means that sufficient positive real rewards or sufficient virtual rewards associated with Ij, j ∈ Ri have not been received yet and that the optimal actions given by equations (7) and (8) are meaningless. So, we need the following preconditions in order to judge redundancy or harmfulness of a subgoal in state s: ~ ∃a , Qi ( s , a) > 0 and ∃a , Qi (s , a) > 0 .

(9)

172

Robotics, Automation and Control

Now, we can say that subgoal Ii is redundant in state s when the following holds: a i * (s ) = ~ ai * (s ).

(10)

When subgoal Ii is found to be redundant in state s, its associated coefficient ci is reduced by a factor β ∈ (0 ,1) : c i := β c i .

(11)

Coefficient ci is not set to zero at once because we have found that subgoal Ii is redundant in this particular state s but it may be useful in other states. Note that another coefficient di is kept unchanged in this case. Although the composite action-value function Q A (s , a ) used for the action selection includes the terms related upstream subgoals of subgoal Ii, we do not consider them in reducing ci. The upstream subgoals are less important than subgoal Ii. Preconditions (9) mean that subgoal Ii has already been achieved in the past trials. Then, if subgoal Ii and any of the less important subgoals play the same role in action selection, i.e. either of them is redundant, then it is the coefficient associated with that less important upstream subgoal that must be decreased. Therefore the redundancy of subgoal Ii is checked only against its downstream subgoals. B

Ii+1

Qi+1

~ Qi

Fig. 2. Relationship between subgoals and action-value functions A subgoal Ii is harmful in state s if the optimal action towards this subgoal is different from the optimal action towards the final goal or towards another subgoal Ij, j ∈ R i , i.e. the action towards subgoal Ii contradicts with the action towards the final goal or a downstream subgoal. This situation arises when the subgoal is wrong or the agent attempts to go back to the subgoal seeking more virtual reward given there although it has already passed the subgoal. Using a i * (s ) and ~ ai * (s ) above, we can say a subgoal Ii is harmful in state s if a i * (s ) ≠ ~ ai * (s ),

(12)

and the preconditions (9) are satisfied. When a subgoal is judged to be harmful in state s, its associated coefficient di is reduced so that the subgoal does less harm in action selection. In this case coefficient ci remains unchanged. Let us derive a value of di that does not cause the conflict (12). Such value of di, denoted by d io , must be a value such that the action selected ~ ~ by maximizing c i d io Qi (s , a) + Qi (s , a) does not differ from the action selected by Qi (s , a) only. So, the following must hold for state s,

[

]

~ ~ arg max c i d io Qi (s , a) + Qi (s , a ) = arg max Qi (s , a). a

(13)

173

Controlled Use of Subgoals in Reinforcement Learning

Considering equation (7), the above equation (13) holds when ~ ~ c i d io Qi (s , a ) + Q i (s , a ) ≤ c i d io Qi ( s , ~ ai * (s )) + Qi ( s , ~ ai * (s ))

(14)

is satisfied for all a. Then, by straightforward calculation, the value of d io that assures the above inequality (14) is derived as ~ ~ ai * (s )) − Qi (s , a) 1 Qi (s , ~ , Ai ( s ) = a Q i ( s , a ) > Q i ( s , ~ ai * (s )) . a∈Ai ( s ) c Q ( s , a ) − Q ( s , ~ ai * (s )) i i i

d io = min

{

}

(15)

In equation (15) we restrict actions to those belonging to set Ai(s). This is because for actions ai * (s)) , inequality (14) naturally holds for any di since which satisfy inequality Qi (s , a) ≤ Qi (s , ~ ~ ~ ~ c i d i > 0 and Qi ( s , a ) ≤ Qi (s , ai * (s )) from the definition of ~ ai * (s ) in equation (7). Now di is slightly reduced so that it approaches d io by a fraction of δ i :

d i := (1 − δ i )d i + δ i d io ,

(16)

where δ i is a small positive constant. There is a possibility that the original value of di is already smaller than d io . In that case, di is not updated. Coefficient di is not reduced to d io at once. We have observed a conflict among the subgoal Ii and a downstream subgoal (or the final goal itself), and it seems that we need to reduce the coefficient di for subgoal Ii to solve the conflict. The observed conflict is genuine on the condition that the action-value functions Qi, Qj, j ∈ Ri and Q used to detect the conflict are sufficiently correct (in other words they are well updated). Therefore, in the early stage of learning, the observed conflict can be nonauthentic. Even if the conflict is genuine, there is a situation where di is not to be reduced. Usually a downstream subgoal of subgoal Ii is more important than Ii, and therefore the conflict must be resolved by changing the coefficient associated with the subgoal Ii. However, when the downstream subgoals are wrong, reducing the coefficient associated with the subgoal Ii is irrelevant. These possibilities of non-genuine conflict and/or wrong downstream subgoals demand a cautious reduction of di as in equation (16). Moreover, to suppress possible misleading by wrong downstream subgoals, parameter δ i is set smaller for upstream subgoals because a subgoal located closer to the initial state has a more number of downstream subgoals and therefore is likely to suffer from more undesirable effect caused by wrong subgoals. ~ Because update of di depends on downstream coefficients cj and dj, j ∈ Ri contained in Q i , the update is done starting with the last subgoal namely the subgoal closest to the final goal to the first subgoal that is the closest to the initial state. The overall procedure is described in Fig. 3. Action-values Q and Qi are updated for st and at, and then it is checked if these updates have made the subgoal Ii redundant or harmful. Here the action-values for other state-action pairs remain unchanged, and thus it suffices that the preconditions (9) are checked for st and at only. Each of coefficients ci, i= 1, …, n represents non-redundancy of its associated subgoal, while di reflects harmlessness of the subgoal. All of coefficients ci eventually tend to zero as the learning progresses since the agent does not need to rely on any subgoal once it has found

174

Robotics, Automation and Control

an optimal policy that leads the agent and environment to the final goal. On the other hand, the value of di depends on the property of its associated subgoal; di remains large if its corresponding subgoal is not harmful while di associated with a harmful subgoal decreases to zero. Therefore, by inspecting the value of each di when the learning is complete, we can find which subgoal is harmful and which is not.

Fig. 3. Learning procedure

3. Examples The proposed technique is tested on several example problems where an agent finds a path from the start cell to the goal cell in grid worlds. The grid worlds have several doors each of which requires a fitting key for the agent to go through it as shown in Fig. 4. The agent must pick up a key to reach the goal. Therefore having a key, or more precisely having just picked up a key, is a subgoal. The state consists of the agent’s position (x-y coordinates) and which key the agent has. The agent can move to an adjacent cell in one of four directions (north, south, east and west) at each time step. When the agent arrives at a cell where a key exists, it picks up the key. Key 1 opens door 1, and key 2 is the key to door 2. The agent receives a reward 1.0 at the goal cell F and also a virtual reward 1.0 at the subgoals. When it selects a move to a wall or to the boundary, a negative reward −1.0 is given and the agent stays where it was. An episode ends when the agent reaches the goal cell or 200 time steps have passed.

Controlled Use of Subgoals in Reinforcement Learning

175

Fig. 4. Grid world 1

Fig. 5. Subgoal structure of grid world 1 3.1 Effect of use of correct subgoals The subgoals in the example depicted in Fig. 4 can be represented by a directed graph shown in Fig.5. In RL, the first arrival at the goal state must be accomplished by random actions because the agent has no useful policy yet. Since the agent has to collect two keys to go through the two doors in this example, it takes a large number of episodes to arrive at the final goal by random actions only. Here we are going to see how much acceleration of RL we will have by introducing correct subgoals. Q-learning is performed with and without taking the subgoals into consideration. The parameters used are as follows: discount factor γ=0.9, learning rate α=0.05, β in equation (11) is 0.99, and decreasing rates δ i of coefficient di is 0.005 for subgoal I1 and 0.01 for I2. Softmax action selection is used with ‘temperature parameter’ being 0.1. The numbers of episodes required for the agent to reach the goal for the first time by greedy action based on the learnt QA (i.e. the action that maximizes QA) and the numbers of episodes necessary to find an optimal (shortest) path to the goal are listed in Table 1. These are averages over five runs with different pseudo random number sequences. The table indicates that consideration of the correct subgoals makes the learning more than ten times faster in this small environment, which verifies the validity of introducing correct subgoals to accelerate RL. Also more acceleration can be expected for larger or more complex environments.

Without subgoals With subgoals

Number of episodes First arrival at the goal Finding an optimal path 11861.0 13295.8 1063.2 1068.0

Table 1. Numbers of episodes required before achieving the goal (grid world 1)

176

Robotics, Automation and Control

3.2 Effect of controlled use of subgoals Now let us turn our attention to how the control of use of subgoals by coefficients ci and di works. Here we consider another grid world shown in Fig. 6 where key 1 is the only correct key to the door and key 2 does not open the door. We apply the proposed method to this problem considering each of subgoal structures shown in Fig. 7. In this figure, subgoal structure (a) is the exact one, subgoal structure (b) has a wrong subgoal only, subgoal structures (c) and (d) have correct and wrong subgoals in series and subgoal structure (e) has correct and wrong subgoals in parallel. The same values are used as in the previous subsection for the parameters other than δ i . For a single subgoal in (a) and (b) δ1 is set to 0.01, for series subgoals in (c) and (d) δ1 =0.005 and δ 2 =0.01 are used, and for the parallel subgoals in (e) 0.01 is used for both δ1 and δ 2 .

Fig. 6. Grid world 2 with a correct key and a wrong key

Fig. 7. Possible subgoal structures for grid world 2

177

Controlled Use of Subgoals in Reinforcement Learning

The numbers of episodes before the first arrival at the goal and before finding an optimal path are shown in Table 2 together with the values of coefficients di after learning and the ratio of di for the correct subgoal (dcorrect) to di for the wrong subgoal (dwrong) where available. All of these are averages over five runs with different pseudo random number sequences. Subgoals used in learning None Correct Wrong Correct & wrong in series Wrong & correct in series Correct & wrong in parallel

Number of episodes First Finding an arrival at optimal the goal path 99.4 103.2 76.2 79.0 139.0 206.8

Coefficients di after learning For correct For wrong dcorrect subgoal subgoal /dwrong (dcorrect) (dwrong) 2.61×10−1 9.79×10−5

116.6

180.0

3.48×10−1

3.52×10−5

1.30×107

87.4

97.8

4.15×10−2

7.06×10−3

1.37×105

116.8

163.4

9.85×10−2

2.21×10−4

1.03×108

Table 2. Numbers of episodes required before achieving the goal (grid world 2) With the exact subgoal information given, the agent can reach the goal and find the optimal path faster than the case without considering any subgoal. When a wrong subgoal is provided in place of / in addition to the correct subgoal, the learning is delayed. However, the agent can find the optimal path anyway, which means that introducing a wrong subgoal does not cause a critical damage and that the proposed subgoal control by coefficients ci and di works well. Finding the optimal path naturally takes more episodes than finding any path to the goal. The difference between them is large in the cases where wrong subgoal information is provided. This is because the coefficient associated with the wrong subgoal does not decay fast enough in those cases. The preconditions (9) for reducing the coefficient demand that the subgoal in question as well as at least one of its downstream subgoals have been already visited. Naturally the subgoals closer to the initial state in the state space (not in the subgoal structure graph) are more likely to be visited by random actions than those far from the initial state. In this grid world, the correct key 1 is located closer to the start cell than the wrong key 2 is, and therefore the correct subgoal decays faster and the wrong subgoal survives longer, which causes more delay in the learning. Coefficients di are used to reduce the effect of harmful subgoals. Therefore, by looking at their values in Table 2, we can find which subgoal has been judged to be harmful and which has not. Each of the coefficients di for the correct subgoals takes a value around 0.1 while each of those for the wrong subgoals is around 10−4. Each ratio in the table is larger than 105. Thus the coefficients di surely reflect whether their associated subgoals are harmful or not. In Table 2, the coefficient for the wrong subgoal in the case of ‘wrong and correct subgoals in series’ is 7.06×10−3 and is not very small compared with the value of 4.15×10−2 for the correct subgoal. This has been caused by just one large coefficient value that appeared in one of the five runs. Even in this run, the learning is successfully accomplished just like in other runs. If we exclude this single value from average calculation, the average coefficient value for this subgoal is around 10−6.

178

Robotics, Automation and Control

To confirm the effect of subgoal control, learning is performed with the coefficient control disabled, i.e. both of ci and di are fixed to 1.0 throughout the learning. In the case that the correct subgoal is given, the result is the same as that derived with the coefficient control. However, in other four cases where a wrong subgoal is given, the optimal path has not been found within 200000 episodes except for just one run in the five runs. Therefore, simply giving virtual rewards to subgoals does not work well when some wrong subgoals are included. When either ci or di is fixed to 1.0 and the other is updated in the course of learning, similar results to those derived by updating both coefficients are obtained, but the learning is delayed when wrong subgoal information is provided. In composite action-value function QA used in action selection, each action-value function Qi associated with subgoal Ii is multiplied by a product of ci and di. The product decreases as the learning proceeds, but its speed is slow when either ci or di is fixed. A large product of ci and di makes the ‘attractive force’ of its corresponding subgoal strong, and the agent cannot perform a bold exploration to go beyond the subgoal and find a better policy. Then harmfulness of a subgoal cannot be detected since the agent believes that visiting that subgoal is a part of the optimal path and does not have another path to compare with in order to detect a conflict. Therefore, coefficient ci must be reduced when its associated subgoal is judged to be redundant to help agent to explore the environment and find a better policy. The above results and observation verifies that the proper control of use of subgoals is essential. 3.3 Effect of subgoals on problems with different properties In the results shown in Table 2, the learning is not accelerated much even if the exact subgoal structure is given, and the results with wrong subgoal are not too bad. Those results of course depend on the problems to be solved. Table 3 shows the results for a problem where the positions of key 1 and key 2 are exchanged in grid world 2. Also the results for grid world 3 depicted in Fig. 8 are listed in Table 4. Here the correct and the wrong keys are located in the opposite directions from the start cell. The same parameter values are used in both examples as those used in the original grid world 2. The values in the tables are again averages over five runs with different pseudo random number sequences. Subgoals used in learning None Correct Wrong Correct & wrong in series Wrong & correct in series Correct & wrong in parallel

Number of episodes First Finding an arrival at optimal the goal path 323.8 343.6 117.4 121.2 196.8 198.4

Coefficients di after learning For wrong For correct subgoal subgoal dcorrect /dwrong (dwrong) (dcorrect) 3.82×10−3 2.23×10−2

188.2

189.6

6.84×10−3

9.42×10−3

2.53×101

117.2

126.8

4.37×10−5

3.80×10−1

9.47×10-4

100.0

100.6

2.08×10−2

9.57×10−3

5.14

Table 3. Numbers of episodes required before achieving the goal (grid world 2 with keys exchanged)

179

Controlled Use of Subgoals in Reinforcement Learning

Fig. 8. Grid world 3 with two keys in opposite directions By exchanging the two keys in grid world 2, the problem becomes more difficult than the original because the correct key is now far from the start cell. So, without subgoals, the learning takes more episodes, and introduction of subgoals is more significant than before as shown in Table 3. The wrong key is located on the way from the start cell to the correct key, and although picking up the wrong key itself has no useful meaning, the wrong subgoal guides the agent in the right direction towards the correct subgoal (correct key). Therefore the wrong subgoal information in this grid world is wrong but not harmful; it is even helpful in accelerating the learning as shown in Table 3. Also, since it is not harmful, coefficients di corresponding to the wrong subgoals remain large after the learning. Subgoals used in learning None Correct Wrong Correct & wrong in series Wrong & correct in series Correct & wrong in parallel

Number of episodes First Finding an arrival at optimal the goal path 153.8 155.8 95.8 99.6 150.4 309.8

Coefficients di after learning For wrong For correct subgoal subgoal dcorrect /dwrong (dwrong) (dcorrect) 4.80×10−2 1.08×10−4

170.2

346.8

1.84×10−3

9.83×10−5

7.32×101

107.2

109.0

2.01×10−3

1.22×10−4

4.98×101

106.4

226.0

6.04×10−3

4.75×10−4

1.05×106

Table 4. Numbers of episodes required before achieving the goal (grid world 3) In contrast, the wrong key in grid world 3 lies in the opposite direction from the correct key. So, this wrong subgoal has worse effect on the learning speed as shown in Table 4. Here the coefficients di for the wrong subgoals are smaller than those for the correct subgoals. For grid worlds 2 and 3, the actual subgoal structure is that shown in Fig. 7. (a). To investigate the performance of the proposed method on problems with parallel subgoals, key 2 in grid world 2 is changed to a key 1. So the environment now has two correct keys, and the actual subgoal structure is just like Fig. 7. (e) but both the keys are correct. Five different subgoal structures are considered here: ‘near subgoal’, ‘far subgoal’, ‘near and far

180

Robotics, Automation and Control

subgoals in series’, ‘far and near subgoals in series’ and ‘near and far subgoals in parallel’ where ‘near subgoal’ denotes the subgoal state ‘picking up key near the start cell’, and ‘far subgoal’ refers to the subgoal ‘picking up the key far from the start cell’. Note that there is no wrong subgoal in this grid world. The results shown in Table 5 are similar to those already derived. Introduction of subgoal(s) makes the goal achievement faster, but in some subgoal settings, finding the optimal path is slow. The subgoal structure ‘near and far subgoals in parallel’ is the exact one, but this gives the worst performance in finding the optimal path in the table. In this problem, both the keys correspond to correct subgoals, but one (near the start cell) is more preferable than the other, and the less-preferable subgoal survives longer in this setting as described in Section 3.2. This delays the learning. Number of episodes Subgoals used in learning None Near Far Near & far in series Far & near in series Near & far in parallel

First arrival at the goal 106.0 76.2 136.2 126.6 84.6 116.4

Finding an optimal path 109.6 79.0 203.8 205.2 95.0 169.6

Coefficients di after learning For near subgoal

For far subgoal

2.62×10−1 4.06×10−1 2.78×10−2 7.95×10−2

2.96×10−3 1.15×10−5 7.06×10−3 2.21×10−4

Table 5. Numbers of episodes required before achieving the goal (grid world 2 with two correct keys) Introduction of subgoals usually makes goal achievement (not necessarily by an optimal path) faster. But, a wrong or less-preferable subgoal sometimes makes finding the optimal path slower than the case without any subgoals considered, especially when it occupies a position far from the initial state. However, the wrong subgoals do not cause critically harmful effect such as impractically long delay and inability of finding the goal at all thanks to the proposed mechanism of subgoal control. Also we can find the harmful subgoals by inspecting the coefficient values used for subgoal control. This verifies the validity of the proposed controlled use of subgoals in reinforcement learning.

4. Conclusions In order to make reinforcement learning faster, use of subgoals is proposed with appropriate control of each subgoal independently since errors and ambiguity are inevitable in subgoal information provided by humans. The method is applied to grid world examples and the results show that use of subgoals is very effective in accelerating RL and that, thanks to the proposed control mechanism, errors and ambiguity in subgoal information do not cause critical damage on the learning performance. Also it has been verified that the proposed subgoal control technique can detect harmful subgoals. In reinforcement learning, it is very important to balance exploitation, i.e. making good use of information acquired by learning so far in action selection, with exploration, namely trying different actions seeking better actions or policy than those already derived by learning. In other words, a balance is important between what is already learnt and what is to be leant

Controlled Use of Subgoals in Reinforcement Learning

181

yet. In this chapter, we have introduced subgoals as a form of a priori information. Now we must compromise among leant information, information yet to be learnt and a priori information. This is accomplished, in the proposed technique, by choosing proper values for β and δ i that control use of a priori information through coefficients ci and di as well as an appropriate choice of exploration parameter such as ‘temperature parameter’ used in softmax that regulates exploration versus exploitation. A good choice of parameters may need further investigations. However, this will be done using additional a priori information such as confidence of the human designer/operator in his/her subgoal information. Also a possible extension of the method is to combine it with a subgoal learning technique.

5. Acknowledgements The author would like to acknowledge the support for part of the research by the Japan Society for the Promotion of Science, Grant-in-Aid for Scientific Research (C), 16560354, 2004-2006.

6. References Bakker, B. & Schmidhuber, J. (2004). Hierarchical Reinforcement Learning with Subpolicies Specializing for Learned Subgoals, Proc. 2nd IASTED Int. Conf. Neural Networks and Computational Intelligence, pp. 125-130. Crites, R.H. & Barto, A.G. (1998). Elevator Group Control Using Multiple Reinforcement Learning Agents, Machine Learning, Vol. 33, pp. 235-262. Driessens, K. & Džeroski, S. (2004). Integrating Guidance into Relational Reinforcement Learning, Machine Learning, Vol. 57, pp. 271-304. Kaelbling, L.P.; Litman, M.L. & Moor, A.W. (1996). Reinforcement Learning: A survey, J. of Artificial Intelligence Research, Vol. 4, pp. 237-285. Kamal, M.A.S.; Murata, J. & Hirasawa, K. (2005). Elevator Group Control Using Multiagent Task-Oriented Reinforcement Learning, IEEJ Trans. EIS, Vol. 125, pp. 1140-1146. Kamal, M.A.S. & Murata, J. (2008). Reinforcement learning for problems with symmetrical restricted states, Robotics and Autonomous Systems, to appear. Kimura, H.; Yamash*ta, T. & Kobayashi, S. (2001), Reinforcement Learning of Walking Behavior for a Four-Legged Robot, Proc. 40th IEEE Conf. Decision and Control, pp. 411-416. Kretchmar, R.M.; Feil, T. & Bansal, R. (2003). Improved Automatic Discovery of Subgoals for Options in Hierarchical Reinforcement Learning, J. Computer Science & Technology, Vol. 3, pp. 9-14. McGovern, A. & Barto, A.G. (2001). Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density, Proc. 18th Int. Conf. Machine Learning, pp. 361-368. Millan, J.D. (1995). Reinforcement learning of goal-directed obstacle-avoiding reaction strategies in an autonomous mobile robot, Robotics and Autonomous Systems, Vol. 15, pp. 275-299. Murata , J.; Ota, K. & Abe, Y. (2007). Introduction and Control of Subgoals in Reinforcement Learning, Proc. IASTED Conf. Artificial Intelligence and Applications, pp. 329-334. Singh, S. (1992). The Efficient Learning of Multiple Task Sequences, In: Advances in Neural Information Processing Systems 4, pp. 251-258, Morgan Kauffman, San Mateo, USA.

182

Robotics, Automation and Control

Smart, W.D. & Kaelbling, L.P. (2000). Practical Reinforcement Learning in Continuous Spaces, Proc. 17th Int. Conf. Machine Learning, pp. 903-910. Sutton, R.S. & Barto, A.G. (1998). Reinforcement Learning, An Introduction, A Bradford Book, The MIT Press, Cambridge, USA. Tesauro, G.J. (1994). TD-gammon, a self-teaching backgammon program, archives masterlevel play, Neural Computation, Vol. 6, pp. 215–219. Wang, Y.; Huber, M.; Papudesi, V.N. & Cook, D.J. (2003). User-Guided Reinforcement Learning of Robot Assistive Tasks for an Intelligent Environment, Proc. 2003 IEEE/RSJ Int. Conf. Intelligent Robots and Systems, pp. 424-429. Wiering, M. & Schmidhuber, J. (1996). HQ-Learning: Discovering Markovian Subgoals for Non-Markovian Reinforcement Learning, Tech. Rep. IDSIA-95-96. Zennir, Y.; Couturier, P. & Temps, M.B. (2003). Distributed Reinforcement Learnig of a SixLegged Robot to Walk, Proc. 4th Int. Conf. Control and Automation, pp. 896-900.

11 Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets Oussama Mustapha1, Mohamad Khalil2,3, Ghaleb Hoblos4, Houcine Chafouk4 and Dimitri Lefebvre1 1University

Le Havre, GREAH, Le Havre, University, Faculty of Engineering, 3Islamic University of Lebanon, Biomedical Department, Khaldé, 4ESIGELEC, IRSEEM, Saint Etienne de Rouvray, 1,4France 2,3Lebanon 2Lebanese

1. Introduction The fault detection and isolation (FDI) are of particular importance in industry. In fact, the early fault detection in industrial systems can reduce the personal damages and economical losses. Basically, model-based and data-based methods can be distinguished. Model-based techniques require a sufficiently accurate mathematical model of the process and compare the measured data with the estimations provided by the model in order to detect and isolate the faults that disturb the process. Parity space approach, observers design and parameters estimators are well known examples of model-based methods (Patton et al, 2000), (Zwingelstein, 1995), (Blanke et al., 2003), (Maquin & Ragot, 2000). In contrast, data-based methods require a lot of measurements and can also be divided into signal processing methods and artificial intelligence approaches. Many researchers have performed fault detection by using vibration analysis for mechanical systems, or current and voltage signature analysis for electromechanical systems (Awadallah & Morcos 2003), (Benbouzid et al., 1999). Other researchers use the artificial intelligence (AI) tools for faults diagnosis (Awadallah & Morcos 2003) and the frequency methods for faults detection and isolation (Benbouzid et al., 1999). This study continues our research in frequency domain, concerning fault detection by means of filters bank (Mustapha et al-a, 2007), (Mustapha et al-b, 2007). The aim of this article is to propose a method for the on-line detection of changes applied after a modeling of the original signal. This modeling is based on a filters bank decomposition that is needed to explore the frequency and energy components of the signal. The filters coefficients are derived from the wavelet packets, so the wavelet packets characteristics are approximately conserved and this leads to both filtering and reconstruction of the signal. This work is a continuity of our previous works for deriving a filters-bank from wavelet packets because the wavelet packets offer more flexibility for signal analysis and offer a lot of bases to represent the signal. The main contributions are to derive the filters and to evaluate the error between filters bank and wavelet packets response curves. Filters bank is preferred in comparison with wavelet

184

Robotics, Automation and Control

packets because it could be directly hardware implemented as a real time method. Then, the Dynamic Cumulative Sum detection method (Khalil, Duchêne, 1999) is applied to the filtered signals (sub-signals) in order to detect any change in the signal (figure 1). The association of filters bank decomposition and the DCS detection algorithm will be shown to be of great importance when the change is in the frequency domain.

L-signal componen ts y (1)(t ) Signal

L - channe ls filters b ank deriv ed from a wavelet packets

(n)

y (t )

DCS

Event detection

(L)

y (t ) Fig. 1.Two stages algorithm for change detection (L is the number of filters used). This paper is decomposed as follows. First we will explain the problem and we will present the utility of the decomposition before the detection. Then in section 3, the wavelet transform and the wavelet packets are presented and the derivation of filters bank from a wavelet packets and the problem of curve fitting are introduced, in the same section, the best tree selection based on the entropy of the signal and the filters bank channels corresponding to the suitable scale levels are discussed. In section 4, the Cumulative Sum (CUSUM), the Dynamic Cumulative Sum (DCS) algorithms and the fusion technique are detailed. Finally, the method is applied for the diagnosis of the Tennessee Eastman Challenge Process.

2. Filters bank for decomposition and detection The simultaneous detection and isolation of events in a noisy non-stationary signal is a major problem in signal processing. When signal characteristics are known before and after the change, an optimal detector can be used according to the likelihood ratio test (Basseville & Nikiforov 1993). But when the signal to be detected is unknown, the Generalized Likelihood Ratio Test (GLRT) which consists of using the maximum likelihood estimate of the unknown signal will be used. In general, the segmentation depends on the parameters that change with time. These parameters, to be estimated, depend on the choice of the signal modeling. Most authors make use of application-dependent representations, based on AR modeling or on wavelet transform, in order to detect or characterize events or to achieve edge detection in signals (Mallat, 2000). When the change is energetic, many methods exist for detection purposes. But when the change is in frequency contents, special modeling, using a filters bank, is required before the application of the detection methods. After this modeling, the detection algorithm (DCS) will be applied on the decomposed signals instead of applying it to the original signal (see figure 1). The motivation is that the filters bank modeling can filter the signals and transform the frequency change into energy change. Then we choose only the sub-signals which present energy changes after decomposition. Furthermore, the detectability of DCS is improved if the changes are in energy. The subsignals can be used also to classify the detected events and this will be done after extracting the necessary parameters from isolated events and finally this aims to make diagnosis.

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

185

This work is originated from the analysis and characterization of random signals. In our case, the recorded signals can be described by a random process x(t) as:

x(t ) = x1 (t ) Before the point of change tr x(t ) = x2 (t ) After the point of change tr tr is the exact time of change. x1(t) and x2(t) can be considered as random processes where the statistical features are unknown but assumed to be identical for each segment 1 or 2. Therefore we assume that the signals x1(t) and x2(t) have Gaussian distributions. We will suppose also that the appearance times of the changes are unpredictable. In our case, we suppose that the frequency distribution is an important factor for discriminating between the successive events. In this way, the filters bank decomposition will be useful for classification purposes. For each component m, and at any discrete time t, the sample y(m)(t) is on-line computed in terms of the original signal x(t) using the parameters a(i)(m) and b(i)(m) of the corresponding filter according to the following difference equation:

p q y(m) (t ) = ∑ b(i ) ( m) x(t − i ) − ∑ a(i ) (m) y (m) (t − i ) i =0 i =1

(1)

After decomposition of x(t) into y(m)(t), m=1,2,3,…L the problem of detection can be transformed to an hypothesis test: H 0 : y ( m) t ; t = {1,..., t r } has a probability density function f0 H1 : y ( m) t ; t = {t r + 1,..., n} has a probability density function f1

Figure 2 shows the original signal presenting a change in frequency content at time 1000 time units (TU). We can see that the decomposition enhances the energy changes, and it is more accurate to apply the detection on the sub-signals instead of applying it on the original signal.

Fig. 2. a) original simulated signal presenting a frequency change at tr=1000 TU. b,c,d) the decomposition of the signal into three sub-signals using 3 channels filters bank.

186

Robotics, Automation and Control

3. Filters bank derived from wavelet packets 3.1 Wavelet Transform The Fourier analysis is the most well known mathematical tool used for transforming the signal from time domain to frequency domain. But it has an important drawback represented by the loss of time information when transforming the signal to the frequency domain. To preserve the temporal aspect of the signals when transforming them to frequency domain, one solution is to use the Wavelet Transform (Flandrin, 1993) which analyzes non-stationary signals by mapping them into time-scale and time-frequency representation. The Wavelet Transform is similar to the Short Time Fourier Transform but provides, in addition, a multi-resolution analysis with dilated and shifted windows. The multi-resolution analysis consists of decomposing the signal x(t) using the wavelet ψ (t ) and its scale function φ (t ) ,(Papalambros & Wilde, 2000).

ψ ab (t) =

1 a

t -b ) a

ψ(

+∞

Txψ ( a, b) = ∫ x(t )ψ ab (t )dt −∞

(2)

(3)

where a and b are respectively the dilation and translation parameters. The filter associated to the scale function φ (t ) is a low pass filter and the filter associated to the wavelet ψ (t ) is a band pass filter. 1 t ψ( ) . The relation (3) can be seen as the cross correlation function of the signal x(t) and a a This relation can also be written as follows: T (τ , a ) = x(τ ) ∗

1 a

−τ ) a

(4)

τ ψ ( )

(5)

ψ *(

* But knowing that ψ ( −t ) = ψ (t ) , then:

T (τ , a) = x (τ ) ∗

1 a

The relation (5) indicates that the wavelet transform can be obtained by filtering the signal 1 t ψ( ) and of central x(t) by using a series of band pass filters of impulse responses a a frequencies centered at

. a The band width of the band pass filters bank decreases when a increases. In order to decompose a signal into components of equal decreasing frequency intervals, we have to use a discrete time-frequency domain and the dyadic wavelet transform: 1 t -τ ψ( ) become: For a = 2 j and b = k 2 j (j and k are integers, the wavelets a a

187

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

ψ j, k (t ) = 2 − j / 2ψ ( 2 − j t − k )

(6)

H(jf) in dB

f f1

Fig. 3. Response curves of the wavelet filters. If the signal x(t) is sampled, the discrete wavelet transform will be: +∞

T j ,k = ∑ xnψ *j ,k (n)

(7)

n=−∞

Note that all the wavelet frequencies will be between 0 and fs (sampling frequency of the signal). The use of discrete time-frequency domain and the dyadic wavelet transform leads to decompose the signal into components of equal decreasing frequency intervals. 3.1.1 Scaling function and wavelet bases The projection of a function f(t)∈ L2(R) on the orthonormal basis {φ(t-k)} is a correlation between the original function f(t) and the scaling function φ(t) sampled at integer intervals. Approx j (t ) = ( projV j x)(t )

(8)

The approximations resulting from the projection of f(t) on the scaling function basis form subspaces Vj ∈ L2(R). The projection of a function on the orthonormal scaling function basis results in loss of some information about the function: Detail j (t ) = approx j −1 (t ) − approx j (t )

(9)

To obtain the detail information of the function, we use the projections on the orthonormal wavelet bases. With Discrete Wavelet Transform (DWT), the multi-resolution analysis uses a scaling function and a wavelet to perform successive decompositions of the signal into approximations and details (Chendeb, 2002).

x(t)

scale

let

et detail1 [w] 1

detail2 [w 2]

scale

et el av

e av

el av

approx 2..... [v 2 ]

app rox 1 [v1 ]

approx j..... [vj ] detailj [w]j

Fig. 4. Multiresolution Analysis: successive decompositions into approximations and details.

188

Robotics, Automation and Control

3.1.2 Orthogonal and bi-orthogonal analysis with Mallat algorithm Practically, the Mallat’s algorithm is used for orthogonal and bi-orthogonal analysis. This algorithm consists of low-pass filtering the signal with a filter having its impulse response h[n] then down-sampling the result to obtain the lower resolution approximation, and highpass filtering the signal with a filter having its impulse response g[n] then down-sampling the result to obtain the lower resolution detail information. The numerical sequence h[n] will be considered as the impulse response of a digital filter. In the filters bank theory, we can find that the wavelet can be modeled by a discrete quadrature mirror filters bank g[n] and h[n] such that (Flandrin, 1993): g[n] = ( −1) n h[1 − n]

(10)

g[n] will be considered as the impulse response of a discrete low pass filter whose coefficients can represent the scale function φ(t). And h[n] will be considered as the impulse response of a discrete high pass filter whose coefficients can represent the wavelet Ψ(t). Figure 5 represents the Mallat's multiresolution recursive analysis.

Fig. 5. Analysis Algorithm of Mallat. The discrete wavelet transform consists of discrete filtering operation followed by downsampling process. Figure 6 shows the response curves of the quadrature mirror filters used for three levels decomposition.

Fig.6. Response curves of the filters used in Mallat's algorithm (fci are the cut off frequencies of the filters used).

189

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

3.1.3 Base and wavelet selection The choice of the wavelet is a critical problem. To extract a specific signal event, the choice of the wavelet becomes important such that the wavelet must be adapted to the changes to be detected. On the other hand, the choice of scale levels is also important. After decomposing a signal, the most suitable scale levels are chosen according to their frequency bands, and the application of the detection must be applied to the most suitable scale levels. Most likely, the choice of the filters bank, after the derivation, depends on the original signal and its frequency band. The number of filters depends on the details that we have to extract from the signal and to the events that must be distinguished. The result of detection depends on the number of the filters used and the cut off frequency of each channel. In practice, L-channels derived filters are used. These filters are chosen so that the cut off frequencies are between zero Hertz and the half of the sampling frequency (fs/2). We will see later an automatic procedure, based on the entropy of the signal, to select the best filters bank cut off frequencies. 3.2 The wavelet packets The wavelet packet decomposition is a generalization of the wavelet decomposition that offers more flexibility for signal analysis. In wavelet analysis, the approximation of the signal is decomposed into a lower-level approximation and detail. So the dyadic decomposition tree of the wavelet transform represents the signals by means of a fixed basis, and for n-level decomposition tree, we have n+1 possible ways to decompose the signal. In wavelet packet analysis, the details as well as the approximations can be decomposed into a n −1

lower-level approximations and details and this yields more than 2 2 possible ways to decompose the signal. So the wavelet packets offer a lot of bases to represent the signal from which a best decomposition tree can be selected, according to a given criterion, to meet the design objectives.

Fig. 7. Wavelet packet, decomposition of both the details and the approximations. 3.2.1 Wavelet packets decomposition The wavelet packets are defined by Coifman, Meyer and Wickerhauser (Mallat, 1999) as a generalized relation between the multiresolution approximations and the wavelet. The approximation multiresolution space V j will be decomposed into a lower resolution approximation V

j+1

and details W

j+1

. This is done by decomposing the orthogonal basis

190

Robotics, Automation and Control

{φ j (t − 2 j n)}n∈Z

{

}

of V j into two new orthogonal bases φ j +1 (t − 2 j +1 n) of V j +1 and n∈Z

{ψ j+1(t − 2 j+1n)}n∈Z of W j+1 .

The difference between the wavelet and the wavelet packet decompositions are shown by the binary tree indicated by figure 8. And for seek of simplicity, in wavelet packet decomposition, the vector space V will be replaced by W.

Vj-1

Wj- 1,0

Wj-1

Vj-2

Wj-2

Wj-2,0

V0,0

Wj-1,1

Wj-2,1 Wj-2,2 V0,1

a) Wavelet decomposition

V0,2

Wj-2,3

...V0,k ...V0,2j-1

b) Wavelet packet decomposition

Fig. 8. Wavelet and wavelet packet decompositions. The wavelet packets nodes can be indexed by the scale parameter j, the node index n at the scale level j and the time parameter k. For different values of j and n, the wavelet packets decomposition can be organized as a binary tree as shown in figure (9).

J=0

W0,0 h W1,0

J=1 h J=2

J=3

W3,0 h

W1,1 g

W2,0 h

J=4

h W2,1

W3,1 h

W3,2 g

W2,2 g

W3,3 h

W3,4 g

W2,3 g

W3,5 h

W3,6 g

g W3,7 h

W4,0 W4,1W4,2 W4,3 W4,4 W4,5W4,6 W 4,7 W4,8 W4,9 W4.10 W4,11 W4,12 W4,13W 4,14 W4,15

Fig. 9. Binary tree obtained after wavelet packets decomposition.

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

191

Figure 9 shows clearly that the wavelet packet decomposition offers a lot of bases to 2n−1 ,instead of n+1, possible ways to represent the signal and we have more than 2 decompose the signal, from which a best decomposition tree can be selected for signal analysis. 3.2.2 Wavelet packets filters bank Each node of the binary tree is indexed by (j,n) where n is the number of the node at level j. For each node there is an associated space W j ,n which admits an orthogonal

{

}

basis ψ nj (t − 2 j k )

k∈Z

. The relations between a node and its children nodes can be seen as

follows:

j +1, 2 n (t ) =

+∞

∑ h[ k ]ψ

k = −∞

j ,n (t

− 2 j k)

(11)

and

j +1, 2 n +1 (t )

+∞

= ∑ g[k ]ψ k = −∞

j ,n (t

− 2 j k)

(12)

The wavelet packets coefficients are calculated by using a filters bank algorithm. Suppose that the projection tree of a signal in the sub-space W is C, we can define a wavelet packets

{

}

C j ,n = C j ,n ( k ) as the projection of the signal in the sub-space W j ,n where ψ j,n is an k∈Z orthogonal basis. For each node (j,n) of the tree, the coefficients of the wavelet packets of a function f (t ) can be calculated as follows: C j ,n (k ) = f (t ),ψ j ,n (t − 2 j k )

(13)

The wavelet packets decomposition is obtained by using pyramidal recursive algorithm as follows: ^

C j +1, 2 n ( k ) = ∑ h[2k − l ]C j ,n l

C j +1, 2 n+1 (k ) = ∑ g[ 2k − l ]C j ,n l

(14)

(15)

The packet C j +1, 2 n (respectively C j +1, 2 n +1 ) is obtained by low-pass (respectively high-pass) filtration of C j ,n by using the filter g (resp. h), followed by down sampling by factor of 2 that projects the signal to sub-space W j +1,2 n (resp. W j +1, 2 n+1 ).

192

Robotics, Automation and Control

3.2.3 Best tree selection The wavelet packet decomposition generates redundant representations. Each decomposition level contains all the signal information. Some combinations of packets contain non-redundant representation of the signal, from which a best tree can be selected. The selection must be based on a criterion according to the problem to be treated. In our case and for detection purposes, the Shannon entropy-based algorithm is used, where the criterion is defined to allow an optimal separation of the frequency components of the signal (in case of frequency change). The best tree selection algorithm offers optimal signal representation (Coifman & Wicherhauser, 1992). For a series of discrete elements X = {x1 , x 2 ,..., x N } , the Shannon entropy is defined by: N

M (x ) = − ∑ (xi )2 log( xi ) 2

(16)

i =1

where (xi )2 log( xi ) 2 = 0 if xi = 0 . M(X), which is the expectation of information quantity, reflects the concentration of energy in the discrete series X. Based on the entropy criterion, the best tree selection is done as follows: a. For a given node Aj,n, the energy is compared to threshold energy (SE) then this node is labeled by 0 if its energy is lower than SE and by 1 in other cases.

⎧⎪1 if energy (C j ,n ) > SE A j ,n = ⎨ ⎪⎩0 else

(17)

b. Indicate the number of components present in each packet. c. Select the single-component packets. In (Hitti. & Eric, 1999), Hitti proposed a method to determine the threshold energy by estimating the sum of energies of packets in each level j ( EnsSP j ) and then the threshold energy can be calculated as follows:

Sj =

EnsSPj

(18)

Figure (10) shows the different steps to select the best tree. 1

a) Tree obtained after thresholding the packets energies (the node is labeled by 0 if its energy is lower than SE and by 1 in other cases).

193

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

0 0 1 1 b) Number of present components per packet. 0

c) The best tree is determined by single-component packets (the nodes labeled 1). Fig. 10. Steps of best tree selection according to Hitti algorithm. 3.3 Filters bank parameters estimation (Butterworth filters) In order to explore the frequency and energy components of the original signal, an important pre-processing step is required before detection, feature extraction and classification. At a discrete time t, the signal is first decomposed by using an L-channels filters bank whose parameters are derived from wavelet packets. Each component m∈ {1,…, L} is the result of filtering the original signal x(t) by the derived filter for a given cut off frequency. The filters coefficients are derived from the wavelet packets in order to use them as wavelet packets so the wavelet packets characteristics are approximately conserved and can be used for both filtering and reconstruction of the signal. Also the wavelet packets derived filters bank can be hardware implemented and can be used for online detection. For a given wavelet, we can use its frequency domain transfer function hwav to derive (extract) the transfer function of a low pass Butterworth filter hfilt. After the extraction, the frequency responses of the wavelet packets-based filters bank can be determined as follows: 1. Selection of the wavelet. 2. Extractions of the transfer function of the wavelet hwav. 3. Calculation of the cut off frequency (fc) of the wavelet filter (at -3db attenuation). 4. Estimation of the order of the filter that minimizes the error between the two transfer functions hwav and hfilt. The estimation is done by using the least square method to calculate the optimal filter order N that minimizes the error between the two transfer functions hwav and hfilt (figure 11).

194

Robotics, Automation and Control

Fig.11. Curve representing the variation of the error in terms of the order of the filter. 1. 2. 3. 4.

Decompose the signal by using the wavelet packets. Select the best tree by using an entropy-based method. Determine the suitable cut off frequencies of the filters bank according to the selected best tree. Design of the corresponding Butterworth filters according to the Butterworth general transfer function: H ( jf ) =

1 1 + ( f / fc ) 2N

(19)

where H(jf) is the transfer function, f is the frequency, fc is the cut off frequency and N is the order of the filter. Practically, the coefficients a(i) and b(i) necessary to calculate the equation (19) will be determined from the Butterworth table. Note that m represents the decomposition level. To have a quadrature mirror filters, we can extract a high pass filter by using the following relation:

h filt _ H = (−1) k h filt _ L ( 2 N + 1 − k ) for k = 1,2,...,2 N

(20)

So we can design a filters bank that behaves as a wavelet packets. This filters bank can be used, instead of a wavelet packets, to decompose a signal into some frequency components in order to explore the signal. The main problem is to solve nonlinear curve-fitting (data-fitting) in the least squares sense. Given input data hwav, and the desired output data hfilt, find coefficient N that "best-fit" the two curves is well known in the numerical analysis domain. This problem is named curve-fitting problem (Papalambros & Wilde, 2000), (Rustagi, 1994). This problem consists to find the parameters which permit the best fitting of the analytic model to the real data. If the model is linear, the problem is solved by using linear regression methods (Saporta, 1990), and if the model is nonlinear, the problem is solved by using nonlinear regression methods (Rardin, 1998).

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

195

The optimization consists to express the relation between the two variables f and hwav for a function hfilt that have non linear dependence with the vector: h filt = F ( f , N )

(21)

In order to solve this problem we use the least squares method, we must minimize with respect to N the quadratic error:

2 1 1 min h filt ( N , f ) − hwav = ∑ h filt ( N , f i ) − hwav i ) 2 N 2 2 i

(22)

So the nonlinear curve-fitting problem can be solved by the classical algorithm of Gauss – Newton (Papalambros & Wilde, 2000) that is: knowing the desired output we can find the optimal coefficient N that best fits the two curves. Because F is a non linear function of parameter N, the optimization is done by an iterative method: We start with a given value of N, then we continue the iterations until that the parameter N can't be changed. The figure 12 shows the response curve of the wavelet 'db5' and the response curve of the filter derived from it. The order of the derived filter is 6, and the error corresponding to the minimum error and calculated by using the Euclidian distance:

e = ∑ (hond − h filt ) 2

(23)

Is e=0.78869. Responce curves of the wavelet and the derived filter 1 hfilt hwav Wavelet= db5 order= 6 error= 0.78869

0.9 0.8 -3dB

magnitude

0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

Fc= 257.5 0

100

150

200

250 300 Frequency

350

400

450

500

Fig.12. Response curve of a Butterworth low pass filter derived from the wavelet 'db5' (error: e = 0.78869).

4. Real time detection 4.1 Sequential algorithms The CUSUM algorithm is based on a recursive calculation of the logarithm of the likelihood ratios. This method can be considered as a sequence of repeated tests around the point of

196

Robotics, Automation and Control

change tr (Nikiforov, 1986), (Basseville & Nikiforov 1993). Let x1,x2,x3,…,xt be a sequence of observations. Let us assume that the distribution of the process X depends on parameter θ 0 until time tr and depends on parameter θ1 after the time tr. At each time t we compute the sum of logarithms of the likelihood ratios as follows: fθ ( xt / xt −1 ,..., x1 ) t t (t ,m) = ∑ s (m) i = ∑ Ln 1 S1 fθ ( xt / xt −1 ,..., x1 ) i =1 i =1 0

(24)

where, ƒθ is the probability density function. The importance of this sum comes from the fact that its sign changes after the point of change. The CUSUM method is usually applied when we have a priori knowledge about the segments before and after the changes (energy change or probability density distribution before and after the change). The Dynamic Cumulative Sum (DCS) technique, based on the local dynamic cumulative sum, is preferred when the parameters of the signal are unknown (Khalil & Duchêne, 1999). The DCS is a repetitive sequence around the point of change tr. It is based on the local cumulative sum of the likelihood ratios between two local segments estimated at the current time t. These two dynamic segments S a(t ) (after t) and S b(t ) (before t) are estimated by using two windows of width W (figure 13) before and after the instant t as follows: * S b(t ) : xi *

S a(t )

: xi

; i ={t −W ,...t −1}

follows a probability density function fθb ( xi )

; i ={t +1,...t +W } follows

a probability density function fθ a ( xi )

^ (t )

The parameters θ b of the segment Sb( t ) , are estimated using W points before the instant t and ^ (t )

the parameters θ a of the segment S a , are estimated using W points after the instant t. At a time t, and for each level m, the DCS is defined as the sum of the logarithm of likelihood ratios from the beginning of the signal up to the time t: (t )

f ( i ) ( xi ) ^

θa

DCS ( m ) ( S a( m ) , Sb( m ) ) = ∑ Ln i =1

(i )

( xi ) ^

θb

= ∑ Si

(25)

i =1

(Khalil, 1999) proves that the DCS function reaches its maximum at the point of change tr. The detection function used to estimate the point of change is:

g ( m) t = max[ DCS ( m) ( S a(t ) , S b(t ) )] − DCS ( m) ( S a(t ) , S b(t ) ) 1≤i ≤t

(26)

The instant at which the procedure is stopped is ts = min {t : g(m)t ≥ h}, where h is the detection threshold. The point of change is estimated as follows:

tc = max{t > 1: g ( m ) t = 0}

(27)

The application of the DCS method improves the detection when applied after filtration, specially when the signal presents no abrupt change, and the direct application of the DCS algorithm leads us to faulty results and sometimes difficult to interpret for accurate fault

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

197

detection (figure 14). The use of the filters bank improves considerably the detection ability of the DCS method. And the sub-signal frequency components, constituting the frequencies in defined ranges will be transformed into energy changes.

Fig. 13. Application of the DCS on a signal of abrupt change. a) Original signal. b) DCS function. c) Detection function g(t).

Fig. 14. Application of the DCS on a signal with frequency change. a) Original signal. b) DCS function. c) Detection function g(t).

Fig. 15. Results after the decomposition. At left: decomposition of the signal into 3 components. At right: the detection functions of each component. a) Original signal presenting frequency change at 1000 T.U. b) DCS applied directly on the original signal. c) d) e) Decomposition using filters bank derived from the 'Haar' wavelet packet. f) g) h) Detection functions applied on the filtered signals (c, d, e).

198

Robotics, Automation and Control

Figure (15) shows clearly the improvement offered by the derived filters bank pre processing before the application of the DCS algorithm. The original signal is a simulated signal obtained by concatenating two segments S1(t)t∈[0 1000s] and S2(t)t∈[1001 2000s], i.e the real point of change is tr = 1000. These segments are generated by filtering a white gaussian noise by two band pass filters of central frequencies f1 and f2. The point of change of each component is calculated by the detection algorithm.

4.2 Fusion of change points Because the detection algorithm is applied individually to each frequency component, it is important to apply a fusion technique to the resulting times of change in order to get a single value for a given fault in the system. The fusion technique is achieved as follows: Each point of change at a given level is considered as an interval [tc-a, tc+a], where a is an arbitrary number of points taken before and after the point of change. The time intervals with common areas are considered to correspond to the same fault. The resulting point of change tf is calculated as the center of gravity (or mean) of the superimposed intervals. This procedure is shown on figure 16. The signal is a simulated one with two segments (each segment has 1000 points) of different frequency. The point of change of each component is calculated by the detection algorithm, and then the fusion technique is applied to get a single point of change tf. The real point of change is tr = 1000 and, after fusion, the estimated point of change is tf = 1003.

Fig. 16. Fusion of the change points. a) Original signal presenting a frequency change b) DCS applied directly on the original signal c) d) e) Decomposition using derived filters f) g) h) Detection functions applied on the filtered signals.

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

199

(0,0)

(1,0)

(2,0)

(1,1)

(2,1)

(3,2) (3,3)

Fig.17. Obtained best tree. Figure 17 shows the best tree obtained after decomposing the signal by using filters bank derived from the 'Haar' wavelet packet. 4.3 Performance evaluation To study the performance of the detection method, the probability of detection and the probability of false alarm are calculated. Also the detection delay which is the difference between the stopping time and the real time of change is calculated. In order to evaluate the performance of our algorithm, we compare the DCS with filtration and the DCS without filtration in case of a change in variance. The comparison is done by using the Receiver Operational Curve ROC which is a curve that represents the probability of detection as function of the probability of false alarm. The two methods are tested, by using 20 signals randomly generated. The variances of the signals are 1 and 1.3. The comparison between curves in figure 18 shows that, for the same probability of false alarm, the probability of detection, in case of DCS with filtration, is higher than that of the DCS without filtration. However, let us focus on the following remarks: 1. The use of a filters bank to decompose the signal, before the application of the DCS method, causes a delay to the detection because the current output sample is calculated in terms of current and past input samples. In the case of on-line detection, it is necessary to estimate a model on line at the same time as the input data is received, to make the decision on-line, and in order to investigate possible time variation in the signal parameters during the collection of data. Our future works include the study of detection delay in order to deal with real time applications. 2. The use of filters bank is not suitable in case of varying mean signals because the band pass filters, used to decompose the signal, will attenuate all mean variation. In this last part, the influence of the window size is also investigated. The comparison between curves in figure 19 shows that, for the same probability of false alarm, the probability of detection increases if the width W of the window increases. These curves are plotted for W = 25, W = 50 and W = 100 points. It is clear that if the window becomes large, the false alarm probability decreases. But in this case small events will not be detected and the delay detection time will be increased. A trade-off must be found between the detection delay and the length of the window.

200

Robotics, Automation and Control

Fig. 18. Comparison between the DCS and DCS with decomposition.

Fig. 19. Comparison between the DCS with filtration for three different window lengths.

5. Applications to the Tennessee Eastman Challenge Process In this section, the method based on wavelet packet filters bank decomposition and DCS algorithm is applied to detect disturbances on the Tennessee Eastman Challenge Process (TECP, Figure 20). The TECP (Downs & Vogel, 1993) is a multivariable non-linear, high dimensionality, unstable open-loop chemical reactor that is a simulation of a real chemical

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

201

plant provided by the Eastman Company. The process results in final products G and H from four reactants A, C, D and E. The plant has 7 operating modes, 41 measured variables and 12 manipulated variables. There are also 20 disturbances IDV1 through IDV20 that could be simulated (Downs & Vogel, 1993), (Singhal, 2000). The sampling period for measurements is 60 seconds. The TECP offers numerous opportunities for control and fault detection and isolation studies. In this work, we use a robust adaptive multivariable (4 inputs and 4 outputs) RTRL neural networks controller (Leclercq et al., 2005), (Zerkaoui et al, 2007) to regulate the temperature (Y1) and pressure (Y2) in reactor, and the levels in separator (Y3) and stripper (Y4). For this purpose, the controller drives the purge valve (U1), the stripper input valve (U2), the condenser CW valve (U3), and reactor CW valve (U4). The controller is presented in figure 20 (full lines represent measurements and dashed line represent actuators updating). This controller compensates all perturbations IDV1 to IDV 20 excepted IDV1, IDV6 and IDV7. Particularly, the controller is robust for perturbation IDV16 that will be used in the following.

Fig. 20. Tennessee Eastman Challenge Process and robust adaptive neural networks controller (Leclercq et al., 2005), (Zerkaoui et al, 2007).

202

Robotics, Automation and Control

The figure 21 illustrates the advantage of our method to detect changes for real world FDI applications. Measurements of the stripper level (figure 21 a) are decomposed into 3 components by using filters bank derived from the 'Haar' wavelet packet. From time tr = 600 hours, the perturbation IDV16, that corresponds to a random variation of the A, B, C composition, modifies the dynamical behavior of the system. The detection functions applied on the 3 components (figure 21 f, g, h) can be compared with the detection function applied directly on measurement of pressure (figure 21 b). After fusion, the point of change is calculated to be tf = 659. Detection results are considerably improved by using the derived filters bank as a preprocessor.

Fig. 21. Analysis of the stripper level measurements (%) for TECP with robust adaptive control and for IDV 16 perturbation from t = 600. At left: decomposition of the signal into 3 components. At right: the detection functions of each component. a) Original signal b) DCS applied directly on the original signal. c) d) e) Decomposition using filters bank derived from the 'Haar' wavelet packet. f) g) h) Detection functions applied on the filtered signals (c, d, e).

6. Conclusions and perspectives The aim of our work is to detect the point of change of statistical parameters in signals collected from complex industrial systems. This method uses a filters bank derived from a wavelet packet and combined with DCS to characterize and classify the parameters of a signal in order to detect any variation of the statistical parameters due to any change in frequency and energy. The main contribution of this paper is to derive the parameters of a filters bank that behaves as a wavelet packet. The proposed algorithm provides also good results for the detection of frequency changes in the signal. The application to the Tennessee Eastman Challenge Process illustrates the interest of the approach for on–line detection and real world applications.

Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

203

In the future, our algorithm will be tested with more data issued form several systems in order to improve and validate it and to compare it to other methods. We will consider mechanical and electrical machines (Awadallah & Morcos 2003, Benbouzid et al.,1999), and as a consequence our intend is to develop FDI methods for wind turbines and renewable multi-source energy systems (Guérin et al., 2005).

7. References Awadallah M., M.M. Morcos, Application of AI tools in faults diagnosis of electrical machines and drives – a review, Trans. IEEE Energy Conversion, vol. 18, no. 2, pp. 245-251, june 2003. Basseville M., Nikiforov I. Detection of Abrupt Changes: Theory and Application. PrenticeHall, Englewood Cliffs, NJ, 1993. Benbouzid M., M. Vieira, C. Theys, "Induction motor's faults detection and localization using stator current advanced signal processing techniques", IEEE Transaction on Power Electronics, Vol. 14, N° 1, pp 14 – 22, January1999. Blanke M., Kinnaert M., Lunze J., Staroswiecki M., Diagnosis and fault tolerant control, Springer Verlag, New York, 2003. Coifman R. R., and Wicherhauser M.V. (1992): ‘Entropy based algorithms for best basis selection’, IEEE Trans. Inform. Theory, 38, pp. 713-718. Downs, J.J., Vogel, E.F, 1993, A plant-wide industrial control problem, Computers and Chemical Engineering, 17, pp. 245-255. Flandrin P. Temps fréquence, édition HERMES, Paris,1993. Guérin F., Druaux F., Lefebvre D., Reliability analysis and FDI methods for wind turbines: a state of the art and some perspectives, 3ème French - German Scientific conference « Renewable and Alternative Energies», December 2005, Le Havre and Fécamp, France. Hitti. Eric 3 Sélection d'un banc optimal de filtres à partir d'une décomposition en paquets d'ondelettes. Application à la détection de sauts de fréquences dans des signaux multicomposantes » THESE de DOCTORAT, Sciences de l'Ingénieur, Spécialité: Automatique et Informatique Appliquee, 9 novembre 1999, Ecole Centrale de Nantes. Khalil.M, Une approche pour la détection fondée sur une somme cumulée dynamique associée à une décomposition multiéchelle. Application à l'EMG utérin. Dixseptième Colloque GRETSI sur le traitement du signal et des images, Vannes, France,1999. Khalil M., Duchêne J., Dynamic Cumulative Sum approach for change detection, EDICS NO: SP 3.7. 1999. Leclercq, E., Druaux, F. Lefebvre, D., Zerkaoui, S., 2005. Autonomous learning algorithm for fully connected recurrent networks. Neurocomputing, vol. 63, pp. 25-44. Mallat S. (1999): ‘A Wavelet Tour of Signal Processing’, Academic Press, San Diego, CA. Mallat S. Une exploration des signaux en ondelettes, les éditions de l’école polytechnique, Paris, juillet 2000. http ://www.cmap.polytechnique.fr/~mallat/ Wavetour_fig/. Chendeb Marwa, Détection et caractérisation dans les signaux médicaux de longue durée par la théorie des ondelettes. Application en ergonomie, stage du DEA modélisation et simulation informatique (AUF), octobre 2002.

204

Robotics, Automation and Control

Maquin D. and Ragot J., Diagnostic des systèmes linéaires, Hermes, Paris, 2000. Mustapha O., Khalil M., Hoblos G, Chafouk H., Ziadeh H., Lefebvre D., About the Detectability of DCS Algorithm Combined with Filters Bank, Qualita 2007, Tanger, Maroc, April 2007. Mustapha O, Khalil M., Hoblos G, Chafouk H., Lefebvre D., Fault detection algorithm using DCS method combined with filters bank derived from the wavelet transform, IEEE – IFAC ICINCO 2007, 09- 11 May, Angers, France, 2007. Nikiforov I. Sequential detection of changes in stochastic systems. Lecture notes in Control and information Sciences, NY, USA, 1986, pp. 216-228. Papalambros P. Y, Wilde J. D. Principles of optimal design. Modeling and computation. Cambridge university press, USA, 2000. Patton R.J., Frank P.M. and Clarck R., Issue of Fault diagnosis for dynamic systems, Springer Verlag, 2000. Rardin L. R. Optimization in operation research. Prentice-Hall, NJ, USA, 1998. Rustagi S. J. Optimization techniques in statistics. Academic press, USA, 1994. Saporta G. Probabilités, analyse des données et statistiques, éditions Technip, 1990. Singhal, A., 2000. Tennessee Eastman Plant Simulation with Base Control System of McAvoy and Ye., Research report, Department of Chemical Engineering, University of California, Santa Barbara, USA. Zerkaoui S., Druaux F., Leclercq E., Lefebvre D., 2007, Multivariable adaptive control for nonlinear systems : application to the Tennessee Eastman Challenge Process, ECC 2007, Kos, Greece, July 2 – 5. Zwingelstein G., Diagnostic des défaillances, Hermes, Paris, 1995.

12 Pareto Optimum Design of Robust Controllers for Systems with Parametric Uncertainties 1Dept.

Amir Hajiloo1, Nader Nariman-zadeh1 2 and Ali Moeini3,

of Mechanical Engineering, Faculty of Engineering, University of Guilan Experimental Mechanics Center of Excellence, School of Mechanical Engineering, Faculty of Engineering, University of Tehran 3Dept. of Algorithms & Computations, Faculty of Engineering, University of Tehran Iran

2Intelligent-based

1. Introduction The development of high-performance controllers for various complex problems has been a major research activity among the control engineering practitioners in recent years. In this way, synthesis of control policies have been regarded as optimization problems of certain performance measures of the controlled systems. A very effective means of solving such optimum controller design problems is genetic algorithms (GAs) and other evolutionary algorithms (EAs) (Porter & Jones, 1992; Goldberg, 1989). The robustness and global characteristics of such evolutionary methods have been the main reasons for their extensive applications in off-line optimum control system design. Such applications involve the design procedure for obtaining controller parameters and/or controller structures. In addition, the combination of EAs or GAs with fuzzy or neural controllers has been reported in literature which, in turn, constitutionally formed intelligent control scheme (Porter et al., 1994; Porter & Nariman-zadeh, 1995; Porter & Nariman-zadeh, 1997). The robustness and global characteristics of such evolutionary methods have been the main reasons for their extensive applications in off-line optimum control system design. Such applications involve the design procedure for obtaining controller parameters and/or controller structures. In addition to the most applications of EAs in the design of controllers for certain systems, there are also much research efforts in robust design of controllers for uncertain systems in which both structured or unstructured uncertainties may exist (Wolovich, 1994). Most of the robust design methods such as μ-analysis, H2 or H∞ design are based on different normbounded uncertainty (Crespo, 2003). As each norm has its particular features addressing different types of performance objectives, it may not be possible to achieve all the robustness issues and loop performance goals simultaneously. In fact, the difficult mixed norm-control methodology such as H2/ H∞ has been proposed to alleviate some of the issue of meeting different robustness objectives (Baeyens & Khargonekar, 1994). However, these are based on the worst case scenario considering in the most possible pessimistic value of the performance for a particular member of the set of uncertain models (Savkin et al., 2000). Consequently, the performance characteristics of such norm-bounded uncertainties robust designs often degrades for the most likely cases of uncertain models as the likelihood of the

206

Robotics, Automation and Control

worst-case design is unknown in practice (Smith et al., 2005). Recently, there have been many efforts for designing robust control methods. In these methods for reducing the conservatism or accounting more for the most likely plants with respect to uncertainties, the probabilistic uncertainty, as a weighting factor, propagates through the uncertain parameter of plants. In fact, probabilistic uncertainty specifies set of plants as the actual dynamic system to each of which a probability density function (PDF) is assigned (Crespo & Kenny, 2005). Therefore, such additional information regarding the likelihood of each plant allows a reliability-based design in which probability is incorporated in the robust design. In this method, robustness and performance are stochastic variables (Stengel & Ryan, 1989). Stochastic behavior of the system can be simulated by Monte- Carlo Simulation (Ray & Stengel, 1993). Robustness and performance can be considered as objective functions with respect to the controller parameters in optimization problem. GAs have also been recently deployed in an augmented scalar single objective optimization to minimize the probabilities of unsatisfactory stability and performance estimated by Monte Carlo simulation (Wang & Stengel, 2001), (Wang & Stengel, 2002). Since conflictions exist between robustness and performance metrics, choosing appropriate weighting factor in a cost function consisting of weighted quadratic sum of those non-commensurable objectives is inherently difficult and could be regarded as a subjective design concept. Moreover, trade-offs existed between some objectives cannot be derived and it would be, therefore, impossible to choose an appropriate optimum design reflecting the compromise of the designer’s choice concerning the absolute values of objective functions. Therefore, this problem can be formulated as a multi objective optimization problem (MOP) so that trade-offs between objectives can be derived consequently. In this chapter, a new simple algorithm in conjunction with the original Pareto ranking of non-dominated optimal solutions is first presented for MOPs in control systems design. In this Multi-objective Uniform-diversity Genetic Algorithm (MUGA), a є-elimination diversity approach is used such that all the clones and/or є-similar individuals based on normalized Euclidean norm of two vectors are recognized and simply eliminated from the current population. Such multi-objective Pareto genetic algorithm is then used in conjunction with Monte-Carlo simulation to obtain Pareto frontiers of various non-commensurable objective functions in the design of robust controllers for uncertain systems subject to probabilistic variations of model parameters. The methodology presented in this chapter simply allows the use of different non-commensurable objective functions both in frequency and time domains. The obtained results demonstrate that compromise can be readily accomplished using graphical representations of the achieved trade-offs among the conflicting objectives.

2. Stochastic robust analysis In real control engineering practice, there exist a variety of typical sources of uncertainty which have to be compensated through robust control design approach. Those uncertainties include plant parameter variations due to environmental condition, incomplete knowledge of the parameters, age, un-modelled high frequency dynamics, and etc. Two categorical types of uncertainty, namely, structured uncertainty and unstructured uncertainty are generally used in classification. The structured uncertainty concerns about the model uncertainty due to unknown values of parameters in a known structure. In conventional optimum control system design, uncertainties are not addressed and the optimization process is accomplished deterministically. In fact, it has been shown that optimization

Pareto Optimum Design of Robust Controllers for Systems with Parametric Uncertainties

207

without considering uncertainty generally leads to non-optimal and potentially high risk solution (Lim et al., 2005). Therefore, it is very desirable to find robust design whose performance variation in the presence of uncertainties is not high. Generally, there exist two approaches addressing the stochastic robustness issue, namely, robust design optimization (RDO) and reliability-based design optimization (RBDO) (Papadrakakis et al., 2004). Both approaches represent non deterministic optimization formulations in which the probabilistic uncertainty is incorporated into the stochastic optimal design process. Therefore, the propagation of a priori knowledge regarding the uncertain parameters through the system provides some probabilistic metrics such as random variables (e.g., settling time, maximum overshoot, closed loop poles, …), and random processes (e.g., step response, Bode or Nyquist diagram, …) in a control system design (Smith et al., 2005). In RDO approach, the stochastic performance is required to be less sensitive to the random variation induced by uncertain parameters so that the performance degradation from ideal deterministic behaviour is minimized. In RBDO approach, some evaluated reliability metrics subjected to probabilistic constraints are satisfied so that the violation of design requirements is minimized. In this case, limit state functions are required to define the failure of the control system. Figure (1) depicts the concept of these two design approaches where f is to be minimized. Regardless the choice of any of these two approaches, random variables and random processes should be evaluated reflecting the effect of probabilistic nature of uncertain parameters in the performance of the control system.

Fig. 1. Concepts of RDO and RBDO optimization With the aid of ever increasing computational power, there have been a great amount of research activities in the field of robust analysis and design devoted to the use of Monte Carlo simulation (Crespo, 2003; Crespo & Kenny, 2005; Stengel, 1986; Stengel & Ryan, 1993; Papadrakakis et al., 2004; Kang, 2005). In fact, Monte Carlo simulation (MCS) has also been used to verify the results of other methods in RDO or RBDO problems when sufficient number of sampling is adopted (Wang & Stengel, 2001). Monte Carlo simulation (MCS) is a direct and simple numerical method but can be computationally expensive. In this method, random samples are generated assuming pre-defined probabilistic distributions for

208

Robotics, Automation and Control

uncertain parameters. The system is then simulated with each of these randomly generated samples and the percentage of cases produced in failure region defined by a limit state function approximately reflects the probability of failure. Let X be a random variable, then the prevailing model for uncertainties in stochastic randomness is the probability density function (PDF), f X (x ) or equivalently by the cumulative distribution function (CDF), FX (x ) , where the subscript X refers to the random variable. This can be given by

FX ( x ) = Pr ( X ≤ x ) = ∫−∞ f X ( x ) dx x

(1)

where Pr(.) is the probability that an event (X≤x) will occur. Some statistical moments such as the first and the second moment, generally known as mean value (also referred to as expected value) denoted by E(X) and variance denoted by σ 2 (X ) , respectively, are the most important ones. They can also be computed by

E ( X ) = ∫−∞ xdFX ( x ) = ∫−∞ f X ( x ) dx ∞

∞

(2)

and

σ 2 (X ) =

∞

∫−∞ (x − E(X )) f X (x )dx

(3)

In the case of discrete sampling, these equations can be readily represented as

1 N

∑ xi

(4)

1 (xi − E(X ))2 N − 1 i =1

(5)

E(X ) ≅

i =1

and

σ 2 (X ) ≅

∑

where xi is the ith sample and N is the total number of samples. In the reliability-based design, it is required to define reliability-based metrics via some inequality constraints (in time or frequency domain). Therefore, in the presence of uncertain parameters of plant (p) whose PDF or CDF can be given by fp(p) or Fp(p), respectively, the reliability requirements can be given as

Pfi = Pr ( gi ( p ) ≤ 0 ) ≤ ε

i = 1, 2,..., k

(6)

In equation (6), Pfi denotes the probability of failure (i.e., gi (p) ≤ 0 ) of the ith reliability measure and k is the number of inequality constraints (i.e., limit state functions) and is the highest value of desired admissible probability of failure. It is clear that the desirable value of each Pfi is zero. Therefore, taking into consideration the stochastic distribution of

Pareto Optimum Design of Robust Controllers for Systems with Parametric Uncertainties

209

uncertain parameters ( p ) as f p (p ) , equation (6) can now be evaluated for each probability function as Pfi = Pr(gi (p ) ≤ 0 ) =

∫ f p (p)dp

g i (p )≤ 0

(7)

This integral is, in fact, very complicated particularly for systems with complex g(p) (Wang & Stengel, 2002) and Monte Carlo simulation is alternatively used to approximate equation (7). In this case, a binary indicator function Ig(p) is defined such that it has the value of 1 in the case of failure (g(p)≤0) and the value of zero otherwise,

⎧0 g(p ) > 0 I g (p ) = ⎨ ⎩1 g(p ) ≤ 0

(8)

Consequently, for each limit state function, g(p), the integral of equation (7) can be rewritten as P f (p ) =

∞

∫ I g(p) (G(p), C (k )) f p (p)dp

(9)

−∞

where G(p) is the uncertain plant model and C(k) is the controller to be designed in the case of control system design problems. Based on Monte Carlo simulation (Ray & Stengel, 1993; Wang & Stengel, 2001; Wang & Stengel, 2002; Kalos, 1986), the probability using sampling technique can be estimated using

P f (p ) =

1 N

∑ I g (p) (G(p), C (k )) i =1

(10)

where Gi is the ith plant that is simulated by Monte Carlo Simulation. In other words, the probability of failure is equal to the number of samples in the failure region divided by the total number of samples. Evidently, such estimation of Pf approaches to the actual value in the limit as N → ∞ (Wang & Stengel, 2002). However, there have been many research activities on sampling techniques to reduce the number of samples keeping a high level of accuracy. Alternatively, the quasi-MCS has now been increasingly accepted as a better sampling technique which is also known as Hammersley Sequence Sampling (HSS) (Smith et al., 2005; Crespo & Kenny, 2005). In this paper, HSS has been used to generate samples for probability estimation of failures. In a RBDO problem, the probability of representing the reliability-based metrics given by equation (10) is minimized using an optimization method. In a multi-objective optimization of a RBDO problem presented in this paper, however, there are different conflicting reliability-based metrics that should be minimized simultaneously. In the multi-objective RBDO of control system problems, such reliability-based metrics (objective functions) can be selected as closed-loop system stability, step response in time domain or Bode magnitude in frequency domain, etc. In the probabilistic approach, it is, therefore, desired to minimize both the probability of instability and probability of failure to a desired time or frequency response, respectively, subjected to assumed probability

210

Robotics, Automation and Control

distribution of uncertain parameters. In a RDO approach that is used in this work, the lower bound of degree of stability that is the distance from critical point -1 to the nearest point on the open lop Nyquist diagram, is maximized. The goal of this approach is to maximize the mean of the random variable (degree of stability) and to minimize its variance. This is in accordance with the fact that in the robust design the mean should be maximized and its variability should be minimized simultaneously (Kang, 2005). Figure (2) depicts the concept of this RDO approach where f X (x ) is a PDF of random variable, X. It is clear from figure (2) that if the lower bound of X is maximized, a robust optimum design can be obtained. Recently, a weighted-sum multi-objective approach has been applied to aggregate these objectives into a scalar single-objective optimization problem (Wang & Stengel, 2002; Kang, 2005).

Fig. 2. Concept of RDO approach However, the trade-offs among the objectives are not revealed unless a Pareto approach of the multi-objective optimization is applied. In the next section, a multi-objective Pareto genetic algorithm with a new diversity preserving mechanism recently reported by some of authors (Nariman-Zadeh et al., 2005; Atashkari et al., 2005) is briefly discussed for a combined robust and reliability-based design optimization of a control system.

3. Multi-objective Pareto optimization Multi-objective optimization which is also called multi-criteria optimization or vector optimization has been defined as finding a vector of decision variables satisfying constraints to give optimal values to all objective functions (Atashkari et al., 2005; Coello Coello & Christiansen, 2000; Coello Coello et al., 2002; Pareto, 1896). In general, it can be

[

mathematically defined as follows; find the vector X * = x1* , x 2* ,..., xn* F( X ) = [ f 1 ( X ), f 2 ( X ),..., f k ( X )] T

]

to optimize (11)

Pareto Optimum Design of Robust Controllers for Systems with Parametric Uncertainties

211

subject to m inequality constraints g i (X ) ≤ 0

i=1

hj(X ) ≤ 0

j=1

(12)

and p equality constraints

(13) k

where, X ∈ ℜ is the vector of decision or design variables, and F ( X ) ∈ ℜ is the vector of objective functions. Without loss of generality, it is assumed that all objective functions are to be minimized. Such multi-objective minimization based on the Pareto approach can be conducted using some definitions. Pareto dominance A vector U = [u1 , u2 ,..., uk ] ∈ ℜ k dominates to vector V = [v1 , v2 ,..., v k ] ∈ ℜ k (denoted by U ≺ V ) if and only if ∀i ∈ {1,2 ,..., k}, ui ≤ vi ∧ ∃j ∈ {1,2 ,..., k}: u j < v j . It means that there is at least one uj which is smaller than vj whilst the rest u’s are either smaller or equal to corresponding v’s. Pareto optimality A point X * ∈ Ω ( Ω is a feasible region in ℜn ) is said to be Pareto optimal (minimal) with respect to all X ∈ Ω if and only if F( X * ) ≺ F( X ) . Alternatively, it can be readily restated as ∀i ∈ {1,2 ,..., k} , ∀X ∈ Ω − { X * }, f i ( X * ) ≤ f i ( X ) ∧ ∃j ∈ {1,2 ,..., k} : f j ( X * ) < f j ( X ) . It means that the solution X* is said to be Pareto optimal (minimal) if no other solution can be found to dominate X* using the definition of Pareto dominance.

Pareto Set For a given MOP, a Pareto set Ƥ‫ ٭‬is a set in the decision variable space consisting of all the Pareto optimal vectors, Ƥ‫ { = ٭‬X ∈ Ω| ∄ X ′ ∈ Ω : F( X′) ≺ F( X )} . In other words, there is no other X’ in

that dominates any X ∈ Ƥ‫٭‬

Pareto front For a given MOP, the Pareto front ƤŦ‫ ٭‬is a set of vectors of objective functions which are obtained using the vectors of decision variables in the Pareto set Ƥ‫٭‬, that is,

ƤŦ‫{ = ٭‬F ( X ) = ( f1 ( X ), f 2 ( X ),...., f k ( X )) : X ∈ Ƥ‫}٭‬. Therefore, the Pareto front ƤŦ‫ ٭‬is a set of

the vectors of objective functions mapped from Ƥ‫٭‬. Evolutionary algorithms have been widely used for multi-objective optimization because of their natural properties suited for these types of problems. This is mostly because of their parallel or population-based search approach. Therefore, most difficulties and deficiencies within the classical methods in solving multi-objective optimization problems are eliminated. For example, there is no need for either several runs to find the Pareto front or quantification of the importance of each objective using numerical weights. It is very important in evolutionary algorithms that the genetic diversity within the population be preserved sufficiently (Osyezka, 1985). This main issue in MOPs has been addressed by

212

Robotics, Automation and Control

much related research work (Nariman-zadeh et al., 2005; Atashkari et al., 2005; Coello Coello & Christiansen, 2000; Coello Coello et al., 2002; Pareto, 1896; Osyezka, 1985; Toffolo & Benini, 2002; Deb et al., 2002; Coello Coello & Becerra, 2003; Nariman-zadeh et al., 2005). Consequently, the premature convergence of MOEAs is prevented and the solutions are directed and distributed along the true Pareto front if such genetic diversity is well provided. The Pareto-based approach of NSGA-II (Osyezka, 1985) has been recently used in a wide range of engineering MOPs because of its simple yet efficient non-dominance ranking procedure in yielding different levels of Pareto frontiers. However, the crowding approach in such a state-of-the-art MOEA (Coello Coello & Becerra, 2003) works efficiently for two-objective optimization problems as a diversity-preserving operator which is not the case for problems with more than two objective functions. The reason is that the sorting procedure of individuals based on each objective in this algorithm will cause different enclosing hyper-boxes. It must be noted that, in a two-objective Pareto optimization, if the solutions of a Pareto front are sorted in a decreasing order of importance to one objective, these solutions are then automatically ordered in an increasing order of importance to the second objective. Thus, the hyper-boxes surrounding an individual solution remain unchanged in the objective-wise sorting procedure of the crowding distance of NSGA-II in the two-objective Pareto optimization problem. However, in multi-objective Pareto optimization problem with more than two objectives, such sorting procedure of individuals based on each objective in this algorithm will cause different enclosing hyper boxes. Thus, the overall crowding distance of an individual computed in this way may not exactly reflect the true measure of diversity or crowding property for the multi-objective Pareto optimization problems with more than two objectives. In our work, a new method is presented to modify NSGA-II so that it can be safely used for any number of objective functions (particularly for more than two objectives) in MOPs. Such a modified MOEA is then used for multi-objective robust desing of linear controllers for systems with parametric uncertainties.

4. Multi-objective Uniform-diversity Genetic Algorithm (MUGA) The multi-objective uniform-diversity genetic algorithm (MUGA) uses non-dominated sorting mechanism together with a ε-elimination diversity preserving algorithm to get Pareto optimal solutions of MOPs more precisely and uniformly (Jamali et.al., 2008.) 4.1 The non-dominated sorting method The basic idea of sorting of non-dominated solutions originally proposed by Goldberg (Goldberg, 1989) used in different evolutionary multi-objective optimization algorithms such as in NSGA-II by Deb (Deb et al., 2002) has been adopted here. The algorithm simply compares each individual in the population with others to determine its non-dominancy. Once the first front has been found, all its non-dominated individuals are removed from the main population and the procedure is repeated for the subsequent fronts until the entire population is sorted and non-dominately divided into different fronts. A sorting procedure to constitute a front could be simply accomplished by comparing all the individuals of the population and including the non-dominated individuals in the front. Such procedure can be simply represented as following steps:

Pareto Optimum Design of Robust Controllers for Systems with Parametric Uncertainties

213

1-Get the population (pop) 2-Include the first individual {ind(1)} in the front P* as P*(1), let P*_size=1; 3-Compare other individuals {ind (j), j=2, Pop_size)} of the pop with { P*(K), K=1, P*_size} of the P*; If ind(j) 0 and kdi > 0 for i = 1, 2 . Knowing the control inputs u1 , u2 for the linearized system

Autonomous Robot Navigation using Flatness-based Control and Multi-Sensor Fusion

403

one can calculate the control inputs v and ω applied to the robotic vehicle, using Eq. (24). The above result is valid, provided that the dynamic feedback compensator does not need the singularity v = ξ i = 0 . The following theorem assures the avoidance of singularities in the proposed control law [Oriolo, G. et al., (2002)]: Theorem: Let λ11 , λ12 and λ21 , λ22 , be respectively the eigenvalues of two equations of the error dynamics, given in Eq. (34). Assume that, for i = 1, 2 it is λi1 < λi 2 < 0 (negative real eigenvalues), and that λi 2 is sufficiently small. If

⎡•0⎤ ⎡ x• d (t ) ⎤ ε ⎥ ||>| ⎢ x ⎥ | min || ⎢ • t ≥0 ⎢•0⎥ ⎢ ⎥ ⎢⎣ε y ⎥⎦ ⎣ y d (t ) ⎦ •0

•

•0

(35)

•

with ε x = ε x (0) ≠ 0 and ε y = ε y (0) ≠ 0 , then the singularity ξ = 0 is never met.

3. Fusion of distributed measurements using the Extended Kalman Filter The control law for the unicycle robot described in subsection 2.2. was based on the assumption that the vehicle's state vector [ x ( k ), y ( k ), θ ( k )] was measurable at every time instant. Here, the case in which the vehicle's state vector is reconstructed through the fusion of measurements received from distributed sensors (such as odometer or sonar sensors) will be examined. The first approach to be analyzed is that of fusion of measurements coming from distributed sensors, with the use of nonlinear filtering methods such as Extended Kalman Filtering (EKF). The fused data are used to reconstruct the state vector of a mobile robot, and the estimated state vector is in turn used in a control-loop. Extended Kalman Filtering for the nonlinear state-measurement model is revised. The following nonlinear time-invariant state model is now considered [Rigatos, G.G. & Tzafestas, S.G. (2007)]:

x(k + 1) = φ ( x(k )) + L(k )u (k ) + w(k ) z (k ) = γ ( x(k )) + v(k ) where

(36)

w(k ) and v(k ) are uncorrelated, Gaussian zero-mean noise processes with

covariance matrices Q ( k ) and R( k ) respectively. The operators φ (x) and γ (x ) are given by,

φ ( x) = [φ1 ( x), φ 2 ( x),..., φ m ( x), ]T , and γ ( x) = [γ 1 ( x), γ 2 ( x),..., γ p ( x)]T , respectively. It is assumed that φ and γ are sufficiently smooth in x so that each one has a valid series Taylor expansion. Following a linearization procedure, φ is expanded into Taylor series about xˆ : ^

φ ( x(k )) = φ ( x(k )) + J φ ( x(k ))[ x(k ) − x(k )] + ...

(37)

404

Robotics, Automation and Control

where J φ (x) is the Jacobian of φ calculated at xˆ( k ) : ⎛ ∂φ1 ⎜ ⎜ ∂x1 ⎜ ∂φ 2 ∂φ J φ ( x) = | ^ = ⎜ ∂x ∂x x = x ( k ) ⎜ 1 ⎜ ... ⎜ ∂φ m ⎜ ∂x ⎝ 1

∂φ1 ∂x 2 ∂φ 2 ∂x 2 ... ∂φ m ∂x 2

∂φ1 ∂x N ∂φ 2 ... ∂x N ... ... ∂φ m ... ∂x N ...

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠

(38)

^−

Likewise, γ is expanded about x (k ) ^

γ ( x(k )) = γ ( x(k )) + J λ ( x(k ))[ x(k ) − x(k )] + ...

(39)

^−

where x (k ) is the prior to time instant k estimation of the state vector x( k ) , and xˆ( k ) is the estimation of x(k ) at time instant k . The Jacobian J γ ( x) is

∂γ | J γ ( x) = ϑ x x= x

^−

⎛ ∂γ 1 ⎜ ∂x ⎜ 1 ⎜ ∂γ 2 ⎜ = ⎜ ∂x1 (k ) ⎜ ... ⎜ ⎜ ∂γ p ⎜ ∂x ⎝ 1

∂γ 1 ∂x2 ∂γ 2 ∂x2 ... ∂γ p ∂x2

∂γ 1 ⎞ ∂xN ⎟⎟ ∂γ 2 ⎟ ... ⎟ ∂xN ⎟ ... ... ⎟ ⎟ ∂γ p ⎟ ... ∂xN ⎟⎠ ...

(40)

The resulting expressions create first order approximations of φ and γ . Thus the linearized version of the plant is obtained: ^

x(k + 1) = ϕ ( x(k )) + J ϕ ( x(k ))[ x(k ) − x(k )] + w(k ) ^−

^−

z (k ) = γ ( x (k )) + J γ ( x (k ))[ x(k ) − x (k )] + v(k ) Now, the EKF recursion is as follows: First the time update is considered: by xˆ( k )

the

^−

estimation of the state vector at instant k is denoted. Given initial conditions x (0) and P-(0)the recursion proceeds as: • Measurement update. Acquire z (k ) and compute: ^_

K (k ) = P − (k ) J γT ( x (k ))[ J γ ( x (k )) P − (k ) J γT ( x (k )) + R (k )]−1 ^

x (k ) = x (k ) + K (k )[ z (k ) − γ ( x (k ))] ^_

P (k ) = P − (k ) − K (k ) J γ ( x (k )) P − (k )

(41)

Autonomous Robot Navigation using Flatness-based Control and Multi-Sensor Fusion

•

405

Time update. Compute: ^

P − (k + 1) = J φ ( x(k )) P (k ) J φT ( x(k )) + Q(k ) ^−

x (k + 1) = φ ( x (k )) + L(k )u (k ) The schematic diagram of the EKF loop is given in Fig. 1(a).

Fig. 1. (a) Schematic diagram of the Extended Kalman Filter Loop

Fig. 1. (b) Schematic diagram of the Particle Filter loop

(42)

406

Robotics, Automation and Control

4. Particle Filtering for the nonlinear state-measurement model 4.1 Particle Filtering with sequential importance resampling Next, the problem of fusing measurements coming from distributed sensors will be solved using the Particle Filtering method. The fused data are used again to estimate the state vector of a mobile robot and the reconstructed state vector will be used in closed-loop control. In the general case the equations of the optimal filter used for the calculation of the statevector of a nonlinear dynamical system do not have an explicit solution. This happens for instance when the process noise and the noise of the output measurement do not follow a Gaussian distribution. In that case, approximation through Monte-Carlo methods can used. As in the case of the Kalman Filter or the Extended Kalman Filter the particles filter consists of the measurement update (correction stage) and the time update (prediction stage) [Thrun, S. et al., (2005)]. a. The prediction stage: − − The prediction stage calculates p ( x( k ) | Z ) where Z = {z (1),.., z ( n − 1)} , using:

p( x(k − 1) | Z − ) = ∑ wki −1δ ξ i ( x(k − 1)) i =1

k −1

(43)

− − while from Bayes formula it holds p ( x( k − 1) | Z ) = ∫ p ( x( k ) | x( k − 1)) p ( x( k − 1) | Z )dx .

This finally gives N

p( x(k ) | Z − ) = ∑ wki −1δ ξ i ( x(k )) i =1

k−

(44)

with ξ k − ~ p ( x(k ) | x(k − 1)) = ξ i

i k −1

The meaning of Eq. (44) is as follows: the state equation of the nonlinear system of Eq. (36) is executed N times, starting from the N previous values of the state vectors x(k − 1) = ξ ki −1 with the use of Eq. (36). This means that the value of the state vector which is calculated in the prediction stage is the result of the weighted averaging of the state vectors which were calculated after running the state equation, starting from the N previous values of the state i vectors ξ k −1 .

b. The correction stage The a-posteriori probability density was performed using Eq. (44). Now a new position measurement z(k) is obtained and the objective is to calculate the corrected probability Z = {z (1), z (2),.., z (k )}. From Bayes law it holds that density p ( x(k ) | Z ) , where

p( x(k ) | Z ) =

p( Z | x(k )) p( x(k )) , which finally results into p(Z ) N

p( x(k ) | Z − ) = ∑ wki −1δ ξ ( x(k )) i =1

i k−

Autonomous Robot Navigation using Flatness-based Control and Multi-Sensor Fusion

i where wk

w i − p ( z (k ) | x (k ) = ξ i − ) k

j ∑ wk − j =1

j =ξ − k

p( z (k ) | x(k )

407

(45)

)

Eq. (45) denotes the corrected value for the state vector. The recursion of the Particle Filter proceeds in a way similar to the update of the Kalman Filter or the Extended Kalman Filter, i.e.: Measurement update: Acquire z (k ) and compute the new value of the state vector N

p ( x (k ) | Z ) = ∑ wki δ ki − ( x(k )) i =1

with corrected weights wki =

wki − p( z (k ) | x(k ) = ξ ki − ) N

∑w j =1

k−

and ξ ki = ξ ki −

(46)

p ( z (k ) | x(k ) = ξ ki − )i

Resample for substitution of the degenerated particles. Time update: compute state vector x( k + 1) according to N

p ( x(k + 1) | Z ) = ∑ wki δ ξ ( x(k )) i =1

i k

(47)

where ξ ~ p( x(k + 1) | x(k ) = ξ ) i k

i k

The stages of state vector estimation with the use of the particle filtering algorithm are depicted in Fig. 1(b). 4.2 Resampling issues in particle filtering a. Degeneration of particles The algorithm of particle filtering which is described through Eq. (44) and Eq. (45) has a significant drawback: after a certain number of iterations k , almost all the weights

wki become 0 . In the ideal case all the weights should converge to the value 1 / N , i.e. the particles should have the same significance. The criterion used to define a sufficient number N

of particles is N keff = 1 / ∑ wki 2 ∈ [1, N ] . When N keff is close to value N then all particles i =1

have almost the same significance. However using the algorithm of Eq. (44) and Eq. (45) results in N keff → 1 , which means that the particles are degenerated, i.e. they lose their effectiveness. Therefore, it is necessary to modify the algorithm so as to assure that degeneration of the particles will not take place [Crisan, D. & Doucet, A. (2002)]. eff When N k is small then most of the particles have weights close to 0 and consequently they

have a negligible contribution to the estimation of the state vector. The concept proposed to overcome this drawback of the algorithm is to weaken this particle in favor of particles that have a non-negligible contribution. Therefore, the particles of low weight factors are

408

Robotics, Automation and Control

removed and their place is occupied by duplicates of the particles with high weight factors. The total number of particles remains unchanged (equal to N ) and therefore this procedure can be viewed as a "resampling" or "redistribution" of the particles set. The particles resampling mentioned above maybe slow if not appropriately tuned. There are improved versions of it which substitute the particles of low importance with those of higher importance [Arulampalam, S. et al., (2002)], [Kitagawa, (1996)]. A first choice would 1 N be to perform a multinomial resampling. N particles are chosen between {ξ k ,..., ξ k } and

the corresponding weights are {wk1 ,..., wkN } . The number of times each particle is selected is given by [ j1 ,.., jN ] . Thus a set of N particles is again created, the elements of which are chosen after sampling with the discrete distribution

∑ w δξ i =1

i k

( x) . The particles

{ξ k1 ,..., ξ kN } are chosen according to the probabilities {wk1 ,.., wkN } . The selected particles are assigned with equal weights 1/N. b. Other approaches to the implementation of resampling Although sorting of the particles' weights is not necessary for the convergence of the particle i i filter algorithm, there are variants of the resampling procedure of (ξ k , wk , i = 1,.., N ) which

are based on previous sorting in decreasing order of the particles' weights. It is noted that efficient sorting approaches make the complexity of the particle filtering to be O(Nlog(N)), while the avoidance of resampling could result in a faster algorithm of complexity O( N ) . s [1] s [2] s[ N ] Sorting of particles' weights gives w > w > ... > w . A random numbers generator is

evoked and the resulting numbers u i:N ~ U [0,1] fall in the partitions of the interval [0,1] . i The width of these partitions is w and thus a redistribution of the particles is generated. For

instance, in a wide partition of width w j will be assigned more particles than to a narrow m

partition of width w . Two other methods that have been proposed for the implementation of resampling in Particle Filtering are explained in the sequel. These are Kitagawa's approach and the residuals resampling approach [Kitagawa, G. (1996)]. In Kitagawa's resampling the speed of the resampling procedure is increased by using less the random numbers generator. The s[ j ] so as to cover the region that corresponds weights are sorted again in decreasing order w to the interval [0,1]. Then the random numbers generator is used to produce the variable ui according to according to u1 ~ U [0,1 / N ] , and u = u + i

i , i = 2,.., N . The rest of the N

i variables u are produced in a deterministic way [Campillo, F. (2006)]. In the residuals resampling approach, the redistribution of the residuals is performed as follows: at a first stage particle ξ i is chosen in a deterministic way [ wi / N ] times (with ~ i

i i rounding). The residual weights are w = w − N [ w / N ] and are normalized. Thus, a probability distribution is generated. The rest of the particles are selected according to ~

multinomial resampling . The method can be applied if the number N which remains at the N

second stage is small, i.e. when N eff = 1 / ∑ wi2 is small. i =1

409

Autonomous Robot Navigation using Flatness-based Control and Multi-Sensor Fusion

5. Simulation tests 5.1 Flatness-based control using a state vector estimated by EKF The application of EKF to the fusion of data that come from different sensors is examined first [Rigatos, G.G. & Tzafestas, S.G. (2007)]. A unicycle robot is considered. Its continuoustime kinematic equation is: •

•

x(t ) = v(t ) cos(θ (t )), y (t ) = v(t ) sin(θ (t )), θ (t) = ω (t)

(48)

Encoders are placed on the driving wheels and provide a measure of the incremental angles over a sampling period T . These odometric sensors are used to obtain an estimation of the displacement and the angular velocity of the vehicle v (t ) and ω (t ) , respectively. These encoders introduce incremental errors, which result in an erroneous estimation of the orientation θ . To improve the accuracy of the vehicle's localization, measurements from sonars can be used. The distance measure of sonar i from a neighboring surface Pj is thus taken into account (see Fig. 2(a) and Fig. 2(b)). Sonar measurements may be affected by white Gaussian noise and also by crosstalk interferences and multiples echoes. P2

x1'

y1'

d i j (n)

plane P j

Pr j

Pnj

θi

sonar i

Pnj

Fig. 2. (a). Mobile robot with odometric and

Fig. 2. (b). Orientation of the sonar i

sonar sensors The inertial coordinates system OXY is defined. Furthermore the coordinates system O ' X 'Y ' is considered (Fig. 2(a)). O ' X 'Y ' results from OXY if it is rotated by an angle θ . The coordinates of the center of the wheels axis with respect to OXY are ( x, y ) , while the coordinates of the sonar i that is mounted on the vehicle, with respect to O ' X 'Y ' are xi' , yi' . The orientation of the sonar with respect to O ' X 'Y ' is θ i' . Thus the coordinates of the sonar with respect to OXY are ( xi , yi ) and its orientation is θ i , and are given by:

xi (k ) = x(k ) + xi' sin(θ (k )) + yi' sin(θ (k )) xi (k ) = x(k ) + xi' sin(θ (k )) + yi' sin(θ (k )) θi (k ) = θ (k ) + θi

(49)

410

Robotics, Automation and Control

Each plane P j in the robot's environment can be represented by Pr j and

Pn j (Fig. 2(b)),

where (i) Pr j is the normal distance of the plane from the origin O, (ii) Pn j is the angle between the normal line to the plane and the x-direction. The sonar i is at position ( xi (k ), yi ( k )) with respect to the inertial coordinates system

OXY and its orientation is θ i (k ) . Using the above notation, the distance of the sonar i , from the plane P j is represented by Pr j , Pn j (see Fig. 2(b)):

dik (k ) = Pr j − xi (k ) cos( Pn j ) − yi (k ) sin( Pn j )

(50)

Pn j ∈ [θi (n) − δ / 2, θ i (n) + δ / 2] , and δ is the width of the sonar beam. Assuming a constant sampling period Δtk = T the measurement equation is z (k + 1) = γ ( x( k )) + v( k ) , where z (k ) is the vector containing sonar and odometer measures and v(k ) is a white noise sequence ~ N (0, R (kT )) . The dimension p k of z (k ) depends on the number of sonar sensors. The measure vector z (k ) can be decomposed in two sub-vectors where

z1 (k + 1) = [ x(k ) + v1 (k ), y (k ) + v 2 (k ), θ (k ) + v3 (k )]

(51)

z 2 (k + 1) = [d1j (k ) + v 4 (k ),..., d nj (k ) + v 3+ n2 (k )] s

with i = 1, 2,..., ns , where ns is the number of sonars, d i j (k ) is the distance measure with respect to the plane P j provided by the i -th sonar and j = 1,.., n p where n p is the number of surfaces. By definition of the measurement vector one has that the output function

γ ( x(k )) is given by

γ ( x(k )) = [ x(k ), y (k ),θ (k ), d11 (k ), d 22 (k ),..., d n ]T . The robot state is np s

[ x(k ), y (k ), θ (k )]T and the control input is denoted by U (k ) = [u (k ), ω (k )]T . In the simulation tests, the number of sonar is taken to be ns = 1 , and the number of planes n p = 1 , thus the measurement vector becomes s. To obtain the Extended Kalman Filter ^

^−

(EKF), the kinematic model of the vehicle is linearized about the estimates x(k ) and x (k ) , the control input U (k − 1) is applied. The measurement update of the EKF is ^_

K (k ) = P − (k ) J γT ( x ( k ))[ J γ ( x (k )) P − (k ) J γT ( x (k )) + R(k )]−1 ^_

x(k ) = x (k ) + K ( k )[ z (k ) − γ ( x (k ))] ^_

P (k ) = P − ( k ) − K (k ) J γ ( x ( k )) P − (k ) The time update of the EKF is ^

P − (k + 1) = J φ ( x(k )) P(k ) J φT ( x(k )) + Q(k ) ^−

x (k + 1) = φ ( x(k )) + L(k )u (k )

411

Autonomous Robot Navigation using Flatness-based Control and Multi-Sensor Fusion

⎛ T cos(θ (k )) 0 ⎞ ⎛ 1 0 −v(k ) sin(θ )T ⎞ ^ ⎜ ⎟ ⎜ ⎟ where L(n) = ⎜ T sin(θ (k )) 0 ⎟ and J ϕ ( x (k )) = ⎜ 0 1 −v( k ) cos(θ )T ⎟ ⎜ ⎜0 0 ⎟ 0 1 T ⎟⎠ ⎝ ⎝ ⎠ 2 2 2 2 −3 while Q (k ) = diag[σ ( k ), σ ( k ), σ ( k )] , with σ (k ) chosen to be 10 , and ^

φ ( x(k )) = [ x(k ), y (k ), θ (k )]T , γ ( x(k )) = [ x(k ), y (k ), θ (k ), d (k )] T , i.e. ^ ⎛ ⎞ x(k ) ⎜ ⎟ ^ ⎜ ⎟ ^ y (k ) ⎟ γ ( x(k )) = ⎜ ^ ⎜ ⎟ θ (k ) ⎜ ⎟ ⎜ P j − x (k ) cos( P j ) − y (k ) sin( P j ) ⎟ i n i n ⎝ r ⎠

(52)

1 Assuming one sonar ns = 1 , and one plane P , n p = 1 in the mobile robot's neighborhood

one gets

1 0 0 ⎛ ⎞ ⎜ ⎟ 0 1 0 ⎟ J γT ( x (k )) = ⎜ ⎜ ⎟ 0 0 1 ⎜ ⎟ ' ' j j j j ⎝ − cos( Pn ) − sin( Pn ) {xi cos(θ − Pn ) − yi sin(θ − Pn )} ⎠ ^−

(53)

The vehicle is steered by a dynamic feedback linearization control algorithm which is based on flatness-based control [Oriolo, G. et al., (2002)]: ••

•

u1 = x d + K p ( xd − x) + K d ( x d − x) 1

••

•

u2 = y d + K p ( yd − y ) + K d ( y d − y ) 2

(54)

•

ξ = u1 cos(θ ) + u2 sin(θ ) u cos(θ ) − u1 sin(θ ) v = ξ, ω = 2 ξ

The following initialization is assumed (see Fig. 3(a)): (i) vehicle’s initial position in OXY :

x(0) = 0 m , y (0) = 0 m , θ (0) = 45.0o , (ii) position of the sonar in O ' X 'Y ' : x1' = 0.5 m , y1' = 0.5 m , θ1' = 0o , (iii) position of the plane P1 : Pr1 = 15.5 m, Pn1 = 45 o , (iv) state noise w(k ) = 0 ,

K (k ) ∈ R

P (0) = diag[0.1, 0.1, 0.1] and

R = diag[10−3 ,10−3 ,10−3 ] ,

(v)

Kalman

Gain

3×4

The use of EKF for fusing the data that come from odometric and sonar sensors provides an estimation of the state vector [ x(t ), y (t ), θ (t )] and enables the successful application of

412

Robotics, Automation and Control

nonlinear steering control of Eq. (54). For the case of motion along a straight line on the 2Dplane, the obtained results are depicted in Fig. 3(a). Moreover, results on the tracking of a circular reference path are given in Fig. 4(a), while the case of tracking of an eight-shaped reference path is depicted in Fig. 5(a). Tracking experiments for EKF-based state estimation were completed in the case of a curved path as the one shown in Fig. 6(a). 5.2 Flatness-based control using a state vector estimated by Particle Filtering The particle filter can also provide solution to the sensor fusion problem. The mobile robot model described in Eq. (48), and the control law given in Eq. (54) are used again. The number of particles was set to N = 1000 . The

measurement

update

the

p( x(k ) | Z ) = ∑ wki δ ξ ( x(k ))

wki =

wki p( z (k ) | x(k ) = ξ ki ) −

−

∑w j =1

j k

p( z (k ) | x(k ) = ξ kj )

where

the

measurement

with

i k−

i =1

equation

given

−

z (k ) = z (k ) + v(k ) with z (k ) = [ x(k ), y (k ), θ (k ), d (k )]T , and v(k ) = measurement noise. The

time

update

the

p( x(k + 1) | Z ) = ∑ wki δ ξ ( x(k )) i =1

i k

where

^−

ξ ki ~ p ( x(k + 1) | x(k ) = ξ ki ) and the state equation is x = ϕ ( x(k )) + L(k )U (k ) , where φ ( x(k )) , L(k ) and U (k ) are defined in subsection 5.1. At each run of the time update of ^−

the PF, the state vector estimation x (k + 1) is calculated N times, starting each time from a different value of the state vector ξ ki . The measurement noise distribution was assumed to be Gaussian. As the number of particles increases, the performance of the particle filterbased tracking algorithm also improves, but this happens at higher demand for computational resources. Control of the diversity of the particles through the tuning of the resampling procedure may also affect the performance of the algorithm. The obtained results are given in Fig. 3(b) for the case of motion along a straight line on the 2D plane. Additionally, results on the tracking of a circular reference path are given in Fig. 4(b), while the case of tracking of an eight-shaped reference path is depicted in Fig. 5(b). Tracking experiments for PF-based state estimation were completed in the case of a curved path as the one shown in Fig. 6(b). From the depicted simulation experiments it can be deduced that the particle filter for a sufficiently large number of particles can have good performance, in the problem of estimation of the state vector of the mobile robot, without being subject to the constraint of Gaussian distribution for the obtained measurements. The number of particles influences the performance of the particle filter algorithm. The accuracy of the estimation succeeded by the PF algorithm improves as the number of particles increases. The initialization of the particles, (state vector estimates) may also affect the convergence of the PF towards the real value of the state vector of the monitored system. It should be also noted that the calculation time is a critical parameter for the suitability of the PF algorithm for real-time applications.

413

Autonomous Robot Navigation using Flatness-based Control and Multi-Sensor Fusion

When it is necessary to use more particles, improved hardware and parallel processing available to embedded systems, enable the PF to be implemented in real-time systems [Yang, N. et al., (2005)].

8 X

Fig. 5. (a) Desirable trajectory (continuous line) and trajectory using EKF fusion based on odometric and sonar measurements, when tracking a straight line.

Fig. 5. (b) Desirable trajectory (continuous line) and trajectory using PF fusion based on odometric and sonar measurements, when tracking a straight line.

Fig. 4. (a) The trajectory of the mobile robot (dashed line) tracks the reference circular path (continuous line) when the robot’s state vector is estimated with the use of Extended Kalman Filtering.

Fig. 4. (b) The trajectory of the mobile robot (dashed line) tracks the reference circular path (continuous line) when the robot’s state vector is estimated with the use of particle filtering.

414

Robotics, Automation and Control

Fig. 5. (a) The trajectory of the mobile robot (dashed line) tracks the reference eightshaped path (continuous line) when the robot’s state vector is estimated with the use of Extended Kalman Filtering.

Fig. 5. (b) The trajectory of the mobile robot (dashed line) tracks the reference eightshaped path (continuous line) when the robot’s state vector is estimated with the use of Particle Filtering.

Fig. 6. (a) The trajectory of the mobile robot (dashed line) tracks the reference curveshaped path (continuous line) when the robot’s state vector is estimated with the use of Extended Kalman Filtering.

Fig. 6. (b) The trajectory of the mobile robot (dashed line) tracks the reference curveshaped path (continuous line) when the robot’s state vector is estimated with the use of Particle Filtering.

6. Conclusions The paper has studied flatness-based control and sensor fusion for motion control of autonomous mobile robots. Flatness-based control stems from the concept of differential flatness, i.e. of the ability to express the system parameters (such as the elements of the state vector) and the control input as relations of a function y called flat output and of its higher order derivatives. Flatness-based control affects the dynamics of the system in a way similar to control through feedback-linearization. This means that writing the system variables and the control input as functions of the flat output enables transformation of the system dynamics into a linear ODE and subsequently permits trajectory tracking using linear control methods. For linear systems differential-flatness coincides with the property of controllability. Flatness-

Autonomous Robot Navigation using Flatness-based Control and Multi-Sensor Fusion

415

based control is applicable to finite dimensional systems (linear or nonlinear) as well as to infinite dimensional systems, such as the ones usually described by PDE. The problem of motion control of the mobile robots becomes more complicated when the robot's state vector is not directly measurable but has to be reconstructed with the use of measurements coming from distributed sensors. Consequently, the control input generated by the flatness-based control algorithm has to use the estimated state vector of the robotic vehicle instead of the real one. Extended Kalman and Particle filtering have been tested in the problem of estimation of the state vector of a mobile robot through the fusion of position measurements coming from odometric and sonar sensors. The paper has summarized the basics of the Extended Kalman Filter, which is the most popular approach to implement sensor fusion in nonlinear systems. The EKF is a linearization technique, based of a firstorder Taylor expansion of the nonlinear state functions and the nonlinear measurement functions of the state model. In the EKF, the state distribution is approximated by a Gaussian random variable. Although the EKF is a fast algorithm, the underlying series approximations can lead to poor representations of the nonlinear functions and the associated probability distributions. As a result, the EKF can sometimes be divergent. To overcome these weekness of the EKF as well as the constraint of the Gaussian state distribution, particle filtering has been introduced. Whereas the EKF makes a Gaussian assumption to simplify the optimal recursive state estimation, the particle filter makes no assumptions on the forms of the state vector and measurement probability densities. In the particle filter, a set of weighted particles (state vector estimates evolving in parallel) is used to approximate the posterior distribution of the state vector. An iteration of the particle filter includes particle update and weights update. To succeed the convergence of the algorithm at each iteration resampling takes place through which particles with low weights are substituted by particles of high weights. Simulation tests have been carried out to evaluate the performance of flatness-based control for the autonomous mobile robot, when using the EKF and the particle filter for the localization of the robotic vehicle (through the fusion of measurements coming from distributed sensors). It has been shown, that comparing to EKF, the PF algorithm results in better estimates of the mobile robot's state vector as the number of particles increases, but on the expense of higher computational effort. Consequently, the flatness-based controller which used the robot's state vector coming from the particle filter, had better tracking performance than the flatness-based controller which used the robot's state vector that was estimated by the EKF. It has been also observed that the accuracy in the localization of the mobile robot, succeeded by the particle filter algorithm depends on the number of particles and their initialization.

7. References Arulampalam, S.; Maskell, S.R., Gordon, N.J. & Clapp, T. (2002). A tutorial on particle filters for on-line nonlinear/non-Gaussian Bayesian tracking. IEEE Transactions on Signal Processing, Vol. 50, pp. 174-188. Campillo, F. (2006). Particulaire & Modèles de Markov Cachés, Master Course Notes Filtrage et traitment des donées, Université de Sud-Toulon Var, France. Caron, F.; Davy,M., Duflos, E. & Vanheeghe, P. (2007). Particle Filtering for Multi-Sensor Data Fusion with Switching Observation Models: Applications to Land Vehicle Positioning. IEEE Transactions on Signal Processing, Vol. 55, No. 6, pp., 2703-2719. Crisan, D. & Doucet, A. (2002). A Survey of Convergence Results on Particle Filtering Methods for Practitioners. IEEE Transactions on Signal Processing, Vol. 50, No. 3, pp.736-746.

416

Robotics, Automation and Control

Fliess, M. & Mounier, H. (1999). Tracking control and π -freeness of infinite dimensional linear systems, Dynamical Systems, Control, Coding and Computer Vision, G. Picci and D.S. Gilliam (Eds.), Vol. 258, pp. 41-68, Birkhaüser. Jetto, L.; Longhi, S. & Venturini, G. (1999). Development and Experimental Validation of an Adaptive Extended Kalman Filter for the Localization of Mobile Robots. IEEE Transactions on Robotics and Automation, Vol. 15, No.2. Kitagawa, G. (1996). Monte-Carlo filter and smoother for non-Gaussian non-linear statespace models. J. Computat. Graph. Statist., Vol.5, No.1, pp. 1-25. Laroche,B. ; Martin, P. & Petit,N. (2007). Commande par platitude: Equations différentielles ordinaires et aux derivées partielles, Ecole Nationale Supérieure des Techniques Avancées, Paris. Lévine, J.; Nguyen, D.V. (2003). Flat output characterization for linear systems using polynomial matrices. Systems & Control Letters, Elsevier, Vol. 48, pp. 69-75. Martin, P. & Rouchon,P. (1999). Systèmes plats: planification et suivi des trajectoires, Journées X-UPS,Ecole des Mines de Paris, Centre Automatique et Systèmes, Mai 1999. Meurer T.; & Zeitz, M. (2004). A modal approach to flatness-based control of flexible structures. PAMM Proceedings on Applied Mathematics and Mechanics, Vol. 4, pp. 133-134. Mounier, H. & Rudolph, J. (2001). Trajectory tracking for π -flat nonlinear delay systems with a motor example, Nonlinear control in the year 2000 (A. Isidori, F. LamnabhiLagarrigue and W. Respondek, editors), Vol.1, Lecture Notes in Control and Inform. Sci.,vol. 258, pp. 339-352, Springer. Oriolo,G.; De Luca, A. & Vendittelli, M. (2002). WMR Control Via Dynamic Feedback Linearization: Design, Implementation and Experimental Validation. IEEE Transactions on Control Systems Technology, Vol. 10, No.6 , pp. 835-852. Rigatos, G.G. (2003). Fuzzy Stochastic Automata for Intelligent Vehicle Control. IEEE Transactions on Industrial Electronics, Vol. 50, No. 1, pp. 76-79. Rigatos,G.G.; Tzafestas, S.G. & Evangelidis, G.A. (2001). Reactive parking control of a nonholonomic vehicle via a Fuzzy Learning Automaton. IEE Proceedings : Control Theory and Applications, Vol. 148, pp. 169-179. Rigatos,G.G. (2008). Coordinated motion of autonomous vehicles with the use of a distributed gradient algorithm, Journal of Applied Mathematics and Computation, Elsevier, Vol 199, Nο. 2, pp 494-503. Rigatos, G.G. (2007a). Particle Filtering for state estimation in nonlinear industrial systems, 2nd IC-EpsMsO 2007, 2nd International Conference on Experiments / Process / System Modelling / Simulation & Optimization, Athens, Greece, July 2007 . Rigatos, G.G. & Tzafestas,S.G. (2007). Extended Kalman Filtering for Fuzzy Modeling and Mutli-Sensor Fusion. Mathematical and Computer Modeling of Dynamical Systems, Vol. 13, No. 3, Taylor and Francis. Rigatos,G.G. (2007b). Extended Kalman and particle filtering for sensor fusion in mobile robot localization, PhysCon 2007, International Conference on Physics and Control, Potsdam, Germany, Sep. 2007. Rouchon, P. (2005). Flatness-based control of oscillators. ZAMM Zeitschrift fur Angewandte Mathematik und Mechanik, Vol. 85, No.6, pp. 411-421. Rudolph,J. (2003). Flatness Based Control of Distributed Parameter Systems, Steuerungs und Regelungstechnik, Shaker Verlag, Aachen. Yang, N.; Tian, W.F., Jin, Z.H. & Zhang, C.B. (2005). Particle Filter for sensor fusion in a land vehicle navigation system. Measurement Science and Technology, Institute of Physics Publishing, Vol. 16, pp. 677-681. Thrun, S.; Burgard, W. & Fox, D. (2005). Probabilistic Robotics, MIT Press.

21 An Improved Real-Time Particle Filter for Robot Localization Dario Lodi Rizzini and Stefano Caselli

Dipartimento di Ingegneria dell’Informazione, Università degli Studi di Parma, Parma, Italy 1. Introduction Robot localization is the problem of estimating robot coordinates with respect to an external reference frame. In the common formulation of the localization problem, the robot is given a map of its environment, and to localize itself relative to this map it needs to consult its sensor data. The effectiveness of a solution to the localization problem in an unstructured environment strongly depends on how it copes with the uncertainty affecting robot perception. The probabilistic robotics paradigm provides statistical techniques for representing information and making decision, along with a unifying mathematical framework for probabilistic algorithms based on Bayes rule (Thrun et al., 2005). For this reason, bayesian filtering has become the prevailing approach in recent works on localization (Elinas & Little, 2005; Sridharan et al., 2005; Hester & Stone, 2008). Bayesian filtering is a general probabilistic paradigm to arrange motion and sensor data in order to achieve a solution in the form of distribution of state random variables. Bayesian filters differ in the representation of the probability density function (PDF) of state. For example, the resulting estimation of Gaussian filters (Kalman Filter, Extended Kalman Filter) (Leonard & Durrant-Whyte, 1991; Arras et al., 2002) is expressed in the form of a continuous parametric function, while the state posterior is decomposed in discrete elements for nonparametric filters. The main nonparametric algorithm is called Particle Filter (PF) (Fox et al., 1999) and relies on importance sampling (Doucet et al., 2001). With importance sampling, the probability density of the robot pose is approximated by a set of samples drawn from a proposal distribution, and an importance weight measures the distance of each sample from the correct estimation. The nonparametric approach has the advantage of providing a better approximation of the posterior when a parametric model does not exist or changes during iteration, e.g. in initialization or when environment symmetries determine a multi-modal PDF. Even if techniques like Multi-Hypothesis Tracking (Arras et al., 2002) attempt to manage multimodal distributions, particle filters are more efficient and can represent all kinds of PDFs, including uniform distributions. Moreover, particle filters limit errors due to the linearization of model equations that can lead to poor performance and divergence of the

418

Robotics, Automation and Control

filter for highly nonlinear problems. Unfortunately, particle filters suffer from computational complexity due to the large number of discrete samples of the posterior: for each sample a pose update, a correction and a resample step are performed. Since localization can be performed slowly with respect to the usual movement and tasks of the robot, it would be conceivable to perform localization over a large time interval. Therefore, there have been attempts to adapt the number of samples (Fox, 2003). However, during an excessive time interval uncertainty increases and many useful observations are dropped; a proper interval to complete a particle filter iteration should be approximately equal to the rate of incoming data. A trade-off must therefore be reached between time constraints imposed by the need of collecting sensor data incoming with a given rate and the number of samples determining the accuracy of the representation of localization hypotheses. Performance depends both on the number of integrated observations and on the number of samples. The Real-Time Particle Filter (RTPF) (Kwok et al., 2004) is a variant of a standard particle filter addressing this problem. Samples are partitioned into subsets among observations over an estimation window. The size of each partitioned subset is chosen so that a particle filter iteration can be performed before a new observation is acquired. The difference with standard PF with smaller sample set lies in the representation of the posterior as a mixture of samples: at the end of an estimation window the distribution consists of the samples from each subset of the window. Mixture weights determine how each partition set contributes to the posterior and are computed in order to minimize the approximation error of the mixture distribution. While RTPF represents a remarkable step toward a viable particle filter-based localizer, there are a few issues to be addressed in developing an effective implementation. RTPF convergence is prone to bias problem and to some numerical instability in the computation of the mixture weights arising from the need to perform a numerical gradient descent. Furthermore, even adopting RTPF as the basic architecture, the design of a flexible and customizable particle filter remains a challenging task. For example, life cycle of samples extends beyond a single iteration and covers an estimation window in which mixture posterior computation is completed. This extended life cycle of samples impacts over software design. Moreover, RTPF addresses observations management and derived constraints. A good implementation should be adaptable to a variety of sensors. In this chapter, we describe the application of RTPF to robot localization and provide three additional contributions: a formal analysis for the evolution of mixture of posterior in RTPF, a novel solution for the computation of mixture weights yielding improved stability and convergence, and a discussion of the design issues arising in developing a RTPF-based robot localization system. The algorithm described in (Kwok et al., 2004) computes mixture weights by minimizing the Kullback-Leibler (KL) distance between the mixture distribution and the theoreticallycorrect one. Unfortunately, this criterion tends to promote partition sets of the estimation window that provide a poor representation of the distribution of robot state. In particular, we show that KL criterion favours sets with low effective sample size (Liu, 1996) and leads to a bias in the estimation. As an alternative way to compute mixture weights, we define a weights matrix, whose elements are related to the effective sample size. The mixture weight vector is then computed as an eigenvector of this matrix. This solution is more robust and less prone to numerical instability. Finally, we propose the design of a library that takes care

An Improved Real-Time Particle Filter for Robot Localization

419

of efficient life cycle of samples and control data, which is different between RTPF and standard particle filter, and supports multiple motion and sensor models. This flexibility is achieved by applying generic programming techniques and a policy pattern. Moreover, differing from other particle filter implementations (e.g., CARMEN (Montemerlo et al., 2003)), the library is independent from specific control frameworks and toolkits. The remaining of the chapter is organized as follows. Section 2 contains an overview of RTPF with the original algorithm to compute mixture weights. Section 3 provides a formal description of the bias problem and a novel approach in the computation of mixture weights based on the effective number of samples. This approach simplifies RTPF and tries to avoid spurious numeric convergence of gradient descent methods. Section 4 illustrates design issues connected to RTPF and describes a localization library implementing a highly configurable particle filter localizer. Section 5 presents simulation and experimental results which are reported and compared with the original RTPF performance. Finally, section 6 gives conclusion remarks.

2. Real-time particle filters In particle filters, updating the particles used to represent the probability density function (potentially a large number) usually requires a time which is a multiple of the cycle of sensor information arrival. Naive approaches, yet often adopted, include discarding observations arriving during the update of the sample set, aggregating multiple observations into a single one, and halting the generation of new samples upon a new observation arrival (Kwok et al., 2004). These approaches can affect filter convergence, as either they loose valuable sensor information, or they result in inefficient choices in algorithm parameters.

Fig. 1 RTPF operation: samples are distributed in sets, associated with the observations. The distribution is a mixture of the sample sets based on weights α i (shown as ai in figure). An advanced approach dealing with such situations is the Real-Time Particle Filters (RTPF) (Kwok et al., 2003; Kwok et al., 2004) which is briefly described in the following. Consider k observations. The key idea of the Real-Time Particle Filter is to distribute the samples in sets, each one associated with one of the k observations. The distribution representing the system state within an estimation window will be defined as a mixture of the k sample sets as shown in Fig. 1. At the end of each estimation window, the weights of the mixture belief are determined by RTPF based on the associated observations in order to minimize the approximation error relative to the optimal filter process. The optimal belief could be obtained with enough computational resources by computing the whole set of samples for each observation. Formally:

420

Robotics, Automation and Control

Belopt ( xt ) ∝ ∫ ∏ p( zt | xt ) p ( xt | xt −1 ,ut ) Bel ( xt )dxt k

i =1

…

dxt

k −1

(1)

where Bel ( xt 0 ) is the belief generated in the previous estimation window, and z ti , uti , xti are, respectively, the observation, the control information, and the state for the i − th interval. Within the RTPF framework, the belief for the i − th set can be expressed, similarly, as:

Beli ( xt ) ∝ ∫ p ( zti | xti ) ∏ p( xt | xt −1 ,ut ) Bel ( xt )dxt k

j =1

…

dxt

k −1

(2)

containing only observation-free trajectories, since the only feedback is based on the observation z ti , sensor data available at time t i . The weighted sum of the k believes belonging to an estimation window results in an approximation of the optimal belief:

Belmix ( xt | α ) ∝ k

∑ α Bel ( x i

i =1

)

(3)

An open problem is how to define the optimal mixture weights minimizing the difference between the Belopt ( xtk ) and Bel mix ( xtk | α ) . In (Kwok et al., 2004), the authors propose to minimize their Kullback-Leibler distance (KLD). This measure of the difference between probability distributions is largely used in information theory (Cover & Thomas, 1991) and can be expressed as:

J (α ) ∝ ∫ Belmix ( xt | α ) log k

Belmix ( xt | α ) dx t Belopt ( xt ) k

k −1

(4)

To optimize the weights of mixture approximation, a gradient descent method is proposed in (Kwok et al., 2004). Since gradient computation is not possible without knowing the optimal belief, which requires the integration of all observations, the gradient is obtained by Monte Carlo approximation: believes Beli share the same trajectories over the estimation windows, so we can use the weights to evaluate both Beli (each weight corresponds to an observation) and Belopt (the weight of a trajectory is the product of the weights associated to this trajectory in each partition). Hence, the gradient is given by the following formula: k

∂J ≅ 1+ ∂α i

∑ w ti (x

s=1

(s) ti

)

∑α

w t (x (s) ) t j

j =1

∏ w tj (x(s) tj )

(5)

j =1

where Beli is substituted by the sum of the weights of partition set i − th and Belopt by the sum of the weights of each trajectory. Unfortunately, (5) suffers from a bias problem, which (Kwok et al., 2004) solve by clustering samples and computing separately the contribution of each cluster to the gradient (5). In the next section, an alternative solution is proposed.

421

An Improved Real-Time Particle Filter for Robot Localization

3. An enhanced RTPF In this section we provide a formal investigation on the motivation of bias in RTPF estimation in (Kwok et al., 2004) and we propose a new solution for mixture weights computation. 3.1 A bias in RTPF In RTPF, samples belonging to different partition sets are drawn from the same proposal, but their importance weights depend on different observation likelihood functions p( zti | xti ) , which are computed in different time instants ti . Hence, the first source of disparity among partition sets is the degree of proposal dispersion during the correction step. A suitable measure of proposal dispersion at iteration t i is provided by the radius of the ball set B (η xti , r ) ∈ ℜ d , which is centered on expected value xti and includes a consistent portion of the distribution of xti . The probability that a sample falls in B (η xti , r ) can be bound by r and the trace of the covariance matrix Σ xti , since the following Chebychev-like inequality holds:

P ( xti ∈ B (η xt i , r )) > 1 −

( )

tr Σ xti r2

(6)

In the following, the probability of event given by B (η xti , r ) will refer to a proposal density function arrested in ti: i

π ( xt ) = ∫ℜ ∏ p ( xtj | xtj −1 , utj )dxt 0 … dxtk −1 i

d ×i

j =1

(7)

Then, given 0 < ε < 1 , a sample falls in a ball with at least probability ε when its radius is larger than the dispersion radius:

rt ε = i,

tr(Σ x ) ti

1− ε

(8)

Parameter rt i , ε provides a rough estimation for dispersion because only for unimodal PDF the ball B (η xti , rti ,ε ) (briefly B hereafter) limits a region around a local maximum. Furthermore, it is often the case that xti is a vector of heterogeneous random variables (e.g. cartesian coordinates and angular values), whose variances are mixed in the trace, with the result that bound (8) largely overestimates the region. However, the dispersion radius is a synthetic value and can be adapted to multimodal distributions after decomposition into a sum of unimodal hypotheses. Empirically, this decomposition is achieved by clustering on samples. By applying command control and updating robot position, the dispersion radius increases together with the trace of the covariance matrix. If Gti is the Jacobian of motion model computed in (η xti , u ti ) , with Gti G Tti ≥ I (hypotheses verified by a standard model like (Thrun et al., 2005)), and Σ wti is the covariance matrix of additive noise, then

422

Robotics, Automation and Control

(

)

tr ( Σ xti+1 ) ≈ tr Gti Σ xti GtTi + tr ( Σ wti )

(9)

Thus, we conclude that tr(Σ xti ) ≤ tr(Σ xt i+1 ) and that the dispersion radius increases over the estimation window. A more accurate estimation of how it increases could be obtained with further hypotheses on the motion model, e.g. Lipschitz continuity. Since the proposal is more and more spread in the estimation window and correction is performed at different times for each partition, we want to investigate how the dispersion affects importance weights. Observation likelihood wti ( x) = p ( z ti | x) is usually more concentrated than the proposal, sometimes peaked as shown in (Grisetti et al., 2007). We assume that, given a proper δ > 0 , region

L = { x ∈ B | wt ( x) > δ } i

(10)

covers a consistent portion of wti ( x) . Thus, observation likelihood is bound in L by

M = sup x∈L wti ( x) < ∞ (envelope condition) and in B \ L by δ . Hence, wti ( x) < λ ( x)

x∈B else

⎧M

λ ( x) = ⎨ ⎩δ

(11)

The bounding function λ ( x) and set L are defined on ball B , and in the following we will restrict the sampling domain to B using π ( xti | xti ∈ B ) as proposal. This assumption allows us to consider the dispersion radius in the following discussion. Moreover, this approximation is not so rough when ε is close to 1. The effective sample size (Liu, 1996) is a measure of the efficiency of a set of samples in the representation of a target posterior:

eff

1 N

(s) ∑=1 w tj (x ) 2

(

(s) ∑ w t (x ) s =1

)

(s) 2 ∑ w t (x t ) N

s =1

(12)

(13)

~ ( x) with their The above expression is achieved by substituting normalized weights w ti expression. Maximizing the effective sample size is equivalent to minimizing the variance of the weights: it is easy to show with Jensen inequality that neff is bounded by the number of samples N , which is obtained when each weight is equal to 1 and the variance is small. Bounds on observation likelihood allow an approximation of expected values of weight and square weight:

Eπ [ wti ( x) | xti ∈ B ] ≤ M ⋅ H L + δ ⋅ H B\ L

(14)

423

An Improved Real-Time Particle Filter for Robot Localization

Eπ ⎡⎣ wt2 ( x) | xt ∈ B ⎤⎦ ≤ M 2 ⋅ H L + δ 2 ⋅ H B\ L i

(15)

where H L = Eπ [ I L ( x )] and H B \ L = Eπ [ I B \ L ( x)] are the visit histograms of bins L and

B \ L respectively; in our notation I D ( x) is the indicator variable with value 1 when x falls in D , zero otherwise. Equations (14) and (15) can be used to approximate numerator and denominator of (13):

eff

≅N

⎛ ( M ⋅ H L + δ ⋅ H B\ L ) 2 M 2H 2 ⎞ M ≅ N ⎜ H B\L + 2 H L + 2 L ⎟ 2 2 M ⋅ H L + δ ⋅ H B\L δ δ H B\L ⎠ ⎝

(16)

2 The approximation given by (16) follows from the assumption that H L / H B \ L 0 , V

is a symmetric and positive

semi-definite (SPSD) matrix. Moreover, each element j on the main diagonal is the inverse of the effective sample size of set j . The effective sample size is a measure of the efficiency of importance sampling on each of the partition sets. Therefore, the off-diagonal elements of V correspond to a sort of importance covariances among two partition sets. Thus we will refer to this matrix as weights matrix. Hence, a criterion to compute the mixture weights consists of choosing the vector that is left unchanged by map (19) except for scale. Since (19) depends on square of sample weights, the resulting mixture weights reflects the importance of each partition set according to the effective sample size. The vector is thus obtained by searching for an eigenvector of matrix V. To achieve better stability we choose the eigenvector corresponding to the largest eigenvalue. The eigenvector can be computed using the power method or the inverse power method. This criterion can be interpreted as an effort to balance the effective number of samples keeping the proportion among different partition sets. Fig. 2 illustrates the differences in mixture weights computed according to the original algorithm (RTPF-Grad) and the proposed variant (RTPF-Eig) with an example. When RTPT-

425

An Improved Real-Time Particle Filter for Robot Localization

Eig is used to compute mixture weights, the weights of the last partition sets in the estimation window decrease with the effective sample size of the sets, while they increase with RTPF-Grad. Thus, the proposed criterion takes into account the effectiveness of representation provided by partition sets. 1

relative effective sample size

0.95

0.9

0.85

0.8

0.75

0.7

0.65 0

partition set index

0.09

RTPF-Grad RTPF-Eig

0.085 0.08

mixture weights

0.075 0.07 0.065

0.06 0.055 0.05 0.045 0.04 0

partition set index

Fig. 2 Effective sample size (top) and mixture weights computed according to the original algorithm and to the proposed variant (bottom) in an estimation window of 15 partitions.

4. Complexity of RTPF implementation As pointed out in the previous section, updating the whole set of samples can be quite demanding from a computational perspective. Together with advanced algorithms, able to

426

Robotics, Automation and Control

maximize the use of sensor data, the developer should pay great attention to the implementation issues. Inefficiencies due to poor analysis in object creation/destruction cycles and abstraction levels supporting the polymorphic behaviour can introduce drain of computing time preventing the successful application of the conceived algorithm. This section describes a library designed to efficiently support the implementation of particle filter localization algorithms. The library aims at providing an efficient yet open infrastructure allowing the users to exploit the provided genericity to integrate their own algorithms. The library has been studied to be easily included in the majority of control systems for autonomous mobile robots. In a functional layer, or controller, with the basic computational threads for robot action and perception, the localization task can be simply configured as a computationally demanding, low priority thread. 4.1 Design of the library Based on the functional analysis of the localization problem, four main components have been identified: the localizer, the dynamic model of the system, the sensor data model, and the maps. The interaction among these components enables the implementation of the prediction, matching, and resampling phases of particle filter algorithms. Three classes storing basic data information are required: system state, control command, and sensor data. The main component is the localizer implemented by the Localizer class, managing the localization cycle through the coordination of the classes composing the library. Listing 1 shows a simplified interface of the Localizer class, including just two update() methods. In the prediction phase, update() uses the control command executed by the robot (Control class) to perform the computation on the SystemModel class, representing the dynamic model of the system. In the correction phase, the second update() method uses the perception of the environment (SensorData class) to update the density function through the evaluation of the weight. In the following, we describe only implementation strategies to face the main sources of computational overhead arising in the implementation of abstraction layers and object creation/destruction cycles. The main goal of the library is to provide an open infrastructure aimed at integrating developer choices without their hard-wiring inside the code. Often this goal is achieved with the strategy pattern (Gamma et al., 1996). With this pattern the algorithms are implemented separately as subclasses of an abstract strategy class. While this implementation avoids the hard-wiring of user choices inside the library code, it causes inefficiency due to the abstraction levels introduced to support the polymorphic behaviour. This remark, together with the observation that the choices are immutable at runtime, suggested the use of static polymorphism in the implementation of the current version of the library. Static polymorphism is a more effective flavor of polymorphism based on the use of templates. Templates were originally conceived to support generic programming, as they are functions or classes that are written for one or more types not yet specified. Each template parameter models one degree of variability of the problem domain. This parameter must be fixed at compile time allowing the compiler to generate the proper code. This static polymorphism guarantees type checking and improves code optimization. Exploitation of templates to perform code generation is also known as generic programming (Alexandrescu, 2001).

An Improved Real-Time Particle Filter for Robot Localization

427

template< State,

SensorData,

Control,

SampleCounter,

SampleManager, Fusion >

class Localizer : public SampleManager<State> {

Fusion<SensorData> fusion_; public:

template< ... >

// StateConverter

Localizer(Pdf &pdf, StateConverter &converter, tods::TimeInterval &period); ~Localizer(); template< ... >

// SystemModel Parameters

template< ... >

// SensorModel Parameters

void update(Control &u, SystemModel &sys);

};

void update(SensorData &z, SensorModel &sen);

Listing 1 The Localizer class. To reduce the second source of computational overhead, we focused our attention on the creation/destruction of the objects within the localization cycle to reduce their dynamic allocation. Fig. 3 presents the life cycle of the objects created and destroyed during a single localization cycle of the RTPF algorithm. During each step, new samples and new objects for representation of odometric and sensor data are created to be immediately destroyed. Note that the samples and controls are created in different iterations of the localizer in the same estimation window and survive after the end of this window: that is a rather different way of handling objects from a standard particle filter implementation. Thus a management

428

Robotics, Automation and Control

policy and proper data structures for storage are required in a flexible implementation. Therefore observations, even if they are created and used in an iteration and their handling is quite straightforward, exhibit a variety of sensor models given by the range of sensors. To support such a variety, the library provides generic sensor data.

Fig. 3 Life cycle for particle, measurement, and control objects within a single step in a realtime particle filter.

5. Results In this section performance of RTPF is evaluated both in simulated environment and using experimental data collected by robot sensor. An important purpose of this section is the comparison of the two RTPF versions differing in the method for computing mixture weights. 5.1 Simulation Several tests were performed in the environments shown in Fig. 4 and Fig. 5. They correspond to the main ground floor hallway in the Computer Engineering Department of the University of Parma (Fig. 4) and to the hallway of the Department of Computer Science and Engineering of the University of Washington (Fig. 5, map adapted from (Kwok et al., 2004)). These environments allow verification of RTPF correctness while coping with several symmetric features, which may cause ambiguities in the choice of correct localization hypotheses. The environment of Fig. 5 had been exploited in (Kwok et al., 2004) to verify RTPF correctness and has therefore been considered as a reference. In simulation, the map is stored as a grid with a given resolution (0.20 m) and is used both to create simulated observations and to compute importance weights in correction steps. Data provided to the localizer consist of a sequence of laser scans and measurements:

An Improved Real-Time Particle Filter for Robot Localization

429

scanned ranges are obtained by ray tracing a beam on the discretized map. The measurement model is also based on ray tracing according to standard beam models for laser scanner (Thrun et al., 2005). In our tests we have used only three laser beams measuring distances to left, right and frontal obstacles; such poor sensor data stress the role of algorithm instead of sensor data quality. A gaussian additive noise was added to both range beams and robot movements representing environment inputs and robot state in simulation. Thus simulation tests are performed in an environment known in detail and are best suited for comparing performance between algorithms. The task of the robot is to achieve localization while moving in the environments of Fig. 4 and Fig. 5 along assigned trajectories. Simulated trajectories, labeled as Path 1 and Path 2 in Fig. 4 and Fig. 5, correspond to lengths of approximately 5 to 8 m.

Fig. 4 Map 1 – Hallway and simulated paths in the Computer Engineering Department, University of Parma.

Fig. 5 Map 2 – Hallway and simulated paths in the Department of Computer Science and Engineering, University of Washington. Localization algorithms investigated are RTPFs in the two versions: the original steepest descent-based one (RTPF-Grad) and the proposed one based on the effective number of samples (RTPF-Eig). During these tests the partition set size was 1000 samples. A summary of simulation results is reported in Fig. 6 and Fig. 7, where curves show the localization error for the two algorithms at each iteration by considering convergence to the maximal hypothesis. For both curves, each value is obtained by averaging the distances of the estimated pose from the real pose over 10 trials where localization eventually converged to the correct hypothesis within the maximum number of iterations (set to 40). For both

430

Robotics, Automation and Control

algorithms there were also a few instances where localization did not converge to the correct hypothesis within the length of the path, although the correct hypothesis was the second best. These unsuccessful experiments were approximately 10% of all simulated localization trials. We did not verify whether the robot would eventually recover its correct pose in the environment with further navigation. On the average, the two versions of the RTPF-based localizer converge to some few hypotheses after three iterations, and the common samples distribution is multi-modal. Hence, cluster search leads to few hypotheses with different weight. In our tests a hypothesis close to the correct robot pose always exists, and when this hypothesis prevails there is a sudden change in localization error, as shown in Fig. 6 and Fig. 7. Convergence is helped by recognizable features, e.g. the shape of scans, but when the environment is symmetric it can be difficult to reach, especially with limited or noisy sensoriality. Of course, the mean error trend in Fig. 6 and Fig. 7 does not correspond to any of the simulated trials; rather, it is the result of averaging trials with quick convergence and trials where the correct hypothesis could only be recovered after several more iterations. 10 RTPF-Grad RTPF-Eig

9 8

Error [m]

5 4

3 2 1

0 0

20 Num. Iteration

Fig. 6 Performance of the two RTPF versions in the simulated environment of Map 1. The xaxis represents the iterations of the algorithm. The y-axis shows the average error distance of the estimated pose from the actual robot pose. Fig. 8 provides an alternative view of the same data, as curves show the percentage of simulation trials converging to the correct hypothesis (i.e. with localization error less than 1.5 m) at each iteration. For both environments, convergence is reached with only few

431

An Improved Real-Time Particle Filter for Robot Localization

iterations in some simulation runs. In other simulations, the correct robot pose is recovered only after about 20 or 30 iterations, i.e. after sensing map features that increase the weight of the correct samples. Empirically, for the examined environments RTPF-Eig seems to exhibit a slightly faster convergence, on the average, to the correct localization hypothesis, even though its average error at the last recorded iteration appears somewhat larger. 8 RTPF-Grad RTPF-Eig 7

Error [m]

0 0

20 Num. Iteration

Fig. 7 Performance of the two RTPF versions in the simulated environmentof Map 2. The xaxis represents the iterations of the algorithm. The y-axis shows the average error distance of the estimated pose from the actual robot pose. 5.2 Experiments Real experiments took place in the environment of Fig. 4 collecting data with a Nomad 200 mobile robot equipped with a Sick LMS 200 laser scanner. The robot moved along Path 1 for about 5 m, from the left end of the hallway in steps of about 15−20 cm and reading three laser beams from each scan in the same way of the simulation tests. In the real environment localization was always successful, i.e. it always converged to the hypothesis closer to the actual pose in less than 10 iterations (remarkably faster than in simulation). Localization error after convergence was measured below 50 cm, comparable or better than in simulation. To assess the consistency of the localizer’s output on a larger set of experiments, we compared the robot pose computed by the localizer (using the RTPF-Eig algorithm) with the one provided by an independent localization methodology. To this purpose, some visual

432

Robotics, Automation and Control

% Test with error less than 1.5 m

(A) Convergence in Map 1 100

RTPF-Grad RTPF-Eig

80 60 40 20 0 0

20 Num. Iteration

% Test with error less than 1.5 m

(B) Convergence in Map 2 100

RTPF-Grad RTPF-Eig

80 60 40 20 0 0

20 Num. Iteration

Fig. 8 Percentage of simulation trials converged to the correct hypothesis, i.e. with localization error less than 1.5 m, during iterations for Map 1 (a) and Map 2 (b). landmarks were placed in the environment and on the mobile robot, and a vision system exploiting the ARToolKit framework (Kato & Billinghurst, 1999) was exploited to triangualate the robot position based on these landmarks. The vision system provided an independent, coarse estimate of the robot pose at any step, and hence allowed to establish convergence of the RTPF-based localizer. The two localization estimates were computed concurrently at each location and stored by the robot. Fig. 9 shows the results of 10 tests of RTPF-Eig over about 20 iterations. These results confirm that RTPF-Eig achieves localization to the correct hypothesis very fast in most experiments. After convergence, the maximum distance between RTPF-based and vision based estimates is about 70 cm due to the compound error of the two systems.

6. Conclusion In this chapter, we have described an enhanced Real-Time Particle Filter for mobile robot localization incorporating several formal and practical improvements. We have presented a formal discussion of computation of mixture weights in RTPFs, along with a new approach overcoming potential problems associated with the existing technique. The method proposed in this chapter computes mixture weights as the eigenvector of a matrix and thus avoids gradient descent, possibly prone to numerical instability. The method provides a balance of the effective sample size of partition sets on an estimation window. We have also

433

An Improved Real-Time Particle Filter for Robot Localization

Discrepancy Localizer-ARToolKit [m]

0 0

Num. Iteration

Fig. 9 Performance of RTPF-Eig using real data collected in the hallway of Map 1. described a library efficiently supporting implementation of particle filter algorithms and independent from any specific robot control architecture. The library takes advantage from generic programming and from a carefully designed object lifecycle model to minimize overhead while providing flexibility. The proposed approach has been implemented in a RTPF for localization with a Nomad 200 mobile robot equipped with a laser range scanner, and evaluated in both simulation tests and real experiments. In two simulation environments, the new approach has achieved a localization performance similar to the original KLD-based algorithm, while avoiding the potential problems associated with gradient search methods. In real experiments with the mobile robot, the modified RTPF-based localization system has proven very effective, yielding correct localization within a small number of filter iterations. Of course, further experimental work is required to assess the relative merit of the improved RTPF over the original approach. Nonetheless, the research described in this paper shows how a through theoretical understanding of the problem and of the algorithmic solution should be combined with a careful software implementation to attain the potential of probabilistic localization methods.

7. Acknowledgement This research has been partially supported by Laboratory LARER of Regione EmiliaRomagna, Italy.

434

Robotics, Automation and Control

8. References Alexandrescu, A. (2001). Modern C++ Design: Generic Programming and Design Pattern Applied. Addison-Wesley. Arras, K. O., Castellanos, H. F., and Siegwart, R.(2002). Feature-based multi-hypothesis localization and tracking for mobile robots using geometric constraints. IEEE Int. Conf. on Robotics and Automation, 2:1371–1377. Cover, T. M. & Thomas, J. A. (1991). Elements of Information Theory. Wiley. Doucet, A., de Freitas, J., & Gordon, N. (2001). Sequential Monte Carlo Methods in Practice. Springer. Elinas, P. and Little, J. (2005). sMCL: Monte-carlo localization for mobile robots with stereo vision. Proc. of Robotics: Science and Systems. Fox, D. (2003). Adapting the sample size in particle filters through KLD-sampling. Int. J. of Robotics Research, 22(12):985–1003. Fox, D., Burgard, W., & Thrun, S. (1999). Monte Carlo Localization: Efficient position estimation for mobile robots. Proc. of the National Conference on Artificial Intelligence. Gamma, E., Helm, R., Johnson, R., and Vlissides, J. (1995). Design Patterns: elements of reusable object-oriented software. Addison-Wesley. Grisetti, G., Stachniss, C., & Burgard, W. (2007). Improved techniques for grid mapping with Rao-Blackwellized particle filters. IEEE Trans. on Robotics, 23(1):34–46, February 2007. Hester, T., & Stone, P. (2008). Negative Information and Line Observations for Monte Carlo Localization. IEEE Int. Conf. on Robotics and Automation, pages 2764–2769. Kato, H. and Billinghurst, M. (1999). Marker tracking and HMD calibration for a videobased augmented reality conferencing system. Proc. of the Int. Workshop on Augmented Reality. Kwok, C., Fox, D., & Meilǎ, M. (2003). Adaptive real-time particle filters for robot localization. IEEE Int. Conf. on Robotics and Automation, 2:2836–2841, 2003. Kwok, C., Fox, D., & Meilǎ, M. (2004). Real-time particle filters. Proc. of the IEEE, 92(3):469– 484. Leonard, J. J. & Durrant-Whyte, H. F. (1991). Mobile Robot Localization by Traking Geometric Beacons. IEEE Int. Conf. on Robotics and Automation. Liu, J. (1996). Metropolized independent sampling with comparisons to rejection sampling and importance sampling. Statistics and Computing, 6(2):113–119. Montemerlo, M., Roy, N., & Thrun, S. (2003). Perspectives on standardization in mobile robot programming: The Carnegie Mellon navigation (CARMEN) toolkit. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems. Se, S., Lowe, D., & Little, J. (2002). Mobile robot localization and mapping with uncertainty using scale invariant visual landmark. Int. J. of Robotics Research, 21(8):735–758. Sridharan, M., Kuhlmann, G., & Stone, P. (2005). Practical vision-based Monte Carlo localization on a legged robot. IEEE Int. Conf. on Robotics and Automation, pages 3366–3371. Thrun, S., Burgard, W., & Fox, D. (2005). Probabibilistic Robotics. MIT Press, Cambridge, MA.

22 Dependability of Autonomous Mobile Systems Jan Rüdiger, AchimWagner and Essam Badreddin

Automation Laboratory, Dept. Mathematics & Computer Science, University of Heidelberg Mannheim, Germany 1. Introduction Computer systems pervade more and more our everyday life. They are found in workstations, mobile devices and in nearly every device - from consumer products such as coffee-machines to safety critical automotive systems such as cars, industrial productionlines etc. Due to increasing complexity, the controllability of such systems is a serious problem,while impacts and consequences on our daily life are continuously increasing. Therefore, non-functional system properties like dependability became a crucial factor within the computerized product design process. Although common unformal ideas in terms of dependability exist, a formal definition for dependability is still missing. One reason may be the historical growth of the definition of the term dependability, which has been added incrementally by a number of attributes. In the 40ies of the last century, the first computer based on vacuum tube technology with huge failure probabilities (in the ENIAC computer tubes failed every 7 minutes) were constructed, reliability became an issue. Due to increased interaction with the systems later on in the 60ies availability became even more important. Related to the requirements of controlling safety critical plants like nuclear power stations or space crafts the safety property of computer systems where getting into the focus during the 70ies. Internet connectivity, data bases and mobile services were the reason why security, integrity and maintainability have been added to the dependability concept. The state-of-the-art assessment of dependability is based on a binary fault model, which describes components on an operable – not operable basis and a logical error propagation using fault trees, event tress or binary block diagrams (Vesely et al., 1981). Modern approaches like Markov-Chain models (Flammini, 2006) or Stochastic Petri Nets (Filippini & Bondavalli, 2004) capture the time dependent probability of a combination of error states within a system. However, they are not able to detect the origin of an error resulting from the system dynamics. For instance, a light bulb mostly crashes during the switching on phase and not during stationary operation. This error scenario cannot be reflected by pure probabilistic modelling. A further disadvantage of pure probabilistic models is that they are more or less decoupled from the original behaviour of the system. Thus, finding a valid fault model which starts from the functional model of the system is up to the design engineer. However, even a fault probability equal to zero does not guarantee, that systems operate according to what users expect because the requirements on the dynamic system behaviour are not modelled. Dependability is more than a collection of attributes related to a probabilistic – but static – error description.

436

Robotics, Automation and Control

This is particularly important when dealing with autonomous or semi-autonomous systems. With an increasing degree of autonomy and safety requirements, requirements for dependability increase. Hence, being able to measure and compare the dependability of a system becomes more and more vital. Since autonomous mobile systems are often described by their behaviour, it is straightforward to also define the dependability of such systems in a behavioural context. To show the link between conventional approaches and to allow the usage existing tools, the proposed dependability model is supplemented by the majority of the dependability attributes described above. The proposed approach of a behaviour based definition for dependability described in this chapter is focused on autonomous mobile systems. Nevertheless, the ideas behind it are more general.

2. Basics of dependable systems According to (Candea, 2003) a general notion of what dependability is usually understood is summarizes as follows: turn your homework in on time, if you say you’ll do something then do it, don’t endanger other etc. Computer controlled systems, in our case autonomousmobile systems, do, however, not understand these vague concepts. When it comes to machines we need amore precise understanding and definition of dependability. If system dependability must be expressed and measured in numbers, a formal definition is even more important. This definition must be mostly system-independent in order to have the opportunity to compare the dependability between different systems. Finally, this definition must agree with the general notion of dependability. This chapter gives a broad overview of what is usually understood under the term dependability and discusses the sometimes different definitions of dependability used throughout literature. Based on the aforementioned and in combination with a behavioural system description, a dependability definition for autonomous mobile systems is proposed. Non-functional properties reflect the overall quality of a system. Beside performance, robustness, usability etc., dependability is getting amore important non-functional system requirement. The general, qualitative, definitions for dependability used so far in literature are presented and discussed in the following. The most frequently cited definitions of dependability are the ones introduced by Carter and Laprie which are presented here together with others in chronological order. Carter (Carter, 1982): A system is dependable if it is trustworthy enough that reliance can be placed on the service it delivers. Laprie (Laprie, 1992): Dependability is that property of a computing system which allows reliance to be justifiably placed on the service it delivers. Badreddin (Badreddin, 1999): Dependability in general is the capability of a system to successfully and safely fulfill its mission. Dubrova (Dubrova, 2006): Dependability is the ability of a system to deliver its intended level of service to its users. All four definitions have in common that they define dependability on the service a system delivers and the reliance that can be placed on that service. The service a system delivers is the behaviour perceived by the user, which can also be called the mission of the system.

Dependability of Autonomous Mobile Systems

437

Only few further definitions, following the idea of the definitions presented above exist. Among them, the dependability definition of the International Electrotechnical Commission (IEC) (“IEC 60050, IEV 191-02-03: Dependability and quality of service - Part 1: Common terms“ see IEC, 1990)1, the definition from the IFIP 10.4 Working Group on Dependable Computing and Fault Tolerance (see International Federation for Information Processing, ) and the definition from the Department of Defence (see Department of Defence, 1970). These classical definitions of dependability do, however, offer two major drawbacks: 1. They do not define a directly applicable and repeatable way to compute the dependability of a system.2 2. The dynamics of the system are not directly taken into account when investigating the dependability of the system. Using the information gained from the mathematical system model, for example, seems obvious when investing the dependability of that system. Neglecting the dynamics of a system means ignoring all information coming from that model. Furthermore the dynamics of a system is crucial for fault scenarios not only in the case of autonomous mobile systems. In this chapter a theoretical framework is proposed, that does not only describe a dependability definition and dependability means in relation to the dynamics of a system but also presents a repeatable and system-independent way of how to measure dependability. Dependability is mainly understood as an integrated concept that additionally consists of different attributes (see also Figure 1). According to (Avizienis et al., 2004b; Avizienis et al., 2004a; Randell, 2000) dependability consists of the following attributes: • Availability readiness for correct service, • Reliability continuity of correct service, • Safety absence of catastrophic consequences for both user(s) and environment, • Confidentiality absence of unauthorized disclosure of information, • Integrity absence of improper system state alteration and • Maintanability ability to undergo modifications and repairs. For further information as to the impact of these attributes on the dependability of the complete system, please refer to (Avizienis et al., 2004a) and (Dubrova, 2006). Further definitions with slightly different attributes exist in the literature (see i.e. Candea, 2003; Dewsbury et al., 2003). The main idea, i.e. dependability consists of different attributes, is still of value and will be part of the definition proposed below. The authors would like to express, however, that more than a static aggregation of attributes is needed for the description of the dependability of an autonomous mobile system. These attributes are always related to a static model, even if error propagation is dynamic. The proposal for dependability measurement also includes the attribute approach as a special case for being compatible with the qualitative dependability definitions. Unfortunately a few of the attributes have different, not necessarily similar, definitions. Still missing is a formal and comprehensive definition for a few attributes. As for autonomous mobile systems, not all attributes are of equal importance. For a discussion about the set of attributes important for autonomous mobile systems the reader is referred to (Rüdiger et al., 2007b). This definition of dependability is often referred to as ISO 1992 or IEC 50(191) According to IEC norm the definition is ,,used only for general descriptions in nonquantitative terms”

1 2

438

Robotics, Automation and Control

Figure 1. The Dependability Tree

3. Framework for a theory of dynamical systems In order to develop a formal definition for the dependability of autonomous mobile systems, first of all the following terms • service, • system, • and reliance mentioned in the non-formal definitions need a mathematical description. Therefore, a formal, mathematical, theory that is capable of describing these terms is needed. There are different techniques to describe a system mathematicely, that is modeling a system. Among them, the state-space approach can be mentioned, where a system is modelled by a set of input, output and state variables related by first order equations (time domain), and, for example, the frequency domain approach. Another approach is the Behavioural Modelling approach (see Willems, 1991) where a system is modelled only by describing its behaviour. Since the service a system delivers is the behaviour as it is perceived by the user, the later modelling technique was used and is shortly introduced in the following. Additionally, the behaviour based approach is quite common in the field of autonomous mobile robots. This goes back to 1980 where Rodney Brook introduced his subsumption architecture (see Brooks, 1986). Finally, the behavioural modeling approach offers the opportunity to modell the complete system including both environment and user. Willems (Willems, 1991) defines a system in a universum U . The elements of U are called outcomes of the system. A mathematical model of a system from a behavioural or black-box point of view claims that certain outcomes are possible, while others are not. The model thus defines a specific subset B ⊂ U . This subset is called the behaviour of the system. Subsequently, a (deterministic) mathematical model of a system is defined as: Definition 3.1 A mathematical model is a pair ( U , B ) with the universum U - its elements are called outcomes - and B the behaviour. Given the above definition of a mathematical model, a dynamical system is a set of trajectories describing the system behaviour during the time instants of interest in its signal space W .

Dependability of Autonomous Mobile Systems

439

In contrast to the state space representation, like x = f ○ x, a dynamical system is defined as: Definition 3.2 A dynamical system Σ is a triple Σ = ( T , W , B ) with T ⊆ R the time axis, W the signal space, and B ⊆ W T the behaviour. In the above definition a dynamical system is described by a period of time T , the signal space W and a set of time trajectories B . The behaviour of a dynamic system is described by a subset B of all possible time trajectories T ⇒ W . For a trajectory (an event) w : T → W the following applies: • w ∈ B the model allows the trajectory w • w ∉ B the model forbids the trajectory w The reader is referred to (Willems, 1991) for examples of this modeling technique.

4. Behaviour-based dependability of autonomous mobile systems As stated earlier, the service a system delivers is the behaviour as it is perceived by the user. A framework for a mathematical system description according to its behaviour was introduced in the last section. So far the system and its service, from the dependability definition, can be mathematical described. What remains is a fundamental definition of the service a system should deliver and an additional description of the reliance, in terms of the service offered, in the given framework. Finally the attributes of dependability as discussed in section 2 need to be defined in the framework (see also Rüdiger et al., 2007a). 4.1 Behaviour and mission of autonomous mobile systems As afore mentioned, the behaviour of a system is the set of all possible time trajectories of its external states (see section 3). Due to limitations, either deliberately, caused by a fault in the system, or changes in the environment, this set B could be slightly changed or reduced to a set of behaviours actually available to the system. Although it will probably be only a subset of the behaviours of the original system. This set, which may vary over time, will be defined as follows: T Definition 4.1 Let Σ = ( T , W , B ) be a dynamical system then B ⊆ W is called the set of available behaviours wi(t) : T → W , i = 1...n to the system at time t. Again, the set B is a set of time trajectories within the original signal space W . For the set B it must not necessarily apply B ⊂ B since it can also contain trajectories as a result of a fault in the system or a change in the environment not previously modelled. The set B may also vary over time. Additionally, the set B must not cover all trajectories from the set B since due to implementation reasons not all possible trajectories must be available to the system. Autonomous mobile systems are usually build as general purpose machines that handle different tasks. In the following, these tasks will also be called missions. For estimating the dependability of an autonomous mobile system it is, however, not important to handle all kinds of missions dependable, but only the mission in terms of the service the system should deliver or what the system is intended to do. Such missions are be defined as: Definition 4.2 (Mission) Let Σ = ( T , W , B ) be a time-invariant dynamical system. We say the mission wm of this system is the map wm : T → W with wm ∈ B .

440

Robotics, Automation and Control

Note that even if an autonomous mobile system is usually build for the handling of different tasks, the actual mission the system has to accomplish is defined as only one special behaviour from the set B . Thus, the mission, here defined, is just a special trajectory or, more precisely, a special behaviour in B . Weak controllability (see Hermann, 1977) is assumed since it does not make any sense to define a mission to a system which by definition is not able to accomplish it. If the system is capable of fullfilling its mission wm is defined as follows: Definition 4.3 A mission wm ∈ B for a given dynamical system Σ= ( T , W , B ) with the behaviours B is said to be accomplishable by this system if for all w1 ∈ B there exists a t ∈ T , t ≥ 0, a behaviour w ∈ B , w : T ∩ [0, t] → W and a behaviour w2 ∈ B such that w’∈ B , with w’ : T → W defined by:

and

Based on the definition of controllability according (Willems, 1991) a mission wm is accomplishable if the system can be steered from any past trajectory (light green trajectory in Fig. 2) to the beginning of the mission trajectory wm (black trajectory in Fig. 2) and can then be steered along the mission trajectory (red and blue trajectories in Fig. 2).

Figure 2. A mission (black line) is accomplished by steering the system to the mission trajectory by behaviour w1 and subsequently by steering along the mission trajectory with behaviours w2 and w3 4.2 Behaviour based attributes of dependability Before continuing with the definition of dependability for autonomous mobile systems, the basic attributes of dependability must be defined in a behavioural context. Only the main important attributes for dependability of autonomous mobile systems are introduced. Please refer to (Rüdiger et al., 2007a) for an advanced description and (Rüdiger et al., 2007b) for the subset of attributes needed for measuring the dependability of autonomous mobile systems. 4.2.1 Reliability A common (see e.g. Dubrova, 2006) unformal definition for reliability is: Reliability R|t is the probability that the system will operate correctly in a specified operating environment in the interval [0, t], given that it worked at time 0.

Dependability of Autonomous Mobile Systems

441

An autonomous system is, thus, said to be reliable if the system state does not leave the set of admissible trajectories B . The reliability of a system can be defined as: Definition 4.4 Let Σ = ( T , W , B ), T = Z or R , be a time-invariant dynamical system. The system is said to be reliable in the period [0, t] if for all 0 ≤ t1 ≤ t the system state is w(t1) ∈ B . Correspondingly, the reliability of the system is the probability that the system is reliable. 4.2.2 Availability Availability is typically important for real-time systems where a short interruption can be tolerated if the deadline is not missed. Availability A|t is the probability that a system is operational at the instant of time t. In contrast to reliability the availability is defined at a time instant t while the reliability is defined in a time interval. Definition 4.5 Let Σ = ( T , W , B ), T = Z or R , be a time-invariant dynamical system. The system is said to be available at time t if w(t) ∈ B . Correspondingly, the availability of the system is the probability that the system is available. 4.2.3 Safety From the reliability point of view, all failures are equal. In case of safety, those failures are further divided into fail-safe and fail-unsafe ones. Safety is reliability with respect to failures that may cause catastrophic consequences. Therefore safety is unformaly defined as (see e.g. Dubrova, 2006): Safety S(t) of a system is the probability that the system will either perform its function correctly or will discontinue its operation in a fail-safe manner. For the formal definition of safety an area S is introduced, as in (Badreddin & Abdel-Geliel, 2004), which leads to catastrophic consequences when left. In the latter case it is, however, assumed that this Dynamic Safety Margin is fully contained in the stability region while S is defined to be around B . This margin is, like B , highly system specific, but can be set equal to B in the case of restrictive systems.

Figure 3. Safety: The system trajectory w leaves the set of admissible trajectories B but is still considered to be safe since it remains inside S

442

Robotics, Automation and Control

Definition 4.6 Let Σ = ( T , W , B ), T = Z or R , be a time-invariant dynamical system with a safe area S ⊇ B . The system is said to be safe if for all t ∈ T the system state w(t) ∈ S . This definition is consistent with the idea that a safe system is either operable or not operable but in a safe state. 4.3 Behaviour based dependability Having defined the behaviour of a system and the mission, which corresponds to the service the system should deliver, the dependability of the system can be defined as: Definition 4.7 A time-invariant dynamical system Σ= ( T , W , B ) with behaviours B and a mission wm ∈ B is said to be (gradually) dependable in the period T ∈ T if, for all t ∈ T, mission wm can be (gradually) accomplished.

5. Behaviour based dependability measure The basic idea behind the dependability measure proposed in the last section is to define the dependability based on the behaviour of the system. For this purpose, a desired behaviour, which was called mission wm(t), was defined for a system and the dependability measure was proposed to be depending on the total of deviation between the actual system behaviour w(t) and the desired behaviour wm(t). In order to be able to actually measure the dependability this definition must, however, be more sophisticated. 5.1 Requirements for a dependability measure Before proposing a function for measuring the dependability the characteristics this dependability function should posses are introduced. In the following, the function for the dependability will be called D . D (t) should be a continuous time-dependent function • D (t) should be positive, strictly monotone decreasing • D (t) should be normalized between 0 and 1, where 1 means dependable and 0 means • not dependable • D (t) should be a dimensionless quantity The dependability must be measured during and after the mission, hence the dependability measure D (t) must be a time dependant function. The normalization and the non-dimensionalization is obvious in order to achieve a system and unit independent measure. The limitation to the domain between 0 and 1 was chosen so that dependability measure is comperable between different system and application domains. D (t) should be strictly monotonic decreasing since a system is less dependable, i.e. undependability is more likely to occur, the longer a system runs. 5.2 Definition of dependability measure The system trajectory w(t) is the evolution of the system state. The distance between this trajectory and the mission wm(t), together with the distance to the safety area S will be the main idea of the measure for dependability. After the system Σ has completed its mission, the overall mission deviation Dm of system and its mission wm is proposed as the sum of all deviations 2(w(t),wm(t)). In the following,

443

Dependability of Autonomous Mobile Systems

the functional 2(w(t),wm(t)) will be abbreviated as 2(t). Thus, the overall mission deviation can be defined as: (1) Where 2(t) is an appropriate measure of the deviation between mission trajectory wm and system trajectory w and consequently a combination of different distance measurements, including the distance to the safety area S . The term max ( ( )2) represents the maximum deviation during this particular mission. Those distance measurements will be discussed in detail in the following. More important than knowing the system dependability after completion of the mission is knowing the dependability during the mission. At time, t the time dependent overall mission deviation D(t) can be measured by means of (2) Note that the integration limits for the second integral changed from (1) to (2). In order to calculate D (t) during the mission an estimation for max (2()) must be used. This value depends on the distance function 2(t) used and will be discussed together with the calculation of 2(t) in the following. Furthermore,

in (1) and (2) assures that the function for the time dependent overall deviation D is a positive function. The problem with this function for D(t), is that, besides that it is unnormalized, D(t) is equal to zero if there is no deviation between the desired trajectory wm(t) and the actual system trajectory w(t). Hence, in this case, the dependability derived from this function would be zero. 5.3 Non-dimensionalization and normalization Nondimensionalization is a technique for partial or full removal of units from a mathematical equation by a suitable substitution of variables. Normalization bounds the domain of a mathematical function to a given range of values. Function v with its codomain [omin..omax] can be normalized to a function v’ with its codomain [nmin..nmax] by the following formula: (3) For the time dependent overall mission deviation (2) the value for omin is: omin = 0

(4)

444

Robotics, Automation and Control

The dependability function, as stated in the introduction to this chapter, should have a codomain of [0..1], consequently the values for nmin and nmax should be: and

nmin = 0

(5)

nmax = 1

(6)

With these values the normalization function is reduced to: (7) The value omax for the unnormalized dependability D can be set to (8) If at least one 2(t) > 0 for t ∈ [0..tm] the normalized dependability D (t) can be computed from (2) with (7) and (8) to:

(9)

Nevertheless, the problems with this function are: 1. It only exists if at least one 2(t) > 0 for t ∈ [0..tm]. In other words, it only exists if at least a small deviation between the desired behaviour wm and the actual behaviour w occurred. 2. It is subject to the calculation of 2(t). Thereby max (2()) cannot be estimated in advance and dependability cannot be computed during the mission. To finally overcome both problems, a system-independent way for computing 2(t), which is additionally normalized between [0 . . . 1], is proposed. Having this, max(2()) can be estimated equal to 1 and (10) can be estimated to

This finally leads to the desired system independent, normalized function D (t) of dependability. D can now be computed from (9) to:

Dependability of Autonomous Mobile Systems

445 (11)

If a systemindependent way to compute 2(t) between [0 . . . 1] exists this function for the dependability posses all required properties stated at the beginning of this chapter. 2 5.4 Computing ε (t )

For computing the elements of 2(t) it is not only important to address the distance between the system state and the mission trajectory but also to address the different dimensions of dependability such as reliability, availability, etc. For a behavioural definition of these attributes please refer to (Rüdiger et al., 2007a). Furthermore, the distance of the system state to the safe area S also needs to be taken into account. Thus, 2(t) usually consists of different elements reflecting the different attributes of dependability for this special system. From (2) and (9) it follows that if 2(t) is a combination 2 2 of different measures ε 1 (t ) . . . ε n (t ) , D (t) is calculated

(12)

(13) 2 setting again max( ε i (t ) ) = 1, for i = 1 . . . n, this can be reduced to:

(14) As stated in the previous section, ε i2 (t ) must be normalized and between be [0 . . . 1]. The corresponding function of (t) must be chosen in such a way that 0 means dependable, i.e. the system state w(t) follows exactly the mission trajectory wm(t), and 1 means not dependable. 2 In order to compute the different ε i (t ) a special distance measure is proposed derived from

the euclidian distance measure between two points x = (x1 . . . xn) and y = (y1 . . . yn) (15) This measure is, however, not normalized and not necessarily between 0 . . . 1. In order to achieve the remaining two points, too, the following distance measure is proposed derived from (15): (16)

446

Robotics, Automation and Control

In (16) wm(t) is the desired (mission) behaviour and w(t) the actual behaviour of the system. The parameter wdev describes how severely a deviation from the mission trajectory influences the system’s dependability. It must be chosen greater than zero and have the same dimension as w(t). The lower wdev is chosen the more a deviation from the desired behaviour is rated (see Fig. 4). The proposed distance measure is therefore dimensionless and normalized between [0 and 1].

Figure 4. Example of the distance function to compute the different i(t) with wm = 2 (dotted green line) and wdev = 1 (blue), wdev = 0.8 (green), and wdev = 0.4 (light green) As the euclidian distance measure, the proposed distance measure 2(t) defines a metric over the space W since it satisfies all conditions for a metric which are: 1. 2(x,x) = 0, identical points have a distance of zero 2. 2(x,y) = 0 if and only if x = y, identity of indiscernible 3. 2(x,y) = d(y, x), symmetry 4. 2(x,y) ≤ 2(x,z) + 2(z,y), triangle inequality With the aid of this distance measure, the different attributes of dependability can be 2 defined. For ε i (t ) the corresponding euclidian distance measure di(t) is used as a basis.

5.5 Mission deviation ε m2 (t ) The mission deviation describes the normalized difference between the mission trajectory and the system state at time t. For this purpose the afore discussed distance measure is directly used with the euclidian distance dm between the mission trajectroy and the system 2 state. When evaluating the dependability ε m (t ) is used in most of the dependability

measure. The mission deviation ε m2 (t ) is defined as (17) Again, wm(t) is the desired mission trajectory and w(t) is the actual behaviour of the system as described in (16). See Fig. 5 for examples of dm(t).

Dependability of Autonomous Mobile Systems

447

Figure 5. Mission trajectory wm(t) (blue) and system trajectory w(t) (red) with examples for dm(t) at different timesteps. 5.6 Safety ε s2 (t ) 2 2 Beside the mission deviation ε m (t ) is safety ε s (t ) one of the most important elements of

2(t). As proposed in Section 4.2.3 a safety area S is introduced which when left will lead to catastrophic consequences. The minimum euclidian distance between a system trajectory w(t) and the border of the safety area S at time t will be taken as a basis for the measure of

ε s2 (t ) . This distance is called dS(w(t)) and will be abbreviated as follows dS(t) for the minimum distance between the actual system states w(t) and the border of the safety area and dSm(t) for the minimum distance between the mission trajectory wm(t) and the border of the safety are at time t. 2 Obviously ε s (t ) should be 1 when dS(t) = 0, equivalent to the distance between the system

state and the safety area being zero. To be able to adequately cover cases where the mission trajectory wm(t) itself could be close to the border of the safety area S , not the absolute distance between the actual system trajectory and the border of the safety area dS(t) is taken but the relative distance between the minimum distance of the actual systemtrajectory and the safety area dS(t) and the minimum distance of the mission trajectory wm(t) to the border of the safety area dSm is taken 2 2 to compute ε s (t ) . Consequently, ε s (t ) is proposed as:

(18) Both, dS(t) and dSm(t), are greater or equal to 0. The equation for ε s2 (t ) is only defined for dSm(t) ≠ 0. See Fig. 6 for examples for dS(t).

448

Robotics, Automation and Control

Figure 6. Mission trajectory wm(t) (blue) and system trajectory w(t) (red) with examples for dSm the distance between the mission trajectory wm(t) and the boarder of the safety area S (read lines). 5.7 Timely mission accomplishment ε T (t ) For a number of systems it is not only important that the system adequately follows the mission trajectory but that the system follows the mission trajectory at a given time. A good example for such systems is a heard-lung machine where it is not sufficient that the system gives the right pulses, they must be performed at given timesteps. Another important example, especially in the field of controlling autonomous mobile real-time systems, is the class of periodic behaviours, i.e. velocity control or collision avoidance. In the latter example, the exact time execution of a given behaviour is more important then the exact execution of the behaviour itself. 2 The calculation of ε T (t ) is of course only possible if wm(t) is uniquely invertible. For periodic functions, often used on autonomous mobile systems, the uniquely invertible requirement of w(t) can be simplified to a peacewise uniquely invertible requirement. T 2 Let w’m(w) : T → W be the inverse function of wm(t) then ε T (t ) is proposed as: 2

(19) As in (16) and (17) the parameter tdev describes how severe a deviation from the mission trajectory influences the dependability of the system. See Fig 7 for an example of ε T2 (t ) 5.8 Reliability ε R (t ) As stated in section 2, reliability R|t describes the probability according to which the system 2 will operate correctly in a specified operating environment in an interval [0, t]. For ε R (t ) this means that 1 − R|t describes the probability that the system will fail in the interval [0...t]. Setting t = tm the latter probability can be directly used and thus ε R2 (t ) is proposed as: 2

(20)

Dependability of Autonomous Mobile Systems

449

Figure 7. Mission trajectory wm(t) (blue) and system trajectory w(t) (red) with examples for dT (t) 2 5.9 Availability ε A (t )

In contrast to reliability, availability is defined at a time instant t while reliability is defined in a time interval. The availability A|t describes the probability that a system is operational 2 at the instant of time t. As for the reliability, this means for ε A (t ) that 1−A|t describes the

probability that the system is not operable at time instant t. This probability can be directly used when computing ε A2 (t ) . Thus ε A2 (t ) is proposed as: (21) This definition satisfies two statements about availability mentioned in section 2: 1. If a system cannot be repaired, its availability equals its reliability 2.

2 The integral over the mission time of ε A (t ) in the dependability function equal the

average availability, also called interval or mission availability as introduced in section 2. 5.10 Additional ε X2 (t ) According to the system and its mission, additional measures for 2(t) might be needed to take into account further special requirements with respect to dependability. As stated earlier, it is important that those ε X2 (t ) are dimensionaless and are normalized between 0 and 1, where 0 means dependable and 1 means not dependable.

6. Examples for measuring the dependability To present the adaptability of the dependability definition proposed above, the following two examples may serve as a demonstration. 6.1 Example 1: autonomous transport system To clarify the behaviour based dependability measurement, an autonomous mobile system with only one position degree of freedom is used. The system is an autonomous

450

Robotics, Automation and Control

transportation system build to autonomously reach different positions which could be, for example, stopping points on a track. For the dependability measurement only the position on the track is considered in the first example. The velocity and acceleration of the autonomous transportation system will be initially disregarded in this example. 6.1.1 Behaviour based system description For the dependability measurement proposed in the last section, the system will be modelled as described in Section 3. Since the system only has one position degree of freedom it can only move forward and backward on the track, the signal space of the system is W = R . The time of interest for this system is T = R +. For the description of the behaviour B , the train model is needed. A simple train model with rolling friction derived from Newtons Law is used for that purpose. According to Newtons-Law, the sum of forces acting on an object is equal to the mass of that object, multiplied by its acceleration. The mass of the train is assumed to be M. The forces acting on the train are, on the one hand, the driving force Fa and, on the other hand the friction force Fr = μFn (μ represents the coefficient of rolling friction, Fn the force parallel to the planes normal). It is assumed that the train only moves in a plane, thus there is no inclination, etc. Consequently, the force parallel to the normal of the plane Fn can be set equal to the force of gravity Fn = Fg = Mg, with g being the acceleration due to gravity. A diagram of the system with the forces used in this model is shown in Fig. 8. The system can thus be described according to the following equations. (22) (23) (24)

Figure 8. Example of an autonomous transportation system with the forces used to model the system. Fa driving force, Fr friction and Fg gravitation force. According to the behavioural based approach set forth in section 3, the autonomous mobile transportation system can be described as follows. Universe W = R Time T = R + Behaviour The corresponding Matlab Simulink Model is shown in Fig. 9. The position and the velocity of the system are controlled by simple PI-controllers (see Fig. 10 and 11). Of all possible

Dependability of Autonomous Mobile Systems

451

Figure 9. Matlan Simulink model of an autonomous transportation system. M is the mass of the system, μ the friction coefficient and g the acceleration due to gravity

Figure 10. Velocity loop of an autonomous transportation system. The system velocity is controlled by a simple PI controller.

Figure 11. Position loop of an autonomous transportation system. The position of the system is controlled by a simple PI controller system behaviours from the set B only a subset B ⊂ B is available according to the mass and the maximum possible driving force of the system. In this example it is further assumed that the system is able to completely follow the given velocities and accelerations. 6.1.2 Behaviour based dependability measurement The mission of the above modelled autonomous transportation system is to reach consecutively different positions on the track. The mission time in this example is set to 2400 time units. The system should thus accomplish a desired behaviour wm(t) with its given behaviours B ⊂ B . The set of desired behaviours for this example is generated with a Matlab Simulink model. For this purpose, the signal builder block is used (see Fig 12) to define different desired positions on the track. The reference signal is fed to the real train system to simulate the actual behaviour (Model in Fig. 8) and also to the reference train system (Reference Model in Fig. 8) to generate the desired behaviour. With the aid of the generated behaviour

452

Robotics, Automation and Control

in the reference model, this will be taken as the desired behaviour wm(t) or mission of the autonomous transportation system and used for the computation of the system’s dependability. This model shows an example of the different opportunities to measure the dependability of such systems. At first, it is assumed that the position of the autonomous transportation system can be measured adequately. Consequently it is assumed that the measurement of the position itself does not produce additional errors. Up till now only system internal errors or deviations were considered as deviations between the reference model and the real system. It is also possible that changes in the model or the environment, as implicitly considered in this case, may occur. Unexpected wearout of wheels, resulting from e.g. a smaller wheel radius can produce errors, and as such lead to a deviation from the desired behaviour, if the position of the train is only measured on the basis of the wheel rotations.

Figure 12. A Matlab signal builder block is used together with a reference and the real system in order to generate the actual and desired behaviour of the system. When generating the desired behaviour in this example it is assumed that the system is functioning properly. Thus, the reference model reflects the system adequately. Noise in the sensors, for example, is not explicitly modelled. Of course, this could have been also introduced in the model for a better computation of the desired behaviour. In the first example, two different simulations are carried out. 1. To simulate an additive error, a constant value is added to the position measurement. This error could be due to faulty initialization, slippage etc, but could also because of an error in the model of such autonomous transportation system. 2. To demonstrate as to what extend noise in sensors or measurement uncertainty affect the dependability of a system, noise is added to the measurement of the position. The results of the two simulations are shown in Fig. 13. The dotted red line in each case represents the desired behaviour, thus the mission trajectory wm. The actual system behaviour is shown as blue line. The measured dependability for this example is shown as a dashed green line. 6.2 Example 2: Small train Since the autonomous transportation system is built for the transport of people and as such represents a safety critical system, system safety is also considered in the second example. In the second example, besides the position of the system, the velocity is considered when calculating dependability. In addition to the above mentioned two simulations, two other scenarios were added for the computing of dependability.

453

Dependability of Autonomous Mobile Systems

(a) Aboslu*t Value added to the position

(b) Noise added to the position

Figure 13. Simulation Resutls for Example 1.

Figure 14. Simulation Results for example 2 with Position and Speed used for the dependability calculation 1. 2.

In order to enhance the dependability calculation, a desired and actual behaviour of the velocity was added. For the simulation of parameter errors, which are multiplicative, the velocity of the real system is multiplied by a constant value. A safety area, as proposed, was added for the velocity. Consequently, the relative 2 distance ε s (t ) is also used when computing system’s dependability.

454

Robotics, Automation and Control

For each of these two scenarios, again, both simulations allready used in the first examples where performed. The results of the individual four simulations are shown in Fig. 14 and 15. As in the last figure, the dotted red lines represents the desired behaviour for either the velocity or the position. The actual system behaviour in terms of velocity and position is shown as blue line. The measured dependability for the examples is shown as dashed green line.

Figure 15. Simulation Results for example 2 with Position and Speed used for the dependability calculation. Additionally a safety area for the velocity is added.

7. Conclusion There exist numerous non-formal definitions for dependability (see Carter, 1982; Laprie, 1992; Badreddin, 1999; Dubrova, 2006; Avizienis et al., 2004a just to name a few). When applying those non-formal definitions to a specific system the resulting dependability measure usually is only valid for this specific system and only in rare cases transferable to a family of equal systems. Small changes in the system or environment, however, render those measurements usually useless when it comes to measuring or even comparing the dependability of different systems. Autonomous mobile robots are often described by their behaviour. This aspect was utilized in this chapter for the definition of dependability in a behavioural context in order to obtain an easy to apply and computable formula for the dependability of systems. Since this

Dependability of Autonomous Mobile Systems

455

formula for dependability is solely based on the behaviour and the mission of a system it can be easily compared with other systems having different missions. The definition for dependability proposed in this chapter is straight forward, easily applicable and well suited for dependability comparison of different systems.

8. References Avizienis, A., Laprie, J.-C., and Randell, B. (2004a). Dependability and its threats: A taxonomy. Avizienis, A., Laprie, J.-C., Randell, B., and Landwehr, C. (2004b). Basic concepts and taxonomy of dependable and secure computing. IEEE Trans. on Dependable and Secure Computing, 1(1):11–33. Badreddin, E. (1999). Safety and dependability of mechatronics systems. In Lecture Notes. ETH Zürich. Badreddin, E. and Abdel-Geliel, M. (2004). Dynamic safety margin principle and application in control of safety critical systems. In Proceedings of the 2004 IEEE International Conference on Control Applications, 2004., volume 1, pages 689–694Vol.1. Brooks, R. A. (1986). A robust layered control systemfor a mobile robot. IEEE Journal of Robotics and Automation, 2(1):14–23. Candea, G. (2003). The basics of dependability. Carter, W. (1982). A time for reflection. In Proc. 12th Int. Symp. on Fault Tolerant Computing (FTCS-12). FTCS-12) IEEE Computer Society Press Santa Monica. Department of Defence, U. S. o. A. (1970). Military standard - definitions of terms for reliability and maintainability. Technical ReportMIL-STD-721C. Dewsbury, G., Sommerville, I., Clarke, K., and Rouncefield, M. (2003). A dependability model for domestic systems. In SAFECOMP, pages 103–115. Dubrova, E. (2006). Fault tolerant design: An introduction. Draft. Filippini, R. and Bondavalli, A. (2004). Modeling and analysis of a scheduled maintenance system: a dspn approach. Flammini, F. (2006). Model-Based Dependability Evaluation of Complex Critical Control Systems. PhD thesis, Universitá degli Studi di Napoli - Federico II. Hermann, R.; Krener, A. (Oct 1977). Nonlinear controllability and observability. Automatic Control, IEEE Transactions on, 22(5):728–740. IEC (1990). International electrotechnical vocabulary. chapter 191: Dependability and quality of service. International Federation for Information Processing. Wg 10.4 on dependable computing and fault tolerance. http:// www.dependability.org/wg10.4/. Laprie, J. C. (1992). Dependability. Basic Concepts and Terminology. Ed. Springer Verlag. Randell, B. (2000). Turing Memorial Lecture: Facing up to faults. 43(2):95–106. Rüdiger, J., Wagner, A., and Badreddin, E. (2007a). Behavior based definition of dependability for autonomous mobile systems. European Control Conference 2007. Kos, Greece. Rüdiger, J., Wagner, A., and Badreddin, E. (2007b). Behavior based description of dependability - defining a minimum set of attributes for a behavioral description of dependability. In Zaytoon, J., Ferrier, J.-L., Andrade-Cetto, J., and Filipe, J., editors, ICINCO-RA (2), pages 341–346. INSTICC Press.

456

Robotics, Automation and Control

Vesely, W. E., Goldberg, F. F., Roberts, N. H., and Haasl, D. F. (1981). Fault Tree Handbook. U. S. Nuclear Regulatory Commission, NUREG-0492, Washington DC. Willems, J. (1991). Paradigms and puzzles in the theory of dynamical systems. IEEE Transactions on Automatic Control, 36(3):259–294.

23 Model-free Subspace Based Dynamic Control of Mechanical Manipulators Muhammad Saad Saleem and Ibrahim A. Sultan

University of Ballarat Australia

1. Introduction Realtime identification and dynamic control of mechanical manipulators is important in robotics especially in the presence of varying loading conditions and exogenous disturbances as it can affect the dynamic model of the system. Model free control promises to handle such problems and provides solution in an elegant framework. Model free control has been an active area of research for controlling plants which are difficult to model and time varying in nature. The proposed framework takes the objective in operational space. Benefit of specifying objective in operational space along with direct adaptive control is self evident. In this is used for robust framework, subspace algorithm is used for model identification. control of manipulator dynamics. Because of the seamless integration of identification and control modules, explicit values of dynamic parameters is not calculated. The model free control system is capable of explicitly incorporating uncertainties using μ-synthesis. Uncertainty models can be calculated from experimental data using model unfalsification. The proposed control system employs a black box approach for dynamics of mechanical systems. The chapter also presents results from a simulation of a planar robot using MATLAB® and Simulink® from MathWorks Inc. 1.1 Notations The rigid body model adopted in this chapter is given by (1) where M(q) is the inertia tensor matrix, C(q, q ) is the Coriolis and centripetal forces, G(q) is gravity, and ξ (q, q ) denotes unmodeled non-linearities. Joint variables, their velocities, and positions are donated by q, q , q∈

. In case of revolute joint, q is the angle while in

prismatic joint, q represents the distance. The torques generated by actuators are represented by u ∈ . It is assumed that the mechanical manipulator is fully actuated, non-redundant and the Jacobian is known. If the position of endeffector is given by forward kinematics equation i.e. x = fkinematics(q). It can be differentiated by ∂q to obtain (2)

458

Robotics, Automation and Control

1.2 Problem statement The problem under discussion can be stated as

This problem is the amalgamation of inverse dynamics in which M(q), C(q, q ), and G(q, q ) q ), dynamic parameter identification in which M(q), C(q, q ), and are known i.e. u = f(q, q , G(q, q ) are calculated, and robust control to cater unmodeled non-linearities and disturbances in the system.

2. Control of articulated manipulators Control of articulated manipulators can be divided into two main categories: • Joint space control • Operational space control Joint space control is consisted of two subproblems. First, manipulator inverse kinematics is performed and then joint space control scheme is devised to allow the end effector to follow a reference input. The main computational burden in this scheme is incurred by the inverse kinematics procedure, which is normally performed by using different optimization techniques; particularly in redundant systems where there can be infinite solutions for a given task (Kim et al. (2003)). Many implementations of joint space control can be found in the literature (Laib (2000); Kelly (1997); Arimoto (1995); Kelly (1993); Wen et al. (1992); Tomei (1991); Takegaki & Arimoto (1981); Zhang et al. (2000)). In many applications, the desired path of end effector is specified in the operational space (e.g. Cartesian frame). Operational space control, on the other hand, has also been used for constrained manipulator motions (Sapio & Khatib (2005)). These constraints can be because of gravity, or kinematically imposed. It can be seen in Figure 2 that inverse kinematics is embedded in the closed-loop control law but not explicitly performed as shown in Figure 1 (Sciavicco & Siciliano (2000)). Operational space control and task space control allude to the same concept (Xie (2003)).

Figure 1. Joint space control The proposed architecture controls the manipulator in joint space. The reason behind is the very fact that inverse kinematics is highly non-linear in nature. It is assumed the analytical

Model-free Subspace Based Dynamic Control of Mechanical Manipulators

459

Jacobian JA is available, the manipulator is fully actuated and non-redundant. Numerical solutions to inverse kinematics are more complex and are manipulator specific (Khalil & Dombre (2004)). Numerical based inverse kinematics is out-of-scope of this chapter.

Figure 2. Operational space control

3. Model free control There are four methods to use experimental data as shown in Table 3 (Woodley (2001)). Mainly, choice depends on application. For realtime systems which are fairly easy to model, indirect control is a better choice. The system then adapts itself and updates its model parameters according to the conditions gathered from the measured data. Normally an online model-based design is referred as indirect control. If a system is hard to model from first principles (e.g. Newton’s laws of motion) or there are time varying nonlinearities then a direct adaptive control would suit the application. Examples of plants which are difficult to model are arc furnaces (Wilson (1997)) and helicopter rotors (Lohar (2000)). Biped robots on the other hand can be modeled but they exhibit time varying nonlinearities (Wolkotte (2003); Kim et al. (2004); Caballero et al. (2004)).

Table 1. Four different techniques of control design from experimental data 3.1 General predictive control Model free comes under the category of “general predictive control” (GPC). Model free implementations range from fuzzy and neural control (Boyd & Little (2000); Cheng (2004)) to crisp control techniques (Favoreel et al. (1999a); Woodley et al. (2001a)). However, crisp control is regarded as reliable and explicitly defines performance objective when compared with fuzzy control techniques (Athans (1999)). In direct adaptive control techniques, an explicit model formation is not needed; this is why it is referred to as model free control. Plant input and output values are observed in realtime, and a controller is designed for the estimated plant model. Model free is actually a misnomer, as the data from the plant’s input and output also represent some kind of plant information. In a model free control implementation, the system identification and the controller synthesis techniques are seamlessly integrated to reduce the computational burden, which makes it more suitable for realtime applications. Predictive control has been optimal predictors and optimal applied with 2 cost functions (Grimble (1998)), predictors and control costs (Zhao & Bentsman (1999)), and mixed minimax 2/ predictors (Tse et al. (1993)). Subspace predictors have also been used for direct control with

460

Robotics, Automation and Control

cost functions (Woodley et al. (2001b); quadratic (Favoreel et al. (1998, 1999b,a)) and Woodley (2001); Woodley et al. (2001a)). Other implementations include adaptive inverse control (Widrow & Walach (1994)), LMS1 (Widrow & Stearns (1985)), FxLMS2 and its ) loop alternatives (Sayyar-Rodsari et al. (1998)), identification and control based on the ( shaping (Date & Lanzon (2004)), and Lyapunov-based framework (Haddad et al. (2003); Hayakawa et al. (2004)). is famous amongst control engineers because of its ability to control MIMO3 systems built on strong mathematical foundations.

3.2 System identification System identification is used to build dynamical models from measured input-outpu data of a plant. There are many system identification techniques. The list starts with the classical prediction error (PE) and its variants; the auto regression with exogenous input (ARX), output error (OE), auto regression moving average with exogenous input (ARMAX), and Box Jenkins (BJ) (Norton (1986); Ljung (1999)). Subspace identification methods (SIM) have many advantages over classical system identification techniques (Overschee & Moor (1996)). Notables are; • From plant’s input and output data, a predictor is found. This bears similarity to the Kalman filter states, and transforms the analysis into a simple least square problem. As such, the whole architecture could be streamlined and in a user-friendly fashion. • When implemented in direct adaptive control, the plant model does not require simplification. Simplification or model reduction can omit useful information. Instead, in subspace identification methods all the plant information is stored in a compact form of a subspace predictor. • The output of subspace identification methods can be in the state space form which makes it easy to implement on a computer but its architecture has been exploited in different model free implementations as well (Woodley 2001b, Favoreel 1999a). Wernholt used SIM to solve system identification problem for an ABB IRB 6600 robot (Wernholt (2004)). Hsu et al. used N4SID in style translation for human motion (Hsu et al. (2005)). These are some of the examples that show how SIMs are being used. 3.2.1 Reported problems in subspace identification methods There are a few problems in subspace identification methods. Many of these problems have been discussed in recent literature and partial remedies have been suggested (Chou & Verhaegen (1997); Lin et al. (2004); Wang & Qin (2004); Chiuso & Picci (2005)). Some of these problems are: • Biased estimate for closed loop data. • Assumption of noise-free input. The first problem can be solved by filtering the predicted data through a frequency weighted matrix. The second one is solved by using a robust control methodology, which would cater for disturbances and noise in the system. least mean square filtered-x least mean squares 3 multiple-input multiple-ouput 1 2

Model-free Subspace Based Dynamic Control of Mechanical Manipulators

461

3.3 Model unfalsification For a true robust model free control, the system should be able to calculate an uncertainty model from the input and output data of plant. This can be done through model unfalsification. First an uncertainty model is unfalsified against the plant input and output data and then the uncertainty model (Δ) is incorporated in the controller design. Model unfalsification does not get much appeal in practice because of its high computational burden (Woodley (2001)). There are many implementations available for model unfalsificaiton (Kosut & Anderson (1997); Agnoloni & Mosca (2003); Tsao et al. (2003); Wodoley et al. (1999); Safonov (2003); Tsao & Safonov (2001); Woodley et al. (1998); Cabral & Safonov (2004)). Wang et al. suggested a direct adaptive controller based on model unfalsification with the assumption that there would be a controller in the given set that would satisfy the control requirements for a particular plant (Wang et al. (2004)). The identification of uncertainty models using model unfalsification is out of scope of this chapter.

Figure 3. Model free subspace based control. The plant transfer function P is unknown and the controller transfer function K is configured in realtime.

4. Model-free subspace based dynamic control of mechanical manipulators The desired trajectory of end-effector is given by [x, x , x ]T . If the initial position of endeffector xo and joint variables qo is known, then using equation (2), q, q and q can be written as (3) (4) (5) In case J is not a square matrix, pseudo-inverse of J i.e. J† is used. Once the reference trajectory is in the joint space, model free control system which has been inspired by the work of Woodley et al. can be applied (Woodley et al. (2001b)). The cost function for this

462

Robotics, Automation and Control

⎤⎦ , control effort u, and q for the framework minimizes the error in joint space ⎡⎣ q,q,q maximum value of the input qr. Here q = q − qr and qr is the reference trajectory in joint space. The reason behind minimizing q is that the system becomes unstable near T

singularities and it becomes important that near singularities system doesn’t try to achieve extremely high velocities, which could make the system unstable. The cost function to minimized can now be written as (6)

⎦⎤ , where γ is the performance objective, and zw1 , zw1 , zw1 are weighted feedbacks of ⎡⎣ q,q,q T ⎤ , y2 = q , e = q , and r = qr. u, and q , respectively. For simplicity, lets suppose y1 = ⎡ q,q,q T

⎣

⎦

The weights are applied in frequency domain. The time domain equivalent of these weighted feedback signals can be written as

(7) where H1 and H2 are lower triangular Toeplitz matrices developed from impulse responses (Markov parameters) of the discrete weighting filters, W1 and W2. These weights are normally assigned by the designer.

(8)

(9)

then Γ1 and Γ2 are the extended observability matrices formed from the impulse responses of the weighting filters W1 and W2.

(10)

Model-free Subspace Based Dynamic Control of Mechanical Manipulators

463

(11)

T ⎤⎦ and y2 = y{3}4. For simplicity, assume that y = ⎡⎣ q,q,q

For system identification, suppose a plant’s input and output values at discrete times are given, respectively, by

where ui ∈ m and yi ∈ l, where m and l are number of plant input and output signals respectively. The Hankel matrices for the past and future inputs are written as

Similarly, the Hankel matrices for the past and future outputs can be written as Yp ∈ il×j and Yf ∈ il×j respectively. Hankel matrix for past outputs and inputs, Wp, could be defined as follows

The linear least squares predictor of Yf with given Wp and Uf can be written as Frobenius norm minimization as follows

where the subspace orthogonal projections, Lw and Lu, are calculated as (12)

every third element in the array

464

Robotics, Automation and Control

where † denotes pseudo-inverse. This solution assumes that the problem is overconstrained i.e., there are more independent equations than unknowns. If the problem is underconstrained, the pseudo-inverse cannot be computed. Future outputs can now be predicted from the past inputs, outputs, and future inputs.

(13)

In order to calculate Lw and Lu, matrix decomposition methods are used. Using QR method, if

then (14) Pseudo-inverse is normally calculated through singular value decomposition (SVD) but Woodley et al. presented another method which employs the Cholesky factorization instead of SVD (Woodley et al. (2001b)). This is computationally faster and requires less memory. Using the strictly causal estimate of y1 and y2 from equation (13), we get (15) (16) Here yˆ 1 is the estimated value of the end effector position in the Cartesian coordinates and yˆ 2 is the estimated value of the joint angular velocities. From equations (7) to (16), we get

(17) where

(18)

465

Model-free Subspace Based Dynamic Control of Mechanical Manipulators

Substituting equation (7) into (6) and (17) produces the objective (19) where W is given as follows;

Differentiating (19) with respect to [r u]T and equating to zero produces the following; = 0 (20) The linear system in (20) can be re-arranged as follows; (21)

Differentiating once again with respect to [r u]T suggests the following;

(22)

Schur decomposition offers the following definition; (23) Since Q1 = Q1T and Q1 is positive definite, it could be concluded that A3 > 0 which satisfies the saddle condition (Woodley (2001)). As A3 ∈ im×im and r ∈ im, the condition for worst case input reference signal can be stipulated by the following inequality; (24) Matching the definitions in (23) and (24) to the mathematical aspects of the model-free control introduced above, the following could be stated;

466

Robotics, Automation and Control

(25) which can be written as; (26) To calculate the optimum controller outputs, multiply (21) with [0 I]. (27)

4.0.1 Uncertainties For a robust system, it is important that uncertainties are accounted for. Most uncertainties in a plant are hard to model. Figures (4) and (5) show the general layout of plant models with uncertainties in multiplicative and additive configurations, respectively. Woodley calculated γmin for different configurations of uncertainties in model-free control designs (Woodley (2001)). But the real challenge is to find the uncertainty block Δ through techniques like model unfalsification. A true robust system calculates Δ in realtime.

Figure 4. Plant with multiplicative uncertainties for robust

control design

Model-free Subspace Based Dynamic Control of Mechanical Manipulators

467

5. Simulation For a complete identification 13n parameters are to be identified (Khalil and Dombre (2004)). In model free framework, these parameters are not available explicitly. The prediction horizon should be two to three times the expected order of the system. From this rough estimate, prediction horizon as 30 is selected for a two joint planar manipulator. On one of the coordinate, a square signal and on the other one, a sinusoidal signal is given. The response of the end effector along with the performance objective γ is given in the figure (6). One of the benefit of using subspace identification is the property of Hankel matrix that allows to concatenate the data from a previous session.

Figure 5. Plant with additive uncertainties for robust

control design

6. Conclusion The proposed framework provides solution to inverse dynamics, parameter identification and robust control of mechanical manipulators in an elegant way. The fastest way to calculate pseudo-inverse is through Cholesky/SVD factorization (Golub & Loan (1996)). Its complexity is O(ij + i3), where i is the prediction horizon and j is number of prediction problems in a Hankel matrix. The computational burden becomes significant when a robot has large number of links and it has 6 degrees of freedom i.e. 13ln, where l is a arbitrary natural number from 1 to 5 as a safety factor in prediction and n is the number of joints of manipulator. Calculations required to predict a predictor for a six joint robot with six degrees of freedom with safety margin of 2 is 3796416. This number is not so big for modern computers but increasing the number of joints will increase the complexity exponentially.

468

Robotics, Automation and Control

Reference signals vs. plant outputs

γ as a function of time

t →(sec) Time offset: 0

Figure 6. Response of a planar robot with two rotary joints. In this particular experiment, Wp is initialized with null matrix. The performance is given by γ, which converges to a constant value when the input is consistent.

Model-free Subspace Based Dynamic Control of Mechanical Manipulators

469

7. References Agnoloni, T. and Mosca, E. (2003). Controller falsification based on multiple models. International Journal of Adaptive Control and Signal Processing, 17:163–177. Arimoto, S. (1995). Fundamental problems of robot control: Parti, innovations in the realm of robot servo-loops. Robotica, 13:19–27. Athans, M. (1999). Crisp control is always better than fuzzy feedback control. EUFIT t’99 debate with Prof. L.A. Zadeh, Aachen, Germany. Boyd, J. E. and Little, J. J. (2000). Phase in model-free perception of gait. In HUMO ’00: Proceedings of the Workshop on Human Motion (HUMO’00), page 3, Washington, DC, USA. IEEE Computer Society. Caballero, R., Armada, M. A., and Akinfiev, T. (2004). Robust cascade controller for nonlinearly actuated biped robots: experimental evaluation. International Journal of Robotics Research, 23(10/11):1075–1095. Cabral, F. B. and Safonov, M. G. (2004). Unfalsified model reference adaptive control using the ellipsoid algorithm. International Journal of Adaptive Control and Signal Processing, 18(8):683–696. Cheng, G. (2004). Model-free adaptive (mfa) control. Computing & Control Engineering Journal, 15(3):28–33. Chiuso, A. and Picci, G. (2005). Consistency analysis of some closed-loop subspace identification methods. Automatica, 41(3):377–391. Chou, C. T. and Verhaegen, M. (1997). Subspace algorithms for the identification of multivariable dynamic errors-in-variables models. Automatica, 33:1857–1869. Date, P. and Lanzon, A. (2004). A combined iterative scheme for identification and control redesigns. International Journal of Adaptive Control and Signal Processing, 18(8):629– 644. Favoreel, W., Moor, B. D., Gevers, M., and Overschee, P. V. (1998). Model-free subspacebased LQG-design. Technical report, Katholieke Universiteit Leuven. Favoreel, W., Moor, B. D., Gevers, M., and Overschee, P. V. (1999a). Closed loop model-free subspace-based LQG-design. In Proceedings of the IEEE Mediterranean Conference on Control and Automation, Haifa, Israel. Favoreel, W., Moor, B. D., and Overschee, P. V. (1999b). Model-free subspace-based LQGdesign. In Proceedings of the American Control Conference, pages 3372–3376. Golub, G. H. and Loan, C. F. V. (1996). Matrix Computations. The Johns Hopkins University Press. Grimble, M. J. (1998). Multi-step generalized predictive control. Dynamics and Control, 8(4):303–339. Haddad, W. M., Hayakawa, T., and and, J. M. B. (2003). Adaptive control for nonnegative and compartmental dynamical systems with applications to general anesthesia. International Journal of Adaptive Control and Signal Processing, 17(3): 209– 235. Hayakawa, T., Haddad, W. M., and Leonessa, A. (2004). A Lyapunov-based adaptive control framework for discrete-time non-linear systems with exogenous disturbances. International Journal of Control, 77(3): 250–263. Hsu, E., Pulli, K., and Popovi´c, J. (2005). Style translation for human motion. ACM Transactions on Graphics (TOG), 24(3): 1082–1089.

470

Robotics, Automation and Control

Kelly, R. (1993). Comments on adaptive pd controller for robot manipulators. IEEE Trans. Robot. Automat., 9: 117–119. Kelly, R. (1997). Pd control with desired gravity compensation of robotic manipulators: A review. Int. J. Robot. Res., 16(5): 660–672. Khalil, W. and Dombre, E. (2004). Modeling, Identification and Control of Robots. Kogan Page Science. Kim, D., Kim, N.-H., Seo, S.-J., and Park, G.-T. (2004). Fuzzy Modeling of Zero Moment Point Trajectory for a Biped Walking Robot. Lecture Notes in Computer Science. SpringerVerlag GmbH, 3214 edition. Kim, J. O., Lee, B. R., Chung, C. H., Hwang, J., and Lee, W. (2003). The Inductive Inverse Kinematics Algorithm to Manipulate the Posture of an Articulated Body. Lecture Notes in Computer Science. Springer-Verlag GmbH, 2657 edition. Kosut, R. L. and Anderson, B. D. O. (1997). Uncertainty model unfalsification. In Proceedings of the 36th IEEE Conference on Decision and Control, volume 1, pages 163–168, San Diego, CA. Laib, A. (2000). Adaptive output regulation of robot manipulators under actuator constraints. IEEE Trans. Robot. Automat., 16:29–35. Lin, W., Qin, S. J., and Ljung, L. (2004). A framework for closed-loop subspace identification with innovation estimation. Technical Report 2004-07, Department of Chemical Engineering, The University of Texas at Austin, Austin, TX 78712, USA and Linköping University, SE-581 83 Linköping, Sweden. Ljung, L. (1999). System identification: theory for the user. Prentice-Hall, Upper Saddle River, NJ, USA. Lohar, F. A. (2000). and μ-synthesis for full control of helicopter in hover. In 38th Aerospace Sciences Meeting and Exhibit, Reno, NV. American Institute of Aeronautics and Astronautics. Norton, J. P. (1986). Introduction to Identification. Academic Press. Overschee, P. V. and Moor, B. D. (1996). Subspace Identificiation for Linear Systems. Kluwer Academic Publishers. Safonov, M. G. (2003). Recent advances in robust control theory. In AIAA Guidance, Navigation and Control Conference and Exhibit, volume 4, Austin, TX. Sapio, V. D. and Khatib, O. (2005). Operational space control of multibody systems with explicit holonomic constraints. In Proceedings of the 2005 IEEE International Conference on Robotics and Automation. Sayyar-Rodsari, B., How, J. P., Hassibi, B., and Carrier, A. (1998). An optimal alternative to the fxlms algorithm. In Proceedings of the American Control Conference, pages 1116– 1121. Sciavicco, L. and Siciliano, B. (2000). Modelling and Control of Robot Manipulators. Springer, 2nd edition. Takegaki, M. and Arimoto, S. (1981). A new feedback method for dynamic control of manipulators. ASME J. Dyn. Syst., Meas., Control, 102:119–125. Tomei, P. (1991). Adaptive pd controller for robot manipulators. IEEE Trans. Robot. Automat., 7:565–570.

Model-free Subspace Based Dynamic Control of Mechanical Manipulators

471

Tsao, T.-C., Brozenec, T., and Safonov, M. G. (2003). Unfalsified adaptive spacecraft attitude control. In AIAA Guidance, Navigation, and Control Conference and Exhibit, Austin, TX. Tsao, T.-C. and Safonov, M. G. (2001). Unfalsified direct adaptive control of a twolink robot arm. International Journal of Adaptive Control and Signal Processing, 15(3):319–334. Tse, J., Bentsman, J., and Miller, N. (1993). Properties of the self-tuning minimax predictive control (MPC). In Proceedings of the 1993 American Control Conference, pages 1721– 1725. Wang, J. and Qin, S. J. (2004). A new deterministic-stochastic subspace identification method using principal component analysis. Technical report, Department of Chemical Engineering, The University of Texas at Austin. Wang, R., Stefanovic, M., and Safonov, M. G. (2004). Unfalsified direct adaptive control using multiple controllers. In AIAA Guidance, Navigation, and Control Conference and Exhibit, pages 1–16, RI, USA. Wen, J., Kreutz-Delgado, K., and Bayard, D. (1992). Lyapunov function-based control laws for revolute robot arms. IEEE Trans. Automat. Contr., 37:231–237. Wernholt, E. (2004). On Multivariable and Nonlinear Identification of Industrial Robots. PhD thesis, Department of Electrical Engineering, Link¨oping University, SE-581 83 Linköping, Sweden. Widrow, B. and Stearns, S. D. (1985). Adaptive Signal Processing. Prentice-Hall. Widrow, B. and Walach, E. (1994). Adaptive Inverse Control. Prentice-Hall. Wilson, E. (1997). Adaptive profile optimization for the electric arc furnace. In Steel Technology International, pages 140–145. Wodoley, B., How, J., and Kosut, R. (1999). Direct unfalsified controller design - solution via convex optimization. In Proceedings of the 1999 American Control Conference, volume 5, pages 3302–3306. Wolkotte, P. T. (2003). Modelling human locomotion. Technical report, Institute of Electronic Systems, Aalborg University. control. In Woodley, B., How, J., and Kosut, R. (2001a). Model free subspace based Proceedings of the 2001 American Control Conference, volume 4, pages 2712– 2717. Woodley, B., Kosut, R., and How, J. (1998). Uncertainty model unfalsification with simulation. In Proceedings of the 1998 American Control Conference, volume 5, pages 2754–2755. control. PhD thesis, Department of Woodley, B. R. (2001). Model free subspace based Electrical Engineering, Stanford University. Woodley, B. R., How, J. P., and Kosut, R. L. (2001b). Subspace based direct adaptive control. International Journal of Adaptive Control and Signal Processing, 15(5):535–561. Xie, M. (2003). Fundamentals of Robotics, volume 54 of Machine perception and artificial intelligence. World Scientific. Zhang, Y., Tian, H., Wang, Q., and Qiang, W. (2000). Servo control in joint space of biped strategy. In Jiang, D. and Wang, A., editors, Proceedings robot using nonlinear of SPIE, International Conference on Sensors and Control Techniques (ICSC 2000), volume 4077, pages 386–391.

472

Robotics, Automation and Control

predictive control based on minimax Zhao, H. and Bentsman, J. (1999). Multivariable predictor. In Proceedings of the IEEE Conference on Decision and Control, volume 4, pages 3699–3705.

24 The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems Jorge Santos1, Zita Vale2, Carlos Serôdio3 and Carlos Ramos1

1Departamento

de Engenharia Informática, Instituto Superior de Engenharia do Porto, de Engenharia Electrotécnica, Instituto Superior de Engenharia do Porto, 3Departamento de Engenharias, Universidade de Trás-os-Montes e Alto Douro, Portugal

2Departamento

1. Introduction Although humans present a natural ability to deal with knowledge about time and events, the codification and use of such knowledge in information systems still pose many problems. Hence, the development of applications strongly based on temporal reasoning remains a hard and complex task. Furthermore, albeit the last significant developments in temporal reasoning and representation (TRR) area, there still is a considerable gap for its successful use in practical applications. In this chapter we present VERITAS, a tool that focus time maintenance, that is one of the most important processes in the engineering of the time during the development of KBS. The verification and validation (V&V) process is part of a wider process denominated knowledge maintenance (Menzies 1998), in which an enterprise systematically gathers, organizes, shares, and analyzes knowledge to accomplish its goals and mission. The V&V process states if the software requirements specifications have been correctly and completely fulfilled. The methodologies proposed in software engineering have showed to be inadequate for Knowledge Based Systems (KBS) validation and verification, since KBS present some particular characteristics. VERITAS is an automatic tool developed for KBS verification which is able to detect a large number of knowledge anomalies. It addresses many relevant aspects considered in real applications, like the usage of rule triggering selection mechanisms and temporal reasoning. The rest of the chapter is structured as follows. Section 2 provides a short overview of the state-of-art of V&V and its most important concepts and techniques. After that, section 3 describes SPARSE, a KBS used to assist the Portuguese Transmission Control Centres operators in incident analysis and power restoration. Special attention is given to SPARSE's particular characteristics, introducing the problem of verifying real world applications. Section 4 presents VERITAS; special emphasis is given to the tool architecture and to the method used in anomaly detection. Finally, in section 5, achieved results are discussed and in section 6, we present some conclusions and ideas for future work.

474

Robotics, Automation and Control

2. Verification and validation of KBS Many authors argued that the correct and efficient performance of any piece of software must be guaranteed through the verification and validation (V&V) process, and it becomes obvious that Knowledge Based Systems (KBS) should undergo the same evaluation process. Besides, it is known that knowledge maintenance is an essential issue for the success of the KBS since it assures the consistency of the knowledge base (KB) after each modification in order to avoid the assertion of knowledge inconsistencies. Unfortunately, the methodologies proposed in software engineering have showed to be inadequate for knowledge based systems validation and verification, since KBS present some particular characteristics (Gonzalez & Dankel 1993). Namely, the need for KBS to deal with uncertainty and incompleteness; the domains modelled normally do not underline physical models; it is not rare for KBS to have the ability to learn and improve the KB allowing a dynamical behaviour; in most domains of expertise, there is no concept of right results but only of acceptable ones. Besides the facets of software certification and maintenance previously referred, the systematic use of formal V&V techniques is also a key for making end-users more confident about KBS, especially when critical applications are considered. In Control Centres domain the V&V process intends to assure the reliability of the installed applications, even under incident conditions. The problem of Verification and Validation appears when there is a need to assure that some model (solution) correctly addresses the problem through the adequate techniques and methodology in order to provide the desired results. In the scope of this work, Validation and Verification will be referred as two complementary processes, both fundamental for KBS end-user acceptance. Albeit there is no general agreement on the V&V terminology (Hoppe & Meseguer 1991), the following definitions will be used in the scope of this paper. • Validation - Validation means building the right system (Boehm 1984). The purpose of validation is to assure that a KBS will provide solutions with similar (or higher if possible) confidence level as the one provided by domain experts. Validation is then based on tests, desirably in the real environment and under real circ*mstances. During these tests, the KBS is considered as a black box, meaning that only the input and the output are really considered important; • Verification - Verification means building the system right (Boehm 1984). The purpose of verification is to assure that a KBS has been correctly designed and implemented and does not contain technical errors. During the verification process the interior of the KBS is examined in order to find any possible errors; this approach is also called crystal box. • Verification & Validation - The Verification and Validation process allows determining if the requirements have been correctly and completely fulfilled in order to assure the system’s reliability, safety, quality and efficiency. More synthetically, it can be said that the V&V process is to build the right system right (Preece 1998). In the last decades, several techniques were proposed for validation and verification of Knowledge Based Systems, like inspection, formal proof, cross-reference verification or empirical tests (Preece 1998). The efficiency of these techniques strongly depends on the existence of test cases or on the degree of formalization used in the specifications. One of the most used techniques is static verification, which consists of sets of logical tests executed in order to detect possible knowledge anomalies.

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

475

•

Anomaly – An anomaly is a symptom of one (or multiple) possible error(s). Notice that an anomaly does not necessarily denote an error (Preece & Shinghal 1994). Rule bases are drawn as a result of a knowledge analysis/elicitation process, including, for example, interviews with experts or the study of documents such as codes of practice and legal texts, or analysis of typical sample cases. The rule base should reflect the nature of this process, meaning that if documentary sources are used, the rule base should reflect knowledge sources. Consequently, some anomalies are desirable and intentionally inserted in KB. For instance, redundancy on the documentary sources will lead to redundant KB. Rule based systems are still the more often used representation in the development of KBS. The scientific community has deeply studied these systems. At the moment there is an assortment of V&V techniques that allow the detection of many anomalies in systems that use this kind of representation. Some well known V&V tools used different techniques to detect anomalies. The KB-Reducer (Ginsberg 1987) system represents rules in logical form, and then it computes for each hypothesis the corresponding labels, detecting the anomalies during the labelling process. Meaning that each literal in the rule LHS (Left Hand Side) is replaced by the set of conditions that allows to infer it. This process finishes when all formulas become grounded. The COVER (Preece, Bell & Suen 1992) works in a similar fashion using the ATMS (Assumption Truth Maintaining System) approach (Kleer 1986) and graph theory, allowing the detection of a large number of anomalies. The COVADIS (Rousset 1988) successfully explored the relation between input and output sets. The ESC (Cragun & Steudel 1987), RCP (Suwa, Scott & Shortliffe 1982) and Check (Nguyen et al. 1987) systems and more recently the PROLOGA (Vanthienen, Mues & Wets 1997) used decision table methods for verification purposes. This approach proved to be quite interesting, especially when the systems to be verified also used decision tables as representation support. These systems’ major advantage is that it enables tracing the reasoning path quite clearly, while the major problem is the lack of solutions for verifying long reasoning inference chains. Some authors studied the applicability of Petri nets (Pipard 1989; Nazareth 1993) to represent the rule base and to detect the knowledge inconsistencies. More recently coloured Petri nets were used (Wu & Lee 1997). Although specific knowledge representations provide higher efficiency while used to perform some verification tests, arguably all of them could be successfully converted into production rules.

3. The case study: SPARSE Control Centres (CC) are very important in the operation of electrical networks when receiving real-time information about network status. CC operators should take, usually in a short time, the most appropriate actions in order to reach the maximum network performance. In case of incident conditions, a huge volume of information may arrive to these centres. The correct and efficient interpretation by a human operator becomes almost impossible. In order to solve this problem, some years ago, electrical utilities began to install intelligent applications in their control centres (Amelink, Forte & Guberman 1986; Kirschen & Wollenberg 1992). These applications are usually KBS and are mainly intended to provide operators with assistance, especially under critical situations. 3.1 Architecture and functioning SPARSE (Vale et al. 1997) is a KBS used in the Portuguese Transmission Network (REN) for incident analysis and power restoration. In the beginning it started to be an expert system

476

Robotics, Automation and Control

(ES) and it was developed for the Portuguese Transmission Network (REN) Control Centres. The main goals of this ES were to assist Control Centre operators in incident analysis, allowing a faster power restoration. Later, the system evolved to a more complex architecture (Vale et al. 2002), which is normally referred as a Knowledge Based System (see Fig. 1).

Fig. 1 - SPARSE Architecture SPARSE includes many modules, namely for learning and automatic data acquisition (Duarte et al. 2001), adaptive tutoring (Faria et al. 2002) and automatic explanations (Malheiro et al. 1999). As it happens in the majority of KBSs, one of the most important SPARSE components is the knowledge base (KB) (see formula (1)):

KB = RB ∪ FB ∪ MRB

(1)

where: • RB stands for rule base; • FB stands for facts base; • MRB stands for meta-rules base; The rule base is a set of clauses with the following structure: RULE ID: 'Description': [ [C1 AND C2 AND C3] OR [C4 AND C5] ] ==> [A1,A2,A3].

The rule's Left Hand Side (LHS) is a set of conditions (C1 to C5 in this example) of the following types: • A fact, representing domain events or status messages. Typically these facts are time-tagged;

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

477

• A temporal condition over facts; • Previously asserted conclusions. The rule’s Right Hand Side (RHS) is a set of actions/conclusions to be taken (A1 to A3 in this example) and may be of one of the following types: • Assertion of facts representing conclusions to be inserted in the knowledge base. A conclusion can be final (e.g., a diagnosis) or intermediate (e.g., a status fact concerning that later it will be used in other rule LHS); • Retraction of facts (conclusions to be deleted from the knowledge base); • Interaction with the user interface. Let’s consider the rule d3 as an example of a SPARSE rule: rule d3 : ‘Monophasic Triggering’: [ [ msg(Dt1,Tm1,[In1,Pn1,[In2,NL]],'>>>TRIGGERING','01') at T1 and breaker(_,_,In1,Pn1,_,_, closed) and msg(Dt2,Tm2,[In1,Pn1,[In2,NL, _ ]],'BREAKER','00') at T2 and condition(abs_diff_less_or_equal(T2,T1,30)) ] or [ msg(Dt1,Tm1,[In1,Pn1,[In2,NL]],'>>>TRIGGERING','01') at T1 and msg(Dt2,Tm2,[In1,Pn1,[In2,NL]],'BREAKER','00') at T2 and condition(abs_diff_less_or_equal(T2,T1,30)) ] ] ==> [ assert(triggering(Dt1,Tm1,In1,Pn1,In2,NL,monophasic,not_identified,T2),T1), retract(breaker(_,_,In1,Pn1,_,_,closed),_,T2), assert(breaker(Dt2,Tm2,In1,Pn1,_,triggering,mov),T2), retract(msg(Dt1,Tm1,[In1,Pn1,[In2,NL]],'>>>TRIGGERING','01'),T1,T1), retract(msg(Dt2,Tm2,[In1,Pn1,[In2,NL | _ ]],'BREAKER','00'),T2,T2), retract(breaker_opened(In1,Pn1),_,T1), assert(breaker_opened(In1,Pn1),T1) ]. The meta-rule base is a set of triggers, used by the rule selection mechanism, with the following structure:

trigger( Fact ,[( R1 , TB1 , TE1 ), … , ( Rn , TBn , TEn )])

(2)

standing for: • Fact - the arriving fact (external alarm or a previously inferred conclusion); • (Ri, TBi, TEi) - the temporal window were the rule Ri could by triggered. TBi is the delay time before rule triggering, used to wait for remaining facts needed to define an event, and the TEi is the maximum time for trying to trigger the rule Ri.

478

Robotics, Automation and Control

The inference process relies on the cycle depicted in Fig. 2. In the first step, SPARSE collects a message (represented as a fact in SPARSE scope) from SCADA1, then the respective trigger is selected and some rules are scheduled. The scheduler selects the next rule to be tested (the inference engines try to prove its veracity). Notice that, when a rule succeeds, the conclusions (on the RHS) will be asserted and later processed in the same way as the SCADA messages. Start

Select fact

Fact Base

Select meta-rule

Meta-rules Base

Schedule rules

Rules Base

Assert intermediate conclusion

Test rule

Final Conclusion

Yes

Produce report Process Flow Data Flow Yes

Iterate

Finish

Fig. 2 - SPARSE main algorithm Let’s consider the following meta-rule that allows scheduling the rule d3: trigger(msg(_,_,[Inst1,Painel1,_],'>>>TRIGGERING','01'), [(d1,30,50),(d2,31,51),(d3,52,52)] ).

Supervisory Control And Data Acquisition: this system collects messages from the mechanical/electrical devices installed in the network.

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

479

The use of rule selection mechanism allows configuring a heuristic approach with the following characteristics: • Problem space reduction – since it assures that only a set o rules related to the Fact will be used and tested; • From specific to general rules – the temporal windows allow defining an explicit order, so the system usually proceeds from the most specific to general. In what concerns SPARSE, there were clearly two main reasons to start its verification. First, the SPARSE development team carried out a set of tests based on previously collected real cases and some simulated ones. Despite the importance of these tests for the final product acceptance, the major criticism that could be pointed out to this technique is that it only assures the correct performance of SPARSE under the tested scenarios. Moreover, the tests performed during the validation phase, namely the field tests, were very expensive, since they required the assignment of substantial technical personnel and physical resources for their execution (e.g., transmission lines and coordination staff). Obviously, it would be unacceptable to perform those tests after each KB update. Under these circ*mstances, an automatic verification tool could offer an easy and inexpensive way of assuring knowledge quality maintenance, assuring the consistency and completeness of represented knowledge. 3.2 The verification problem The verification problem based on the anomaly detection usually relies on the calculation of all possible inference chains that could be entailed during the reasoning process. Later, some logical tests are performed in order to detect if any constraints violation takes place. SPARSE presents some features that make the verification work harder. These features demand the use of more complex techniques during anomaly detection and introduce significant changes in the number and type of anomalies to detect. The following ones are the most important: • Rule triggering selection mechanism - In what concerns SPARSE, this mechanism was implemented using both meta-rules and the inference engine. As for verification work, this mechanism not only avoids some run-time errors (for instance circular chains) but also introduces another complexity axis to the verification. Thus, this mechanism constrains the existence of inference chains and also the order that they would be generated. For instance, during system execution, the inference engine could be able to assure that shortcuts (specialists rules) would be preferred over generic rules; • Temporal reasoning - This issue received large attention from the scientific community in last two decades (surveys covering this issue can be found in (Gerevini 1997; Fisher, Gabbay & Vila 2005)). Although time is ubiquitous in society, and despite the natural ability that human beings show dealing with it, a widespread representation and usage in the artificial intelligence domain remains scarce due to many philosophical and technical obstacles. SPARSE is an alarm processing application and its major challenge is to reason about events. Therefore, it is necessary to deal with time intervals (e.g., temporal windows of validity), points (e.g., instantaneous events occurrence), alarms order, duration and the presence or/and absence of data (e.g., messages lost in the collection or/and transmission system);

480 •

•

Robotics, Automation and Control

Variables evaluation - In order to obtain comprehensive and correct results during the verification process, the evaluation of the variables present in the rules is crucial, especially in what concerns temporal variables, i.e., the ones that represent temporal concepts. Notice that during anomaly detection (this type of verification is also called static verification) it is not possible to predict the exact value that a variable will have; Knowledge versus Procedure - Languages like Prolog provide powerful features for knowledge representation (in the declarative way) but they are also suited to describe procedures; so, sometimes knowledge engineers encode rule using procedural predicates. For instance, the following sentence in Prolog: min(X,Y,Min) calls a procedure that compares X and Y and instantiates Min with smaller value. Thus, it is not a (pure) knowledge item; in terms of verification it should be evaluated in order to obtain the Min value. It means that the verification method needs to consider not only the programming language syntax but also the meaning (semantic) in order to evaluate the functions. This step is particularly important for any variables that are updated during the inference process.

4. The verification tool: VERITAS VERITAS is an automatic tool developed for KBS verification. This tool performs KB structural analysis allowing knowledge anomalies detection. Originally, VERITAS used a non temporal KB verification approach. Although it proved to be very efficient in other KBS verification – in an expert system for cardiac diseases diagnosis (Rocha 1990) and in other expert system otology diseases diagnosis and therapy (Sampaio 1996) –, in SPARSE case some important limitations were detected. 4.1 Main process The VERITAS main process, depicted in Fig. 3, relies on a set of modules that assures the following competences: converting the original knowledge base; creating the internal knowledge base; computing the rule expansions and detecting anomalies; producing readable results. Regarding that, the user can interact with all stages of the verification process. The conversion module translates the original knowledge base files into a set of new ones containing the same data but represented in an independent format, recognized by VERITAS. For this module functioning, a set of conversion rules is also needed, specifying the translation procedure. This module assures the independence of VERITAS to the format and syntax used in specification of the KB to be verified. The conversion operations largely depend on the original format, although the most common conversion operations are: • If the LHS is a disjunctive form, a distinct rule is created for each conjunction contained by the LHS. The rule RHS remains the same; • Original symbols, like logical operators, are replaced by others accordingly to the internal notation; • The variables are replaced by internal symbols and a table is created in order to store these symbols. During the conversion step, the rule d3 (previously presented) would be transformed in two d3-L1 and d3-L2, since the LHS contains two conjunctions. The tuple cvr/3 stores the intermediate rule d3-L1 presented after Fig.3.

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

Fig. 3 - VERITAS main algorithm cvr( d3-L1, Monophasic Triggering, [ msg(#Dt1,#Tm1,[#In1,#Pn1,[#In2,#NL]],>>>TRIGGERING,01) at #T1, breaker(#_,#_,#In1,#Pn1,#_,#_, closed), msg(#Dt2,#Tm2,[#In1,#Pn1,[#In2,#NL,#_]],BREAKER,00) at #T2, abs_diff_less_or_equal(#T2,#T1,30) ], [ assert(triggering(#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,monophasic,not_identified,#T2),#T1), retract(breaker(#_,#_,#In1,#Pn1,#_,#_,fechado),#_,#T2), assert(breaker(#Dt2,#Tm2,#In1,#Pn1,#_,triggering,mov),#T2), retract(msg(#Dt1,#Tm1,[#In1,#Pn1,[#In2,#NL]],>>>TRIGGERING,01),#T1,#T1), retract(msg(#Dt2,#Tm2,[#In1,#Pn1,[#In2,#NL|#_]],BREAKER,00),#T2,#T2), retract(breaker_opened(#In1,#Pn1),#_,#T1), assert(breaker_opened(#In1,#Pn1),#T1) ]).

481

482

Robotics, Automation and Control

After the conversion step, the internal knowledge base is created (see section 4.2). This KB will store information about the following issues: • The original rules and meta-rules, represented according to the structure that allows speeding up the expansion calculation; • A set of restrictions over the knowledge domain modelled in KB regarding this set can be manually or semi-automatically created; • A characterization over the data items extracted from the original KB. In the following step, the anomaly detection (see section 4.3), the module responsible for this task will examine the KB in order to calculate every possible inference that could be entailed during KBS functioning, and then detect if any constraint (logic or semantic) is violated. Finally, the detected anomalies are reported in a suitable way for human analysis. 4.2 Knowledge base The knowledge base schema was designed regarding the need to speed the expansion calculation since it is one of the most time consuming steps in the verification process. In Fig. 4 the concepts and relations contained in the schema are depicted.

Fig. 4 - Knowledge Base Schema This schema allows to: store the rules and meta-rules in an efficient way; classify the elements that compose such rules and meta-rules. Therefore, tuples item/1, type/2 and literal/3 allow to classify the elements (in VERITAS scope named literals) that compose both

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

483

rule LHSs and RHSs. The tuples literalArgs/3 and matchLiteral/ represent, respectively: the arguments for each literal and the pairs of literals that could be matched during expansion calculation. The tuple rule/3 relates the old and new rule representations while lhs/7 and rhs/8 store the literals that compose each rule LHS and RHS, respectively. Finally, metaRule/7 stores the information related to original meta-rules.

item( F / A)

(3)

The tuple item/1 stores the functor and arity of each literal appearing in rules LHS and RHS contained in RB. For the rule d3-L1, the following items would be asserted: item(abs_diff_less_or_equal/3). item(breaker_opened/2). item(triggering/9). item(breaker/7). item(msg/5).

type( F / A , Type )

(4)

The tuple type/2 allows the characterization of the items previously extracted and stored in item/1. The classification can be done manually or semi-automatically. The field Type can exhibit one of the following values: • interface – an operation for user interface; • status – state of a knowledge domain element (e.g., electrical devices); • process – used to define eventualities with duration (non-instantaneous); • event - used to define instantaneous eventualities; • time – temporal operator used to reason about time; • operation - used to define “procedural” operations like comparisons. The following tuples type/2 would be created for the considered rule d3-L1: type(abs_diff_less_or_equal/3,time). type(breaker_opened/2,status). type(triggering/9,status). type(breaker/7,status). type(msg/5,event).

literal( Literal , Type , F / A)

(5)

The tuple literal/3 stores the synthesis of type/2 and item/1. Besides, it allows labelling the literals with a key (Literal) for what they will be referred during the verification process. According to the example, the following instances of literal/3 would be created: literal(tr1,time,abs_diff_less_or_equal/3). literal(st5,status,breaker_opened/2). literal(st9,status,triggering/9). literal(st11,status,breaker/7). literal(ev1,event,msg/5).

literalArgs( Literal , Index , Args )

(6)

484

Robotics, Automation and Control

During the expansions calculation, VERITAS replaces each literal X contained in a LHS rule by the literals that compose the LHS of the rule that allows inferring a literal Y if the literal Y matches X. Actually, this process simulates the inference engine using backward chaining. During the KBS usage the process of matching literals is straightway since the variables values are known; however, it doesn’t happen in the verification process, so the number of expansions (possible inference chains) that the system needs to calculate grows exponentially. Additionally, during the expansions calculation each pair of literals needs to be checked often, so in order to avoid it, VERITAS computes à priori the pairs of literals that can be matched. Henceforth, the tuple literalArgs/3 stores the diverse occurrences of similar literals. Two literals are similar if they exhibit the same functor and arity but their respective lists of arguments are not similar. Two lists of arguments are similar if they have the same size and for each element in corresponding position one of the following situations happens: • The argument is a free variable, meaning it can be matched with everything; • The argument is a terminal (not a free variable) and in this case both arguments need to exhibit the same value. Considering the literal st9 the following instances of literalArgs/3 would be created: literalArgs(st9,1,[#Dt,#Tm,#In1,#Pn1,#In2,#NL,#Type,not_identified,#T2]). literalArgs(st9,2,[#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,monophasic,not_identified,#Tabr]). literalArgs(st9,3,[#Dt2,#Tm2,#In1,#Pn1,#In2,#NL,triphasic,dtd,#Tabr2]). literalArgs(st9,4,[#Dt2,#Tm2,#In1,#Pn1,#In2,#NL,monophasic,dmd,#Tabr2]). literalArgs(st9,5,[#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,triphasic,rel_rap_trif,#Tabr1]). literalArgs(st9,6,[#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,triphasic,dtr,#Tabr1]). literalArgs(st9,7,[#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,monophasic,rel_rap_mono,#Tabr1]). literalArgs(st9,8,[#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,monophasic,dmr,#Tabr1]). literalArgs(st9,9,[#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,triphasic,ds,#Tabr]). literalArgs(st9,10,[#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,#_, #TriggeringType,#Tabr]). literalArgs(st9,11,[#Dt2,#Tm2,#In1,#Pn1,#In2,#NL,triphasic,not_identified,#Tabr]). literalArgs(st9,12,[#Dt2,#Tm2,#In1,#Pn1,#In2,#NL,triphasic,close_defect,#Tabr]).

matchLiteral( Literal , Index , Indexes )

(7) The tuple matchLiteral/3 stores the possible matches between a literal defined by pair Literal/Index and a list of indexes. Notice that the process used to determine the possible matches between literals is similar to determining a Graph Transitive Closure where the pair Literal/Index is each graph node and the list Indexes is the set of adjacent edges. In the considered example the following instances of matchLiteral/3 would be created: matchLiteral(st9,1,[1,2,10,11]). matchLiteral(st9,2,[1,2,10]). matchLiteral(st9,3,[3,10]). matchLiteral(st9,4,[4,10]). matchLiteral(st9,5,[5,10]). matchLiteral(st9,6,[6,10]). matchLiteral(st9,7,[7,10]). matchLiteral(st9,8,[8,10]). matchLiteral(st9,9,[9,10]). matchLiteral(st9,10,[1,2,3,4,5,6,7,8,9,10,11,12]). matchLiteral(st9,11,[1,10,11]). matchLiteral(st9,12,[10,12]).

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

rule( Rule , Description , OriginalRule )

485 (8)

The tuple rule/3 allows relating the new rules used by VERITAS and the original ones. This tuple is quite useful in the user interaction since the user is more acquainted with the original rules. For the rule d3 the following instances of rule/3 would be created: rule(d3-L1,Monophasic Triggering,d3). rule(d3-L2,Monophasic Triggering,d3).

lhs( Rule , Literal , Index , Position , TemporalArg , LogicalValue , Args )

(9)

The tuple lhs/7 stores the set of conditions forming the rule LHS. Meaning, the occurrence of a literal (Literal/Index) in a rule and its position, temporal label (TemporalArg), logical value (since a condition can be a negation of the referred literal) and arguments. Considering the rule d3-L1, the following instances of lhs/7 would be asserted: lhs(d3-L1,ev1,8,1,#T1,true,[#Dt1,#Tm1,[#In1,#Pn1,[#In2,#NL]],>>>TRIGGERING,01]). lhs(d3-L1,st11,8,2,none,true,[#_,#_,#In1,#Pn1,#_,#_,fechado] ). lhs(d3-L1,tr1,1,4,none,true,[#T2,#T1,30]). lhs(d3-L1,ev1,7,3,#T2,true,[#Dt2,#Tm2,[#In1,#Pn1,[#In2,#NL,#_]],BREAKER,00]).

rhs( Rule , Literal , Index , Position , TemporalArg , LogicalValue , Args , Type )

(10)

The tuple rhs/8 stores the set of conclusions forming the rule RHS. This tuple has about the same structure as the lhs/7 but adds the field Type, which can exhibit the following values: • cf – assertion of conclusion; • rf – retraction of fact; • it – user interface. Considering the rule d3-L1, the following instances of rhs/8 would be asserted: rhs(d3-L1,st9,1,1,#T1,true,cf,[#Dt1,#Tm1,#In1,#Pn1,#In2,#NL,monophasic,not_identified, #T2]). rhs(d3-L1,st11,8,2,#T2,true,rf,[#_,#_,#In1,#Pn1,#_,#_,closed]). rhs(d3-L1,st11,4,3,#T2,true,cf,[#Dt2,#Tm2,#In1,#Pn1,#_,triggering,mov]). rhs(d3-L1,ev1,8,4,#T1,true,rf,[#Dt1,#Tm1,[#In1,#Pn1,[#In2,#NL]],>>>TRIGGERING,01]). rhs(d3-L1,ev1,9,5,#T2,true,rf,[#Dt2,#Tm2,[#In1,#Pn1,[#In2,#NL|#_]],BREAKER,00]). rhs(d3-L1,st5,1,6,#T1,true,rf,[#In1,#Pn1]). rhs(d3-L1,st5,1,7,#T1,true,cf,[#In1,#Pn1]).

metaRule( Rule , Literal , Index , Order , StartInstant , FinishInstant , Args )

(11)

The tuple metaRule/7 stores the information the meta-rules used in the rule selection triggering mechanism. Hence, a rule is scheduled for the interval defined by StartInstant and FinishInstant if the literal defined by the pair Literal/Index is asserted in the KB. Concerning the tuple trigger/2 used by SPARSE (presented in section 3.1) and the rule d3, the following instances of metaRule/7 would be asserted: metaRule(d3-L1,ev1,8,3,52,52,[#_,#_,[#Inst1,#Panel1,#_],>>>TRIGGERING,01]). metaRule(d3-L2,ev1,8,4,52,52,[#_,#_,[#Inst1,#Panel1,#_],>>>TRIGGERING,01]).

link( RuleL , Literal , IndexL , PositionL , RuleR , IndexR , PositionR )

(12)

486

Robotics, Automation and Control

The tuple link/7 instantiates the tuple matchLiteral/3 for a specific knowledge base. While the matchLiteral/3 states which pairs of literals can be matched, the tuple link/7 stores which pairs of literals present in a KB can actually be matched during the expansions calculation. Concerning the literal st9 and the rule d3-L1, the following instances of link/7 would be created: link(d3-L1,st9,2,1,d22,10,1). link(d3-L1,st9,2,1,d21,10,2). link(d3-L1,st9,2,1,d21,10,1). link(d3-L1,st9,2,1,d5,2,1). link(d3-L1,st9,2,1,i1,1,1).

cst(Constraint, Literal1, Index1, Literal2, Index2)

(13)

The tuple cst/5 allows storing the impermissible sets for a knowledge base. Each occurrence of this tuples states that a pair of literals is incompatible together. 4.3 Anomaly detection The algorithm used for the anomaly detection, depicted in Fig. 5, works in the following way.

Fig. 5 - Anomaly Detection Algorithm

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

487

In the first step, the knowledge base is previously created (see section 4.2). After that, the expansions are calculated (see section 4.3.1); during this step the circular expansions are detected and labelled. In the next step, using both expansions and meta-rules, the temporal consistency of the expansions is evaluated (see section 4.3.2). Finally, specific algorithms are applied in order to detect the following anomalies: circularity (see section 4.3.3), ambivalence (see section 4.3.4) and redundancy (see section 4.3.5). 4.3.1 Expansions calculation The expansions calculation process intends to thoroughly determine the inference chains that can be possibly drawn during the KBS functioning. Calculating an expansion consists in breadth-first search over a hypergraph, in which each hypernode is a set of literals and each transition represents a rule. The procedure calcExpansion works in the following way:

488

Robotics, Automation and Control

First, some variables are initialized, namely: listE (the list of literals to be expanded); rules (list of rules that used in an expansion); expansion (list of the expanded literals). Hence, for each literal contained in the KB, a rule that allows its inference is selected and the literals contained in the LHS are stored in the list listE. This algorithm finishes when listE becomes empty, meaning that all literals were expanded. The elements contained in listE are iteratively popped and they can be one of the following types: • Not inferable – this type of literals are ground facts or basic operations; • Inferable – this type of literals can be inferred during KBS functioning. If a literal is part of a circular inference chain, the algorithm labels it and the expansion for this literal finishes; otherwise the algorithm calculates the set of literals needed to infer it and pushes this set into the listE. Finally, the algorithm produces a list of the following kind of tuples: • f(literal) – represents a literal not inferable; • e(literals,ruleList) – represents a list of inferable literals and the rules used to infer them; • c(literals,ruleList) – represents a list of inferable literals that configure a circular chain. Let’s consider the following set of rules: rule(r1,[st1,ev1],[st2,st3]) rule(r2,[st3,ev3],[st6,ev5]) rule(r3,[ev3],[ev4]) rule(r4,[ev1,ev2],[ev4,st4]) rule(r5,[ev5,st5,ev4],[st7,st8])

After the use of the described algorithm over this set of rules, the following two expansions would be obtained: f(ev3) f(st5) f(ev3) f(ev1) f(st1) e([ev4],[r3])e([st2,st3],[r1]) e([st6,ev5],[r1,r2]) e([st7,st8],[r1,r2,r3,r5]) f(ev2) f(ev1) f(st5) f(ev3) f(ev1) f(st1) e([ev4,st4],[r4]) e([st2,st3],[r1]) e([st6,ev5],[r1,r2]) e([st7,st8],[r1,r2,r4,r5])

The dependencies between rules captured in the expansions can be graphically represented by the hypergraphs as depicted in the Fig. 6. This technique allows rule representation in a manner that clearly identifies complex dependencies across compound clauses in the rule base and there is a unique directed hypergraph representation for each set of rules (Ramaswamy & Sarkar 1997).

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

489

Fig. 6 - Hypergraphs for the literal st8 For sake of brevity, in the described algorithm some simplifications were considered, namely: • Local versus global – as previously referred, in order to fasten the expansion calculation, the knowledge base includes the tuple link/7 that stores the pairs of literals able to mach together. Notice that link/7 assures only a local consistency (between the pair of literals), not global (along the entire expansion). Let’s consider the following occurrences of the tuple literalArgs/3: literalArgs(st11,5,[#Dt2,#Tm2,#In1,#Pn1,#_,relRapTrif,closed]). literalArgs(st11,6,[#Dt2,#Tm2,#In1,#Pn1,#_,relRapMono,closed]). literalArgs(st11,8,[#_,#_,#In1,#Pn1,#_,#Type,closed]).

•

In this example, local consistency means that the pairs (st11/5, st11/8) and (st11/6, st11/8) can be matched. Global consistency means the chains (st11/5, st11/8, st11/5) and (st11/6, st11/8, st11/6) can be inferred. Furthermore, the chain (st11/5, st11/8, st11/6) isn’t possible, since after the variable Type becomes instantiated with the relTrapTrif, it can’t be re-instantiated with the value relTrapMono. In order to assure global consistency, the expansion calculation algorithm implements a table of symbols where the variables are stored and updated along with the expansion calculation; Multiple conclusions per rule – when a literal is expanded if a particular rule infers multiples conclusions that needs to be adequately stored. For instance, the rule r5 allows the inference of the literals st7 and st8, as depicted in the Fig. 6, obviously this situation is reflected in the tuples e/2 and c/2 contained in the expansion list;

490 •

Robotics, Automation and Control

Data representation – in the described algorithm the rules are represented using a tuple rule(r,lhs,rhs) although in the algorithm implementation the tuples, previously described, are used (e.g., rule/3, lhs/7, rhs/8 or literal/3).

4.3.2 Temporal analysis Concerning temporal analysis, the following issues where considered in the VERITAS implementation: • Distinct treatment for temporal variables – this kind of variables are used for labelling literals, hence, during the internal knowledge base creation these variables where stored in lhs/7 and rhs/8 in the field TemporalRef. Later during the expansions calculation, they are indexed in a specific table of symbols with the following structure:

(Var , BeginInst , EndInst )

•

where each variable (Var) has a temporal interval of validity defined by its starting (BeginInst) and ending (EndInst) instants; Capture and evaluation temporal operations – the literals relating temporal operations contained in an expansion are captured in order to build a net representing their dependencies. This net is later evaluated aiming to assure temporal consistency over an entire expansion. Therefore, the following items are collected: • Literals for variables evaluation like: t = t 1 + 1 and t = max( t 1, t 2) ; • •

Temporal relational operators like: t 1 ≺ t 2 and t ≥ 30 ; Temporal operators used specifically in SPARSE like: abs_Diff_Less_Or_Equal(t1,t2), meaning t 2 − t 1 ≤ t 3 .

Later the collected items are evaluated in order to: • Detect inconsistencies between related items like: t 1 ≺ t 2 and t 1 ≥ t 2 ; • Assert and/or update the table of temporal symbols, for instance, t 1 ≺ t 2 and t ≥ max( t 1, t 2) allow updating variable t with the value of t2. •

Parametric temporal validity analysis – the meta-rules stores temporal window of validity for a set of rules. The maximum validity for literals contained in an expansion is defined by the combination of the temporal validity intervals inherited from all rules used in the referred expansion. Additionally, this parametric evaluation is enriched with temporal operations described in the previous items. Regarding that, each literal usually has symbolic, starting and ending instants defined by knowledge base assert and retract operations.

4.3.3 Circularity detection A knowledge base contains circularity if, and only if, it contains a set of rules, which allows an infinite loop during rule triggering. In order to accomplish modelling requirements, sometimes the knowledge engineer needs to specify rules that allow the definition of a circular inference chain. The algorithm used to compute expansions detects every circular chain existing in a rule base; although aiming to reduce the number of false anomalies, a heuristic was considered for circularity detection. This heuristic has two mandatory conditions:

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

•

491

At least one literal of the type event needs to be present in the rule LHS. This implies the occurrence of an action requiring some rule triggering. Besides, the referred literal needs to be defined as the triggering fact in the metaRule/7 tuple; The culprit literals for circularity (equivalent literals) need to exhibit distinct temporal label. More precisely, the literal that appears in rule RHS needs to occur after the equivalent literal contained in the rule LHS.

4.3.4 Ambivalence detection A knowledge base is ambivalent, if and only if, for a permissible set of conditions, it is possible to infer an impermissible set of hypotheses. Concerning the detection of ambivalence, VERITAS is capable of detecting two types: in a single expansion and in multiple expansions. The detection of ambivalence in a single expansion is performed using an algorithm that works in the following way: for each pair literal1/index1 representing a conclusion (i.e., a literal which appears solely in the rules RHS) contained in the KB the following conditions are evaluated: • The existence of an expansion that allows to infer the pair literal1/index1; • The existence of an restriction relating the referred pair; • The other literal contained in the restriction, literal2/index2, is contained in the expansion considered in the first point. If all of these conditions are true, it means that two contradictory are contained in the same inference chain. In the last step the validity intervals defined for both literals, represented by literal1/index1 and literal2/index2, respectively, are evaluated, and if they intercept2 each other then an anomaly is reported. The detection of ambivalence in multiples expansions is performed using an algorithm that works in the following way: for each constraint contained in KB, if there are expansions supporting both literals contained in the restriction, represented by literal1/index1 and literal2/index2, respectively; finally the algorithm evaluates if the set of literals supporting literal1/index1 contains, or is contained by, the set of literals that support literal2/index2 and if so, an anomaly is reported. Notice that the notion of contain, or contained by, inherited from the set theory is refined with the condition of interception between corresponding literals as depicted in the Fig. 7. The set P (formed by the elements Pi, Pj, Pk and Pl) contains the set Q, since each element of Q exists simultaneously both in P and Q. The set R represents the temporal interception between P and Q.

Fig. 7 - Set contains or contained by with temporal characterization

Two temporal intervals intercept each other if they share at least an instant.

492

Robotics, Automation and Control

4.3.5 Redundancy detection A knowledge base is redundant if, and only if, the set of final hypotheses is the same in the rule/literal presence or absence. Concerning the detection of the anomaly referred as redundancy, two distinct types where addressed: not usable rule and redundancy in groups of rules. The detection of unused rules is performed using a rather simple algorithm that works in the following way: for each rule R contained in RB the following conditions need to be verified or an anomaly is reported: • At least a meta-rule refers the R; if not, the rule wouldn’t be ever called; • The R rule LHS doesn’t contain a pair of literals defined in any constraint, as long as an impermissible set of conditions would not be provided as input. The detection of redundancy in groups of rules is performed using an algorithm that works in the following way: after all expansions calculation, all related expansions for each rule are checked, and if all the conclusions can be inferred by other expansions then the considered rule is redundant. Consequently, an anomaly is reported.

5. Results In the development of VERITAS some well-known techniques were assembled with a set of new ones specifically designed for VERITAS. The set of techniques referred in literature includes: the calculation of the rule base expansions (Ginsberg 1987); detection of set of anomalies (Rousset 1988; Preece & Shinghal 1994); the use of graph theory for anomaly detection along with inference chains (Preece, Bell & Suen 1992); the use of logical and semantic constraints (Preece & Shinghal 1994; Zlatareva 1991) and directed hypergraphs for representing rule dependencies (Ramaswamy & Sarkar 1997). Therefore, VERITAS has the following characteristics: • Independence of the original rule grammar and syntax - VERITAS includes a module that allows the conversion between original and verification representation formats; • Optimized rule base expansions calculation – in order to fasten the expansions calculation two different techniques were considered: the information needed during the verification process was stored using a normalized data schema in which most important data issues where indexed; the matching pairs of literals were computed à priori and stored in the knowledge ensuring local consistency; • Variables and procedural instructions correctly addressed – in order to process variables in an adequate way during knowledge base creation, the variables contained in the rule and meta-rule sets are extracted and later stored in the table of symbols. During the expansions calculation, the variables are evaluated and their values are updated in the table of symbols. The use of this mechanism allowed assuring global consistency through an expansion calculation and, consequently, reducing the number of computed expansions. The procedural instructions were considered during expansions calculation, and whenever it implies variables evaluation the table of symbols is updated accordingly; • Temporal aspects – in the development of VERITAS a set of techniques and algorithms were considered in order to address the knowledge temporal reasoning representation issues, namely: definition of an anomaly classification temporal characterized, as well as the temporal characterization of logical and semantic restrictions; variables related

The Verification of Temporal KBS: SPARSE - A Case Study in Power Systems

•

493

with time representation received a distinct treatment during expansion calculation; capture and evaluation of operations relating time, in order to evaluate the consistency of the net formed by these operations; VERITAS testing – the verification method supported by VERITAS was tested with SPARSE, an expert system for incident analysis and power restoration in power transmission networks. Regarding that, a previous version of VERITAS was tested with two expert systems for: cardiac diseases diagnosis (Rocha 1990) and otology diseases diagnosis and therapy (Sampaio 1996).

6. Conclusions and future work This chapter focussed on some aspects of the practical use of KBS in Control Centres, namely, knowledge maintenance and its relation to the verification process. The SPARSE, a KBS used in the Portuguese Transmission Network (REN) for incident analysis and power restoration was used as case study. Some of its characteristics that mostly constrained the development and use of verification tool were discussed, like: the use o rule selection triggering mechanism, temporal reasoning and variables evaluation, hence, the adopt solutions were described. VERITAS is a verification tool that performs logical tests in order to detect knowledge anomalies as described. The results obtained show that the use of verification tools increases the confidence of the end users and eases the process of maintaining a knowledge base. It also reduces the testing costs and the time needed to implement those tests.

7. References Amelink, H., Forte, A. & Guberman, R., 1986. Dispatcher Alarm and Message Processing. IEEE Transactions on Power Systems, 1(3), 188-194. Boehm, B.W., 1984. Verifying and validating software requirements and design specifications. IEEE Software, 1(1), 75-88. Cragun, B. & Steudel, H., 1987. A decision table based processor for checking completeness and consistency in rule based expert systems. International Journal of Man Machine Studies (UK), 26(5), 633-648. Duarte, J. et al., 2001. TEMPUS: A Machine Learning Tool. In International NAISO Congress on Information Science Innovations (ISI'2001). Dubai/U.A.E., pp. 834-840. Faria, L. et al., 2002. Curriculum Planning to Control Center Operators Training. In International Conference on Fuzzy Systems and Soft Calculational Intelligence in Management and Industrial Engineering. Istanbul/Turkey, pp. 347-352. Fisher, M., Gabbay, D. & Vila, L. eds., 2005. Handbook of Temporal Reasoning in Artificial Intelligence, Elsevier Science & Technology Books. Gerevini, A., 1997. Reasoning about Time and Actions in Artificial Intelligence: Major Issues. In O.Stock, ed. Spatial and Temporal Reasoning. Kluwer Academic Públishers, pp. 43-70. Ginsberg, A., 1987. A new aproach to checking knowledge bases for inconsistency and redundancy. In procedings 3rd Annual Expert Systems in Government Conference, 10-111. Gonzalez, A. & Dankel, D., 1993. The Engineering of Knowledge Based Systems - Theory and Practice, Prentice Hall International Editions.

494

Robotics, Automation and Control

Hoppe, T. & Meseguer, P., 1991. On the terminology of VVT. Proceedings of the European Workshop on the Verification and Validation of Knowledge Based Systems, 3-13. Kirschen, D. & Wollenberg, B., 1992. Intelligent Alarm Processing in Power Systems. Proceedings of the IEEE, 80(5), 663-672. Kleer, J., 1986. An assumption-based TMS. Artificial Intelligence $Holland$, 2(28), 127-162. Malheiro, N. et al., 1999. An Explanation Mechanism for a Real Time Expert System: A Client-Server Approach. In International Conference on Intelligent Systems Application to Power Systems (ISAP'99). Rio de Janeiro/Brasil, pp. 32-36. Menzies, T., 1998. Knowledge Maintenance: The State of the Art. Engineering Review. Nazareth, D., 1993. Investigating the applicability of Petri Nets for Rule Based Systems Verification. IEEE Transactions on Knowledge and Data Engineering, 4(3), 402-415. Nguyen, T. et al., 1987. Knowledge Based Verification. AI Magazine, 2(8), 69-75. Pipard, E., 1989. Detecting Inconsistencies and Incompleteness in Rule Bases: the INDE System. In Proceedings of the 8th International Workshop on Expert Systems and their Applications, 1989, Avignon,France;. Nanterre, France: EC2, pp. 15-33. Preece, A., 1998. Building the Right System Right. In Proc.KAW'98 Eleventh Workshop on Knowledge Acquisition, Modeling and Management. Preece, A., Bell, R. & Suen, C., 1992. Verifying knowledge-based systems using the COVER tool. Proccedings of 12th IFIP Congress, 231-237. Preece, A. & Shinghal, R., 1994. Foundation and Application of Knowledge Base Verification. International Journal of Intelligent Systems, 9(8), 683-702. Ramaswamy, M. & Sarkar, S., 1997. Global Verification of Knowledge Based Systems via Local Verification of Partitions. In Proceedings of the European Symposium on the Verification and Validation of Knowledge Based Systems. Leuven,Belgium, pp. 145-154. Rocha, J., 1990. Concepção e Implementação de um Sistema Pericial no Domínio da Cardiologia. Master Thesis, FEUP, Porto, Portugal. Rousset, M., 1988. On the consistency of knowledge bases:the COVADIS system. In Procedings of the European Conference on Artificial Intelligence (ECAI'88). Munchen, pp. 79-84. Sampaio, I., 1996. Sistemas Periciais e Tutores Inteligentes em Medicina - Diagnóstico, terapia e apoio de Otologia. Master Thesis, FEUP, Porto, Portugal. Suwa, M., Scott, A. & Shortliffe, E., 1982. An aproach to Verifying Completeness and Consistency in a rule based Expert System. AI Magazine (EUA), 3(4), 16-21. Vale, Z. et al., 1997. SPARSE: An Intelligent Alarm Processor and Operator Assistant. IEEE Expert, Special Track on AI Applications in the Electric Power Industry, 12(3), 86-93. Vale, Z. et al., 2002. Real-Time Inference for Knowledge-Based Applications in Power System Control Centers. Journal on Systems Analysis Modelling Simulation (SAMS), Taylor&Francis, 42, 961-973. Vanthienen, J., Mues, C. & Wets, G., 1997. Inter Tabular Verification in an Interactive Environment. Proceedings of the European Symposium on the Verification and Validation of Knowledge Based Systems, 155-165. Wu, C. & Lee, S., 1997. Enhanced High-Level Petri Nets with Multiple Colors for Knowledge Verification/Validation of Rule-Based Expert Systems. IEEE Transactions on Systems. Man, and Cybernetics, 27(5), 760-773. Zlatareva, N., 1991. Distributed Verification: A new formal approach for verifying knowledge based systems. In I. Liebowitz, ed. Proceedings Expert systems world congress. New York, USA: Pergamon Press, pp. 1021-1029.