ann modeling of wire edm and optimization of cutting

ANN MODELING OF WIRE EDM AND OPTIMIZATION OF CUTTING PARAMETERS BY GA

A DISSERTATION

Submitted in partial fulfillment of the requirements for the award of the degree

of MASTER OF TECHNOLOGY

in MECHANICAL ENGINEERING

(With Specialization in Production & Industrial Systems Engineering)

By

AMANUEL TESGERA BASHA

DEPARTMENT OF MECHANICAL AND INDUSTRIAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY ROORKEE

ROORKEE - 247 667 (INDIA) JUNE, 2005

CANDIDATE'S DECLARATION

I hereby declare that the work which is being presented in the report entitled "ANN MODELING

OF WIRE EDM AND OPTIMIZATION OF CUTTING PARAMETERS BY GA" in partial

fulfillment of the requirement for the award of the degree of Master of Technology in Mechanical

and Industrial Engineering with specialization in Production and Industrial systems engineering,

submitted in the Department of Mechanical and Industrial Engineering, Indian Institute of

Technology, Roorkee is an authentic record of my own work carried out from August 2004 to

June 2005, under the guidance of Dr. H.S. Shan, Professor and Dr. Navneet Arora, Assistant

Professor, Mechanical and Industrial Engineering Department, IIT—Roorkee.

The matter embodied in this dissertation has not been submitted by me for the award of any other

degree or diploma.

Date : LZ1 D l L©oS

Place: JJ.T .cOV €.e

(AMANUEL TESGERA BASHA)

CERTIFICATE

This is to certify that the above statement made by candidate is correct to the best of my

knowledge.

Date: 22~VA . 6S

Dr. H.S. Shan

Place: Professor, MIED

Dr. Navneet Arrora

Assist. Prof., MIED

Acknowledgement

I express my deep and sincere sense of gratitude from the core of my heart to Dr. H.S Shan,

Professor, Mechanical and Industrial Engineering Department, Indian Institute of Technology,

Rookee, for his inspiring and painstaking supervision, encouragement and invaluable help during

the course of this thesis work without which this work would not have been possible. I am

grateful for the long hours he spends in discussing and explaining minute details of the work. It

has been a wonderful association which I cherish. I consider my self privileged to have worked

under his supervision and guidance.

I am grateful to my co-guide Dr Navneet Aroar, Assistant Professor, Mechanical and Industrial

engineering department, Indian Institute of Technology Roorkee, Roorkee (India), for his

suggestions and constant encouragement.

The services of the staff of Machine Tool Laboratory, Mechanical and Industrial Engineering

department are acknowledged with sincere thanks. I am particularly thankful to Mr. Jasbir Singh,

for providing technical assistance during the experimental work.

I would also like to thank Dr. Pradeep Kumar, Professor, Mechanical and Industrial Engineering

Department, Indian Institute of Technology, Roorkee, for providing facilities to carry out the

experiments.

Last but not least, I would like to express my gratitude to my Parents for their kind blessing and

for providing moral support and encouragement throughout my life. Grateful acknowledgements

are also due to all my teachers and friends whose timely help has gone a long way in my studies.

AMANUEL TESGERA BASHA

11

Abstract

Wire electrical discharge machining (WEDM) technology has been widely used in conductive

material machining especially when intricate shapes and profiles have to be cut. Manufacturers

and users of this process always want to achieve higher machining productivity with a desired

accuracy and surface finish. The WEDM process's performance, in terms of surface fmish and machining productivity is however affected by; many factors such as applied machining voltage, ignition pulse current, pulse duration, time between two pulses, servo-speed variation, servo-

control reference voltage, wire speed, wire tension, conductivity of dielectric and injection

pressure for dielectric. The material of the work piece and its height also influence the process. If

the setting of one of the above parameters changes, it affects the process in a complex way.

Because of the many variables and the complex and stochastic nature of the process, achieving

the optimal performance, even for a highly skilled operator with a state-of-the-art WEDM

machine is rarely possible. An effective way to solve this problem is to discover the relationship

between the performance of the process and its controllable input parameters i.e., model the

process through suitable mathematical techniques. However, the complex and stochastic nature of

the WEDM process has made it difficult to establish a conclusive analytical model; therefore, an

empirical method can be adopted. The present study is amid at exploiting the strong capabilities

of both ANN and GA, which are suitable for solving manufacturing problems that are amenable

for modeling using traditional methods.

A feed-forward back-propagation neural network based on Taguchi L18 experimental design is

developed to model the machining process. GA is then employed to find the optimal operating

conditions so that the productivity of wire EDM is improved for a given surface finish

requirement. The set of Pareto-optimal solutions is searched for the processing of titanium alloy.

The model was tested with experimental data and good correlation was obtained between the

expected and experimental results.

iii

Table of contents

Title Page no.

CANDIDATE'S DECLARATION .................................................................... i ACKNOWLEDGEMENTS ................................................................................ii ABSTRACT...................................................................................................iii TABLEOF CONTENTS ................................................................................. iv LISTOF FIGURES .......................................................................................vii LISTOF TABLES ......................................................................................... ix

Chapter 1 ..............................I

Introduction .. ......................................................... ....................................

1.1 Nontraditional processes defined ....................................................................2

1.2 Why Nontraditional Processes are Important....................... ..............................2

1.3 Classification of Nontraditional Processes by Type of Energy Used ..........................2 3 1.4 Thermal Energy Processes- Overview .............................................................

1.5 Electrical Discharge Processes ......................................................................3

1.5.1 Work Material in EDM ....................................

1.5.2 Complex nature of the EDM material removal process ........................................ 5

1.5.3 EDMApplications ........................

Chapter 2 . ,.6 . ................... WireEDM process ...............................................................

..............................7 2.1 History ....................................................................

. ..........? 2.2 EDM vs WEDM ............................................................................

..............................8 2.3 Wire-EDM equipment ..................................................

2.3.1 Positioning system ........... ....................................

2 3 2 Wire drive system....., ...... .......................... ............

2.3.3 Power supply ..........................................

2,3.4 Dielectric system ............. ...........................................................................9

process parameters .. 2.4 Wire-EDM p .............................

iv

.1.110

2.5 Wire-EDM process capabilities ....... ,.....11

2.6 WEDM applications ............................................' ,..................................11

applications 2.6.1 Moderntoohngapp ••••••••••••••••"""" "" '

12 2.6.2. Advanced ceramic materials ............................................. .. ...

2.6.3. Modern composite materials ..................... ............................... • .. • ......12

Chapter 3 Literature review ................................................. ..'.""""'.""""""""."""'14

3.1 Process modeling ................................................ ....................................14

3.2 Process optimization ................................................... .............................15

Chapter 4 Neural Network Implementation Issues ......................................................16

4.1 Overview of Neural Network Training Methodology ..........................................16

4.1.1 Training and Test Data Selection ..................................................................19

4.1.2 ..................................................................................19 Scaling Input Vectors ....................................................................................19 4.1.3 Initializing Weights

4.1.4 Over fitting .............................................................................................. 20

4.1.5 Neural Network Noise ............................... ................................................. 20

4.1.6 Stopping Criteria and Cross Validation Training ............................................ 20

Chapter 5 ANN modeling of WEDM ...........................................................................22

5.1 Neural network model ...............................................................................22

5.2 Experimental details .................................................................................24

5.3 Results and Discussion ..............................................................................25

5.4 The effect of the cutting parameters on the performance ...................................... 25

V

Chapter 6 Optimization of wire EDM process parameters ...........................................30

6.1 Why constrained optimization technique? ........................................................31 6.2 Search for Pareto-optimal WEDM process parameters .........................................32

6.2.1 Discussion .............................................................................................. 3 4

Chapter 7 Summary and Conclusions ...................................................................37 Scope for future research .....................................................................38

REFERENCES .........................................................................................39

Appendix - 1 Neural Networks: an over view ...........................................45

Appendix - 2 An Overview of Genetic Algorithms ...............................................55

r

a

Vi

LIST OF FIGURES

Fig. no. Title Page no.

1 (a) EDM Overall setup ...............................................................................4

1 (b) EDM Close-up view of gap, showing discharge and metal removal .....................4 2 Schematic of wire EDM set up ...............................................................6 3 Definition of kerf and over cut in electric discharge wire cutting ........................7 4 Complicated shapes produced by wire EDM ..............................................11

5 Schematic diagram of a neuron and a sample of pulse train ..........................:.45

6 General symbol of neuron .....................................................................46

7 (a) Bipolar continuous activation functions of a neuron .....................................48 7 (b) Unipolar continuous activation functions of a neuron ....................................48 8 A standard artificial neuron ..................................................................48 9 Configuration and terminology of a multi-layered neural network ....................50

10 r

Neural Network Training Flow Chart ......................................................18

11 Configuration of the neural network ....................................................... 22

12 Sum of square error vs number of iterations in the training process ...................26

13 (a) Surface show the relationship of Gap Voltage with cutting rate (CR) .................27 13 (b) Surface show the relationship of Gap Voltage with surface roughness (SR) .........27 14 (a) Surfaces show the relationship of Ton with cutting rate (CR) .........................28 14 (b) Surfaces show the relationship of Ton with surface roughness (SR) .................28 15 (a) Surfaces show the relationship of Toff with cutting rate (CR) .........................28 15 (b) Surfaces show the relationship of Toff with surface roughness (SR) .................28 16 (a) Surfaces show the relationship of Ws with cutting rate (CR) .........................29 16 (b) Surfaces show the relationship of Ws with surface roughness (SR) ..................29 17 The basic structure of EA ...................................................................56 18 Structure of a single population evolutionary algorithm ..............................57

Vii

19 Roulette Wheel Selection ......................................... ....................... 66

20 Multi-point Crossover .......................................................................67

21 Binary Mutation .............................................................................70

22 Structure of the optimization system .....................................................30

23 Machining performance predictions of ANN model ...................................34

24 Maximized cutting speed vs. surface roughness .........................................35

1

vii'

LIST OF TABLES

Table no. Title Page no.

Training data for experiments planned according to Taguchi's method .................. 25

2 Test data for experiments with randomly selected input parameters .....................25

3 Process performance optimization .............................................................32 r

4 Actual vs predicted WEDM performances ................................................... 33

5 Sorted pareto-optimal points .....................................................................36

1x

Chapter 1 Introduction

Today's manufacturing . industry is facing challenges from advanced difficult-to-machine

materials (i.e. tough super alloys, ceramics, and composites), stringent design requirements (i.e.

complex shapes, high precision, and high surface quality), and machining costs. In order to cope

up with these challenges, it has become necessary to change to more sophisticated tools of

manufacturing.

This need for more sophisticated tools has resulted in the creation of a new, unique family of

manufacturing processes known as nontraditional manufacturing processes. Generally speaking,

non-traditional processes differ from conventional processes either on account of utilizing

energy in novel ways or by applying forms of energy directly for the purpose of manufacturing.

Wire Electrical Discharge Machining (WEDM), one of the widely accepted non-traditional

material removal processes, has certain unique advantages as compared to other prevalent

nontraditional cutting technologies including laser cutting, plasma cutting and water jet cutting.

The most attractive advantages of this process are long cutting edge (maximal cutting edge >

500 mm), a small cutting kerf (minimal kerf < 0.05 mm), a small cutting taper and a

homogeneous surface. Due to these inherent advantages, it offers an effective and economical

alternative to present large-scale machining techniques. The realization of a methodology which

can optimize the productivity and surface quality requirement of this process is of great

significance to promote the process in to the growingly demanding tool manufacturing industry.

In this thesis work, an attempt is made to model and optimize the wire-EDM process parameters

using the combination of Artificial Neural Network (ANN) and Genetic Algorithm (GA).

In the first chapter a brief introduction to non-traditional machining processes is presented.

Electrical Discharge Machining (EDM) process as one of the thermal energy processes is

discussed in a detailed manner. The process overview based on the widely accepted principle of

thermal conduction and some highlights of its applications are also given. In the second chapter,

an explanation of wire-EDM process is given along with the similarity and difference it has with

the die sinking EDM. History of wire EDM process, process equipment and its applications are

1

discussed in detail for better understanding of the process. In chapter three, the review of

literature is made in order to understand what has been done so far in the modeling and

optimization of wire EDM process. There are several choices to be made when implementing

neural networks to solve a problem. These choices involve the selection of the training and

testing data, the network architecture, the training method, the data scaling method, and the

error goal. Chapter four is devoted to this part of discussion. The main section of chapter five

focuses on modeling of wire EDM process by using multi-layered back propagation neural

network. The experimental details together with the network architecture developed for this

purpose and the results of training and testing by applying the experimental data to the network

is given in this chapter. Chapter six is devoted to the optimization of wire EDM process. The

use of Genetic algorithm to solve constrained optimization problem is explored. The importance

of finding pareto-optimal points and how to find them from the predicted data is also given. The

final part of the thesis gives the conclusions drawn from the results and indicates the future

research direction in WEDM modeling and optimization.

1.1 Nontraditional Processes Defined

A group of processes that remove excess material by various techniques involving mechanical,

thermal, electrical, or chemical energy (or combinations of these energies) but do not use a

sharp cutting tool in the conventional sense. Developed since World War II in response to new

and unusual machining requirements that could not be satisfied by conventional methods.

1.2 Why Nontraditional Processes are Important

• Need to machine newly developed metals and non-metals with special properties that

make them difficult or impossible to machine by conventional methods.

• Need for unusual and/or complex part geometries that cannot easily be accomplished by

conventional machining.

• Need to avoid surface damage that often accompanies conventional machining.

1.3 Classification of Nontraditional Processes by the Type of Energy Used

• Mechanical - erosion of work material by a high velocity stream of abrasives or fluid (or

both) is the typical form of mechanical action

2

• Electrical - electrochemical energy to remove material (reverse of electroplating)

• Thermal — thermal energy usually applied to small portion of work surface, causing that

portion to be removed by fusion and/or vaporization

• Chemical — chemical etchants selectively remove material from portions of workpart,

while other portions are protected by a mask.

1.4 Thermal Energy Processes- Overview

Very high local temperatures are involved; material is removed by fusion or vaporization.

Physical and metallurgical damage to the new work surface is common in this case. In some

cases, resulting surface finish is so poor that subsequent processing is required.

1.4.1 Thermal Energy Processes

• Electric discharge machining

• Wire electrical discharge cutting

• Electron beam machining

+ Laser beam machining

• Plasma arc machining

• Conventional thermal cutting processes

1.5 Electrical Discharge Processes

EDM is a non-traditional manufacturing process that uses electric spark discharges to machine

electrically conducting materials. This process is typically used for materials such as tool and

die-steels, ceramics, etc., which are hard to machine using a more traditional approach. During

the process, a voltage is applied between two electrodes, the tool and the workpiece, closely

placed inside a liquid dielectric medium. When electrodes are very close to each other (gap

distance 0.05 mm), an electric spark discharge occurs between them forming a plasma channel

between the cathode and the anode. Fig. 1 shows a close-up of the machining region. The spark

generates enough heat to melt and even vaporize some of the workpiece material. As the spark

collapses, some of the molten and vaporized workpiece material is removed from the rest of the

workpiece and is carried away by the dielectric. Discharge duration is controlled by the process

parameters used and can be anywhere from a few microseconds to hundreds of microseconds.

3

I Gap

T

-- Overcut

(a)

Although quantity of material removed per discharge is miniscule, a large number of discharges

occurring over time result in removal of the desired amount of material. As material is removed

from the workpiece the tool slowly moves towards the workpiece surface (aided by servo-

control mechanism) so that a constant gap between the two can be maintained. The liquid

dielectric serves two purposes. It helps to keep the expanding plasma channel confined to a

small diameter so that the intensity of the heat flux is very high over a small surface area of the

electrodes. This ensures that melting, and even vaporization, can occur. The other use of the

dielectric is to flush some of the particles that gather in the gap between the electrodes. EDM

processes can be broadly classified into two categories, die-sinking EDM where the tool shape

complements the final desired shape of the workpiece, and wire-EDM where the discharge takes

place between a thin wire and the workpiece. The wire in wire-EDM applications acts almost

like an electrical saw.

(b)

Tool feed

4

Too[

Ionized fluid

Metal removed

from cavity "I

4

wear

Discharge

—Flow of dielectric fluid

—Cavity created I by discharge

Recast metal

Figure 1- Electric discharge machining (EDM): (a) overall setup, and (b) close-up view of gap,

showing discharge and metal removal.

1.5.1 Work material in EDM

• Only electrically conducting work materials

• Hardness and strength of the work material are not factors in EDM

• Material removal rate is related to melting point of work material

CI

1.5.2 Complex nature of the EDM material removal process

EDM involves the complex interaction of many physical phenomena. The electric spark

between the anode and the cathode generates a large amount of heat over a small area of the

workpiece. A portion of this heat is conducted through the cathode, a fraction is conducted

through the anode, and the rest is dissipated by the dielectric. The duration of the spark is of the

order of microseconds and during this time a plasma channel is formed between the tool and the

workpiece. Electrons and ions travel through this plasma channel. The plasma channel induces a

large amount of pressure on the workpiece surface as well. This pressure holds back the molten

material in its place. As the plasma starts forming, it displaces the dielectric fluid and a shock

wave passes through the fluid. As soon as the spark duration time is over and the spark collapses,

the dielectric gushes back to fill the void. This sudden removal of pressure results in a violent

ejection of the molten and vaporized material from the workpiece surface [1,2]. Ejected molten

particles quickly solidify as they come in contact with the colder fluid and are eventually

flushed out by the dielectric. Small craters are formed at locations where material has been

removed. Multiple craters overlap each other and the machined surface that is finally produced

consists of numerous overlapping craters. Although molten material ejection is not the only

means of material removal in EDM it is, however, the dominant mode of material removal in

case of metals [2]. In the machining of ceramics which have much higher melting and boiling

points, material spalling is the mechanism for material removal [2]. During machining the local

temperature in the workpiece gets close to the vaporization temperature of the material [1,2].

Thus, phase transformation from solid to liquid as well as liquid to vapor occurs during the

heating cycle. Part of the transformed material is removed but the rest re-solidifies on the

surface of the workpiece. This re-solidified layer is usually called the white layer, as it is not

easily etchable. EDM processes carried out in hydrocarbon dielectrics lead to the partial

breakdown of dielectrics and this further leads to some diffusion of carbon.

1.5.3 EDM Applications.

• Tooling for many mechanical processes: molds for plastic injection molding, extrusion

dies, wire drawing dies, forging and heading dies, and sheet metal stamping dies

• Production parts: delicate parts not rigid enough to withstand conventional cutting forces,

hole drilling where hole axis is at an acute angle to surface, and machining of hard and

exotic metals.

hi

Chapter 2 Wire EDM process

The WEDM process differs from the conventional EDM process in that a small wire is engaged

as the -tool electrode. The wire unwinding from a wire supply wheel is continuously fed through

the workpiece by the wire traction rollers and taken by a collection spool. The workpiece

mounted on the clamp frame. is almost never submerged in the dielectric medium that is

delivered at the, gap between the wire and workpiece via a hose or flushed through the sparking

area coaxially with the wire. The wire-workpiece gap usually ranges from 0.025 to 0.05 mm and

is constantly maintained by a computer-controlled (CNC) positioning system. This positioning

system is also responsible for controlling the movement of the wire to achieve the desired

complex two- and three-dimensional (2- and 3-D) shapes for the workpiece.

_ T\ Wire supply spool

Wire electrode

Dielectric 'fluid flow

Cutting .path

••.::. Wire take-up spool

Feed motion axes

Fig. 2 schematic of wire EDM set up

0

'ire diameter.

Overcut Figure 3- Definition of kerf and over cut in electric discharge wire cutting

2.1 History

WEDM was first introduced to the manufacturing industry in the late 1960s. The development

of the process was the result of seeking a technique to replace the machined electrode used in

EDM. In 1974, D.H. Dulebohn applied the optical-line follower system to automatically control

the shape of the component to be machined by the WEDM process [1]. By 1975, its popularity

was rapidly increasing, as the process and its capabilities were better understood by the industry

[2]. It was only towards the end of the 1970s, when computer numerical control (CNC) system

was initiated into WEDM that brought about a major evolution of the machining process. As a

result, the broad capabilities of the WEDM process were extensively exploited for any through-

hole machining owing to the wire, which has to pass through the part to be machined

2.2 EDM vs WEDM

While the material removal mechanisms of EDM and WEDM are similar, their functional

characteristics are not identical. WEDM uses a thin wire continuously feeding through the

workpiece by a microprocessor, which enables parts of complex shapes to be machined with

exceptional high accuracy. A varying degree of taper ranging froml5 ° for a 100 mm thick to 30

7 4

° for a 400 mm thick workpiece can also be obtained on the cut surface. The microprocessor

also constantly maintains the gap between the wire and the workpiece, which varies from 0.025

to 0.05 mm [2]. WEDM eliminates the need for elaborate pre-shaped electrodes, which are

commonly required in EDM to perform the roughing and finishing operations. In the case of

WEDM, the wire has to make several machining passes along the profile to be machined to

attain the required dimensional accuracy and surface finish (SF) quality. The typical WEDM

cutting rates (CRs) are 300 mm2/min for a 50 mm thick D2 tool. steel and 750 mm2/min for a

150 mm thick aluminium [2], and SF quality is as fine as 0.12-0.25µRa. In addition, WEDM

uses deionized water instead of hydrocarbon oil as the dielectric fluid and contains it within the

sparking zone. The deionized water is not suitable for conventional EDM as it causes rapid

electrode wear, but its low viscosity and rapid cooling rate make it ideal for WEDM [2].

2.3 Wire-EDM equipment

A wire-EDM machine consists of four sub-systems: the positioning system, the wire drive

system, the power supply, and the dielectric system. All the four subsystems have distinct

differences from conventional EDM.

2.3.1 Positioning system

Wire-EDM positioning systems usually consist of a CNC two-axis table and, in some cases, an

additional multi-axis wire-positioning system. The most unique feature of the CNC system is

that it must operate in adaptive control mode to always insure the consistency of the gap

between the wire and work piece. If the wire should come in contact with the work piece or if a

small piece of material bridges the gap and causes a short circuit, the positioning system must

sense this condition and back up along the programmed path to reestablish the proper cutting

conditions.

2.3.2 Wire drive system

The function of the wire drive system is to continuously deliver fresh wire under constant

tension to the work area. The need for constant wire tension is important to avoid such problems

as taper, machining streaks, wire breaks, and vibration marks.

As the wire passes through the work piece, it is guided by a set of sapphire or diamond guides.

Before being collected by the take-up spool, it passes through a series of tensioning rollers.

Many wire-EDM systems use a massive granite slab as the machine base to further guarantee

wire accuracy and stability.

Automatic wire threading is a recently introduced feature that boosts productivity. It

automatically re-threads the wire after breakage and enables a longer round after one pass

through the work piece and it is discarded.

2.3.3 Power supply

The most pronounced differences between the power supplies used for wire-EDM and

conventional EDM are the frequency of the pulses used and the current. To produce the

smoothest surface finish possible, pulse frequencies as high as 1 MHz may be used with wire-

EDM. Such a high frequency ensures that each spark removes as little material as possible, thus

reducing the size of EDM crater.

Because the diameter of the wire used is so small, its current —currying capability is limited.

Because of this limitation, wire-EDM power supplies are rarely built to deliver more than 20

amp of current.

2.3.4 Dielectric system

De-ionized water is the dielectric used for the wire-EDM process. De-ionized water is used for

four reasons: low viscosity, high cooling rate, high material removal rate and absence of fire

hazard.

The small cutting gap used with wire-EDM mandates that a low-viscosity dielectric be used to

ensure adequate flushing. Water meets this criterion. Water can also remove heat from the

cutting area much more efficiently than conventional dielectric oils. More efficient cooling

results in extremely thin recast layers.

Very high specific material removal rates can be achieved when using water as dielectric;

however, the wear rate on the tool (wire) is also high. Because the wire is not reused, the high

tool-wear rate is of no consequence. This explains however why water is not commonly used

with conventional EDM.

Finally, because of the slow processing speeds of wire-EDM, many users run their most time —

consuming jobs overnight or over the weekend unattended. With conventional EDM, the use of

7

flammable dielectric oils presents a fire hazard. When using water for the dielectric, the fire

hazard problem is eliminated.

Rather than submerge the entire part into de-ionized water, local delivery is most often used.

Some systems deliver the dielectric fluid via a hose directed at the cut interface. The most

efficient method of dielectric delivery (with respect to flushing) is to provide a stream of de-

ionized water coaxial with the wire.

2.4 Wire-EDM process parameters

The linear cutting rate for wire-EDM is approximately 38-115rmn/hr in 25 mm thick steel or

approximately 20mm/hr in 76 mm steel. The linear speed is dependent upon the thickness of the

material but not upon the shape of the cut. The linear cutting rate is the same whether a straight

cut or complex curves are being generated.

The speed of the wire passing through the work piece can vary from 8-40 mm/sec depending

upon cutting conditions.

2.5 Wire-EDM process capabilities

Wire-EDM is a specialized process that is capable of machining electrically conductive work

pieces to produce fine finishes, extremely high accuracies and cut edges that have a smooth,

matte finish.

The matte finish is a result of the thousands of microscopic pits remaining from the spark

erosion. When applied to punch-and-die application, the oil-retaining quality of these micro pits

has been known to increase the die life. Surface finishes ranging from 0.12 to 0.25µm are

routinely obtained, and by utilizing a second "finish pass", finishes as good as 0.05 - 0.12 µm

are possible. Many wire-EDM machines are available with a positioning resolution of 0.001mm

and can routinely obtain accuracies off 0.007mm [2].

Advantages • No electrode fabrication required • No cutting forces • Unmanned machining • Die costs reduced by 30 — 70 % • Cuts hardened materials • Intricate shapes can be cut with same ease as that for straight cut. • Very small kerf width

10

Disadvantages • High capital cost • Recast layer • - Electrolysis can occur in some materials • Slow cutting rates • Not applicable to very large workpieces

2.6 WEDM applications

• Ideal for stamping die components since kerf is so narrow, it is often possible to

fabricate punch and die in a single cut.

• Other tools and parts with intricate outline shapes, such as lathe form tools, extrusion

dies, flat templates and almost any complicated shapes (Fig.4).

Fig.4 Complicated shapes produced by wire EDM

2.6.1 Modern tooling applications

WEDM has been gaining wide acceptance in the machining of various materials used in modern

tooling applications. Several authors [3,4] have . investigated the machining performance of

WEDM in the wafering of silicon and machining of compacting dies made of sintered carbide..

The feasibility of using cylindrical WEDM for dressing a rotating metal bond diamond wheel

used for the precisionform grinding of ceramics has also been studied [5]. The results show that

the WEDM process is capable of generating precise and intricate profiles with small corner radii

but a high wear rate is observed on the diamond wheel during the first grinding pass. Such an

11

initial high wheel wear rate is due to the over-protruding diamond grains, which do not bond

strongly to the wheel after the WEDM process [6]. The WEDM of permanent NdFeB and `soft'

MnZn ferrite magnetic materials used in miniature systems, which requires small magnetic parts,

was studied by comparing it with the laser-cutting process [7]. It was found that the WEDM

process yields better dimensional accuracy and SF quality but has a slow CR, 5.5 mm/min for

NdFeB and 0.17 mm/min for MnZn ferrite. A study was also done to investigate the machining

performance of micro-WEDM used to machine a high aspect ratio meso-scale part using a

variety of metals including stainless steel, nitronic austentic stainless, beryllium copper and

titanium [8].

2.6.2. Advanced ceramic materials

The WEDM process has also been evolved as one of the most promising alternatives for the

machining of the advanced ceramics. Sanchez et al. [9] provided a literature survey on the EDM

of advanced ceramics, which have been commonly machined by diamond grinding and lapping.

In the same paper, they studied the feasibility of machining boron carbide (B4C) and silicon

infiltrated silicon carbide (SiC) using EDM and WEDM. Cheng et al. [10] also evaluated the

possibility 'of machining ZrB2 based materials using EDM and WEDM, whereas Matsuo and

Oshima [11] examined the effects of conductive carbide content, namely niobiumcarbide (NbC)

and titaniumcarbide (TiC), on the CR and surface roughness of zirconia ceramics (Zr02) during

WEDM. Lok and Lee [12] have successfully WEDMed sialon 501 and aluminium oxide--

titaniumcarbide (A1203—TiC). However, they realized that the MRR is very low as compared to

the cutting of metals such as alloy steel SKD-11 and the surface roughness is generally inferior

to the one obtained with the EDM process. Dauw et al. [13] explained that the MRR and surface

roughness are not only dependent on the machining parameters but also on the material of the

part. An innovative method of overcoming the technological limitation of the EDM and WEDM

processes requiring the electrical resistivity of the material with threshold values of

approximately 100 (1/cm [14] or 300 a /cm [15] has recently been explored. There are different

grades of engineering ceramics, which Konig et al. [ 14] classified as non-conductor, natural-

conductor

and conductor, which is a result of doping nonconductors with conductive elements.

Mohri et al. [ 16] brought a new perspective to the traditional EDM phenomenon by using an

assisting electrode to facilitate the sparking of highly electrical-resistive ceramics. Both the

EDM and WEDM processes have been successfully tested diffusing conductive particles from

12

assisting electrodes onto the surface of sialon ceramics assisting the feeding of electrode through

the insulating material. The same technique has also been experimented on other types of

insulating ceramic materials including oxide ceramics such as Zr02 and A1203, which have very

limiting electrical conductive properties [17].

2.6.3. Modern composite materials

Among the different material removal processes, WEDM is considered as an effective and

economical tool in the machining of modern composite materials. Several comparative studies

[18, 19] have been made between WEDM and laser cutting in the processing of metal matrix

composites (MMC), carbon fibre and reinforced liquid crystal polymer composites. These

studies showed that WEDM yields better cutting edge quality and has better control of the

process parameters with fewer workpiece surface damages. However, it has a slower MRR for

all the tested composite materials. Gadalla and Tsai [20] compared WEDM with conventional

diamond sawing and discovered that it produces a roughness and hardness that is comparable to

a low speed diamond saw but with a higher MRR. Yan et al. [21] surveyed the various

machining processes performed on the MMC and experimented with the machining of

A1203/6061Al composite using rotary EDM coupled with a disk-like electrode. Other studies

[22, 23] have been conducted on the WEDM of A1203 particulate reinforced composites

investigating the effect of the process parameters on the WEDM performance measures. It was

found that the process parameters have little influence on the surface roughness but have an

adverse effect on CR. -

13

Chapter 3 Literature review

Wire EDM manufacturers and users always want to achieve higher machining productivity with

a desired accuracy and surface finish. Performance of the WEDM process, however, is affected

by many factors (workpiece material, wire material, dielectric medium, adjustable parameters,

etc.) and a single parameter change will influence the process in a complex way. As surface

finish and cutting. speed are the most important parameters in manufacturing, investigations

have been carried out by several researchers [24-27] for improving the surface finish and cutting

speed of WEDM process. However, Because of the many variables and the complex and

stochastic nature of the process [28], achieving the optimal performance, even for a highly

skilled operator with a state-of-the-art WEDM machine is rarely possible. An effective way to

solve this problem is to discover the relationship between the performance of the process and its

controllable input parameters (i.e., model the process through suitable mathematical techniques),

and then determine the optimal parameters for a given set of conditions.

Investigation into the influences of machining input parameters on the performance of EDM and

WEDM have been reported widely [24-41] and several attempts have been made to model the

process.

3.1 Process modeling

Traditionally, the selection of the most favorable process parameters was based on experience or

handbook values, which produced inconsistent machining performance. However, the

optimization of parameters now relies on process analysis to identify the effect of operating

variables on achieving the desired machining characteristics. The modeling of the WEDM

process by means of mathematical techniques has also been applied to effectively relate the

large number of process variables to the different performance of the process. Spedding and

Wang [42] developed the modeling techniques using the response surface methodology and

artificial neural network technology to predict the process performance such as MR, SQ and

surface waviness within a reasonable large range of input factor levels. Liu and Esterling [43]

proposed a solid modeling method, which can precisely represent the geometry cut by the

14

WEDM process, whereas Hsue et al. [44] developed a model to estimate the MRR during

geometrical cutting by considering wire deflection with transformed exponential trajectory of

the wire centre. Spur and Scho"nbeck [451 designed a theoretical model studying the influence

of the workpiece material and the pulse-type properties on the WEDM of a workpiece with an

anodic polarity. Han et al. [46] developed a simulation system, which accurately reproduces the

discharge phenomena of WEDM. The system also applies an adaptive control, which

automatically generates an optimal machining condition for high precision WEDM.

3.2 Process optimization.

Many different types of problem-solving quality tools have been used to investigate the

significant factors and its inter-relationships with the other variables in obtaining an optimal

WEDM CR. Konda et al. [29] classified the various potential factors affecting the WEDM

performance measures into five major categories namely the different properties of the

workpiece material and dielectric fluid, machine characteristics, adjustable machining

parameters,. and component geometry. In addition, they applied the design of experiments (DOE)

technique to study and optimize the possible effects of variables during process design and

development, and validated the experimental results using noise-to-signal (S/N) ratio analysis.

Tarng et al [30] employed a neural network system with the application of a simulated

annealing algorithm for solving the multi-response optimization problem. It was found that the

machining parameters such as the pulse on/off duration, peak current, open circuit voltage,

servo reference voltage, electrical capacitance and table speed are the critical parameters for the

estimation of the CR and SF. Huang et at [31] argued that several published works are

concerned mostly with the optimization of parameters for the roughing cutting operations and

proposed a practical strategy of process planning from roughing to finishing operations. The

experimental results showed that the pulse on-time and the distance between the wire periphery

and the workpiece surface affect the CR and SF significantly. The effects of the discharge

energy on the CR and SF of a MMC have also been investigated.

15

Chapter 4 Neural Network Implementation Issues

Due to its ability to address complex and nonlinear problems (problems whose solutions have

not been explicitly formulated), the widely accepted method, artificial neural network (ANN) is

chosen to model the complex behavior between input and output in the WEDM process. It has

been used extensively in many fields such as forecasting, pattern recognition, robotics,

parameter selection, process modeling, monitoring, and controlling etc. It is originally based on

the human thoughts of receiving and transferring the information in making decision. A simple

model of ANN consists of an input layer, a hidden layer and an output layer. With sets of input—

output patterns stored in input and output layers, the hidden layer interconnects different

strength of information from the input to the output layers, through so-called weights. The

weights are adjusted in the learning process in which all the patterns of input—output are

presented in the learning phase repeatedly. There are many learning algorithms available and the

most popular and successful learning algorithm used to train multilayer network is the back

propagation scheme. Any output point can be obtained after this learning phase, and good

results can be achieved. In Appendix — 1, a brief review of the fundamentals of multilayered

feed-forward neural networks is provided. For more details, reference may be made to Freeman

and Skapura [47] and Vemuri [48].

Neural networks are highly flexible modeling tools with an ability to learn the mapping between

input variables and output feature spaces. Therefore, neural networks are considered in this

work to model the wire-EDM process with multi-dimensional input and output spaces.

There are several choices to be made when implementing neural networks to solve a problem.

These choices involve the selection of the training and testing data, the network architecture, the

training method, the data scaling method, and the error goal. Since over 90% of all neural

network implementations use back propagation trained multi-layer perceptrons, an attempt has

been made to discuss and implement it in this work.

4.1 Overview of Neural Network Training Methodology

Figure 10 shows the methodology to follow when training a neural network. First we must

collect or generate the data to be used for training and testing the neural network. In the present

16

case experimental data generated on wire EDM has been used. Once this data is collected, it

must be divided into a training set (Table 1) and a test set (Table 2). The training set should

cover the input space or should at least cover the space in which the network will be expected to

operate. If there is not training data for certain conditions, the output of the network should not

be trusted for those inputs. The division of the data into the training and test sets is somewhat of

an art and somewhat of a trial and error procedure. We want to keep the training set small so

that training is fast, but we also want to exercise the input space well which may require a large

training set.

Once the training set is selected, we must choose the neural network architecture. There are two

lines of thought here. Some designers choose to start with a fairly large network that is sure to

have enough degrees of freedom (neurons in the hidden layer) to train to the desired error goal;

then, once the network is trained, they try to shrink the network until the smallest network that

trains remains. Other designers choose to start with a small network and grow it until the

network trains and its error goal is met. We will use the second method which involves initially

selecting fairly small network architecture.

After the network architecture is chosen, the weights and biases are initialized and the network

is trained. The network may not reach the error goal due to one or more of the following reasons.

1. The training gets stuck in local minima.

2. The network does not have enough degrees of freedom to fit the desired

input/output model.

3. There is not enough information in the training data to perform the desired

mapping. J

17

Collect Data

Select Training and Test Sets

Select Neural Network Architecture

r

Initialize Weights

Change Weights N SSE Goal or

Increase NN Size Met?

Y Run Test Set

Reselect Training Set or Collect More Data

SSE Goal N Met?

Y Done

Fig.10 Neural Network Training Flow Chart

In case one, the weights and biases are reinitialized and training is restarted. In case two,

additional hidden nodes or layers are added, and network training is restarted. Case three is

usually not apparent unless all else fails. When attempting to train a neural network, you want

to end up with the smallest network architecture that trains correctly (meets the error goal); if

not, you may have over fitting. Over fitting is described in greater detail in Section 4.1.4.

Once the smallest network that trains to the desired error goal is found, it must be tested with the

test data set. The test data set should also cover the operating region well. Testing the network

• involves presenting the test set to the network and calculating the error. If the error goal is met,

training is complete. If the error goal is not met, there could be two causes:

1. Poor generalization due to an incomplete training set.

2. Over fitting due to an incomplete training set or too many degrees of freedom in the

network architecture.

The cause of the poor test performance is rarely apparent without using cross validation

checking which will be discussed in Section 4.1.6. If an incomplete test set is causing the poor

performance, the test patterns that have high error levels should be added to the training set, a

new test set should be chosen, and the network should be retrained. If there is not enough data

left for training and testing, data may need to be collected again or be regenerated.

4.1.1 Training and Test Data Selection

Neural network training data should be selected to cover the entire region where the network is

expected to operate. Usually a large amount of data is collected and a subset of that data is used

to train the network. Another subset of that data is then used as test data to verify the correct

generalization of the network. If the network does not generalize well on several data points,

that data is added to the training data and the network is retrained. This process continues until

the performance of the network is acceptable.

The training data should bound the operating region because a neural network's performance

cannot be relied upon outside the operating region. This ability is called a network's

extrapolation ability.

4.1.2 Scaling Input Vectors

Training data is scaled for two major reasons. First, input data is usually scaled to give each

input equal importance and to prevent premature saturation of sigmoidal activation functions.

Secondly, output or target data is scaled if the output activation functions have a limited range

and the unscaled targets do not match that range.

There are two popular types of input scaling: linear scaling and z-score scaling. Linearly

scaling transforms the data into a new range which is usually 0.1 to 0.9. 1

4.1.3 Initializing Weights

As mentioned above, the initial weights should be selected to be small random values in order to

prevent premature saturation of the sigmoidal activation functions. The most common method

is to use the random number generator and pass it the number of inputs plus 1 and the number of

hidden nodes for the first hidden layer weight matrix W1 and pass it the number of outputs and

hidden nodes plus 1 for the output weight matrix W2. One is added to the number of inputs in

19

W 1 and to hidden in W2 to account for the bias. To make the weights somewhat smaller, the

resulting random weight matrix is multiplied by 0.5.

4.1.4 Over fitting

Several parameters affect the ability of a neural network to over fit the data. Over fitting is

apparent when a networks error level for the training data is significantly better than the error

level of the test data. When this happens, the data learned the peculiarities of the training data,

such as noise, rather than the underlying functional relationship of the model to be learned.

Over fitting can be reduced by:

1. Limiting the number of free parameters (neurons) to the minimum necessary.

2. Increasing the training set size so that the noise averages itself out.

3. Stopping training before over fitting occurs.

4.1.5 Neural Network Noise

As discussed above, when there is noise in the training data, a method to calculate the RMS

error goal needs to be used. If there is significant noise in the data, increasing the number of

patterns in the training set can reduce the amount of over fitting.

4.1.6 Stopping Criteria and Cross Validation Training

The last method of reducing the chance of over fitting is cross validation training. Cross

validation, training uses the principle of checking for over fitting during training. This

methodology uses two sets of data during training. One set is used for training and the other is

used to check for over fitting. Since over fitting occurs when the neural network models the

training data better than it would other data, checking data is used during training to test for this

over learning behavior.

At each training epoch, the RMS error is calculated for both the test set and the checking set. If

the network has more than enough neurons to model the data, there will be a point during

training when the training error continues to decrease but the checking error levels off and

begins to increase.

In summary, there are four methods to reduce the chance of over fitting:

1. Limiting the number of free parameters.

20

2. Training to a realistic error goal.

3. Increase the training set size.

4. Use cross validation training to identify when over fitting occurs.

These methods can be used independently or used together to reduce the chance of over fitting.

21

Chapter 5 ANN modeling of WEDM

5.1 Neural network model

Commercial software MATLAB Version 6.3 is used for coding the Neural Network program.

The stopping criteria used in the current study was set at 2000 maximum epoch number, and the

characteristics of the training set was train multiplayer. Whereas the testing set was set once the

difference between sum square error of the actual and predicted values is g x 10"3.

A feed forward neural network is adopted here to model the wire-EDM process. The feed

forward neural network is composed of many interconnected artificial neurons that are often

grouped into input, hidden and output layers (Fig 11).

f (Hp)

CD CD

C) CD

Cd

VG

1 Ton

Toff

Ws

iparator

CR

[~— SR

performances des

Hidden nodes

Fig. 11. Configuration of the neural network.

22

The fundamental equation which defines input out put relationship can be expressed as follows:

Y= f (X, W) (vi)

Where Y represents the performance parameters, such as the MRR and surface roughness; X is

a vector of the input variables to the neural network, and W is the weight matrix that is

evaluated in the network training process. f (.) represents the model of the process that is to be

built through neural network training.

The modeling phase involves the establishment of the model using multilayer feed forward

neural network architecture. The back propagation algorithm finds the optimum values of the

weights that minimize the error between the target and the calculated (network output)

performance parameters. Fig. 11. shows the network architecture of the developed model.

The following relations were used to combine the inputs of the network at the nodes of the

hidden layer and the output layer, respectively.

Hp = EVhpXh , Oq = E pq.ZP

Both outputs at the hidden (Zh = f (Hp)) and output layer (Yq =f (0k)) are calculated using

sigmoid function, mainly because of its well-known use as a transfer function for many

applications. Combining equations (vi) & (vii), the relations for the output of the network is

given by the following relation:

Y q =f (Oq) =f( pq.Zp) = J (Wpq.( EVhpXh))

Finaly, the output of the network (Yq ) was compared with the measured performance (Tq ) of

the process using a simple sum of square error (Eq) as follows:

Eq = (Yqq - T9) 2 k1

The artificial neuron evaluates the inputs and determines the strength of each one through its

weighting factor calculated by the back-propagation learning algorithm [Appendix - 1]. The

weighted inputs are summed to determine the output of the neuron using a sigmoid transfer

function. The output of the neuron is then transmitted along the weighted outgoing connections

23

to serve as an input to subsequent neurons. In this study, the neurons of the input and output

layers are used to receive the input variable of cutting parameters and to send out the output

variable of cutting performance, respectively. To properly map the input and output

relationships in the wire-EDM process with the neural network, finite discrete samples of

experimental data are required for training the neural network given in section 5.2. During the

training process, the number of neurons in the hidden layer is determined by trial-and-error

experimentation. It is found that a single hidden layer with 11 neurons can provide better

convergence in modeling the wire-EDM process. As shown in Fig. 12, the sum of square error

(SSE) between the desired and predicted outputs is almost reduced to zero after 2000 iterations

during the training process. Therefore, a feed forward neural network with a 5-11-2 type is

adopted here to associate the cutting parameters with the cutting performance.

5.2 Experimental details

Titanium alloy was chosen as the work material and work piece thickness was kept as 5 mm.

Brass wire of 0.25mm was used for all the experiments. Experiments were planned using a

factorial design based on Taguchi's L18 orthogonal array with 21 x 34. The machining voltage

(Va) was maintained at 80V and conductivity of dielectric (Cd) at 50 and 250 p-mho. The other

four parameters were maintained at three levels; pulse duration (Ton) at 1.1, 1.2 and 1.31.ts; time

between two pulses (Toff) at 30, 34, 38µs; gap voltage (GV) at 50, 60, 70 volts; and wire speed

(Ws) at 4, 6, 8 m/min. For testing the results, 16 experiments were conducted, on the basis of

randomly selected input parameters. For each set of parameters the workpiece was straight cut

for a length of 10 mm.

The linear cutting rate reading was noted down. Each piece was cleaned and the

surfaceroughness was measured as Ra value using profilometer. The average of six readings.

taken perpendicular to the direction of cut was chosen as the surface roughness value. The

results of the experiments given in Table 1 are based on Taguchi's method and Table 2 gives

data obtained by randomly selecting the input parameters.

24

Table 1 Training data: MRR and surface finish for experiments planned according to Taguchi's method'

Si.no Cd GV Ton Toff Ws CR (mm/min) SF (micron)

1 50 50 1.1 30 4 4.1 2.88 2 50 50 1.2 34 6 4.1 3.01 3 50 50 1.3 38 8 3.9 3.15 4 50 60 1.1 30 6 3.3 3.02 5 50 60 1.2 34 8 3.2 3.15 6 50 60 1.3 38 4 3.1 3.64 7 50 70 1.1 34 4 2.2 2.74 8 50 70 1.2 38 6 2.1 2.28 9 50 70 1.3 30 8 2.7 3.21 10 250 50 1.1 38 8 3.1 3.18 11 250 50 1.2 30 4 4.2 3.23 12 250 50 1.3 34 6 4.1 3.22 13 250 60 1.1 34 8 2.8 2.71 14 250 60 1.2 38 4 2.9 3.02 15 250 60 1.3 30 6 3.6 3.00 16 250 70 1.1 38 6 1.8 3.06 17 250 70 1.2 30 8 2.5 3.08 18 250 70 1.3 34 4 2.4 3.13

a workpiece height, Hw = 5 mm

Table 2

Test data: MRR and surface finish for experiments with randomly selected input parameters.a

Si.no Dc GV Ton Toff Ws CR (rnni/min) SR (micron)

1 250 60 1.2 35 7.5 3.0 3.01 2 50 58 1.1 30 7.9 3.5 2.82 3 50 66 1.2 30 7.6 2.9 2.94 4 50 68 1.26 33 5.6 2.7 2.92 5 250 56 1.19 31.7 6.6 3.7 3.33 6 50 65 1.25 30 6.1 3.2 3.15 7 250 64 1.19 34.7 4.8 2.7 3.12 8 50 68 1.23 30.4 5.2 2.8 3.27 9 250 52 1.29 31.8 4.0 4.2 3.11 10 250 68 1.17 32 4.0 2.6 2.99 11 250 62 1.22 34.9 7.9 2.8 2.93 12 50 50 1.2 30 7.2 4.4 2.93 3 250 54 1.26 32.9 7.0 3.9 3.22 14 50 60 1.19 33,6 7.6 3.2 2.97 15 250 58 1.24 32.2 4.5 3.5 3.10 16 50 70 1.24 31.4 4.4 2.7 3.06

a workpiece height, Hw = 5 mm

25

5.3 Results and Discussion

To properly map the input and out put relationships in the wire EDM process with the neural

network, finite discrete samples of experimental data given in Tables 1 and 2 are used for

training and testing the network. During the training process, the number of neurons in the

hidden layer is determined by trial-and error experimentation as discussed in section 4.1. It is

found that a single hidden layer with 11 neurons can provide better convergence in modeling the r

wire EDM process. As shown in Fig. 12, the sum of square errors (SSE) between the desired and

predicted outputs is almost reduced to zero after 2000 iterations during the training process. The

network is further tested by applying test data and showed good prediction capability with sum

of square errors close to 0.01. Therefore, a feed forward neural network with 5-11-2 type (Fig

11) is adopted here to associate the cutting parameters with the cutting performance.

Plot of sum of square errors for the trlaning data. 0.035

0.03

0.025

0.02

W N 0.0115

0.01

0.005

00

i

200 400 600 800 1000 1200 1400 1600 1800 Number of epochs

2000

Fig. 12 Sum of square error vs number of iterations in the training process

5.4 The effect of the cutting parameters on the performance of the process according to the

developed model

In the following, the effect of the cutting parameters on the cutting performance will be studied

one by one based on this developed neural network. In reality, cutting parameters affect the

cutting performance of one another. To separate the effect caused by each cutting parameter, the

other cutting parameters are set to a middle value in the allowable working spaces when one of

the cutting parameters is varied and analyzed. The effect of the variations of cutting parameters

26

3.2

3.1

- 3 10

92.9

2.E

2.i

2.1 183 Cd (micro-mho 'o-mhol

;

on the machining speed and machined surface roughness are shown in figures accompanied with

an explanation of the effect of each cutting parameter on the machining speed and machined

surface roughness.

(1) Gap voltage (GV): As can be seen in Fig. 13, the higher the gap voltage, the longer the

discharge off time (Toff). To obtain the longer discharge off time, the machining speed needs to

be slowed down. This will lead to a wider average discharge gap. Therefore, the discharge

condition becomes more stable but the number of discharge cycles decreases within a given

period. Owing to this stable machining, surface accuracy becomes better.

4

E E

58 82 GV (voHj

(a)

Fig.13 Surfaces show the relationship of GV with (a) cutting rate (CR), (b) surface roughness

(SR)

(2) Pulse on time (Ton): It can be seen (Fig 14) that machining speed increases with increase in

the pulse on time. On the contrary, surface finish decreases with increasing the pulse on time

(Fig.14). This is because the discharge energy increases with the pulse on time. As a result,

machining speed becomes faster with the increase of the discharge energy. However, in the

meantime, the discharge gap becomes wider so as to increase surface roughness.

PXA

_^.✓ iJP/fir/IIiJI _ ~riirlri/!ii ii•~i/iriiiiJii

OBOJId i~v~~fr°•.rd.•A

4.

_____ • 11Ti1iTi

it a®/~' ~'/iijf%i

ii/iJljfJ` '%of!i

i

(a) (b)

Fig. 14 Surfaces show the relationship of Ton with (a) cutting rate (CR), (b) surface

roughness (SR).

(3) Pulse off time (Toff): As the pulse off time is decreased, the number of discharges within a

given period becomes more. This will lead to a higher machining speed. But, surface finish

becomes poor because of a larger number of discharges (Fig.15).

32 3.4

G 3 F E .E. 02 2.8

2.6

hE 3.21

E 3

2.8

Toff

V 193 17 CG (micro-mho) 2.6

Toll

f 183 ca (010r02-11220)

(a) (b)

Fig.15 Surfaces show the relationship of Toff with (a) cutting rate (CR), (b) surface

roughness (SR).

183 17 Ctl (micro-mho)

3.5

T i

/183 117 CU (ml.ro-mho)

'J'/.s. (mmin) 7

(4) Wire feed speed (Ws): Fig.16 reveals, as wire speed increases, the discharge density at

particular space and time in the discharge gap decreases, this is because the evacuation

capability of the bye-products from the discharge gap increases with wire speed. This in turn

means the cutting speed decreases due to the low input energy, per time and space. The

surface finish improves due to a more stable machining.

W. (rnhnin) 7 2 850

(a)

(b)

Fig. 16 Surfaces show the relationship of Ws with (a) cutting rate (CR), (b) surface

roughness (SR).

29

Chapter 6 Optimization of wire EDM process parameters

In this phase, the input parameters to the network were coded as chromosomes for genetic

evaluation. Since the modeling part of the problem has already been solved in the previous

phase, the optimization phase is straight forward. In this case, the structure of the network (Fig.

22) is seen as a black box for the user.

Fitness function Cd

Optimal value

0

o Ton 0

CD

Toff

Ws

Fig.22 Structure of the optimization system

In this work, ANN is combined with GA to get the optimum value. (The introductory concept of

GA and GA operators are discussed in Appendix — 2). To search for the optimum, GA requires

the optimized weights of the ANN. ANN first provides the GA with the final weight settings of

each neuron after training and validation of the network (Fig.22). Consequently, both GA and

ANN programs should be linked-up and exchanged data with each other. In the current study,

the procedure adopted is as follows. First ANN writes the selected optimal weight setings in the

text file. The text file is then read by the GA and received as ANN parameter. Then, GA

optimizes the parametric setting based on the constrained optimization technique. The surface

roughness as an output (i.e. SR) generated from this procedure is compared with the designers

X17

surface roughness requirement (limit). If the value exceeds the limit, GA generates new input

parameters from the GA operator, i.e. mutation and crossover [Appendix — 2]. These steps are

repeated until the optimal cutting rates are found for the given surface roughness limit. This is

an iterative process at the end of which the GA arrives at the optimum set of machining

parameters which produce the optimal cutting rate for acceptable limit of surface roughness.

6.1 Why constrained optimization technique?

Two questions must be answered with regard to selection of parameters: What is the best

parameter combination? And how can we get it? In the case of multiple objectives, it is known

that no perfect run exists that can result in both the best cutting speed and surface roughness.

However, in the production environment, the surface finish quality of a workpiece, which is

determined by the designer or process engineer, must be fulfilled, and productivity is of

secondary importance, when compared with the quality requirement.

Therefore, the best parameters can be regarded as those that maximize productivity and fulfill

the surface finish quality requirements. For the present approach, the best combination of

parameter levels should produce the maximum cutting speed, while the surface roughness is

within requirement. This problem can be represented and solved by a constrained optimization

technique. The optimization model can be expressed as:

Max speed = CR (Cd, GV, Ton, Toff, WS)

Subject to 0 < SR (Cd, GV, Ton, Toff, WS) <a

50 <= Cd <= 250

50 <= VG <= 70

1.1 <= Ton <= 1.3

30<=Toff<=38

4<=Ws<=8

Where a is maximum allowable Ra value. The value of a should be within the range of

predicted Ra values i.e. within 2.28pm and 3.64µm

The functions CR (*), and Ra (*), are represented by the ANN model. For a given a, the

solution of the problem can be obtained from the GA optimizer, the output of which is

parameter combinations. GA program is coded in Matlab to solve the optimization problem.

31

Table 3 illustrates the solutions. By using Table 3, the best parametric combination can be

selected. For example, if the roughness of workpiece surface should be less than 3.0 µm, the

best parametric combination would be (227 54 1.2 33 4) which will yield a cutting speed of 3.35

mm/min other combinations will either yield a lower cutting speed or violate the surface finish

requirements.

Table 3 Process performance optimization

Parametric combinations

Cd GV Ton Toff Ws CR (mm/min) SR (micron)

99 54 1.1 32 5 1.92 2.227 62 52 1.2 37 6 2.03 2.289 69 58 1.1 36 7 2.24 2.310 63 62 1.1 37 7 2.28 2.311 62 59 1.1 35 5 2.34 2.320 87 62 1.1 38 7 2.53 2.329 58 61 1.1 35 6 2.74 2.338 81 63 1.1 37 7 2.78 2.349 60 68 1.1 37 7 2.88 2.360 51 67 1.1 36 6 3.01 2.371 50 64 1.1 35 6 3.04 2.381 100 •64 1.1 38 7 3.06 2.418 243 60 1.1 30 6 3.10 2.508

.85 55 1.2 35 5 3.12 2.599 70 50 1.3 38 7 3.18 2.700 116 50 1.1 34 8 3.18 2.799 228 67 1.1 35 6 3.23 2.899 235. 53 1.3 34 4 3.29 2.950 227 54 1.2 33 4 3.35 3.000 54 51 1.1 30 7 3.38 3.15 166 63 1.2 31 7 3.39 3.199 121 50 1.2 30 7 3.48 3.302 88 53 1.3 33 8 3.50 3.400

77 51 1.3 31 7 3.60 3.450 174 58 1.2 38 5 3.66 3.502 93 62 1.3 37 5 3.67 3.549 81 57 1.3 35 4 3.83 3.610 80 68 1.3 36 7 4.14 3.612 110 66 1.3 34 4 4.20 3.619 71 69 1.3 36 4 4.20 3.629 96 58 1.3 34 4 4.27 3.388

The five sample settings of the five cutting parameters obtained from the optimization technique

are listed in Table 4. As indicated in Table 4, the errors between the expected and experimental

performance results are reasonably small.

32

I

Table 4 Actual vs predicted WEDM performances

Parametric combinations CR (mm/min) SR (micron)

Cd GV Ton Toff Ws Prediction Actual Error (%) Prediction Actual Error (%)

69 58 1.1 36 7 2.24 2.3 2.6 2.310 2.4 3.89

60 68 1.1 37 7 2.88 2.65 7.9 2.360 2.53 7.20

243 60 1.1 30 6 3.10 3.3 6.45 2.508 2.43 3.11

227 54 1.2 33 4 3.35 3.45 2.98 3.000 2.89 3.66

80 68 1.3 36 7 4.14 4.4 6.28 3.612 3.89 7.69

Average error (%) 5.24 5.11

6.2 Search for Pareto-optimal WEDM process parameters

In the case of multiple objectives, there may not exist one solution that is best or global

optimum with respect to all objectives. The presence of multiple objectives in a problem usually

give rise to a family of non-dominated or non-inferior solutions, largely known as Pareto-

optimal solutions, where each objective component of any solution along the Pareto-front can

only be improved by degrading at least one of its other objective components. Since none of the

solutions in the non-dominated set is absolutely better than any other, any one of them is an

acceptable solution. As it is difficult to choose any particular solution for a multi-objective

optimization problem without iterative interaction with the decision maker, one general

approach is to establish the entire set of Pareto-optimal solutions.

By searching the Pareto-optimal solution one can find multiple optimal solutions. The fitted ANN

model is assumed to represent the relationship between process performance and controllable

factors and is used to predict the performance for 625 randomly generated combinations of input

parameter levels. Fig. 23 illustrates the prediction result. This figure does not directly illustrate

the process response with respect to input factors but gives a visual demonstration of the

relationship between the predicted responses (cutting rate vs. roughness). Every point

corresponds to a particular combination of input parameter levels. In Fig. 23, an approximate

tendency that a smaller surface roughness corresponds to a slower cutting speed seems to be

33

shown. Faster cutting speed (higher productivity), therefore, will result in larger roughness

(worse surface finish).

All 625 outputs for cutting speed and surface roughness were plotted in Fig.23. For

convenience, CR and 1/Ra were considered as X- and Y-axis, respectively. Pareto optimal

solutions have to be searched out from all these 625 outputs. Here, the Pareto-optimal solutions

means that it is better than any other output at least with respect to one process criterion i.e. CR

or 1/Ra. If one parameter combination results in higher in both the process criterion or if it is

higher with respect to at least one process criterion and is equal with respect to other process

criterion to a second, then the second parametric combination should never be selected in

preference to the first. In other words graphically a point is not optimum if there is any other

point, which is above and right to the point. If both points have same coordinate both will be

considered.

0.5

0 0.4 I.

E 0.35

0.3

0.25

o non optimal points ® pareto-optimal points

1.25 2.25 3.25 4.25

CR (mm/min)

Fig. 23. Machining performance predictions of ANN model for all 625. combinations.

34

6.2.1 Discussion

Some specific points situated at the boundary constitute a Pareto-optimal front as shown in Fig.

23. Certainly all points other than this set of optimum points are not desirable. Excel program

was used to find out these optimum points from the set of all 625 points. It was observed that

out of 625 points there were only 38 optimum points. These set of Pareto-optimal solutions are

very much useful because manufacturing engineer can adapt to different optimal solutions, as

and when required. This is a major advantage of this approach over constrained optimization

technique. Once the Pareto-optimal set is available there is no need to run the program again.

Just by scanning the chart of optimal solutions one can readily find out the optimum parametric

combination for a given surface roughness requirement. Table 4 contains the sorted list

(increasing Ra) of all these 38 optimum parametric combinations. This chart may be used as a

technology guideline for optimum machining of titanium alloy (Ti — 6AL — 4V). For example if

the required Ra value is less than or equal to 2.7µm, then the best parametric combination which

will optimize the cutting speed is given at the serial number 13 in the technology guideline

shown in Table 4. In Fig.24, a plot of all these. optimum points is shown. From this plot it can be

observed that the surface roughness increases as the maximum cutting speed increases. r

o actual data

2 2.5 3 3.5

Minimum Ra (micron)

Fig. 24. Maximized cutting speed vs. surface roughness.

35

E E

C.) E E ca

5

4.5

4

3.5

3

2.5

2

1.5

1

Table 5 Sorted pareto-optimal points

Parametric combinations

S.no Cd GV Ton Toff Ws CR (mm/min) SR (micron)

1 57 50 1.1 37 7 1.54 2.152 2 52 50 1.1 37 6.7 1.57 2.157 3 61 51 1.11 36.2 6.5 1.60 2.169 4 54 50.4 1.10 35.4 5.8 1.67 2.187 5 51 50.2 1.10 35.1 5.8 1.68 2.188 6 73 52.4 1.12 35.1 6.2 1.78 2.220 7 66 51.7 1.11 r 34.2 6.1 1.79 2.230 8 74 52.5 1.12 34.9 5.6 1.92 2.250 9 64 51.5 1.11 34.2 5.3 2.00. 2.262 10 57 50.7 1.10 33.6 5.3 2.06 2.274 11 67 51.8 1.11 34.1 5.1 2.13 2.303 12 62 51.3 1.11 33.2 5.3 2.34 2.327 13 104 55.7 1.15 37.5 7.0 2.49 2.412 14 59 51.0 1.10 32.2 5.0 2.98 2.473 15 94 54.6 1.14 34.4 5.2 3.21 2.544 16 88 54.0 1.13 33.7 6.1 3.35 2.580 17 104 55.7 1.15 35.5 5.9 3.47 2.600 18 98 55.0 1.14 34.4 5.8 3.66 2.626 19 90 54.2 1.14 33.2 5.7 3.80 2.702 20 89 54.1 1.13 33.0 5.5 3.92 2.734 21 85 53.7 1.13 32.6 4.9 4.03 2.783 22 99 55.1 1.14 33.5 5.2 4.07 2.814 23 100 55.3 1.15 33.5 5.2 4.14 2.857 24 102 55.5 1.15 33.3 5.7 4.15 2.893 25 102 55.5 1.15 33.0 5.0 4.28 2.999 26 128 58.2 1.17 33.8 5.5 4.31 3.081 27 102 55.5 1.15 32.1 5.3 4.35 3.116 28 102 .55.5 1.15 32.3 4.9 4.37 3.124 29 83 53.4 1.13 31.3 4.8 4.37 3.160 30 129 58.3 1.17 32.5 5.0 4.37 3.170 31 128 58.2 1.17 31.8 5.0 4.39 3.212 32 153 60.9 1.20 30.0 4.7 4.39 3.284 33 115 56.9 1.16 31.2 4.9 4.42 3.301 34 110 56.4 1.16 31.6 4.5 4.44 3.380 35 140 59.4 1.19 30.0 4.6 4.44 3.406 36 80. 53.1 1.13 30.1 4.6 4.45 3.424 37 111 56.4 1.16 31.0 4.4 4.46 '3.453 38 132 58.6 1.18 30.2 4.2 4.46 3.509

36

Chapter 7

Summary and Conclusions

For optimization of the WEDM process, experiments were planned using a factorial design

based on Taguchi's L18 orthogonal array with 2' x 34, to establish the relationship between the

control variables and the performance and productivity. In order to model the process, Pulse

width, time between two pulses, Gap voltage , conductivity of the dielectric and wire-feed speed

were selected as the control factors. Cutting speed and work piece surface roughness were

selected as the process outputs.

A 5-11-2 feed-forward back-propagation ANN model was developed to represent the WEDM

process. A close fit of the developed model to the experimental data is observed from the test

analysis. The ANN model developed was used to predict the process performance. Based on the

developed model, influence of the various process parameters on the machining criteria was

observed. Finally the process is optimized using constrained optimization algorithm. Pareto

front for the process. has also been found (The 38 Pareto-optimal solutions were searched out

from the set of all 625 outputs).

From this thesis work the following conclusions can be drawn.

• The results from the neural network show that the model is able to predict the process

performance, such as cutting speed and surface roughness within a reasonable large

range of input factor levels. In the investigating area, the ANN model is found to fit the

data satisfactorily and have good predictive capability to Ra and the cutting speed. From

the results presented in this work, it can be concluded that this technique can be

extended to processes exhibiting similar stochastic character and complexity.

fi

• From the validation experiments the error between the expected and experimental

cutting performance results are reasonably small for the optimized process parameters

settings using constrained optimization method.

37

• The constrained optimization approach is very useful for maximizing the productivity

while maintaining surface roughness within desired limit.

• The set of 38 Pareto-optimal solutions is very useful and will act as a guideline for

optimum machining of the titanium alloy.

• The developed technology setting by searching the pareto optimal front in the field of

wire electrical discharge machining of titanium alloy will have potentiality in modem

industrial applications for efficient manufacturing of precision jobs.

• In addition, the efficiency of determining optimal cutting parameters in the process

planning of wire-EDM can be dramatically improved by using this approach.

7.1 Scope for future research

Further research might attempt to take more factors, such as wire tenstion, workpiece material,

and workpiece height, into account as process inputs. Other performance criteria, such as the

surface cross-sectional microstructure, might be investigated. The techniques presented in this

study might also be tried on the finishing operation of WEDM or other machining processes.

REFERENCES

[1] E.C. Jameson, Description and development of electrical discharge machining (EDM),

Electrical Discharge Machining, Society of Manufacturing Engineers, Dearbem,

Michigan, 2001, pp. 16.

[2] G.F. Benedict, Electrical discharge machining (EDM), Non-Traditional Manufacturing

Processes, Marcel Dekker, Inc, New York & Basel, 1987, pp. 231-232.

[3] Y.F. Luo, C.G. Chen, Z.F. Tong, Investigation of silicon wafering by wire EDM, J.

Mater. Sci. 27 (21) (1992) 5805-5810.

[4] G.N. Levy, R. Wertheim, EDM-machining of sintered carbide compacting dies, Ann.

CIRP 37 (1) (1988) 175-178.

[5] B.K. Rhoney, A.J. Shih, R.O. Scattergood, J.L. Akemon, D.J. Grant, M.B. Grant, Wire

electrical discharge machining of metal bond diamond wheels for ceramic grinding, Inter.

J. Mach. Tools Manuf. 42 (12) (2002) 1355-1362.

[6] B.K. Rhoney, A.J. Shih, R.O. Scattergood, R. Ott, S.B. McSpadden, Wear mechanism of

metal bond diamond wheels trued by wire electrical discharge machining, Wear 252 (7-

8) (2002) 644-653.

[7] A. Kruusing, S. Leppavuori, A. Uusimaki, B. Petretis, 0. Makarova, Micromachining of

magnetic materials, Sensors Actuators 74 (1-3) (1999) 45-51.

[8] G.L. Benavides, L.F. Bieg, M.P. Saavedra, E.A. Bryce, High aspect ratio meso-scale

parts enables by wire micro-EDM, Microsys. Technol. 8 (6) (2002) 395-401.

[9] J.A. Sanchez, I. Cabanes, L.N. Lopez de Lacalle, A. Lamikiz, Development of optimum

electro discharge machining technology for advanced ceramics, Inter. J. Adv. Manuf.

Technol.18 (12) (2001) 897-905.

[10] ' Y.M. Cheng, P.T. Eubank, A.M. Gadalla, Electrical discharge machining of ZrB2-

based ceramics, Mater. Manuf. Processes 11 (4) (1996) 565-574.

[11] T. Matsuo, E. Oshima, Investigation on the optimum carbide content and machining

condition for wire EDM of zirconia ceramics, Ann. CIRP 41 (1) (1992) 231-234.

[12] Y.K. Lok, T.C. Lee; Processing of advanced ceramics using the wire-cut EDM

process, J. Mater. Process. Technol. 63 (1-3) (1997) 839-843.

39

[13] D.F. Dauw, C.A. Brown, J.P. Van griethuysen, J.F.L.M. Albert, Surface topography

investigations by fractal analysis of spark-eroded, electrically conductive- ceramics, Ann.

CIRP 39 (1) (1990) 161-165.

[14] W. Konig, D.F. Dauw, G. Levy, U. Panten, EDM-future steps,towards the machining

of ceramics, Ann. CIRP 37 (2) (1988) 623--631.

[15] R.F. Firestone, Ceramic—Applications in Manufacturing, Society of Manufacturing

Engineers, Michigan, 1988, pp. 133.

[ 16] N. Mohri, Y. Fukuzawa, T. Tani, N. Saito, K. Furutani, Assisting electrode method

for machining insulting ceramics, Ann. CIRP 45 (1) (1996) 201-204.

[17] N. Mohri, Y. Fukuzawa, T. Tani, T. Sata, Some considerations to machining

characteristics of insulating ceramics—towards practical use in industry, Ann. CIRP 51

(1) (2002) 161-164. I

[18] W.S. Lau, W.B. Lee, A comparison between EDM wire-cut and laser cutting of

carbon fibre composite materials, Mater. Manuf. Processes 6 (2) (1991) 331-342.

[19] W.S. Lau, T.M. Yue, T.C. Lee, W.B.-Lee, Un-conventional machining of composite

materials, J. Mater. Process. Technol. 48 (1-4) (1995.) 199-205.

[20] A.M. Gadalla, W. Tsai, Machining of WC-Co composites, Mater Manuf. Processes 4

(3) (1989) 411-423.

[21] B.H. Yan, C.C. Wang, W.D. Liu, F.Y. Huang, Machining characteristics of

A1203/6061A1 composite using rotary EDM with a dislike electrode, Inter. J. Adv.

Manuf. Technol. 16 (5) (2000) 322-333.

[22] T.M. Yue, Y. Dai, W.S. Lau, An examination of wire electrical discharge machining

(WEDM) of A1203 particulate reinforced aluminium based composites, Mater. Manuf.

Processess 11 (3) (1996) 341-350.

[23] Z.N. Guo, X. Wang, Z.G. Huang, T.M. Yue, Experimental investigation into shaping

particle-reinforced material by WEDM-HS, J. Mater. Process. Technol. 129 (1-3) (2002)

56-59.

[24] Y.S. Tamg, S.C. Ma, L.K. Chung, Determination of optimal cutting parameters in

wire electrical discharge machining, Inter, J. Mach. Tools Manuf. 35 (12) (1995) 1693-

1701.

40

[25] J.T. Huang, Y.S. Liao, W.J. Hsue, Determination of finish-cutting operation number

and machining-parameters setting in wire electrical discharge machining, J. Mater.

Process. Technol. 87 (1-3) (1999) 69-81.

[26] D. Scott, S. Boyina, K.P. Rajurkar, Analysis and optimization of parameter

combination in wire electrical discharge machining, Inter. J. Prod. Res. 29 (11) (1991)

2189-2207.

[27] Y.S. Liao, J.T. Huang, H.C. Su, A study on the machining parameters optimization

of wire electrical discharge machining, J. Mater. Process. Technol. 71 (3) (1997) 487-

493.

[28] R.E. Williams, K.P. Rajurkar, Study of wire electrical discharge machined surface

characteristics, J. Mater. Process. Technol. 28 (1-2) (1991) 127-138.

[29] R. Konda, K.P. Rajurkar, R.R. Bishu, A. Guha, M. Parson, Design of experiments to

study and optimize process performance, Inter. J. Qual. Reliab. Manage. 16 (1) (1999)

56-71.

[30] Y.S. Tamg, S.C. Ma, L.K. Chung, Determination of optimal cutting parameters in

wire electrical discharge machining, Inter, J. Mach. Tools Manuf. 35 (12) (1995) 1693-

1701.

[31] J.T. Huang, Y.S. Liao, W.J. Hsue, Determination of finish-cutting operation number

and machining-parameters setting in wire electrical discharge machining, J. Mater.

Process. Technol. 87 (1-3) (1999) 69-81.

[32] D. Scott, S. Boyina, K.P. Rajurkar, Analysis and optimization of parameter

combination in wire electrical discharge machining, Inter. J. Prod. Res. 29 (11) (1991)

2189-2207.

[33] Y.S. Liao, J.T. Huang, H.C. Su, A study on the machining parameters optimization

of wire electrical discharge machining, J. Mater. Process. Technol. 71 (3) (1997) 487-

493.

[34] M. Rozenek, J. Kozak, L. Dabrowski, K. Lubkowski, Electrical discharge machining

characteristics of metal matrix composites, J. Mater. Process. Technol. 109 (3) (2001)

367-370.

[35] J.T. Huang, Y.S. Liao, Optimization of machining parameters of wire-EDM based on

grey relational and statistical analyses, Inter. J. Prod. Res. 41 (8) (2003) 1707-1720.

41

[36] K.P. Rajurkar, W.M. Wang, Thermal modelling and on-line monitoring of wire-

EDM, J. Mater. Process. Technol. 38 (1-2) (1993) 417-430.

[37] M.I. Go kler, A.M. Ozano"zgu" , Experimental investigation of effects of cutting

parameters on surface roughness in the WEDM process, Inter. J. Mach. Tools Manuf. 40

(13) (2000) 1831-1848.

[38] N. Tosun, C. Cogun, A. Ivan, The effect of cutting parameters on workpiece surface

roughness in wire EDM, Machining Sci. Technol. 7 (2) (2003) 209-219.

[39] K.N. Anand, Development of process technology in wire-cut operation for

improving machining quality, Total Quality Management 7 (1) (1996) 11-28.

[40] T.A. Spedding, Z.Q. Wang, Parametric optimization and surface characterization of

wire electrical discharge machining process, Precision Eng. 20 (1) (1997) 5-15.

[41] R.E. Williams, K.P. Rajurkar, Study of wire electrical discharge machined surface

characteristics, J. Mater. Process. Technol. 28 (1-2) (1991) 127-138.

[42] T.A. Spedding, Z.Q. Wang, Study on modeling of wire EDM process, J. Mater.

Process. Technol. 69 (1-3) (1997) 18-28.

[43] C.L. Liu, D. Esterling, Solid modeling of 4-axis wire EDM cut geometry, Computer-

Aided Des. 29 (12) (1997) 803-810.

[44] W.J. Hsue, Y.S. Liao, S.S. Lu, Fundamental geometry analysis of wire electrical

discharge machining in corner cutting, Inter. J. Mach. Tools Manuf. 39 (4) (1999) 651-

667.

[45] G. Spur, J. Scho"nbeck, Anode erosion in wire-EDM—a theoretical model, Ann.

CIRP 42 (1) (1993) 253-256.

[46] F. Han, M. Kunieda, T. Sendai, Y. Imai, High precision simulation of WEDM using

parametric programming, Ann. CIRP 51 (1) (2002) 165-168.

[47] Freeman, J. A. and Skapura, D. M. Neural Networks, Algorithm, Application, and

Programming Techniques. Reading, MA: Addison-Wesley, 1992

[48] Vemuri, V. R. Artificial Neural Networks: Concepts and Control Application, New

York: IEEE Computer Society Press,1992

[49] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning,

Addison Wesley Publishing Company, January 1989.

[50] C. L. Karr, "Design of an Adaptive Fuzzy Logic Controller Using a Genetic

Algorithm", Proc. ICGA 4, pp. 450-457, 1991.

42

[51] R. B. Holstien, Artificial Genetic Adaptation in Computer Control Systems, PhD

Thesis, Department of Computer and Communication Sciences, University of Michigan,

Ann Arbor, 1971.

[52] R. A. Caruana and J. D. Schaffer, "Representation and Hidden Bias: Gray vs. Binary

Coding", Proc. 6th Int. Conf Machine Learning, ppl53-161, 1988.

[53] W. E. Schmitendorgf, O. Shaw, R. Benson and S. Forrest, "Using Genetic

Algorithms for Controller Design: Simultaneous Stabilization and Eigenvalue Placement

in a Region", Technical Report No. CS92-9, Dept. Computer Science, College of

Engineering, University of New Mexico, 1992.

[54] M. F. Bramlette, "Initialization, Mutation and Selection Methods in Genetic

Algorithms for Function Optimization", Proc ICGA 4, pp. 100-107, 1991.

[55] C. B. Lucasius and G. Kateman, "Towards Solving Subset Selection Problems with

the Aid of the Genetic Algorithm", In Parallel Problem Solving from Nature 2, R.

Manner and B. Manderick, (Eds.), pp. 239-247, Amsterdam: North-Holland, 1992.

[56] A. H. Wright, "Genetic Algorithms for Real Parameter Optimization", In

Foundations of Genetic Algorithms, J. E. Rawlins (Ed.), Morgan Kaufmann, pp. 205-218,

1991.

[57] Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs,

Springer Verlag, 1992.

[58] T. Back, F. Hoffineister and H.-P. Schwefel, "A Survey of Evolution Strategies",

Proc. ICGA 4, pp. 2-10, 1991.

[59] J. J. Grefenstette, "Incorporating Problem Specific Knowledge into Genetic

Algorithms", In Genetic Algorithms and Simulated Annealing, pp. 42-60, L. Davis (Ed.),

Morgan Kaufmann, 1987.

[60] D. Whitley, K. Mathias and P. Fitzhorn, "Delta Coding: An Iterative Search Strategy

for Genetic Algorithms", Proc. ICGA 4, pp. 77-84, 1991.

[61] K. A. De Jong, Analysis of the Behaviour of a Class of Genetic Adaptive Systems,

PhD Thesis, Dept. of Computer and Communication Sciences, University of Michigan,

Ann Arbor, 1975.

[62] J. E. Baker, "Adaptive Selection Methods for Genetic Algorithms", Proc. ICGA 1,

pp. 101-111, 1985.

43

[63] J. E. Baker, "Reducing bias and inefficiency in the selection algorithm", Proc. ICGA

2, pp. 14-21, 1987.

[64] L. Booker, "Improving search in genetic algorithms," In Genetic Algorithms and

Simulated Annealing, L. Davis (Ed.), pp. 61-73, Morgan Kaufmann Publishers, 1987.

[65] W. M. Spears and K. A. De Jong, "An Analysis of Multi-Point Crossover", In

Foundations of Genetic Algorithms, J. E. Rawlins (Ed.), pp. 301-315, 1991.

[66] G. Syswerda, "Uniform crossover in genetic algorithms", Proc. ICGA 3, pp. 2-9,

1989.

[67] W. M. Spears and K. A. De Jong, "On the Virtues of Parameterised Uniform

Crossover", Proc. ICGA 4, pp.230-236, 1991.

[68] R. A. Caruana, L. A. Eshelman, J. D. Schaffer, "Representation and hidden bias II:

Eliminating defining length bias in genetic search via shuffle crossover", In Eleventh

International Joint Conference on Artificial Intelligence, N. S. Sridharan (Ed.), Vol. 1,

pp. 750-755, Morgan Kaufmann Publishers, 1989.

[69] H. Muhlenbein and D. Schlierkamp-Voosen, "Predictive Models for the Breeder

Genetic Algorithm", Evolutionary Computation, Vol. 1, No. 1, pp. 25- 49, 1993.

[70] H. Furuya and R. T. Haftka, "Genetic Algorithms for Placing Actuators on Space

Structures", Proc. ICGA 5, pp. 536-542, 1993.

[71] C. Z. Janikow and Z. Michalewicz, "An Experimental Comparison of Binary and

Floating Point Representations in Genetic Algorithms", Proc. ICGA 4, pp. 31-36, 1991.

[72] D. M. Tate and A. E. Smith, "Expected Allele Convergence and the Role of

Mutation in Genetic Algorithms", Proc. ICGA 5, pp.31-37, 1993.

[73] L. Davis, "Adapting Operator Probabilities in Genetic Algorithms", Proc. ICGA 3,

pp. 61-69, 1989.

[74] T. C. Fogarty, "Varying the Probability of Mutation in the Genetic Algorithm", Proc.

ICGA 3, pp. 104-109, 1989..

[75] K. A. De Jong and J. Sarma, "Generation Gaps Revisited", In Foundations of

Genetic Algorithms 2, L. D. Whitley (Ed.), Morgan Kaufmann Publishers, 1993.

44

Appendix -1

Neural Networks: an over view

Introduction to artificial neural network

There are a number of different answers possible to the question of how to define neural

networks. At one extreme, the answer could be that neural networks are simply a class of

mathematical algorithms, since a network can be regarded essentially as a graphic notation for a

large, class of algorithms. Such algorithms produce solutions to a number of specific problems.

At the other end, the reply may be that these are synthetic networks that emulate the biological

neural networks found in living organisms. In light of today's limited knowledge of biological

neural networks and organisms, the more plausible answer seems to be closer to the algorithmic

one.

In search of better solutions for engineering and computing tasks, many avenues have been

pursued. There has been a long history of interest in the biological sciences on the part of

engineers, mathematicians, and physicists endeavoring to gain new ideas, inspirations, and

designs. Artificial neural networks have undoubtedly been biologically inspired, but the close

correspondence between them and real neural systems is still rather weak. Vast discrepancies

exist between both the architectures and capabilities of artificial and natural neural networks.

Knowledge about actual brain functions are so limited, however, that there is little to guide

those who would try to emulate them. No models have been successful in duplicating the

performance of the human brain. Therefore, the brain has been and still is only a metaphor for a

wide variety of neural network configurations that have been developed.

Despite the loose analogy between artificial and natural neural systems, we will briefly review

the biological neuron model. The synthetic neuron model will subsequently be defined in this

chapter and examples of networkrclasses will be discussed. The basic definitions of neuron and

elementary neural networks will also be given. Since no common standards are yet used in the

technical literature, this part of the chapter will introduce notation, graphic symbols, and

terminology used in this text. The basic forms of neural network processing will also be

discussed.

45

Biological neurons and their artificial models

A human brain consists of approximately 1011 computing elements called neurons. They

communicate through a connection network of axons and synapses having a density of

approximately 104 synapses per neuron.

Biological Neuron

The elementary nerve cell, called a neuron, is the fundamental building block of the biological

neural network. Its schematic diagram is shown in Figure 5. A typical cell has three major

regions: the cell body, which is also called the soma, the axon, and the dendrites. Dendrites form

a dendritic tree, which is a very fine bush of thin fibers around the neuron's body; Dendrites

receive information from neurons through axons-long fibers that serve as transmission lines. An

axon is a long cylindrical connection that carries impulses from the neuron. The end part of an

axon splits into a fine carbonization. Each branch of it terminates in the small end bulb almost

touching the dendrites of neighboring neurons. The axon-dendrite contact organ is called a

synapse. The synapse is where the neuron introduces its signal to the neighboring neuron. The

signals reaching a synapse and received by dendrites are electrical impulses.

Neuron

Incoming Axons from other Neurons

Dendrites

\ Cell Body Axon Hillock \\ Impulse

— Axon J —'

Smapsc

Termin l Receiving Bouton Neuron

nV Impulse

Figure 5 Schematic diagram of a neuron and a saipple of pulse train

46

I'

~7

Neuron's pruceMIflg natic

Neuron Modeling for Artificial Neural Systems

Weights and the neurons' thresholds are fixed in the model and no interaction among network

neurons takes place except for signal flow. Thus, we will consider this model as a starting point

for our neuron modeling discussion. Specifically, the artificial neural systems and computing

algorithms employ a variety of neuron models that have more diversified features than the

model just presented. Below, the main artificial neuron models are introduced that will be used

later in this text.

Synaptic connections

1~u1latiiitir+e wcht

Figure 6 General symbol of neuron consisting of processing node and synaptic

connections.

Every neuron model consists of a processing element with synaptic input connections and a

single output. The signal flow of neuron inputs, x, is considered to be unidirectional as indicated

by arrows, as is a neuron's output signal flow. A general neuron symbol is shown in Figure 5.

This symbolic representation shows a set of weights and the neuron's processing unit, or node.

The neuron output signal is given by the following relationship:

o = f (w`x), or (i) n

0 i=1

where w is the weight vector defined as

W2

and x is the input vector:

X = [xi X2

47

(All vectors defined in this text are column vectors; superscript t denotes a transposition.) The

function f(w`x) is often referred to as an activation function. Its domain is the set of activation

values, net, of the neuron model, we thus often use this function as f (net). The variable net is

defined as a scalar product of the weight and input vector.

net = wx (iii)

The argument of the activation function, the variable net, is an analog of the biological neuron's

membrane potential. Note that temporarily the threshold value is not explicitly used in (i) (ii)

and (iii), but this is only for notational convenience. We have momentarily assumed that the

modeled neuron has n - 1 actual synaptic connections that come from actual variable inputs x j,

X2.... x„_l. We have also assumed that xn = -1 and w„ = T. Since threshold plays an important role

for some models, we will sometimes need to extract explicitly the threshold as a separate neuron

model parameter.

The general neuron symbol, shown in Figure 5 and described with expressions (i), (ii) and (iii),

is commonly used in neural network literature. However, different artificial neural network

classes make use of different definitions of f(net). Also, even within the same class of networks,

the neurons are sometimes considered to perform differently during different phases of network

operation. Therefore, it is pedagogically sound to replace, whenever needed, the general neuron

model symbol from Figure 6 with a specific f(net) and a specific neuron model. The model

validity will then usually be restricted to a particular class of network. Two main models

introduced below are often used in this text.

Acknowledging the simplifications that are necessary to model a biological neuron network

with artificial neural networks, the following terminology is introduced: (1) neural networks are

meant to be artificial neural networks consisting of neuron models and (2) neurons are meant to

be artificial neuron models.

Observe from (i) and (ii) that the neuron as .a processing node performs the operation of

summation of its weighted inputs, or the scalar product computation to obtain net. Subsequently,

it performs the nonlinear operation f(net) through its activation function. Typical activation

functions used are

f(net)= 1+ exp (

2 - ,net) —1

(iv)

1+i, net > 0 f(net) =sign (net) (v) 1- 0, net <0

where Z > 0 in (iv) is proportional to the neuron gain determining the steepness of the

continuous function f(net) near net = 0. The continuous activation function is shown in Figure

7(a) for various A. Notice that as , ao, the limit of the continuous function becomes the

sign(net) function defined in (v). Activation functions (iv) and (v) are called bipolar continuous

and bipolar binary functions, respectively. The word "bipolar" is used to point out that both

positive and negative responses of neurons are produced for this definition of the activation

function. fi

By shifting and scaling the bipolar activation functions denoted by (iv) and (v) unipolar

continuous and unipolar binary activation functions can be obtained.

I.) (h)

Figure? Activation functions of a neuron: (a) bipolar continuous and (b) unipolar

continuous

How Do Neural Networks Work?

The standard artificial neuron is a processing element whose output is calculated by multiplying

its inputs by a weight vector, summing the results, and applying an activation function to the

sum (Fig.8)..

49

Output:y(n)

Single Layer Perceptron

The function of the entire neural network simply is an entirely deterministic calculation of the

outputs of all the n x1(n

x2(n Inputs

x3(n I

x( n

Fig.8 A standard artificial neuron

The back propagation training algorithm

Back propagation (BP) is a general method for iteratively solving for a multilayer perceptrons'

weights and biases. It uses a steepest descent technique which is very stable when a small

learning rate is used, but has slow convergence properties. Several methods for speeding up BP

have been used including momentum and a variable learning rate.

(a) Derivative of the Activation Functions:

The chain rule that is used in deriving the BP algorithm necessitates the computation of the

derivative of the activation functions. For logistic, hyperbolic tangent, and linear functions; the

derivatives are as follows:

Linear (D(I) = I

1 Logistic t(I) =

1+e" e"' —

Tanh (D(I) - e [̀ +e'

(~(I) =

i(I) = at(I)(1— J?(I))

d~(I) = a(i-0(I)1)

Alpha is called the slope parameter. Usually alpha is chosen to be 1 but other slopes may be

used. This formulation for the derivative makes the computation of the gradient more efficient

since the output '(I) has already been calculated in the forward pass.

50 U

Xh

Input Layer kid maaen Layer U) uuiput Layer kK) -I

Index h Index p Index q mNodes n Nodes r Nodes

T i

T2

T,

The highest gradient is at I=O. Since the speed of learning is partially dependent on the size of

the gradient, the internal activation of all neurons should be kept small to expedite training.

This is why we scale the inputs and initialize weights to small random values

The backpropagation algorithm is an optimization technique designed to minimize an objective

function. The most commonly used objective function is the squared error which is defined as:

62 =[Tq_ q]z

Fig. 9 Configuration and terminology of a multi-layered neural network.

The network syntax is defined as in the Figure 9:

In this notation, the layers are labeled i, j, and k; with m, n, and r neurons respectively; and the

neurons in each layer are indexed h, p, and q respectively.

Where, x = input value

T = target output value

w = weight value

I = internal activation

c = neuron output

c =error term

51

The outputs for a two layer network with both layers using a logistic activation function are

calculated by the equation:

(D = logistic{w2 * [logistic(wl * x + bl)] + b2}

Where: wl = first layer weight matrix

w2 = second layer weight matrix

bl = first layer bias vector

b2 = second layer bias vector

The input vector can be augmented with a dummy node representing the bias input. This

dummy input of 1 is multiplied by a weight corresponding to the bias value. This results in a

more compact representation of the above equation:

1 (D = logistic W2

Ilogistic(W1 * X)

where X = [ 1 x]' % Augmented input vector. W1=[blwl] W2 = [b2 w2]

Note that a dummy hidden node (=1) also needs to be inserted into the equation.

52

4.3.5 The weight updates

• The output layer weights are changed in proportion to the negative gradient of the

squared.error with respect to the weights. These weight changes can be calculated using

the chain rule. The symbols and terminologies are according to Fig. 9

[l£ 2 AW pq

k - -7lp•q 19W pq .k

2 T a £ q.k 1 q.k Tl

p q 8 (D q.k 19I q.k a W pq .k

5 pq.k *(D P•j

and

8 pq k = 2 [Tq —(Dq.k10q.k[1— ~q.k]

Wpq.k (N+ 1) = Wpq.k (V ) 1lP•q . 6Pq•k - (DP-j

• The hidden layer outputs have no target values. Therefore, a procedure is used to back

propagate the output layer errors to the hidden layer neurons in order to modify their

weights to minimize the error. To accomplish this, we start with the equation for the

gradient with respect to the weights and use the chain rule.

8 2 Awhp.j = -'7h.P 8w

hp . j

8 2 = 7h.p

q=1 O~W hP•j r

l/a£q 2

- l h. p q=1 V W q.k

81 q.k P-J 01 P_j

olq.k 8b.1 aln ~ £9Whp.j

53

ôwhp.l 9=l

(-2)[T -~q.k] q.k[1-~q.k] Wpq.k 'a(D P.J 11 -~P J~h

- Z 5P9 k WP9 k a (D P .% L1 - (D P•.J I"h 9=l

Shp.j = S pq.kW pq.k CJ~I P•J

w11. (N+1)=whPJ( N ) -t/hp x h ShP•J

54

Appendix -2

An Overview of Genetic Algorithms

Optimization Techniques

Optimization stands for selecting the best alternative among a given set of options. In any

optimization problem there is an objective function or objective that depends on a set of

variables. To reach an optimum does not necessarily mean "maximum". It means the best value

for the function.

Analytical approaches

Analytical approaches (most traditional optimization methods) are used for linear functions.

Derivatives of the objective function and constraints are usually required in these techniques.

For nonlinear problems, the use of such approaches is limited to certain types of functions. For

the functions to be optimized by these techniques, they must be continuous and differentiable

(no integer variables) and to find a globally optimal solution, the function must be convex.

Stochastic approaches

These techniques randomly search for better solutions, if they are built from promising

solutions, they ensure greater efficiency than completely random search. They can handle any

type of problem; the only limitation to this category is their long computational time. They are

slower than analytical methods for problems that can be solved analytically. Examples include

evolutionary algorithms (EAs), simulated annealing (SA), tabu search, and numerous variations.

Why evolutionary?

Since classical search and optimization methods use a point-by-point approach, where one

solution in each iteration is modified to a different (hopefully better) solution, the outcome of

using a classical optimization method is a single optimized solution.

Thinking along this working principle, classical search optimization methods could find only a

single optimized solution in a single simulation run.

55

Since only a single optimized solution could be found, it was therefore, necessary to convert the

task of multi-tradeoff solutions in a multi-objective optimization to one of finding a single

solution of a transformed single- objective optimization problem.

coding of sobitions

Problem objective $unction , ewhitionary operators

specific knowledge

r. Solution

Fig. 17 the basic structure of EA

However, the field of search and optimization has changed over the last few years by the

introduction of a number of non classical, unorthodox and stochastic search and optimization

algorithms. Of these, the evolutionary algorithm (EA) (Figl7), mimics nature's evolutionary

principles to drive its search towards an optimal solution. One of the most striking differences to

classical search and optimization algorithms is that EAs use a population of solutions-processed

in each iteration, instead of a single solution.

Since a population of solutions are processed in each iteration, the outcome of an EA is also a

population of solutions. If an optimization problem has a single optimum, all EA population

members can be expected to converge to that optimum solution. However, if an optimization

problem has multiple optimal solutions, an EA can be used to capture multiple optimal solutions

in its final population. This ability of an EA to find multiple optimal solutions in one single

simulation run makes EAs unique in solving multi-objective optimization problems. Since the

first step of the ideal strategy for multi-objective optimization requires multiple trade-off

solutions to be found, an EA's populations-search can be suitably utilized to find a number of

solutions in a single simulation run.

56

In this Section, a tutorial introduction to the basic Genetic Algorithm (GA) and outline the

procedures for solving problems using the GA are given. The genetic algorithms were

developed by John Holland in 1960s and early 1970s [49]. They are adaptive heuristic searching

algorithms used for solving demanding searching and optimisation problems. Since their

introduction genetic algorithms were spread on almost all areas of research work. They proved

to be an effective optimization tool for multi-criteria and multi-parametrical problems. Their

power is in random guided search hidden in imitation of principles of natural evolution as seen

through the "survival of the fittest" law. To implement the genetic algorithm on a certain

problem we have to describe an individual and an environment in which the individual has to fit.

In other words it is necessary to determine a coding form of variables of the problem and the

fitness function enabling the calculation of quality of the variables. The optimization process

starts by creating the initial generation of organisms which then improve by reproduction,

mutation and crossover from generation to generation. Thus, we gradually obtain the members

(organisms) of ever higher quality that, in fact, are the solutions of the problem.( Fig 18)

generate I evaluate ob jectnre Are op tiny cation best initial func fio n criteria met? indaliduals

population• no

start generate Seen result

new population

Recombination

Mutation

Fig. 18: Structure of a single population evolutionary algorithm

The principal steps of the method are:

• Creation of the initial generation of organisms,

• Evaluation of organisms by means of the fitness function,

• Selection of organisms which best solve the set problem, and

• Creation of new generation by crossover, mutation and reproduction.

57

What are Genetic Algorithms?

The GA is a stochastic global search method that mimics the metaphor of natural biological

evolution. GAs operates on a population of potential solutions applying the principle of survival

of the fittest to produce (hopefully) better and better approximations to a solution. At each

generation, a new set of approximations is created by the process of selecting individuals

according to their level of fitness in the problem domain and breeding them together using

operators borrowed from natural genetics. This process leads to the evolution of populations of

individuals that are better suited to their environment than the individuals that they were created

from, just as in natural adaptation.

Individuals, or current approximations, are encoded as strings, chromosomes, composed over

some alphabet(s), so that the genotypes (chromosome values) are uniquely mapped onto the

decision variable (phenotypic) domain. The most commonly used representation in GAs is the

binary alphabet {0, 1} although other representations can be used, e.g. ternary, integer, real-

valued etc. For example, a problem with two, variables, xi and x2, may be mapped onto the

chromosome structure in the following way:

101101001101011 010100101 xi X2

where xl is encoded with 10 bits and x2 with 15 bits, possibly reflecting the level of accuracy

or range of the individual decision variables. Examining the chromosome string in isolation

yields no information about the problem we are trying to solve. It is only with the decoding of

the chromosome into its phenotypic values that any meaning can be applied to the

representation. However, as described below, the search process will operate on this encoding of

the decision variables, rather than the decision variables themselves, except, of course, where

real-valued genes are used. Having decoded the chromosome representation into the decision

variable domain, it is possible to assess the performance, or fitness, of individual members of a

population. This is done through an objective function that characterises an individual's

performance in the problem domain. In the natural world, this would be an individual's ability

to survive in its present environment.

Thus, the objective function establishes the basis for selection of pairs of individuals that will be

mated together during reproduction. During the reproduction phase, each individual is assigned

a fitness value derived from its raw performance measure given by the objective function. This

value is used in the selection to bias towards more fit individuals. Highly fit individuals, relative

to the whole population, have a high probability of being selected for mating whereas less fit

individuals have a correspondingly low probability of being selected. Once the individuals have

been assigned a fitness value, they can be chosen from the population, with a probability

according to their relative fitness, and recombined to produce the next generation. Genetic

operators manipulate the characters (genes) of the chromosomes directly, using the assumption

that certain individual's gene codes, on average, produce fitter individuals. The recombination

operator is used to exchange genetic information between pairs, or larger groups, of individuals.

The simplest recombination operator is that of single-point crossover.

Consider the two parent binary strings:

P1=10010110, and P2=10111000.

If an integer position, i is selected uniformly at random between 1 and the string length, 1, minus

one [1, 1-1], and the genetic information exchanged between the individuals about this point,

then two new offspring strings are produced. The two offspring below are produced when the

crossover point i = 5 is selected,

01 =10010000, and 02=10111110.

This crossover operation is not necessarily performed on all strings in the population. Instead, it

is applied with a probability P(x) when the pairs are chosen for breeding. A further genetic

operator, called mutation, is then applied to the new chromosomes, again with a set probability,

P(m). Mutation causes the individual genetic representation to be changed according to some F,

59

probabilistic rule. In the binary string representation, mutation will cause a single bit to change

its state, 0 to 1 or I to 0.

Olm=1 0000000.

Mutation is generally considered to be a background operator that ensures that the probability of

searching a particular subspace of the problem space is never zero. This has the effect of tending

to inhibit the possibility of converging to a local optimum, rather than the global optimum. After

recombination and mutation, the individual strings are then, if necessary, decoded, the objective

function evaluated, a fitness value assigned to each individual and individuals selected for

mating according to their fitness, and so the process continues through subsequent generations.

In this way, the average performance of individuals in a population is expected to increase, as

good individuals are preserved and bred with one another and the less fit individuals die out.

The GA is terminated when some criteria are satisfied, e.g. a certain number of generations, a

mean deviation in the population, or when a particular point in the search space is encountered.

GAs versus Traditional Methods

From the above discussion, it can be seen that the GA differs substantially from more traditional

search and optimization methods. The four most significant differences are:

• GAs search a population of points in parallel, not a single point.

• GAs do not require derivative information or other auxiliary knowledge; only the

objective function and corresponding fitness levels influence the directions of search.

• GAs use probabilistic transition rules, not deterministic ones.

• GAs work on an encoding of the parameter set rather than the parameter set itself

(except in where real-valued individuals are used).

It is important to note that the GA provides a number of potential solutions to a given problem

and the choice of final solution is left to the user. In cases where a particular problem does not

have one individual solution, for example a family of Pareto-optimal solutions, as is the case in

multi-objective optimization and scheduling problems, then the GA is potentially useful for

identifying these alternative solutions simultaneously.

Major Elements of the Genetic Algorithm

The simple genetic algorithm (SGA) is described by Goldberg [49] and is used here to illustrate

the basic components of the GA. A pseudo-code outline of the SGA is shown below. The

population at time t is represented by the time-dependent variable P, with the initial population

of random estimates being P(0). Using this outline of a GA, the remainder of this Section

describes the major elements of the GA.

Procedure in GA

begin

t=0:

initialize P(t):

evaluate P(t):

while not finished do

begin

t=t+ 1;

select P(t) from P(t-1);

reproduce pairs in P(t);

evaluate P(t);

end

end.

Population Representation and Initialization

GAs operate on a number of potential solutions, called a population, consisting of some

encoding of the parameter set simultaneously. Typically, a population is composed of between

30 and 100 individuals, although, a variant called the micro GA uses very small populations,

—10 individuals, with a restrictive reproduction and replacement strategy in an attempt to reach

real-time execution [50].

The most commonly used representation of chromosomes in the GA is that of the single-level

binary string. Here, each decision variable in the parameter set, is encoded as a binary string and

these are concatenated to form a chromosome. The use of Gray coding has been advocated as a

method of overcoming the hidden representational bias in conventional binary representation as

61

the Hamming distance between adjacent values is constant [51]. Empirical evidence of Caruana

and Schaffer [52] suggests that large Hamming distances in the representational mapping

between adjacent values, as is the case in the standard binary representation, can result in the

search process being deceived or unable to efficiently locate the global minimum. A further

approach of Schmitendorgf et-al [53], is the use of logarithmic scaling in the conversion of

binary-coded chromosomes to their real phenotypic values. Although the precision of the

parameter values is possibly less consistent over the desired range, in problems where the spread

of feasible parameters is unknown, a larger search space may be covered with the same number

of bits than a linear mapping scheme allowing the computational burden of exploring unknown

search spaces to be reduced to a more manageable level. Whilst binary-coded GAs are most

commonly used, there is an increasing interest in alternative encoding strategies, such as integer

and real-valued representations. For some problem domains, it is argued that the binary

representation is in fact deceptive in that it obscures the nature of the search [54]. In the subset

selection problem [55], for example, the use of an integer representation and look-up tables

provides a convenient and natural way of expressing the mapping from representation to

problem domain. The use of real-valued genes in GAs is claimed by Wright [56] to offer a

number of advantages in numerical function optimization over binary encodings. Efficiency of

the GA is increased as there is no need to convert chromosomes to phenotypes before each

function evaluation; less memory is required as efficient floating-point internal computer

representations can be used directly; there is no loss in precision by discretisation to binary or

other values; and there is greater freedom to use different genetic operators. The use of real-

valued encodings is described in detail by Michalewicz [57] and in the literature on Evolution

Strategies (see, for example, [58]). Having decided on the representation, the first step in the

SGA is to create an initial population. This is usually achieved by generating the required

number of individuals using a random number generator that uniformly distributes numbers in

the desired range. For example, with a binary population of Nind individuals whose

chromosomes are Lind bits long, Nind x Lind random numbers uniformly distributed from the

set {0, 1 } would be produced. A variation is the extended random initialisation procedure of

Bramlette whereby a number of random initialisations are tried for each individual and the one

with the best performance is chosen for the initial population. Other users of GAs have seeded

the initial population with some individuals that are known to be in the vicinity of the global

minimum (see, for example, [59] and [60]). This approach is, of course, only applicable if the

62

nature of the problem is well understood beforehand or if the GA is used in conjunction with a

knowledge based system. The GA code supports binary,chromosome representations. Binary

populations may be initialised using the Toolbox function to create binary populations, crtbp.

An additional function, crtbase, is provided that builds a vector describing the integer

representation Conversion between binary strings and real values is provided by the routine

bs2ry that supports the use of Gray codes and logarithmic scaling.

The Objective and Fitness Functions

The objective function is used to provide a measure of how individuals have performed in the

problem domain. In the case of a minimization problem, the most fit individuals will have the

lowest numerical value of the associated objective function. This raw measure of fitness is

usually only used as an intermediate stage in detennining the relative performance of

individuals in a GA. Another function, the fitness function, is normally used to transform the

objective function value into a measure of relative fitness [61], thus:

F(x) g (f(x)

where f is the objective function, g transforms the value of the objective function to a non-

negative number and F is the resulting relative fitness. This mapping is always necessary when

the objective function is to be minimized as the lower objective function values correspond to

fitter individuals. In many cases, the fitness function value corresponds to the number of

offspring that an individual can expect to produce in the next generation. A commonly used

transformation is that of proportional fitness assignment (see, for example, [49]). The individual

fitness, F(x), of each individual is computed as the individual's raw performance, f(x), relative

to the whole population, i.e.,

f(x) F (x;) =

Nind

Y ✓ (xi)

where Nind is the population size and xi is the phenotypic value of individual i. Whilst this

fitness assignment ensures that each individual has a probability of reproducing according to its

relative fitness, it fails to account for negative objective function values. A linear transformation

which offsets the objective function [49] is often used prior to fitness assignment, such that,

63

F(x) =a*f(x)+b

Where a is a positive scaling factor if the optimization is maximizing and negative if we are

minimizing. The offset b is used to ensure that the resulting fitness values are non-negative.

The linear scaling and offsetting outlined above is, however, susceptible to rapid convergence.

The selection algorithm (see below) selects individuals for reproduction on the basis of their

relative fitness. Using linear scaling, the expected number of offspring is approximately

proportional to that individuals performance. As there is no constraint on an individual's

performance in a given generation, highly fit individuals in early generations can dominate the

reproduction causing rapid convergence to possibly sub-optimal solutions. Similarly, if there is

little deviation in the population, then scaling provides only a small bias towards the most fit

individuals. Baker .[62] suggests that by limiting the reproductive range, so that no individuals

generate an excessive number of offspring, prevents premature convergence. Here, individuals

are assigned fitness according to their rank in the population rather than their raw performance.

One variable, MAX, is used to determine the bias, or selective pressure, towards the most fit

individuals and the fitness of the others is determined by the following rules:

•MIN=2.0-MAX

• INC= 2.0 x (MAX-1.0) /N; d

•LOW=INC/2.0

where MIN is the lower bound, INC is the difference between the fitness of adjacent individuals

and LOW is the expected number of trials (number of times selected) of the least fit individual.

MAX is typically chosen in the interval [1.1, 2.0]. Hence, for a population size of N;d = 40 and

MAX= 1.1, we obtain MIN= 0.9, INC = 0.05 and LOW= 0.025. The fitness of individuals in

the population may also be calculated directly as,

F(x)-2-MAX+2 (MAX- 1) Nind - 1

where x; is the position in the ordered population of individual i.

64

Selection

Selection is the process of determining the number of times, or trials, a particular individual is

chosen for reproduction and, thus, the number of offspring that an individual will produce. The

selection of individuals can be viewed as two separate processes:

1) Determination of the number of trials an individual can expect to receive, and

2) Conversion of the expected number of trials into a discrete number of offspring.

The first part is concerned with the transformation of raw fitness values into a real valued

expectation of an individual's probability to reproduce and is dealt with in the previous

subsection as fitness assignment. The second part is the probabilistic selection of individuals for

reproduction based on the fitness of individuals relative to one another and is sometimes known

as sampling. The remainder of this subsection will review some of the more popular selection

methods in current usage. Baker [63] presented three measures of performance for selection

algorithms, bias, spread and efficiency. Bias is defined as the absolute difference between an

individual's actual and expected selection probability. Optimal zero bias is therefore achieved

when an individual's selection probability equals its expected number of trials. Spread is the

range in the possible number of trials that an individual may achieve. Iff(i) is the actual number

of trials that individual i receives, then the "minimum spread" is the smallest spread that

theoretically permits zero bias. Thus, while bias is an indication of accuracy, the spread of a

selection method measures its consistency. The desire for efficient selection methods is

motivated by the need to maintain a GAs overall time complexity. It has been shown in the

literature that the other phases of a GA (excluding the actual objective function evaluations) are

O ( Lind, Nind) or better time complexity, where Lind is the length of an individual and Nind is

the population size. The selection algorithm should thus achieve zero bias whilst maintaining a

minimum spread and not contributing to an increased time complexity of the GA.

Roulette Wheel Selection Methods

Many selection techniques employ a "roulette wheel" mechanism to probabilistically select

individuals based on some measure of their performance. A real-valued interval, Sum, is

determined as either the sum of the individuals' expected selection probabilities or the sum of

the raw fitness values over all the individuals in the current population. Individuals are then

mapped one-to-one into contiguous intervals in the range [0, Sum]. The size of each individual

ri

interval corresponds to the fitness value of the associated individual. For example, in Fig. 19 the

circumference of the roulette wheel is the sum of all six individual's fitness values. Individual 5

is the most fit individual and occupies the largest interval, whereas individuals 6 and 4 are the

least fit and have correspondingly smaller intervals within the roulette wheel. To select an

individual, a random number is generated in the interval [0, Sum] and the individual whose

segment spans the random number is selected. This process is repeated until the desired number

of individuals have been selected.

The basic roulette wheel selection method is stochastic sampling with replacement (SSR). Here,

the segment size and selection probability remain the same throughout the selection phase and

individuals are selected according to the procedure outlined above. SSR gives zero bias but a

potentially unlimited spread. Any individual with a segment size > 0 could entirely fill the next

population.

o/ o

Figure 19: Roulette Wheel Selection

Crossover (Recombination)

The basic operator for producing new chromosomes in the GA is that of crossover. Like its

counterpart in nature, crossover produces new individuals that have some parts of both parent's

genetic material. The simplest form of crossover is that of single-point crossover, described in

the Overview of GAs. In this Section, a number of variations on crossover are described and

discussed and the relative merits of each reviewed.

Multi point Crossover

For multi-point crossover, m crossover positions, K; C {1, 2...,1-1}, where k; are the crossover

points and 1 is the length of the chromosome, are chosen at random with no duplicate new

offspring. The section between the first allele position and the first crossover point is not

exchanged between individuals. This process is illustrated in Fig. 20.

Figure 20: Multi-point Crossover (m=5)

The idea behind multi-point, and indeed many of the variations on the crossover operator, is that

the parts of the chromosome representation that contribute to the most to the performance of a

particular individual may not necessarily be contained in adjacent substrings [64]. Further, the

disruptive nature of multi-point crossover appears to encourage the exploration of the search

space, rather than favoring the convergence to highly fit individuals early in the search, thus

making the search more robust [65].

Uniform Crossover

Single and multi-point crossover define cross points as places between loci where a

chromosome can be split. Uniform crossover [66] generalizes this scheme to make every locus a

potential crossover point.

P1 =1011000111 P2 =0001111000 Mask =0011001100 01 =0011110100 02 =1001001.011

67

A crossover mask, A crossover mask, the same length as the chromosome structures is created

at random and the parity of the bits in the mask indicates which parent will supply the offspring

with which bits. Consider the following two parents, crossover mask and resulting offspring:

P1 =1011000111

P2 =0001111000

Mask =001 1001 1 00

01 =0011110100

02 =1001001011

Here, the first offspring, 01, is produced by taking the bit from P1 if the corresponding mask bit

is 1 or the bit from P2 if the corresponding mask bit is 0. Offspring 02 is created using the

inverse of the mask or, equivalently, swapping P1 and P2.

Uniform crossover, like multi-point crossover, has been claimed to reduce the bias associated

with the length of the binary representation used and the particular coding for a given parameter

set. This helps to overcome the bias in single-point crossover towards short substrings without

requiring precise understanding of the significance of individual bits in the chromosome

representation. Spears and De Jong [67] have demonstrated how uniform crossover may be

parameterised by applying a probability to the swapping of bits. This extra parameter can be

used to control the amount of disruption during recombination without introducing a bias

towards the length of the representation used. When uniform crossover is used with real-valued

alleles, it is usually referred to as discrete recombination.

Discussion

The binary operators discussed in this Section have all, to some extent, used disruption in the

representation to help improve exploration during recombination. Whilst these operators may be

used with real-valued populations, the resulting changes in the genetic material after

recombination would not extend to the actual values of the decision variables, although

offspring may, of course, contain genes from either parent. The intermediate and line

recombination operators overcome this limitation by acting on the decision variables themselves.

Like uniform crossover, the real-valued operators may also be parameterised to provide a

control over the level of disruption introduced into offspring. For discrete-valued

representations, variations on the recombination operators may be used that ensure that only

valid values are produced as a result of crossover [68]. The GA Toolbox provides a number of

crossover routines incorporating most of the methods described above. Single-point, double-

point and shuffle crossover are implemented in the Toolbox functions xovsp, xovdp and xovsh,

respectively, and can operate on any chromosome representation. Reduced surrogate crossover

is supported with both single-point, xovsprs, and double-point, xovdprs, crossover and with

shuffle crossover, xovshrs. A further general multi-point crossover routine, xovmp, is also

provided. To support real-valued chromosome representations, discrete, intermediate and line

recombination operators are also included. The discrete recombination operator, recdis,

performs crossover on real-valued individuals in a similar manner to the uniform crossover

operators. Line and intermediate recombination are supported by the functions reclin and recint

respectively. A high-level entry function to all of the crossover operators is provided by the

function recombin.

Mutation

In natural evolution, mutation is a random process where one allele of a gene is replaced by

another to produce a new genetic structure. In GAs, mutation is randomly applied with low

probability, typically in the range 0.001 and 0.01, and modifies elements in the chromosomes.

Usually considered as a background operator, the role of mutation is often seen as providing a

guarantee that the probability of searching any given string will never be zero and acting as a

safety net to recover good genetic material that may be lost through the action of selection and

crossover [49]. The effect of mutation on a binary string is illustrated in Fig. 21 for a 10-bit

chromosome representing a real value decoded over the interval [0, 10] using both standard and

Gray coding and a mutation point of 3 in the binary string. Here, binary mutation flips the value

of the bit at the loci selected to be the mutation point. Given that mutation is generally applied

uniformly to an entire population of strings, it is possible that a given binary string may be

mutated at more than one point. With non-binary representations, mutation is achieved by either

perturbing the gene values or random selection of new values within the allowed range. Janikow

and Michalewicz [69] demonstrate how real-coded GAs may take advantage of higher mutation

rates than binary-coded GAs, increasing the level of possible exploration of the search space

without adversely affecting the convergence characteristics. Indeed, Tate and Smith [70] argue

that for codings more complex than binary, high mutation rates can be both desirable and

necessary and show how, for a complex combinatorial optimization problem, high mutation

rates and non-binary coding yielded significantly better solutions than the normal approach.

mutation point. -` binary Gray Original string - 0 0 0 1 1 0 0 0 1 0 0.9659 0.6634 Mutated string - 0 0 1 1 1 0 0 0 1 0 2.2146 1.8439

Figure 21: Binary Mutation

Many variations on the mutation operator have been proposed. For example, biasing the

mutation towards individuals with lower fitness values to increase the mutation point

exploration in the search without losing information from the fitter individuals [71] or

parameterising the mutation such that the mutation rate decreases with the population

convergence [72]. Muhlenbein [69] has introduced a mutation operator for the real-coded GA

that uses a non-linear term for the distribution of the range of mutation applied to gene values. It

is claimed that by biasing mutation towards smaller changes in gene values, mutation can be

used in conjunction with recombination as a foreground search process. Other mutation

operations include that of trade mutation [55], whereby the contribution of individual genes in a

chromosome is used to direct mutation towards weaker terms, and reorder mutation [55], that

swaps the positions of bits or genes to increase diversity in the decision variable space. Binary

and integer mutation are provided in the Toolbox by the function mut. Real-valued mutation is

available using the function mutbga. A high-level entry function to the mutation operators is

provided by the function mutate.

Reinsertion

Once a new population has been produced by selection and recombination of individuals from

the old population, the fitness of the individuals in the new population may be determined. If

fewer individuals are produced by recombination than the size of the original population, then

the fractional difference between the new and old population sizes is termed a generation gap

[73]. In the case where the number of new individuals produced at each generation is one or two,

the GA is said to be steady-state [74] or incremental [75]. If one or more of the most fit

individuals is deterministically allowed to propagate through successive generations then the

GA is said to use an elitist strategy. To maintain the size of the original population, the new

70

individuals have to be reinserted into the old population. Similarly, if not all the new individuals

are to be used at each generation or if more offspring are generated than the size of the old

population then a reinsertion scheme must be used to determine which individuals are to exist in

the new population. An important feature of not creating more offspring than the current

population size at each generation is that the generational computational time is reduced, most

dramatically in the case of the steady-state GA, and that the memory requirements are smaller as

fewer new individuals need to be stored while offspring are produced. When selecting which

members of the old population should be replaced the most apparent strategy is to replace the

least fit members deterministically. However, in studies, Fogarty [74] has shown that no

significant difference in convergence characteristics was found when the individuals selected for

replacement where chosen with inverse proportional selection or deterministically as the least fit.

He further asserts that replacing the least fit members effectively implements an elitist strategy

as the most fit will probabilistically survive through successive generations. Indeed, the most

successful replacement scheme was one that selected the oldest members of a population for

replacement. This is reported as being more in keeping with generational reproduction as every

member of the population will, at some time, be replaced. Thus, for an individual to survive

successive generations, it must be sufficiently fit to ensure propagation into future generations.

The GA Toolbox provides a function for reinserting individuals into the population after

recombination, reins. Optional input parameters allow the use of either uniform random or

fitness-based reinsertion. Additionally, this routine can also be selected to reinsert fewer

offspring than those produced at recombination.

Termination of the GA

Because the GA is a stochastic search method, it is difficult to formally specify convergence

criteria. As the fitness of a population may remain static for a number of generations before a

superior individual is found, the application of conventional termination criteria becomes

problematic. A common practice is to terminate the GA after a prespecified number of

generations and then test the quality of the best members of the population against the problem

definition. If no acceptable solutions are found, the GA may be restarted or a fresh search

initiated.

71

Outline of the basic algorithm

0 START : Create random population of n chromosomes

1 FITNESS : Evaluate fitness f(x) of each chromosome in the population

2 NEW POPULATION

0 SELECTION : Based on f(x)

1 RECOMBINATION : Cross-over chromosomes

2 MUTATION : Mutate chromosomes

3 ACCEPTATION : Reject or accept new one

3 REPLACE : Replace old with new population: the new

generation

4 TEST : Test problem criterium

5 LOOP : Continue step 1 —4 until criterium is satisfied

72

ann modeling of wire edm and optimization of cutting

Documents