ann modeling of wire edm and optimization of cutting
TRANSCRIPT
ANN MODELING OF WIRE EDM AND OPTIMIZATION OF CUTTING PARAMETERS BY GA
A DISSERTATION
Submitted in partial fulfillment of the requirements for the award of the degree
of MASTER OF TECHNOLOGY
in MECHANICAL ENGINEERING
(With Specialization in Production & Industrial Systems Engineering)
By
AMANUEL TESGERA BASHA
DEPARTMENT OF MECHANICAL AND INDUSTRIAL ENGINEERING INDIAN INSTITUTE OF TECHNOLOGY ROORKEE
ROORKEE - 247 667 (INDIA) JUNE, 2005
CANDIDATE'S DECLARATION
I hereby declare that the work which is being presented in the report entitled "ANN MODELING
OF WIRE EDM AND OPTIMIZATION OF CUTTING PARAMETERS BY GA" in partial
fulfillment of the requirement for the award of the degree of Master of Technology in Mechanical
and Industrial Engineering with specialization in Production and Industrial systems engineering,
submitted in the Department of Mechanical and Industrial Engineering, Indian Institute of
Technology, Roorkee is an authentic record of my own work carried out from August 2004 to
June 2005, under the guidance of Dr. H.S. Shan, Professor and Dr. Navneet Arora, Assistant
Professor, Mechanical and Industrial Engineering Department, IIT—Roorkee.
The matter embodied in this dissertation has not been submitted by me for the award of any other
degree or diploma.
Date : LZ1 D l L©oS
Place: JJ.T .cOV €.e
(AMANUEL TESGERA BASHA)
CERTIFICATE
This is to certify that the above statement made by candidate is correct to the best of my
knowledge.
Date: 22~VA . 6S
Dr. H.S. Shan
Place: Professor, MIED
Dr. Navneet Arrora
Assist. Prof., MIED
Acknowledgement
I express my deep and sincere sense of gratitude from the core of my heart to Dr. H.S Shan,
Professor, Mechanical and Industrial Engineering Department, Indian Institute of Technology,
Rookee, for his inspiring and painstaking supervision, encouragement and invaluable help during
the course of this thesis work without which this work would not have been possible. I am
grateful for the long hours he spends in discussing and explaining minute details of the work. It
has been a wonderful association which I cherish. I consider my self privileged to have worked
under his supervision and guidance.
I am grateful to my co-guide Dr Navneet Aroar, Assistant Professor, Mechanical and Industrial
engineering department, Indian Institute of Technology Roorkee, Roorkee (India), for his
suggestions and constant encouragement.
The services of the staff of Machine Tool Laboratory, Mechanical and Industrial Engineering
department are acknowledged with sincere thanks. I am particularly thankful to Mr. Jasbir Singh,
for providing technical assistance during the experimental work.
I would also like to thank Dr. Pradeep Kumar, Professor, Mechanical and Industrial Engineering
Department, Indian Institute of Technology, Roorkee, for providing facilities to carry out the
experiments.
Last but not least, I would like to express my gratitude to my Parents for their kind blessing and
for providing moral support and encouragement throughout my life. Grateful acknowledgements
are also due to all my teachers and friends whose timely help has gone a long way in my studies.
AMANUEL TESGERA BASHA
11
Abstract
Wire electrical discharge machining (WEDM) technology has been widely used in conductive
material machining especially when intricate shapes and profiles have to be cut. Manufacturers
and users of this process always want to achieve higher machining productivity with a desired
accuracy and surface finish. The WEDM process's performance, in terms of surface fmish and machining productivity is however affected by; many factors such as applied machining voltage, ignition pulse current, pulse duration, time between two pulses, servo-speed variation, servo-
control reference voltage, wire speed, wire tension, conductivity of dielectric and injection
pressure for dielectric. The material of the work piece and its height also influence the process. If
the setting of one of the above parameters changes, it affects the process in a complex way.
Because of the many variables and the complex and stochastic nature of the process, achieving
the optimal performance, even for a highly skilled operator with a state-of-the-art WEDM
machine is rarely possible. An effective way to solve this problem is to discover the relationship
between the performance of the process and its controllable input parameters i.e., model the
process through suitable mathematical techniques. However, the complex and stochastic nature of
the WEDM process has made it difficult to establish a conclusive analytical model; therefore, an
empirical method can be adopted. The present study is amid at exploiting the strong capabilities
of both ANN and GA, which are suitable for solving manufacturing problems that are amenable
for modeling using traditional methods.
A feed-forward back-propagation neural network based on Taguchi L18 experimental design is
developed to model the machining process. GA is then employed to find the optimal operating
conditions so that the productivity of wire EDM is improved for a given surface finish
requirement. The set of Pareto-optimal solutions is searched for the processing of titanium alloy.
The model was tested with experimental data and good correlation was obtained between the
expected and experimental results.
iii
Table of contents
Title Page no.
CANDIDATE'S DECLARATION .................................................................... i ACKNOWLEDGEMENTS ................................................................................ii ABSTRACT...................................................................................................iii TABLEOF CONTENTS ................................................................................. iv LISTOF FIGURES .......................................................................................vii LISTOF TABLES ......................................................................................... ix
Chapter 1 ..............................I
Introduction .. ......................................................... ....................................
1.1 Nontraditional processes defined ....................................................................2
1.2 Why Nontraditional Processes are Important....................... ..............................2
1.3 Classification of Nontraditional Processes by Type of Energy Used ..........................2 3 1.4 Thermal Energy Processes- Overview .............................................................
1.5 Electrical Discharge Processes ......................................................................3
1.5.1 Work Material in EDM ....................................
1.5.2 Complex nature of the EDM material removal process ........................................ 5
1.5.3 EDMApplications ........................
Chapter 2 . ,.6 . ................... WireEDM process ...............................................................
..............................7 2.1 History ....................................................................
. ..........? 2.2 EDM vs WEDM ............................................................................
..............................8 2.3 Wire-EDM equipment ..................................................
2.3.1 Positioning system ........... ....................................
2 3 2 Wire drive system....., ...... .......................... ............
2.3.3 Power supply ..........................................
2,3.4 Dielectric system ............. ...........................................................................9
process parameters .. 2.4 Wire-EDM p .............................
iv
.1.110
2.5 Wire-EDM process capabilities ....... ,.....11
2.6 WEDM applications ............................................' ,..................................11
applications 2.6.1 Moderntoohngapp ••••••••••••••••"""" "" '
12 2.6.2. Advanced ceramic materials ............................................. .. ...
2.6.3. Modern composite materials ..................... ............................... • .. • ......12
Chapter 3 Literature review ................................................. ..'.""""'.""""""""."""'14
3.1 Process modeling ................................................ ....................................14
3.2 Process optimization ................................................... .............................15
Chapter 4 Neural Network Implementation Issues ......................................................16
4.1 Overview of Neural Network Training Methodology ..........................................16
4.1.1 Training and Test Data Selection ..................................................................19
4.1.2 ..................................................................................19 Scaling Input Vectors ....................................................................................19 4.1.3 Initializing Weights
4.1.4 Over fitting .............................................................................................. 20
4.1.5 Neural Network Noise ............................... ................................................. 20
4.1.6 Stopping Criteria and Cross Validation Training ............................................ 20
Chapter 5 ANN modeling of WEDM ...........................................................................22
5.1 Neural network model ...............................................................................22
5.2 Experimental details .................................................................................24
5.3 Results and Discussion ..............................................................................25
5.4 The effect of the cutting parameters on the performance ...................................... 25
V
Chapter 6 Optimization of wire EDM process parameters ...........................................30
6.1 Why constrained optimization technique? ........................................................31 6.2 Search for Pareto-optimal WEDM process parameters .........................................32
6.2.1 Discussion .............................................................................................. 3 4
Chapter 7 Summary and Conclusions ...................................................................37 Scope for future research .....................................................................38
REFERENCES .........................................................................................39
Appendix - 1 Neural Networks: an over view ...........................................45
Appendix - 2 An Overview of Genetic Algorithms ...............................................55
r
a
Vi
LIST OF FIGURES
Fig. no. Title Page no.
1 (a) EDM Overall setup ...............................................................................4
1 (b) EDM Close-up view of gap, showing discharge and metal removal .....................4 2 Schematic of wire EDM set up ...............................................................6 3 Definition of kerf and over cut in electric discharge wire cutting ........................7 4 Complicated shapes produced by wire EDM ..............................................11
5 Schematic diagram of a neuron and a sample of pulse train ..........................:.45
6 General symbol of neuron .....................................................................46
7 (a) Bipolar continuous activation functions of a neuron .....................................48 7 (b) Unipolar continuous activation functions of a neuron ....................................48 8 A standard artificial neuron ..................................................................48 9 Configuration and terminology of a multi-layered neural network ....................50
10 r
Neural Network Training Flow Chart ......................................................18
11 Configuration of the neural network ....................................................... 22
12 Sum of square error vs number of iterations in the training process ...................26
13 (a) Surface show the relationship of Gap Voltage with cutting rate (CR) .................27 13 (b) Surface show the relationship of Gap Voltage with surface roughness (SR) .........27 14 (a) Surfaces show the relationship of Ton with cutting rate (CR) .........................28 14 (b) Surfaces show the relationship of Ton with surface roughness (SR) .................28 15 (a) Surfaces show the relationship of Toff with cutting rate (CR) .........................28 15 (b) Surfaces show the relationship of Toff with surface roughness (SR) .................28 16 (a) Surfaces show the relationship of Ws with cutting rate (CR) .........................29 16 (b) Surfaces show the relationship of Ws with surface roughness (SR) ..................29 17 The basic structure of EA ...................................................................56 18 Structure of a single population evolutionary algorithm ..............................57
Vii
19 Roulette Wheel Selection ......................................... ....................... 66
20 Multi-point Crossover .......................................................................67
21 Binary Mutation .............................................................................70
22 Structure of the optimization system .....................................................30
23 Machining performance predictions of ANN model ...................................34
24 Maximized cutting speed vs. surface roughness .........................................35
1
vii'
LIST OF TABLES
Table no. Title Page no.
Training data for experiments planned according to Taguchi's method .................. 25
2 Test data for experiments with randomly selected input parameters .....................25
3 Process performance optimization .............................................................32 r
4 Actual vs predicted WEDM performances ................................................... 33
5 Sorted pareto-optimal points .....................................................................36
1x
Chapter 1 Introduction
Today's manufacturing . industry is facing challenges from advanced difficult-to-machine
materials (i.e. tough super alloys, ceramics, and composites), stringent design requirements (i.e.
complex shapes, high precision, and high surface quality), and machining costs. In order to cope
up with these challenges, it has become necessary to change to more sophisticated tools of
manufacturing.
This need for more sophisticated tools has resulted in the creation of a new, unique family of
manufacturing processes known as nontraditional manufacturing processes. Generally speaking,
non-traditional processes differ from conventional processes either on account of utilizing
energy in novel ways or by applying forms of energy directly for the purpose of manufacturing.
Wire Electrical Discharge Machining (WEDM), one of the widely accepted non-traditional
material removal processes, has certain unique advantages as compared to other prevalent
nontraditional cutting technologies including laser cutting, plasma cutting and water jet cutting.
The most attractive advantages of this process are long cutting edge (maximal cutting edge >
500 mm), a small cutting kerf (minimal kerf < 0.05 mm), a small cutting taper and a
homogeneous surface. Due to these inherent advantages, it offers an effective and economical
alternative to present large-scale machining techniques. The realization of a methodology which
can optimize the productivity and surface quality requirement of this process is of great
significance to promote the process in to the growingly demanding tool manufacturing industry.
In this thesis work, an attempt is made to model and optimize the wire-EDM process parameters
using the combination of Artificial Neural Network (ANN) and Genetic Algorithm (GA).
In the first chapter a brief introduction to non-traditional machining processes is presented.
Electrical Discharge Machining (EDM) process as one of the thermal energy processes is
discussed in a detailed manner. The process overview based on the widely accepted principle of
thermal conduction and some highlights of its applications are also given. In the second chapter,
an explanation of wire-EDM process is given along with the similarity and difference it has with
the die sinking EDM. History of wire EDM process, process equipment and its applications are
1
discussed in detail for better understanding of the process. In chapter three, the review of
literature is made in order to understand what has been done so far in the modeling and
optimization of wire EDM process. There are several choices to be made when implementing
neural networks to solve a problem. These choices involve the selection of the training and
testing data, the network architecture, the training method, the data scaling method, and the
error goal. Chapter four is devoted to this part of discussion. The main section of chapter five
focuses on modeling of wire EDM process by using multi-layered back propagation neural
network. The experimental details together with the network architecture developed for this
purpose and the results of training and testing by applying the experimental data to the network
is given in this chapter. Chapter six is devoted to the optimization of wire EDM process. The
use of Genetic algorithm to solve constrained optimization problem is explored. The importance
of finding pareto-optimal points and how to find them from the predicted data is also given. The
final part of the thesis gives the conclusions drawn from the results and indicates the future
research direction in WEDM modeling and optimization.
1.1 Nontraditional Processes Defined
A group of processes that remove excess material by various techniques involving mechanical,
thermal, electrical, or chemical energy (or combinations of these energies) but do not use a
sharp cutting tool in the conventional sense. Developed since World War II in response to new
and unusual machining requirements that could not be satisfied by conventional methods.
1.2 Why Nontraditional Processes are Important
• Need to machine newly developed metals and non-metals with special properties that
make them difficult or impossible to machine by conventional methods.
• Need for unusual and/or complex part geometries that cannot easily be accomplished by
conventional machining.
• Need to avoid surface damage that often accompanies conventional machining.
1.3 Classification of Nontraditional Processes by the Type of Energy Used
• Mechanical - erosion of work material by a high velocity stream of abrasives or fluid (or
both) is the typical form of mechanical action
2
• Electrical - electrochemical energy to remove material (reverse of electroplating)
• Thermal — thermal energy usually applied to small portion of work surface, causing that
portion to be removed by fusion and/or vaporization
• Chemical — chemical etchants selectively remove material from portions of workpart,
while other portions are protected by a mask.
1.4 Thermal Energy Processes- Overview
Very high local temperatures are involved; material is removed by fusion or vaporization.
Physical and metallurgical damage to the new work surface is common in this case. In some
cases, resulting surface finish is so poor that subsequent processing is required.
1.4.1 Thermal Energy Processes
• Electric discharge machining
• Wire electrical discharge cutting
• Electron beam machining
+ Laser beam machining
• Plasma arc machining
• Conventional thermal cutting processes
1.5 Electrical Discharge Processes
EDM is a non-traditional manufacturing process that uses electric spark discharges to machine
electrically conducting materials. This process is typically used for materials such as tool and
die-steels, ceramics, etc., which are hard to machine using a more traditional approach. During
the process, a voltage is applied between two electrodes, the tool and the workpiece, closely
placed inside a liquid dielectric medium. When electrodes are very close to each other (gap
distance 0.05 mm), an electric spark discharge occurs between them forming a plasma channel
between the cathode and the anode. Fig. 1 shows a close-up of the machining region. The spark
generates enough heat to melt and even vaporize some of the workpiece material. As the spark
collapses, some of the molten and vaporized workpiece material is removed from the rest of the
workpiece and is carried away by the dielectric. Discharge duration is controlled by the process
parameters used and can be anywhere from a few microseconds to hundreds of microseconds.
3
I Gap
T
-- Overcut
(a)
Although quantity of material removed per discharge is miniscule, a large number of discharges
occurring over time result in removal of the desired amount of material. As material is removed
from the workpiece the tool slowly moves towards the workpiece surface (aided by servo-
control mechanism) so that a constant gap between the two can be maintained. The liquid
dielectric serves two purposes. It helps to keep the expanding plasma channel confined to a
small diameter so that the intensity of the heat flux is very high over a small surface area of the
electrodes. This ensures that melting, and even vaporization, can occur. The other use of the
dielectric is to flush some of the particles that gather in the gap between the electrodes. EDM
processes can be broadly classified into two categories, die-sinking EDM where the tool shape
complements the final desired shape of the workpiece, and wire-EDM where the discharge takes
place between a thin wire and the workpiece. The wire in wire-EDM applications acts almost
like an electrical saw.
(b)
Tool feed
4
Too[
Ionized fluid
Metal removed
from cavity "I
4
wear
Discharge
—Flow of dielectric fluid
—Cavity created I by discharge
Recast metal
Figure 1- Electric discharge machining (EDM): (a) overall setup, and (b) close-up view of gap,
showing discharge and metal removal.
1.5.1 Work material in EDM
• Only electrically conducting work materials
• Hardness and strength of the work material are not factors in EDM
• Material removal rate is related to melting point of work material
CI
1.5.2 Complex nature of the EDM material removal process
EDM involves the complex interaction of many physical phenomena. The electric spark
between the anode and the cathode generates a large amount of heat over a small area of the
workpiece. A portion of this heat is conducted through the cathode, a fraction is conducted
through the anode, and the rest is dissipated by the dielectric. The duration of the spark is of the
order of microseconds and during this time a plasma channel is formed between the tool and the
workpiece. Electrons and ions travel through this plasma channel. The plasma channel induces a
large amount of pressure on the workpiece surface as well. This pressure holds back the molten
material in its place. As the plasma starts forming, it displaces the dielectric fluid and a shock
wave passes through the fluid. As soon as the spark duration time is over and the spark collapses,
the dielectric gushes back to fill the void. This sudden removal of pressure results in a violent
ejection of the molten and vaporized material from the workpiece surface [1,2]. Ejected molten
particles quickly solidify as they come in contact with the colder fluid and are eventually
flushed out by the dielectric. Small craters are formed at locations where material has been
removed. Multiple craters overlap each other and the machined surface that is finally produced
consists of numerous overlapping craters. Although molten material ejection is not the only
means of material removal in EDM it is, however, the dominant mode of material removal in
case of metals [2]. In the machining of ceramics which have much higher melting and boiling
points, material spalling is the mechanism for material removal [2]. During machining the local
temperature in the workpiece gets close to the vaporization temperature of the material [1,2].
Thus, phase transformation from solid to liquid as well as liquid to vapor occurs during the
heating cycle. Part of the transformed material is removed but the rest re-solidifies on the
surface of the workpiece. This re-solidified layer is usually called the white layer, as it is not
easily etchable. EDM processes carried out in hydrocarbon dielectrics lead to the partial
breakdown of dielectrics and this further leads to some diffusion of carbon.
1.5.3 EDM Applications.
• Tooling for many mechanical processes: molds for plastic injection molding, extrusion
dies, wire drawing dies, forging and heading dies, and sheet metal stamping dies
• Production parts: delicate parts not rigid enough to withstand conventional cutting forces,
hole drilling where hole axis is at an acute angle to surface, and machining of hard and
exotic metals.
hi
Chapter 2 Wire EDM process
The WEDM process differs from the conventional EDM process in that a small wire is engaged
as the -tool electrode. The wire unwinding from a wire supply wheel is continuously fed through
the workpiece by the wire traction rollers and taken by a collection spool. The workpiece
mounted on the clamp frame. is almost never submerged in the dielectric medium that is
delivered at the, gap between the wire and workpiece via a hose or flushed through the sparking
area coaxially with the wire. The wire-workpiece gap usually ranges from 0.025 to 0.05 mm and
is constantly maintained by a computer-controlled (CNC) positioning system. This positioning
system is also responsible for controlling the movement of the wire to achieve the desired
complex two- and three-dimensional (2- and 3-D) shapes for the workpiece.
_ T\ Wire supply spool
Wire electrode
Dielectric 'fluid flow
Cutting .path
••.::. Wire take-up spool
Feed motion axes
Fig. 2 schematic of wire EDM set up
0
'ire diameter.
Overcut Figure 3- Definition of kerf and over cut in electric discharge wire cutting
2.1 History
WEDM was first introduced to the manufacturing industry in the late 1960s. The development
of the process was the result of seeking a technique to replace the machined electrode used in
EDM. In 1974, D.H. Dulebohn applied the optical-line follower system to automatically control
the shape of the component to be machined by the WEDM process [1]. By 1975, its popularity
was rapidly increasing, as the process and its capabilities were better understood by the industry
[2]. It was only towards the end of the 1970s, when computer numerical control (CNC) system
was initiated into WEDM that brought about a major evolution of the machining process. As a
result, the broad capabilities of the WEDM process were extensively exploited for any through-
hole machining owing to the wire, which has to pass through the part to be machined
2.2 EDM vs WEDM
While the material removal mechanisms of EDM and WEDM are similar, their functional
characteristics are not identical. WEDM uses a thin wire continuously feeding through the
workpiece by a microprocessor, which enables parts of complex shapes to be machined with
exceptional high accuracy. A varying degree of taper ranging froml5 ° for a 100 mm thick to 30
7 4
° for a 400 mm thick workpiece can also be obtained on the cut surface. The microprocessor
also constantly maintains the gap between the wire and the workpiece, which varies from 0.025
to 0.05 mm [2]. WEDM eliminates the need for elaborate pre-shaped electrodes, which are
commonly required in EDM to perform the roughing and finishing operations. In the case of
WEDM, the wire has to make several machining passes along the profile to be machined to
attain the required dimensional accuracy and surface finish (SF) quality. The typical WEDM
cutting rates (CRs) are 300 mm2/min for a 50 mm thick D2 tool. steel and 750 mm2/min for a
150 mm thick aluminium [2], and SF quality is as fine as 0.12-0.25µRa. In addition, WEDM
uses deionized water instead of hydrocarbon oil as the dielectric fluid and contains it within the
sparking zone. The deionized water is not suitable for conventional EDM as it causes rapid
electrode wear, but its low viscosity and rapid cooling rate make it ideal for WEDM [2].
2.3 Wire-EDM equipment
A wire-EDM machine consists of four sub-systems: the positioning system, the wire drive
system, the power supply, and the dielectric system. All the four subsystems have distinct
differences from conventional EDM.
2.3.1 Positioning system
Wire-EDM positioning systems usually consist of a CNC two-axis table and, in some cases, an
additional multi-axis wire-positioning system. The most unique feature of the CNC system is
that it must operate in adaptive control mode to always insure the consistency of the gap
between the wire and work piece. If the wire should come in contact with the work piece or if a
small piece of material bridges the gap and causes a short circuit, the positioning system must
sense this condition and back up along the programmed path to reestablish the proper cutting
conditions.
2.3.2 Wire drive system
The function of the wire drive system is to continuously deliver fresh wire under constant
tension to the work area. The need for constant wire tension is important to avoid such problems
as taper, machining streaks, wire breaks, and vibration marks.
As the wire passes through the work piece, it is guided by a set of sapphire or diamond guides.
Before being collected by the take-up spool, it passes through a series of tensioning rollers.
Many wire-EDM systems use a massive granite slab as the machine base to further guarantee
wire accuracy and stability.
Automatic wire threading is a recently introduced feature that boosts productivity. It
automatically re-threads the wire after breakage and enables a longer round after one pass
through the work piece and it is discarded.
2.3.3 Power supply
The most pronounced differences between the power supplies used for wire-EDM and
conventional EDM are the frequency of the pulses used and the current. To produce the
smoothest surface finish possible, pulse frequencies as high as 1 MHz may be used with wire-
EDM. Such a high frequency ensures that each spark removes as little material as possible, thus
reducing the size of EDM crater.
Because the diameter of the wire used is so small, its current —currying capability is limited.
Because of this limitation, wire-EDM power supplies are rarely built to deliver more than 20
amp of current.
2.3.4 Dielectric system
De-ionized water is the dielectric used for the wire-EDM process. De-ionized water is used for
four reasons: low viscosity, high cooling rate, high material removal rate and absence of fire
hazard.
The small cutting gap used with wire-EDM mandates that a low-viscosity dielectric be used to
ensure adequate flushing. Water meets this criterion. Water can also remove heat from the
cutting area much more efficiently than conventional dielectric oils. More efficient cooling
results in extremely thin recast layers.
Very high specific material removal rates can be achieved when using water as dielectric;
however, the wear rate on the tool (wire) is also high. Because the wire is not reused, the high
tool-wear rate is of no consequence. This explains however why water is not commonly used
with conventional EDM.
Finally, because of the slow processing speeds of wire-EDM, many users run their most time —
consuming jobs overnight or over the weekend unattended. With conventional EDM, the use of
7
flammable dielectric oils presents a fire hazard. When using water for the dielectric, the fire
hazard problem is eliminated.
Rather than submerge the entire part into de-ionized water, local delivery is most often used.
Some systems deliver the dielectric fluid via a hose directed at the cut interface. The most
efficient method of dielectric delivery (with respect to flushing) is to provide a stream of de-
ionized water coaxial with the wire.
2.4 Wire-EDM process parameters
The linear cutting rate for wire-EDM is approximately 38-115rmn/hr in 25 mm thick steel or
approximately 20mm/hr in 76 mm steel. The linear speed is dependent upon the thickness of the
material but not upon the shape of the cut. The linear cutting rate is the same whether a straight
cut or complex curves are being generated.
The speed of the wire passing through the work piece can vary from 8-40 mm/sec depending
upon cutting conditions.
2.5 Wire-EDM process capabilities
Wire-EDM is a specialized process that is capable of machining electrically conductive work
pieces to produce fine finishes, extremely high accuracies and cut edges that have a smooth,
matte finish.
The matte finish is a result of the thousands of microscopic pits remaining from the spark
erosion. When applied to punch-and-die application, the oil-retaining quality of these micro pits
has been known to increase the die life. Surface finishes ranging from 0.12 to 0.25µm are
routinely obtained, and by utilizing a second "finish pass", finishes as good as 0.05 - 0.12 µm
are possible. Many wire-EDM machines are available with a positioning resolution of 0.001mm
and can routinely obtain accuracies off 0.007mm [2].
Advantages • No electrode fabrication required • No cutting forces • Unmanned machining • Die costs reduced by 30 — 70 % • Cuts hardened materials • Intricate shapes can be cut with same ease as that for straight cut. • Very small kerf width
10
Disadvantages • High capital cost • Recast layer • - Electrolysis can occur in some materials • Slow cutting rates • Not applicable to very large workpieces
2.6 WEDM applications
• Ideal for stamping die components since kerf is so narrow, it is often possible to
fabricate punch and die in a single cut.
• Other tools and parts with intricate outline shapes, such as lathe form tools, extrusion
dies, flat templates and almost any complicated shapes (Fig.4).
Fig.4 Complicated shapes produced by wire EDM
2.6.1 Modern tooling applications
WEDM has been gaining wide acceptance in the machining of various materials used in modern
tooling applications. Several authors [3,4] have . investigated the machining performance of
WEDM in the wafering of silicon and machining of compacting dies made of sintered carbide..
The feasibility of using cylindrical WEDM for dressing a rotating metal bond diamond wheel
used for the precisionform grinding of ceramics has also been studied [5]. The results show that
the WEDM process is capable of generating precise and intricate profiles with small corner radii
but a high wear rate is observed on the diamond wheel during the first grinding pass. Such an
11
initial high wheel wear rate is due to the over-protruding diamond grains, which do not bond
strongly to the wheel after the WEDM process [6]. The WEDM of permanent NdFeB and `soft'
MnZn ferrite magnetic materials used in miniature systems, which requires small magnetic parts,
was studied by comparing it with the laser-cutting process [7]. It was found that the WEDM
process yields better dimensional accuracy and SF quality but has a slow CR, 5.5 mm/min for
NdFeB and 0.17 mm/min for MnZn ferrite. A study was also done to investigate the machining
performance of micro-WEDM used to machine a high aspect ratio meso-scale part using a
variety of metals including stainless steel, nitronic austentic stainless, beryllium copper and
titanium [8].
2.6.2. Advanced ceramic materials
The WEDM process has also been evolved as one of the most promising alternatives for the
machining of the advanced ceramics. Sanchez et al. [9] provided a literature survey on the EDM
of advanced ceramics, which have been commonly machined by diamond grinding and lapping.
In the same paper, they studied the feasibility of machining boron carbide (B4C) and silicon
infiltrated silicon carbide (SiC) using EDM and WEDM. Cheng et al. [10] also evaluated the
possibility 'of machining ZrB2 based materials using EDM and WEDM, whereas Matsuo and
Oshima [11] examined the effects of conductive carbide content, namely niobiumcarbide (NbC)
and titaniumcarbide (TiC), on the CR and surface roughness of zirconia ceramics (Zr02) during
WEDM. Lok and Lee [12] have successfully WEDMed sialon 501 and aluminium oxide--
titaniumcarbide (A1203—TiC). However, they realized that the MRR is very low as compared to
the cutting of metals such as alloy steel SKD-11 and the surface roughness is generally inferior
to the one obtained with the EDM process. Dauw et al. [13] explained that the MRR and surface
roughness are not only dependent on the machining parameters but also on the material of the
part. An innovative method of overcoming the technological limitation of the EDM and WEDM
processes requiring the electrical resistivity of the material with threshold values of
approximately 100 (1/cm [14] or 300 a /cm [15] has recently been explored. There are different
grades of engineering ceramics, which Konig et al. [ 14] classified as non-conductor, natural-
conductor
and conductor, which is a result of doping nonconductors with conductive elements.
Mohri et al. [ 16] brought a new perspective to the traditional EDM phenomenon by using an
assisting electrode to facilitate the sparking of highly electrical-resistive ceramics. Both the
EDM and WEDM processes have been successfully tested diffusing conductive particles from
12
assisting electrodes onto the surface of sialon ceramics assisting the feeding of electrode through
the insulating material. The same technique has also been experimented on other types of
insulating ceramic materials including oxide ceramics such as Zr02 and A1203, which have very
limiting electrical conductive properties [17].
2.6.3. Modern composite materials
Among the different material removal processes, WEDM is considered as an effective and
economical tool in the machining of modern composite materials. Several comparative studies
[18, 19] have been made between WEDM and laser cutting in the processing of metal matrix
composites (MMC), carbon fibre and reinforced liquid crystal polymer composites. These
studies showed that WEDM yields better cutting edge quality and has better control of the
process parameters with fewer workpiece surface damages. However, it has a slower MRR for
all the tested composite materials. Gadalla and Tsai [20] compared WEDM with conventional
diamond sawing and discovered that it produces a roughness and hardness that is comparable to
a low speed diamond saw but with a higher MRR. Yan et al. [21] surveyed the various
machining processes performed on the MMC and experimented with the machining of
A1203/6061Al composite using rotary EDM coupled with a disk-like electrode. Other studies
[22, 23] have been conducted on the WEDM of A1203 particulate reinforced composites
investigating the effect of the process parameters on the WEDM performance measures. It was
found that the process parameters have little influence on the surface roughness but have an
adverse effect on CR. -
13
Chapter 3 Literature review
Wire EDM manufacturers and users always want to achieve higher machining productivity with
a desired accuracy and surface finish. Performance of the WEDM process, however, is affected
by many factors (workpiece material, wire material, dielectric medium, adjustable parameters,
etc.) and a single parameter change will influence the process in a complex way. As surface
finish and cutting. speed are the most important parameters in manufacturing, investigations
have been carried out by several researchers [24-27] for improving the surface finish and cutting
speed of WEDM process. However, Because of the many variables and the complex and
stochastic nature of the process [28], achieving the optimal performance, even for a highly
skilled operator with a state-of-the-art WEDM machine is rarely possible. An effective way to
solve this problem is to discover the relationship between the performance of the process and its
controllable input parameters (i.e., model the process through suitable mathematical techniques),
and then determine the optimal parameters for a given set of conditions.
Investigation into the influences of machining input parameters on the performance of EDM and
WEDM have been reported widely [24-41] and several attempts have been made to model the
process.
3.1 Process modeling
Traditionally, the selection of the most favorable process parameters was based on experience or
handbook values, which produced inconsistent machining performance. However, the
optimization of parameters now relies on process analysis to identify the effect of operating
variables on achieving the desired machining characteristics. The modeling of the WEDM
process by means of mathematical techniques has also been applied to effectively relate the
large number of process variables to the different performance of the process. Spedding and
Wang [42] developed the modeling techniques using the response surface methodology and
artificial neural network technology to predict the process performance such as MR, SQ and
surface waviness within a reasonable large range of input factor levels. Liu and Esterling [43]
proposed a solid modeling method, which can precisely represent the geometry cut by the
14
WEDM process, whereas Hsue et al. [44] developed a model to estimate the MRR during
geometrical cutting by considering wire deflection with transformed exponential trajectory of
the wire centre. Spur and Scho"nbeck [451 designed a theoretical model studying the influence
of the workpiece material and the pulse-type properties on the WEDM of a workpiece with an
anodic polarity. Han et al. [46] developed a simulation system, which accurately reproduces the
discharge phenomena of WEDM. The system also applies an adaptive control, which
automatically generates an optimal machining condition for high precision WEDM.
3.2 Process optimization.
Many different types of problem-solving quality tools have been used to investigate the
significant factors and its inter-relationships with the other variables in obtaining an optimal
WEDM CR. Konda et al. [29] classified the various potential factors affecting the WEDM
performance measures into five major categories namely the different properties of the
workpiece material and dielectric fluid, machine characteristics, adjustable machining
parameters,. and component geometry. In addition, they applied the design of experiments (DOE)
technique to study and optimize the possible effects of variables during process design and
development, and validated the experimental results using noise-to-signal (S/N) ratio analysis.
Tarng et al [30] employed a neural network system with the application of a simulated
annealing algorithm for solving the multi-response optimization problem. It was found that the
machining parameters such as the pulse on/off duration, peak current, open circuit voltage,
servo reference voltage, electrical capacitance and table speed are the critical parameters for the
estimation of the CR and SF. Huang et at [31] argued that several published works are
concerned mostly with the optimization of parameters for the roughing cutting operations and
proposed a practical strategy of process planning from roughing to finishing operations. The
experimental results showed that the pulse on-time and the distance between the wire periphery
and the workpiece surface affect the CR and SF significantly. The effects of the discharge
energy on the CR and SF of a MMC have also been investigated.
15
Chapter 4 Neural Network Implementation Issues
Due to its ability to address complex and nonlinear problems (problems whose solutions have
not been explicitly formulated), the widely accepted method, artificial neural network (ANN) is
chosen to model the complex behavior between input and output in the WEDM process. It has
been used extensively in many fields such as forecasting, pattern recognition, robotics,
parameter selection, process modeling, monitoring, and controlling etc. It is originally based on
the human thoughts of receiving and transferring the information in making decision. A simple
model of ANN consists of an input layer, a hidden layer and an output layer. With sets of input—
output patterns stored in input and output layers, the hidden layer interconnects different
strength of information from the input to the output layers, through so-called weights. The
weights are adjusted in the learning process in which all the patterns of input—output are
presented in the learning phase repeatedly. There are many learning algorithms available and the
most popular and successful learning algorithm used to train multilayer network is the back
propagation scheme. Any output point can be obtained after this learning phase, and good
results can be achieved. In Appendix — 1, a brief review of the fundamentals of multilayered
feed-forward neural networks is provided. For more details, reference may be made to Freeman
and Skapura [47] and Vemuri [48].
Neural networks are highly flexible modeling tools with an ability to learn the mapping between
input variables and output feature spaces. Therefore, neural networks are considered in this
work to model the wire-EDM process with multi-dimensional input and output spaces.
There are several choices to be made when implementing neural networks to solve a problem.
These choices involve the selection of the training and testing data, the network architecture, the
training method, the data scaling method, and the error goal. Since over 90% of all neural
network implementations use back propagation trained multi-layer perceptrons, an attempt has
been made to discuss and implement it in this work.
4.1 Overview of Neural Network Training Methodology
Figure 10 shows the methodology to follow when training a neural network. First we must
collect or generate the data to be used for training and testing the neural network. In the present
16
case experimental data generated on wire EDM has been used. Once this data is collected, it
must be divided into a training set (Table 1) and a test set (Table 2). The training set should
cover the input space or should at least cover the space in which the network will be expected to
operate. If there is not training data for certain conditions, the output of the network should not
be trusted for those inputs. The division of the data into the training and test sets is somewhat of
an art and somewhat of a trial and error procedure. We want to keep the training set small so
that training is fast, but we also want to exercise the input space well which may require a large
training set.
Once the training set is selected, we must choose the neural network architecture. There are two
lines of thought here. Some designers choose to start with a fairly large network that is sure to
have enough degrees of freedom (neurons in the hidden layer) to train to the desired error goal;
then, once the network is trained, they try to shrink the network until the smallest network that
trains remains. Other designers choose to start with a small network and grow it until the
network trains and its error goal is met. We will use the second method which involves initially
selecting fairly small network architecture.
After the network architecture is chosen, the weights and biases are initialized and the network
is trained. The network may not reach the error goal due to one or more of the following reasons.
1. The training gets stuck in local minima.
2. The network does not have enough degrees of freedom to fit the desired
input/output model.
3. There is not enough information in the training data to perform the desired
mapping. J
17
Collect Data
Select Training and Test Sets
Select Neural Network Architecture
r
Initialize Weights
Change Weights N SSE Goal or
Increase NN Size Met?
Y Run Test Set
Reselect Training Set or Collect More Data
SSE Goal N Met?
Y Done
Fig.10 Neural Network Training Flow Chart
In case one, the weights and biases are reinitialized and training is restarted. In case two,
additional hidden nodes or layers are added, and network training is restarted. Case three is
usually not apparent unless all else fails. When attempting to train a neural network, you want
to end up with the smallest network architecture that trains correctly (meets the error goal); if
not, you may have over fitting. Over fitting is described in greater detail in Section 4.1.4.
Once the smallest network that trains to the desired error goal is found, it must be tested with the
test data set. The test data set should also cover the operating region well. Testing the network
• involves presenting the test set to the network and calculating the error. If the error goal is met,
training is complete. If the error goal is not met, there could be two causes:
1. Poor generalization due to an incomplete training set.
2. Over fitting due to an incomplete training set or too many degrees of freedom in the
network architecture.
The cause of the poor test performance is rarely apparent without using cross validation
checking which will be discussed in Section 4.1.6. If an incomplete test set is causing the poor
performance, the test patterns that have high error levels should be added to the training set, a
new test set should be chosen, and the network should be retrained. If there is not enough data
left for training and testing, data may need to be collected again or be regenerated.
4.1.1 Training and Test Data Selection
Neural network training data should be selected to cover the entire region where the network is
expected to operate. Usually a large amount of data is collected and a subset of that data is used
to train the network. Another subset of that data is then used as test data to verify the correct
generalization of the network. If the network does not generalize well on several data points,
that data is added to the training data and the network is retrained. This process continues until
the performance of the network is acceptable.
The training data should bound the operating region because a neural network's performance
cannot be relied upon outside the operating region. This ability is called a network's
extrapolation ability.
4.1.2 Scaling Input Vectors
Training data is scaled for two major reasons. First, input data is usually scaled to give each
input equal importance and to prevent premature saturation of sigmoidal activation functions.
Secondly, output or target data is scaled if the output activation functions have a limited range
and the unscaled targets do not match that range.
There are two popular types of input scaling: linear scaling and z-score scaling. Linearly
scaling transforms the data into a new range which is usually 0.1 to 0.9. 1
4.1.3 Initializing Weights
As mentioned above, the initial weights should be selected to be small random values in order to
prevent premature saturation of the sigmoidal activation functions. The most common method
is to use the random number generator and pass it the number of inputs plus 1 and the number of
hidden nodes for the first hidden layer weight matrix W1 and pass it the number of outputs and
hidden nodes plus 1 for the output weight matrix W2. One is added to the number of inputs in
19
W 1 and to hidden in W2 to account for the bias. To make the weights somewhat smaller, the
resulting random weight matrix is multiplied by 0.5.
4.1.4 Over fitting
Several parameters affect the ability of a neural network to over fit the data. Over fitting is
apparent when a networks error level for the training data is significantly better than the error
level of the test data. When this happens, the data learned the peculiarities of the training data,
such as noise, rather than the underlying functional relationship of the model to be learned.
Over fitting can be reduced by:
1. Limiting the number of free parameters (neurons) to the minimum necessary.
2. Increasing the training set size so that the noise averages itself out.
3. Stopping training before over fitting occurs.
4.1.5 Neural Network Noise
As discussed above, when there is noise in the training data, a method to calculate the RMS
error goal needs to be used. If there is significant noise in the data, increasing the number of
patterns in the training set can reduce the amount of over fitting.
4.1.6 Stopping Criteria and Cross Validation Training
The last method of reducing the chance of over fitting is cross validation training. Cross
validation, training uses the principle of checking for over fitting during training. This
methodology uses two sets of data during training. One set is used for training and the other is
used to check for over fitting. Since over fitting occurs when the neural network models the
training data better than it would other data, checking data is used during training to test for this
over learning behavior.
At each training epoch, the RMS error is calculated for both the test set and the checking set. If
the network has more than enough neurons to model the data, there will be a point during
training when the training error continues to decrease but the checking error levels off and
begins to increase.
In summary, there are four methods to reduce the chance of over fitting:
1. Limiting the number of free parameters.
20
2. Training to a realistic error goal.
3. Increase the training set size.
4. Use cross validation training to identify when over fitting occurs.
These methods can be used independently or used together to reduce the chance of over fitting.
21
Chapter 5 ANN modeling of WEDM
5.1 Neural network model
Commercial software MATLAB Version 6.3 is used for coding the Neural Network program.
The stopping criteria used in the current study was set at 2000 maximum epoch number, and the
characteristics of the training set was train multiplayer. Whereas the testing set was set once the
difference between sum square error of the actual and predicted values is g x 10"3.
A feed forward neural network is adopted here to model the wire-EDM process. The feed
forward neural network is composed of many interconnected artificial neurons that are often
grouped into input, hidden and output layers (Fig 11).
f (Hp)
CD CD
C) CD
Cd
VG
1 Ton
Toff
Ws
iparator
CR
[~— SR
performances des
Hidden nodes
Fig. 11. Configuration of the neural network.
22
The fundamental equation which defines input out put relationship can be expressed as follows:
Y= f (X, W) (vi)
Where Y represents the performance parameters, such as the MRR and surface roughness; X is
a vector of the input variables to the neural network, and W is the weight matrix that is
evaluated in the network training process. f (.) represents the model of the process that is to be
built through neural network training.
The modeling phase involves the establishment of the model using multilayer feed forward
neural network architecture. The back propagation algorithm finds the optimum values of the
weights that minimize the error between the target and the calculated (network output)
performance parameters. Fig. 11. shows the network architecture of the developed model.
The following relations were used to combine the inputs of the network at the nodes of the
hidden layer and the output layer, respectively.
Hp = EVhpXh , Oq = E pq.ZP
Both outputs at the hidden (Zh = f (Hp)) and output layer (Yq =f (0k)) are calculated using
sigmoid function, mainly because of its well-known use as a transfer function for many
applications. Combining equations (vi) & (vii), the relations for the output of the network is
given by the following relation:
Y q =f (Oq) =f( pq.Zp) = J (Wpq.( EVhpXh))
Finaly, the output of the network (Yq ) was compared with the measured performance (Tq ) of
the process using a simple sum of square error (Eq) as follows:
Eq = (Yqq - T9) 2 k1
The artificial neuron evaluates the inputs and determines the strength of each one through its
weighting factor calculated by the back-propagation learning algorithm [Appendix - 1]. The
weighted inputs are summed to determine the output of the neuron using a sigmoid transfer
function. The output of the neuron is then transmitted along the weighted outgoing connections
23
to serve as an input to subsequent neurons. In this study, the neurons of the input and output
layers are used to receive the input variable of cutting parameters and to send out the output
variable of cutting performance, respectively. To properly map the input and output
relationships in the wire-EDM process with the neural network, finite discrete samples of
experimental data are required for training the neural network given in section 5.2. During the
training process, the number of neurons in the hidden layer is determined by trial-and-error
experimentation. It is found that a single hidden layer with 11 neurons can provide better
convergence in modeling the wire-EDM process. As shown in Fig. 12, the sum of square error
(SSE) between the desired and predicted outputs is almost reduced to zero after 2000 iterations
during the training process. Therefore, a feed forward neural network with a 5-11-2 type is
adopted here to associate the cutting parameters with the cutting performance.
5.2 Experimental details
Titanium alloy was chosen as the work material and work piece thickness was kept as 5 mm.
Brass wire of 0.25mm was used for all the experiments. Experiments were planned using a
factorial design based on Taguchi's L18 orthogonal array with 21 x 34. The machining voltage
(Va) was maintained at 80V and conductivity of dielectric (Cd) at 50 and 250 p-mho. The other
four parameters were maintained at three levels; pulse duration (Ton) at 1.1, 1.2 and 1.31.ts; time
between two pulses (Toff) at 30, 34, 38µs; gap voltage (GV) at 50, 60, 70 volts; and wire speed
(Ws) at 4, 6, 8 m/min. For testing the results, 16 experiments were conducted, on the basis of
randomly selected input parameters. For each set of parameters the workpiece was straight cut
for a length of 10 mm.
The linear cutting rate reading was noted down. Each piece was cleaned and the
surfaceroughness was measured as Ra value using profilometer. The average of six readings.
taken perpendicular to the direction of cut was chosen as the surface roughness value. The
results of the experiments given in Table 1 are based on Taguchi's method and Table 2 gives
data obtained by randomly selecting the input parameters.
24
Table 1 Training data: MRR and surface finish for experiments planned according to Taguchi's method'
Si.no Cd GV Ton Toff Ws CR (mm/min) SF (micron)
1 50 50 1.1 30 4 4.1 2.88 2 50 50 1.2 34 6 4.1 3.01 3 50 50 1.3 38 8 3.9 3.15 4 50 60 1.1 30 6 3.3 3.02 5 50 60 1.2 34 8 3.2 3.15 6 50 60 1.3 38 4 3.1 3.64 7 50 70 1.1 34 4 2.2 2.74 8 50 70 1.2 38 6 2.1 2.28 9 50 70 1.3 30 8 2.7 3.21 10 250 50 1.1 38 8 3.1 3.18 11 250 50 1.2 30 4 4.2 3.23 12 250 50 1.3 34 6 4.1 3.22 13 250 60 1.1 34 8 2.8 2.71 14 250 60 1.2 38 4 2.9 3.02 15 250 60 1.3 30 6 3.6 3.00 16 250 70 1.1 38 6 1.8 3.06 17 250 70 1.2 30 8 2.5 3.08 18 250 70 1.3 34 4 2.4 3.13
a workpiece height, Hw = 5 mm
Table 2
Test data: MRR and surface finish for experiments with randomly selected input parameters.a
Si.no Dc GV Ton Toff Ws CR (rnni/min) SR (micron)
1 250 60 1.2 35 7.5 3.0 3.01 2 50 58 1.1 30 7.9 3.5 2.82 3 50 66 1.2 30 7.6 2.9 2.94 4 50 68 1.26 33 5.6 2.7 2.92 5 250 56 1.19 31.7 6.6 3.7 3.33 6 50 65 1.25 30 6.1 3.2 3.15 7 250 64 1.19 34.7 4.8 2.7 3.12 8 50 68 1.23 30.4 5.2 2.8 3.27 9 250 52 1.29 31.8 4.0 4.2 3.11 10 250 68 1.17 32 4.0 2.6 2.99 11 250 62 1.22 34.9 7.9 2.8 2.93 12 50 50 1.2 30 7.2 4.4 2.93 3 250 54 1.26 32.9 7.0 3.9 3.22 14 50 60 1.19 33,6 7.6 3.2 2.97 15 250 58 1.24 32.2 4.5 3.5 3.10 16 50 70 1.24 31.4 4.4 2.7 3.06
a workpiece height, Hw = 5 mm
25
5.3 Results and Discussion
To properly map the input and out put relationships in the wire EDM process with the neural
network, finite discrete samples of experimental data given in Tables 1 and 2 are used for
training and testing the network. During the training process, the number of neurons in the
hidden layer is determined by trial-and error experimentation as discussed in section 4.1. It is
found that a single hidden layer with 11 neurons can provide better convergence in modeling the r
wire EDM process. As shown in Fig. 12, the sum of square errors (SSE) between the desired and
predicted outputs is almost reduced to zero after 2000 iterations during the training process. The
network is further tested by applying test data and showed good prediction capability with sum
of square errors close to 0.01. Therefore, a feed forward neural network with 5-11-2 type (Fig
11) is adopted here to associate the cutting parameters with the cutting performance.
Plot of sum of square errors for the trlaning data. 0.035
0.03
0.025
0.02
W N 0.0115
0.01
0.005
00
i
200 400 600 800 1000 1200 1400 1600 1800 Number of epochs
2000
Fig. 12 Sum of square error vs number of iterations in the training process
5.4 The effect of the cutting parameters on the performance of the process according to the
developed model
In the following, the effect of the cutting parameters on the cutting performance will be studied
one by one based on this developed neural network. In reality, cutting parameters affect the
cutting performance of one another. To separate the effect caused by each cutting parameter, the
other cutting parameters are set to a middle value in the allowable working spaces when one of
the cutting parameters is varied and analyzed. The effect of the variations of cutting parameters
26
3.2
3.1
- 3 10
92.9
2.E
2.i
2.1 183 Cd (micro-mho 'o-mhol
;
on the machining speed and machined surface roughness are shown in figures accompanied with
an explanation of the effect of each cutting parameter on the machining speed and machined
surface roughness.
(1) Gap voltage (GV): As can be seen in Fig. 13, the higher the gap voltage, the longer the
discharge off time (Toff). To obtain the longer discharge off time, the machining speed needs to
be slowed down. This will lead to a wider average discharge gap. Therefore, the discharge
condition becomes more stable but the number of discharge cycles decreases within a given
period. Owing to this stable machining, surface accuracy becomes better.
4
E E
58 82 GV (voHj
(a)
Fig.13 Surfaces show the relationship of GV with (a) cutting rate (CR), (b) surface roughness
(SR)
(2) Pulse on time (Ton): It can be seen (Fig 14) that machining speed increases with increase in
the pulse on time. On the contrary, surface finish decreases with increasing the pulse on time
(Fig.14). This is because the discharge energy increases with the pulse on time. As a result,
machining speed becomes faster with the increase of the discharge energy. However, in the
meantime, the discharge gap becomes wider so as to increase surface roughness.
PXA
_^.✓ iJP/fir/IIiJI _ ~riirlri/!ii ii•~i/iriiiiJii
OBOJId i~v~~fr°•.rd.•A
4.
_____ • 11Ti1iTi
it a®/~' ~'/iijf%i
ii/iJljfJ` '%of!i
i
(a) (b)
Fig. 14 Surfaces show the relationship of Ton with (a) cutting rate (CR), (b) surface
roughness (SR).
(3) Pulse off time (Toff): As the pulse off time is decreased, the number of discharges within a
given period becomes more. This will lead to a higher machining speed. But, surface finish
becomes poor because of a larger number of discharges (Fig.15).
32 3.4
G 3 F E .E. 02 2.8
2.6
hE 3.21
E 3
2.8
Toff
V 193 17 CG (micro-mho) 2.6
Toll
f 183 ca (010r02-11220)
(a) (b)
Fig.15 Surfaces show the relationship of Toff with (a) cutting rate (CR), (b) surface
roughness (SR).
183 17 Ctl (micro-mho)
3.5
T i
/183 117 CU (ml.ro-mho)
'J'/.s. (mmin) 7
(4) Wire feed speed (Ws): Fig.16 reveals, as wire speed increases, the discharge density at
particular space and time in the discharge gap decreases, this is because the evacuation
capability of the bye-products from the discharge gap increases with wire speed. This in turn
means the cutting speed decreases due to the low input energy, per time and space. The
surface finish improves due to a more stable machining.
W. (rnhnin) 7 2 850
(a)
(b)
Fig. 16 Surfaces show the relationship of Ws with (a) cutting rate (CR), (b) surface
roughness (SR).
29
Chapter 6 Optimization of wire EDM process parameters
In this phase, the input parameters to the network were coded as chromosomes for genetic
evaluation. Since the modeling part of the problem has already been solved in the previous
phase, the optimization phase is straight forward. In this case, the structure of the network (Fig.
22) is seen as a black box for the user.
Fitness function Cd
Optimal value
0
o Ton 0
CD
Toff
Ws
Fig.22 Structure of the optimization system
In this work, ANN is combined with GA to get the optimum value. (The introductory concept of
GA and GA operators are discussed in Appendix — 2). To search for the optimum, GA requires
the optimized weights of the ANN. ANN first provides the GA with the final weight settings of
each neuron after training and validation of the network (Fig.22). Consequently, both GA and
ANN programs should be linked-up and exchanged data with each other. In the current study,
the procedure adopted is as follows. First ANN writes the selected optimal weight setings in the
text file. The text file is then read by the GA and received as ANN parameter. Then, GA
optimizes the parametric setting based on the constrained optimization technique. The surface
roughness as an output (i.e. SR) generated from this procedure is compared with the designers
X17
surface roughness requirement (limit). If the value exceeds the limit, GA generates new input
parameters from the GA operator, i.e. mutation and crossover [Appendix — 2]. These steps are
repeated until the optimal cutting rates are found for the given surface roughness limit. This is
an iterative process at the end of which the GA arrives at the optimum set of machining
parameters which produce the optimal cutting rate for acceptable limit of surface roughness.
6.1 Why constrained optimization technique?
Two questions must be answered with regard to selection of parameters: What is the best
parameter combination? And how can we get it? In the case of multiple objectives, it is known
that no perfect run exists that can result in both the best cutting speed and surface roughness.
However, in the production environment, the surface finish quality of a workpiece, which is
determined by the designer or process engineer, must be fulfilled, and productivity is of
secondary importance, when compared with the quality requirement.
Therefore, the best parameters can be regarded as those that maximize productivity and fulfill
the surface finish quality requirements. For the present approach, the best combination of
parameter levels should produce the maximum cutting speed, while the surface roughness is
within requirement. This problem can be represented and solved by a constrained optimization
technique. The optimization model can be expressed as:
Max speed = CR (Cd, GV, Ton, Toff, WS)
Subject to 0 < SR (Cd, GV, Ton, Toff, WS) <a
50 <= Cd <= 250
50 <= VG <= 70
1.1 <= Ton <= 1.3
30<=Toff<=38
4<=Ws<=8
Where a is maximum allowable Ra value. The value of a should be within the range of
predicted Ra values i.e. within 2.28pm and 3.64µm
The functions CR (*), and Ra (*), are represented by the ANN model. For a given a, the
solution of the problem can be obtained from the GA optimizer, the output of which is
parameter combinations. GA program is coded in Matlab to solve the optimization problem.
31
Table 3 illustrates the solutions. By using Table 3, the best parametric combination can be
selected. For example, if the roughness of workpiece surface should be less than 3.0 µm, the
best parametric combination would be (227 54 1.2 33 4) which will yield a cutting speed of 3.35
mm/min other combinations will either yield a lower cutting speed or violate the surface finish
requirements.
Table 3 Process performance optimization
Parametric combinations
Cd GV Ton Toff Ws CR (mm/min) SR (micron)
99 54 1.1 32 5 1.92 2.227 62 52 1.2 37 6 2.03 2.289 69 58 1.1 36 7 2.24 2.310 63 62 1.1 37 7 2.28 2.311 62 59 1.1 35 5 2.34 2.320 87 62 1.1 38 7 2.53 2.329 58 61 1.1 35 6 2.74 2.338 81 63 1.1 37 7 2.78 2.349 60 68 1.1 37 7 2.88 2.360 51 67 1.1 36 6 3.01 2.371 50 64 1.1 35 6 3.04 2.381 100 •64 1.1 38 7 3.06 2.418 243 60 1.1 30 6 3.10 2.508
.85 55 1.2 35 5 3.12 2.599 70 50 1.3 38 7 3.18 2.700 116 50 1.1 34 8 3.18 2.799 228 67 1.1 35 6 3.23 2.899 235. 53 1.3 34 4 3.29 2.950 227 54 1.2 33 4 3.35 3.000 54 51 1.1 30 7 3.38 3.15 166 63 1.2 31 7 3.39 3.199 121 50 1.2 30 7 3.48 3.302 88 53 1.3 33 8 3.50 3.400
77 51 1.3 31 7 3.60 3.450 174 58 1.2 38 5 3.66 3.502 93 62 1.3 37 5 3.67 3.549 81 57 1.3 35 4 3.83 3.610 80 68 1.3 36 7 4.14 3.612 110 66 1.3 34 4 4.20 3.619 71 69 1.3 36 4 4.20 3.629 96 58 1.3 34 4 4.27 3.388
The five sample settings of the five cutting parameters obtained from the optimization technique
are listed in Table 4. As indicated in Table 4, the errors between the expected and experimental
performance results are reasonably small.
32
I
Table 4 Actual vs predicted WEDM performances
Parametric combinations CR (mm/min) SR (micron)
Cd GV Ton Toff Ws Prediction Actual Error (%) Prediction Actual Error (%)
69 58 1.1 36 7 2.24 2.3 2.6 2.310 2.4 3.89
60 68 1.1 37 7 2.88 2.65 7.9 2.360 2.53 7.20
243 60 1.1 30 6 3.10 3.3 6.45 2.508 2.43 3.11
227 54 1.2 33 4 3.35 3.45 2.98 3.000 2.89 3.66
80 68 1.3 36 7 4.14 4.4 6.28 3.612 3.89 7.69
Average error (%) 5.24 5.11
6.2 Search for Pareto-optimal WEDM process parameters
In the case of multiple objectives, there may not exist one solution that is best or global
optimum with respect to all objectives. The presence of multiple objectives in a problem usually
give rise to a family of non-dominated or non-inferior solutions, largely known as Pareto-
optimal solutions, where each objective component of any solution along the Pareto-front can
only be improved by degrading at least one of its other objective components. Since none of the
solutions in the non-dominated set is absolutely better than any other, any one of them is an
acceptable solution. As it is difficult to choose any particular solution for a multi-objective
optimization problem without iterative interaction with the decision maker, one general
approach is to establish the entire set of Pareto-optimal solutions.
By searching the Pareto-optimal solution one can find multiple optimal solutions. The fitted ANN
model is assumed to represent the relationship between process performance and controllable
factors and is used to predict the performance for 625 randomly generated combinations of input
parameter levels. Fig. 23 illustrates the prediction result. This figure does not directly illustrate
the process response with respect to input factors but gives a visual demonstration of the
relationship between the predicted responses (cutting rate vs. roughness). Every point
corresponds to a particular combination of input parameter levels. In Fig. 23, an approximate
tendency that a smaller surface roughness corresponds to a slower cutting speed seems to be
33
shown. Faster cutting speed (higher productivity), therefore, will result in larger roughness
(worse surface finish).
All 625 outputs for cutting speed and surface roughness were plotted in Fig.23. For
convenience, CR and 1/Ra were considered as X- and Y-axis, respectively. Pareto optimal
solutions have to be searched out from all these 625 outputs. Here, the Pareto-optimal solutions
means that it is better than any other output at least with respect to one process criterion i.e. CR
or 1/Ra. If one parameter combination results in higher in both the process criterion or if it is
higher with respect to at least one process criterion and is equal with respect to other process
criterion to a second, then the second parametric combination should never be selected in
preference to the first. In other words graphically a point is not optimum if there is any other
point, which is above and right to the point. If both points have same coordinate both will be
considered.
0.5
0 0.4 I.
E 0.35
0.3
0.25
o non optimal points ® pareto-optimal points
1.25 2.25 3.25 4.25
CR (mm/min)
Fig. 23. Machining performance predictions of ANN model for all 625. combinations.
34
6.2.1 Discussion
Some specific points situated at the boundary constitute a Pareto-optimal front as shown in Fig.
23. Certainly all points other than this set of optimum points are not desirable. Excel program
was used to find out these optimum points from the set of all 625 points. It was observed that
out of 625 points there were only 38 optimum points. These set of Pareto-optimal solutions are
very much useful because manufacturing engineer can adapt to different optimal solutions, as
and when required. This is a major advantage of this approach over constrained optimization
technique. Once the Pareto-optimal set is available there is no need to run the program again.
Just by scanning the chart of optimal solutions one can readily find out the optimum parametric
combination for a given surface roughness requirement. Table 4 contains the sorted list
(increasing Ra) of all these 38 optimum parametric combinations. This chart may be used as a
technology guideline for optimum machining of titanium alloy (Ti — 6AL — 4V). For example if
the required Ra value is less than or equal to 2.7µm, then the best parametric combination which
will optimize the cutting speed is given at the serial number 13 in the technology guideline
shown in Table 4. In Fig.24, a plot of all these. optimum points is shown. From this plot it can be
observed that the surface roughness increases as the maximum cutting speed increases. r
o actual data
2 2.5 3 3.5
Minimum Ra (micron)
Fig. 24. Maximized cutting speed vs. surface roughness.
35
E E
C.) E E ca
5
4.5
4
3.5
3
2.5
2
1.5
1
Table 5 Sorted pareto-optimal points
Parametric combinations
S.no Cd GV Ton Toff Ws CR (mm/min) SR (micron)
1 57 50 1.1 37 7 1.54 2.152 2 52 50 1.1 37 6.7 1.57 2.157 3 61 51 1.11 36.2 6.5 1.60 2.169 4 54 50.4 1.10 35.4 5.8 1.67 2.187 5 51 50.2 1.10 35.1 5.8 1.68 2.188 6 73 52.4 1.12 35.1 6.2 1.78 2.220 7 66 51.7 1.11 r 34.2 6.1 1.79 2.230 8 74 52.5 1.12 34.9 5.6 1.92 2.250 9 64 51.5 1.11 34.2 5.3 2.00. 2.262 10 57 50.7 1.10 33.6 5.3 2.06 2.274 11 67 51.8 1.11 34.1 5.1 2.13 2.303 12 62 51.3 1.11 33.2 5.3 2.34 2.327 13 104 55.7 1.15 37.5 7.0 2.49 2.412 14 59 51.0 1.10 32.2 5.0 2.98 2.473 15 94 54.6 1.14 34.4 5.2 3.21 2.544 16 88 54.0 1.13 33.7 6.1 3.35 2.580 17 104 55.7 1.15 35.5 5.9 3.47 2.600 18 98 55.0 1.14 34.4 5.8 3.66 2.626 19 90 54.2 1.14 33.2 5.7 3.80 2.702 20 89 54.1 1.13 33.0 5.5 3.92 2.734 21 85 53.7 1.13 32.6 4.9 4.03 2.783 22 99 55.1 1.14 33.5 5.2 4.07 2.814 23 100 55.3 1.15 33.5 5.2 4.14 2.857 24 102 55.5 1.15 33.3 5.7 4.15 2.893 25 102 55.5 1.15 33.0 5.0 4.28 2.999 26 128 58.2 1.17 33.8 5.5 4.31 3.081 27 102 55.5 1.15 32.1 5.3 4.35 3.116 28 102 .55.5 1.15 32.3 4.9 4.37 3.124 29 83 53.4 1.13 31.3 4.8 4.37 3.160 30 129 58.3 1.17 32.5 5.0 4.37 3.170 31 128 58.2 1.17 31.8 5.0 4.39 3.212 32 153 60.9 1.20 30.0 4.7 4.39 3.284 33 115 56.9 1.16 31.2 4.9 4.42 3.301 34 110 56.4 1.16 31.6 4.5 4.44 3.380 35 140 59.4 1.19 30.0 4.6 4.44 3.406 36 80. 53.1 1.13 30.1 4.6 4.45 3.424 37 111 56.4 1.16 31.0 4.4 4.46 '3.453 38 132 58.6 1.18 30.2 4.2 4.46 3.509
36
Chapter 7
Summary and Conclusions
For optimization of the WEDM process, experiments were planned using a factorial design
based on Taguchi's L18 orthogonal array with 2' x 34, to establish the relationship between the
control variables and the performance and productivity. In order to model the process, Pulse
width, time between two pulses, Gap voltage , conductivity of the dielectric and wire-feed speed
were selected as the control factors. Cutting speed and work piece surface roughness were
selected as the process outputs.
A 5-11-2 feed-forward back-propagation ANN model was developed to represent the WEDM
process. A close fit of the developed model to the experimental data is observed from the test
analysis. The ANN model developed was used to predict the process performance. Based on the
developed model, influence of the various process parameters on the machining criteria was
observed. Finally the process is optimized using constrained optimization algorithm. Pareto
front for the process. has also been found (The 38 Pareto-optimal solutions were searched out
from the set of all 625 outputs).
From this thesis work the following conclusions can be drawn.
• The results from the neural network show that the model is able to predict the process
performance, such as cutting speed and surface roughness within a reasonable large
range of input factor levels. In the investigating area, the ANN model is found to fit the
data satisfactorily and have good predictive capability to Ra and the cutting speed. From
the results presented in this work, it can be concluded that this technique can be
extended to processes exhibiting similar stochastic character and complexity.
fi
• From the validation experiments the error between the expected and experimental
cutting performance results are reasonably small for the optimized process parameters
settings using constrained optimization method.
37
• The constrained optimization approach is very useful for maximizing the productivity
while maintaining surface roughness within desired limit.
• The set of 38 Pareto-optimal solutions is very useful and will act as a guideline for
optimum machining of the titanium alloy.
• The developed technology setting by searching the pareto optimal front in the field of
wire electrical discharge machining of titanium alloy will have potentiality in modem
industrial applications for efficient manufacturing of precision jobs.
• In addition, the efficiency of determining optimal cutting parameters in the process
planning of wire-EDM can be dramatically improved by using this approach.
7.1 Scope for future research
Further research might attempt to take more factors, such as wire tenstion, workpiece material,
and workpiece height, into account as process inputs. Other performance criteria, such as the
surface cross-sectional microstructure, might be investigated. The techniques presented in this
study might also be tried on the finishing operation of WEDM or other machining processes.
REFERENCES
[1] E.C. Jameson, Description and development of electrical discharge machining (EDM),
Electrical Discharge Machining, Society of Manufacturing Engineers, Dearbem,
Michigan, 2001, pp. 16.
[2] G.F. Benedict, Electrical discharge machining (EDM), Non-Traditional Manufacturing
Processes, Marcel Dekker, Inc, New York & Basel, 1987, pp. 231-232.
[3] Y.F. Luo, C.G. Chen, Z.F. Tong, Investigation of silicon wafering by wire EDM, J.
Mater. Sci. 27 (21) (1992) 5805-5810.
[4] G.N. Levy, R. Wertheim, EDM-machining of sintered carbide compacting dies, Ann.
CIRP 37 (1) (1988) 175-178.
[5] B.K. Rhoney, A.J. Shih, R.O. Scattergood, J.L. Akemon, D.J. Grant, M.B. Grant, Wire
electrical discharge machining of metal bond diamond wheels for ceramic grinding, Inter.
J. Mach. Tools Manuf. 42 (12) (2002) 1355-1362.
[6] B.K. Rhoney, A.J. Shih, R.O. Scattergood, R. Ott, S.B. McSpadden, Wear mechanism of
metal bond diamond wheels trued by wire electrical discharge machining, Wear 252 (7-
8) (2002) 644-653.
[7] A. Kruusing, S. Leppavuori, A. Uusimaki, B. Petretis, 0. Makarova, Micromachining of
magnetic materials, Sensors Actuators 74 (1-3) (1999) 45-51.
[8] G.L. Benavides, L.F. Bieg, M.P. Saavedra, E.A. Bryce, High aspect ratio meso-scale
parts enables by wire micro-EDM, Microsys. Technol. 8 (6) (2002) 395-401.
[9] J.A. Sanchez, I. Cabanes, L.N. Lopez de Lacalle, A. Lamikiz, Development of optimum
electro discharge machining technology for advanced ceramics, Inter. J. Adv. Manuf.
Technol.18 (12) (2001) 897-905.
[10] ' Y.M. Cheng, P.T. Eubank, A.M. Gadalla, Electrical discharge machining of ZrB2-
based ceramics, Mater. Manuf. Processes 11 (4) (1996) 565-574.
[11] T. Matsuo, E. Oshima, Investigation on the optimum carbide content and machining
condition for wire EDM of zirconia ceramics, Ann. CIRP 41 (1) (1992) 231-234.
[12] Y.K. Lok, T.C. Lee; Processing of advanced ceramics using the wire-cut EDM
process, J. Mater. Process. Technol. 63 (1-3) (1997) 839-843.
39
[13] D.F. Dauw, C.A. Brown, J.P. Van griethuysen, J.F.L.M. Albert, Surface topography
investigations by fractal analysis of spark-eroded, electrically conductive- ceramics, Ann.
CIRP 39 (1) (1990) 161-165.
[14] W. Konig, D.F. Dauw, G. Levy, U. Panten, EDM-future steps,towards the machining
of ceramics, Ann. CIRP 37 (2) (1988) 623--631.
[15] R.F. Firestone, Ceramic—Applications in Manufacturing, Society of Manufacturing
Engineers, Michigan, 1988, pp. 133.
[ 16] N. Mohri, Y. Fukuzawa, T. Tani, N. Saito, K. Furutani, Assisting electrode method
for machining insulting ceramics, Ann. CIRP 45 (1) (1996) 201-204.
[17] N. Mohri, Y. Fukuzawa, T. Tani, T. Sata, Some considerations to machining
characteristics of insulating ceramics—towards practical use in industry, Ann. CIRP 51
(1) (2002) 161-164. I
[18] W.S. Lau, W.B. Lee, A comparison between EDM wire-cut and laser cutting of
carbon fibre composite materials, Mater. Manuf. Processes 6 (2) (1991) 331-342.
[19] W.S. Lau, T.M. Yue, T.C. Lee, W.B.-Lee, Un-conventional machining of composite
materials, J. Mater. Process. Technol. 48 (1-4) (1995.) 199-205.
[20] A.M. Gadalla, W. Tsai, Machining of WC-Co composites, Mater Manuf. Processes 4
(3) (1989) 411-423.
[21] B.H. Yan, C.C. Wang, W.D. Liu, F.Y. Huang, Machining characteristics of
A1203/6061A1 composite using rotary EDM with a dislike electrode, Inter. J. Adv.
Manuf. Technol. 16 (5) (2000) 322-333.
[22] T.M. Yue, Y. Dai, W.S. Lau, An examination of wire electrical discharge machining
(WEDM) of A1203 particulate reinforced aluminium based composites, Mater. Manuf.
Processess 11 (3) (1996) 341-350.
[23] Z.N. Guo, X. Wang, Z.G. Huang, T.M. Yue, Experimental investigation into shaping
particle-reinforced material by WEDM-HS, J. Mater. Process. Technol. 129 (1-3) (2002)
56-59.
[24] Y.S. Tamg, S.C. Ma, L.K. Chung, Determination of optimal cutting parameters in
wire electrical discharge machining, Inter, J. Mach. Tools Manuf. 35 (12) (1995) 1693-
1701.
40
[25] J.T. Huang, Y.S. Liao, W.J. Hsue, Determination of finish-cutting operation number
and machining-parameters setting in wire electrical discharge machining, J. Mater.
Process. Technol. 87 (1-3) (1999) 69-81.
[26] D. Scott, S. Boyina, K.P. Rajurkar, Analysis and optimization of parameter
combination in wire electrical discharge machining, Inter. J. Prod. Res. 29 (11) (1991)
2189-2207.
[27] Y.S. Liao, J.T. Huang, H.C. Su, A study on the machining parameters optimization
of wire electrical discharge machining, J. Mater. Process. Technol. 71 (3) (1997) 487-
493.
[28] R.E. Williams, K.P. Rajurkar, Study of wire electrical discharge machined surface
characteristics, J. Mater. Process. Technol. 28 (1-2) (1991) 127-138.
[29] R. Konda, K.P. Rajurkar, R.R. Bishu, A. Guha, M. Parson, Design of experiments to
study and optimize process performance, Inter. J. Qual. Reliab. Manage. 16 (1) (1999)
56-71.
[30] Y.S. Tamg, S.C. Ma, L.K. Chung, Determination of optimal cutting parameters in
wire electrical discharge machining, Inter, J. Mach. Tools Manuf. 35 (12) (1995) 1693-
1701.
[31] J.T. Huang, Y.S. Liao, W.J. Hsue, Determination of finish-cutting operation number
and machining-parameters setting in wire electrical discharge machining, J. Mater.
Process. Technol. 87 (1-3) (1999) 69-81.
[32] D. Scott, S. Boyina, K.P. Rajurkar, Analysis and optimization of parameter
combination in wire electrical discharge machining, Inter. J. Prod. Res. 29 (11) (1991)
2189-2207.
[33] Y.S. Liao, J.T. Huang, H.C. Su, A study on the machining parameters optimization
of wire electrical discharge machining, J. Mater. Process. Technol. 71 (3) (1997) 487-
493.
[34] M. Rozenek, J. Kozak, L. Dabrowski, K. Lubkowski, Electrical discharge machining
characteristics of metal matrix composites, J. Mater. Process. Technol. 109 (3) (2001)
367-370.
[35] J.T. Huang, Y.S. Liao, Optimization of machining parameters of wire-EDM based on
grey relational and statistical analyses, Inter. J. Prod. Res. 41 (8) (2003) 1707-1720.
41
[36] K.P. Rajurkar, W.M. Wang, Thermal modelling and on-line monitoring of wire-
EDM, J. Mater. Process. Technol. 38 (1-2) (1993) 417-430.
[37] M.I. Go kler, A.M. Ozano"zgu" , Experimental investigation of effects of cutting
parameters on surface roughness in the WEDM process, Inter. J. Mach. Tools Manuf. 40
(13) (2000) 1831-1848.
[38] N. Tosun, C. Cogun, A. Ivan, The effect of cutting parameters on workpiece surface
roughness in wire EDM, Machining Sci. Technol. 7 (2) (2003) 209-219.
[39] K.N. Anand, Development of process technology in wire-cut operation for
improving machining quality, Total Quality Management 7 (1) (1996) 11-28.
[40] T.A. Spedding, Z.Q. Wang, Parametric optimization and surface characterization of
wire electrical discharge machining process, Precision Eng. 20 (1) (1997) 5-15.
[41] R.E. Williams, K.P. Rajurkar, Study of wire electrical discharge machined surface
characteristics, J. Mater. Process. Technol. 28 (1-2) (1991) 127-138.
[42] T.A. Spedding, Z.Q. Wang, Study on modeling of wire EDM process, J. Mater.
Process. Technol. 69 (1-3) (1997) 18-28.
[43] C.L. Liu, D. Esterling, Solid modeling of 4-axis wire EDM cut geometry, Computer-
Aided Des. 29 (12) (1997) 803-810.
[44] W.J. Hsue, Y.S. Liao, S.S. Lu, Fundamental geometry analysis of wire electrical
discharge machining in corner cutting, Inter. J. Mach. Tools Manuf. 39 (4) (1999) 651-
667.
[45] G. Spur, J. Scho"nbeck, Anode erosion in wire-EDM—a theoretical model, Ann.
CIRP 42 (1) (1993) 253-256.
[46] F. Han, M. Kunieda, T. Sendai, Y. Imai, High precision simulation of WEDM using
parametric programming, Ann. CIRP 51 (1) (2002) 165-168.
[47] Freeman, J. A. and Skapura, D. M. Neural Networks, Algorithm, Application, and
Programming Techniques. Reading, MA: Addison-Wesley, 1992
[48] Vemuri, V. R. Artificial Neural Networks: Concepts and Control Application, New
York: IEEE Computer Society Press,1992
[49] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning,
Addison Wesley Publishing Company, January 1989.
[50] C. L. Karr, "Design of an Adaptive Fuzzy Logic Controller Using a Genetic
Algorithm", Proc. ICGA 4, pp. 450-457, 1991.
42
[51] R. B. Holstien, Artificial Genetic Adaptation in Computer Control Systems, PhD
Thesis, Department of Computer and Communication Sciences, University of Michigan,
Ann Arbor, 1971.
[52] R. A. Caruana and J. D. Schaffer, "Representation and Hidden Bias: Gray vs. Binary
Coding", Proc. 6th Int. Conf Machine Learning, ppl53-161, 1988.
[53] W. E. Schmitendorgf, O. Shaw, R. Benson and S. Forrest, "Using Genetic
Algorithms for Controller Design: Simultaneous Stabilization and Eigenvalue Placement
in a Region", Technical Report No. CS92-9, Dept. Computer Science, College of
Engineering, University of New Mexico, 1992.
[54] M. F. Bramlette, "Initialization, Mutation and Selection Methods in Genetic
Algorithms for Function Optimization", Proc ICGA 4, pp. 100-107, 1991.
[55] C. B. Lucasius and G. Kateman, "Towards Solving Subset Selection Problems with
the Aid of the Genetic Algorithm", In Parallel Problem Solving from Nature 2, R.
Manner and B. Manderick, (Eds.), pp. 239-247, Amsterdam: North-Holland, 1992.
[56] A. H. Wright, "Genetic Algorithms for Real Parameter Optimization", In
Foundations of Genetic Algorithms, J. E. Rawlins (Ed.), Morgan Kaufmann, pp. 205-218,
1991.
[57] Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs,
Springer Verlag, 1992.
[58] T. Back, F. Hoffineister and H.-P. Schwefel, "A Survey of Evolution Strategies",
Proc. ICGA 4, pp. 2-10, 1991.
[59] J. J. Grefenstette, "Incorporating Problem Specific Knowledge into Genetic
Algorithms", In Genetic Algorithms and Simulated Annealing, pp. 42-60, L. Davis (Ed.),
Morgan Kaufmann, 1987.
[60] D. Whitley, K. Mathias and P. Fitzhorn, "Delta Coding: An Iterative Search Strategy
for Genetic Algorithms", Proc. ICGA 4, pp. 77-84, 1991.
[61] K. A. De Jong, Analysis of the Behaviour of a Class of Genetic Adaptive Systems,
PhD Thesis, Dept. of Computer and Communication Sciences, University of Michigan,
Ann Arbor, 1975.
[62] J. E. Baker, "Adaptive Selection Methods for Genetic Algorithms", Proc. ICGA 1,
pp. 101-111, 1985.
43
[63] J. E. Baker, "Reducing bias and inefficiency in the selection algorithm", Proc. ICGA
2, pp. 14-21, 1987.
[64] L. Booker, "Improving search in genetic algorithms," In Genetic Algorithms and
Simulated Annealing, L. Davis (Ed.), pp. 61-73, Morgan Kaufmann Publishers, 1987.
[65] W. M. Spears and K. A. De Jong, "An Analysis of Multi-Point Crossover", In
Foundations of Genetic Algorithms, J. E. Rawlins (Ed.), pp. 301-315, 1991.
[66] G. Syswerda, "Uniform crossover in genetic algorithms", Proc. ICGA 3, pp. 2-9,
1989.
[67] W. M. Spears and K. A. De Jong, "On the Virtues of Parameterised Uniform
Crossover", Proc. ICGA 4, pp.230-236, 1991.
[68] R. A. Caruana, L. A. Eshelman, J. D. Schaffer, "Representation and hidden bias II:
Eliminating defining length bias in genetic search via shuffle crossover", In Eleventh
International Joint Conference on Artificial Intelligence, N. S. Sridharan (Ed.), Vol. 1,
pp. 750-755, Morgan Kaufmann Publishers, 1989.
[69] H. Muhlenbein and D. Schlierkamp-Voosen, "Predictive Models for the Breeder
Genetic Algorithm", Evolutionary Computation, Vol. 1, No. 1, pp. 25- 49, 1993.
[70] H. Furuya and R. T. Haftka, "Genetic Algorithms for Placing Actuators on Space
Structures", Proc. ICGA 5, pp. 536-542, 1993.
[71] C. Z. Janikow and Z. Michalewicz, "An Experimental Comparison of Binary and
Floating Point Representations in Genetic Algorithms", Proc. ICGA 4, pp. 31-36, 1991.
[72] D. M. Tate and A. E. Smith, "Expected Allele Convergence and the Role of
Mutation in Genetic Algorithms", Proc. ICGA 5, pp.31-37, 1993.
[73] L. Davis, "Adapting Operator Probabilities in Genetic Algorithms", Proc. ICGA 3,
pp. 61-69, 1989.
[74] T. C. Fogarty, "Varying the Probability of Mutation in the Genetic Algorithm", Proc.
ICGA 3, pp. 104-109, 1989..
[75] K. A. De Jong and J. Sarma, "Generation Gaps Revisited", In Foundations of
Genetic Algorithms 2, L. D. Whitley (Ed.), Morgan Kaufmann Publishers, 1993.
44
Appendix -1
Neural Networks: an over view
Introduction to artificial neural network
There are a number of different answers possible to the question of how to define neural
networks. At one extreme, the answer could be that neural networks are simply a class of
mathematical algorithms, since a network can be regarded essentially as a graphic notation for a
large, class of algorithms. Such algorithms produce solutions to a number of specific problems.
At the other end, the reply may be that these are synthetic networks that emulate the biological
neural networks found in living organisms. In light of today's limited knowledge of biological
neural networks and organisms, the more plausible answer seems to be closer to the algorithmic
one.
In search of better solutions for engineering and computing tasks, many avenues have been
pursued. There has been a long history of interest in the biological sciences on the part of
engineers, mathematicians, and physicists endeavoring to gain new ideas, inspirations, and
designs. Artificial neural networks have undoubtedly been biologically inspired, but the close
correspondence between them and real neural systems is still rather weak. Vast discrepancies
exist between both the architectures and capabilities of artificial and natural neural networks.
Knowledge about actual brain functions are so limited, however, that there is little to guide
those who would try to emulate them. No models have been successful in duplicating the
performance of the human brain. Therefore, the brain has been and still is only a metaphor for a
wide variety of neural network configurations that have been developed.
Despite the loose analogy between artificial and natural neural systems, we will briefly review
the biological neuron model. The synthetic neuron model will subsequently be defined in this
chapter and examples of networkrclasses will be discussed. The basic definitions of neuron and
elementary neural networks will also be given. Since no common standards are yet used in the
technical literature, this part of the chapter will introduce notation, graphic symbols, and
terminology used in this text. The basic forms of neural network processing will also be
discussed.
45
Biological neurons and their artificial models
A human brain consists of approximately 1011 computing elements called neurons. They
communicate through a connection network of axons and synapses having a density of
approximately 104 synapses per neuron.
Biological Neuron
The elementary nerve cell, called a neuron, is the fundamental building block of the biological
neural network. Its schematic diagram is shown in Figure 5. A typical cell has three major
regions: the cell body, which is also called the soma, the axon, and the dendrites. Dendrites form
a dendritic tree, which is a very fine bush of thin fibers around the neuron's body; Dendrites
receive information from neurons through axons-long fibers that serve as transmission lines. An
axon is a long cylindrical connection that carries impulses from the neuron. The end part of an
axon splits into a fine carbonization. Each branch of it terminates in the small end bulb almost
touching the dendrites of neighboring neurons. The axon-dendrite contact organ is called a
synapse. The synapse is where the neuron introduces its signal to the neighboring neuron. The
signals reaching a synapse and received by dendrites are electrical impulses.
Neuron
Incoming Axons from other Neurons
Dendrites
\ Cell Body Axon Hillock \\ Impulse
— Axon J —'
Smapsc
Termin l Receiving Bouton Neuron
nV Impulse
Figure 5 Schematic diagram of a neuron and a saipple of pulse train
46
I'
~7
Neuron's pruceMIflg natic
Neuron Modeling for Artificial Neural Systems
Weights and the neurons' thresholds are fixed in the model and no interaction among network
neurons takes place except for signal flow. Thus, we will consider this model as a starting point
for our neuron modeling discussion. Specifically, the artificial neural systems and computing
algorithms employ a variety of neuron models that have more diversified features than the
model just presented. Below, the main artificial neuron models are introduced that will be used
later in this text.
Synaptic connections
1~u1latiiitir+e wcht
Figure 6 General symbol of neuron consisting of processing node and synaptic
connections.
Every neuron model consists of a processing element with synaptic input connections and a
single output. The signal flow of neuron inputs, x, is considered to be unidirectional as indicated
by arrows, as is a neuron's output signal flow. A general neuron symbol is shown in Figure 5.
This symbolic representation shows a set of weights and the neuron's processing unit, or node.
The neuron output signal is given by the following relationship:
o = f (w`x), or (i) n
0 i=1
where w is the weight vector defined as
W2
and x is the input vector:
X = [xi X2
47
(All vectors defined in this text are column vectors; superscript t denotes a transposition.) The
function f(w`x) is often referred to as an activation function. Its domain is the set of activation
values, net, of the neuron model, we thus often use this function as f (net). The variable net is
defined as a scalar product of the weight and input vector.
net = wx (iii)
The argument of the activation function, the variable net, is an analog of the biological neuron's
membrane potential. Note that temporarily the threshold value is not explicitly used in (i) (ii)
and (iii), but this is only for notational convenience. We have momentarily assumed that the
modeled neuron has n - 1 actual synaptic connections that come from actual variable inputs x j,
X2.... x„_l. We have also assumed that xn = -1 and w„ = T. Since threshold plays an important role
for some models, we will sometimes need to extract explicitly the threshold as a separate neuron
model parameter.
The general neuron symbol, shown in Figure 5 and described with expressions (i), (ii) and (iii),
is commonly used in neural network literature. However, different artificial neural network
classes make use of different definitions of f(net). Also, even within the same class of networks,
the neurons are sometimes considered to perform differently during different phases of network
operation. Therefore, it is pedagogically sound to replace, whenever needed, the general neuron
model symbol from Figure 6 with a specific f(net) and a specific neuron model. The model
validity will then usually be restricted to a particular class of network. Two main models
introduced below are often used in this text.
Acknowledging the simplifications that are necessary to model a biological neuron network
with artificial neural networks, the following terminology is introduced: (1) neural networks are
meant to be artificial neural networks consisting of neuron models and (2) neurons are meant to
be artificial neuron models.
Observe from (i) and (ii) that the neuron as .a processing node performs the operation of
summation of its weighted inputs, or the scalar product computation to obtain net. Subsequently,
it performs the nonlinear operation f(net) through its activation function. Typical activation
functions used are
f(net)= 1+ exp (
2 - ,net) —1
(iv)
1+i, net > 0 f(net) =sign (net) (v) 1- 0, net <0
where Z > 0 in (iv) is proportional to the neuron gain determining the steepness of the
continuous function f(net) near net = 0. The continuous activation function is shown in Figure
7(a) for various A. Notice that as , ao, the limit of the continuous function becomes the
sign(net) function defined in (v). Activation functions (iv) and (v) are called bipolar continuous
and bipolar binary functions, respectively. The word "bipolar" is used to point out that both
positive and negative responses of neurons are produced for this definition of the activation
function. fi
By shifting and scaling the bipolar activation functions denoted by (iv) and (v) unipolar
continuous and unipolar binary activation functions can be obtained.
I.) (h)
Figure? Activation functions of a neuron: (a) bipolar continuous and (b) unipolar
continuous
How Do Neural Networks Work?
The standard artificial neuron is a processing element whose output is calculated by multiplying
its inputs by a weight vector, summing the results, and applying an activation function to the
sum (Fig.8)..
49
Output:y(n)
Single Layer Perceptron
The function of the entire neural network simply is an entirely deterministic calculation of the
outputs of all the n x1(n
x2(n Inputs
x3(n I
x( n
Fig.8 A standard artificial neuron
The back propagation training algorithm
Back propagation (BP) is a general method for iteratively solving for a multilayer perceptrons'
weights and biases. It uses a steepest descent technique which is very stable when a small
learning rate is used, but has slow convergence properties. Several methods for speeding up BP
have been used including momentum and a variable learning rate.
(a) Derivative of the Activation Functions:
The chain rule that is used in deriving the BP algorithm necessitates the computation of the
derivative of the activation functions. For logistic, hyperbolic tangent, and linear functions; the
derivatives are as follows:
Linear (D(I) = I
1 Logistic t(I) =
1+e" e"' —
Tanh (D(I) - e [̀ +e'
(~(I) =
i(I) = at(I)(1— J?(I))
d~(I) = a(i-0(I)1)
Alpha is called the slope parameter. Usually alpha is chosen to be 1 but other slopes may be
used. This formulation for the derivative makes the computation of the gradient more efficient
since the output '(I) has already been calculated in the forward pass.
50 U
Xh
Input Layer kid maaen Layer U) uuiput Layer kK) -I
Index h Index p Index q mNodes n Nodes r Nodes
T i
T2
T,
The highest gradient is at I=O. Since the speed of learning is partially dependent on the size of
the gradient, the internal activation of all neurons should be kept small to expedite training.
This is why we scale the inputs and initialize weights to small random values
The backpropagation algorithm is an optimization technique designed to minimize an objective
function. The most commonly used objective function is the squared error which is defined as:
62 =[Tq_ q]z
Fig. 9 Configuration and terminology of a multi-layered neural network.
The network syntax is defined as in the Figure 9:
In this notation, the layers are labeled i, j, and k; with m, n, and r neurons respectively; and the
neurons in each layer are indexed h, p, and q respectively.
Where, x = input value
T = target output value
w = weight value
I = internal activation
c = neuron output
c =error term
51
The outputs for a two layer network with both layers using a logistic activation function are
calculated by the equation:
(D = logistic{w2 * [logistic(wl * x + bl)] + b2}
Where: wl = first layer weight matrix
w2 = second layer weight matrix
bl = first layer bias vector
b2 = second layer bias vector
The input vector can be augmented with a dummy node representing the bias input. This
dummy input of 1 is multiplied by a weight corresponding to the bias value. This results in a
more compact representation of the above equation:
1 (D = logistic W2
Ilogistic(W1 * X)
where X = [ 1 x]' % Augmented input vector. W1=[blwl] W2 = [b2 w2]
Note that a dummy hidden node (=1) also needs to be inserted into the equation.
52
4.3.5 The weight updates
• The output layer weights are changed in proportion to the negative gradient of the
squared.error with respect to the weights. These weight changes can be calculated using
the chain rule. The symbols and terminologies are according to Fig. 9
[l£ 2 AW pq
k - -7lp•q 19W pq .k
2 T a £ q.k 1 q.k Tl
p q 8 (D q.k 19I q.k a W pq .k
5 pq.k *(D P•j
and
8 pq k = 2 [Tq —(Dq.k10q.k[1— ~q.k]
Wpq.k (N+ 1) = Wpq.k (V ) 1lP•q . 6Pq•k - (DP-j
• The hidden layer outputs have no target values. Therefore, a procedure is used to back
propagate the output layer errors to the hidden layer neurons in order to modify their
weights to minimize the error. To accomplish this, we start with the equation for the
gradient with respect to the weights and use the chain rule.
8 2 Awhp.j = -'7h.P 8w
hp . j
8 2 = 7h.p
q=1 O~W hP•j r
l/a£q 2
- l h. p q=1 V W q.k
81 q.k P-J 01 P_j
olq.k 8b.1 aln ~ £9Whp.j
53
ôwhp.l 9=l
(-2)[T -~q.k] q.k[1-~q.k] Wpq.k 'a(D P.J 11 -~P J~h
- Z 5P9 k WP9 k a (D P .% L1 - (D P•.J I"h 9=l
Shp.j = S pq.kW pq.k CJ~I P•J
w11. (N+1)=whPJ( N ) -t/hp x h ShP•J
54
Appendix -2
An Overview of Genetic Algorithms
Optimization Techniques
Optimization stands for selecting the best alternative among a given set of options. In any
optimization problem there is an objective function or objective that depends on a set of
variables. To reach an optimum does not necessarily mean "maximum". It means the best value
for the function.
Analytical approaches
Analytical approaches (most traditional optimization methods) are used for linear functions.
Derivatives of the objective function and constraints are usually required in these techniques.
For nonlinear problems, the use of such approaches is limited to certain types of functions. For
the functions to be optimized by these techniques, they must be continuous and differentiable
(no integer variables) and to find a globally optimal solution, the function must be convex.
Stochastic approaches
These techniques randomly search for better solutions, if they are built from promising
solutions, they ensure greater efficiency than completely random search. They can handle any
type of problem; the only limitation to this category is their long computational time. They are
slower than analytical methods for problems that can be solved analytically. Examples include
evolutionary algorithms (EAs), simulated annealing (SA), tabu search, and numerous variations.
Why evolutionary?
Since classical search and optimization methods use a point-by-point approach, where one
solution in each iteration is modified to a different (hopefully better) solution, the outcome of
using a classical optimization method is a single optimized solution.
Thinking along this working principle, classical search optimization methods could find only a
single optimized solution in a single simulation run.
55
Since only a single optimized solution could be found, it was therefore, necessary to convert the
task of multi-tradeoff solutions in a multi-objective optimization to one of finding a single
solution of a transformed single- objective optimization problem.
coding of sobitions
Problem objective $unction , ewhitionary operators
specific knowledge
r. Solution
Fig. 17 the basic structure of EA
However, the field of search and optimization has changed over the last few years by the
introduction of a number of non classical, unorthodox and stochastic search and optimization
algorithms. Of these, the evolutionary algorithm (EA) (Figl7), mimics nature's evolutionary
principles to drive its search towards an optimal solution. One of the most striking differences to
classical search and optimization algorithms is that EAs use a population of solutions-processed
in each iteration, instead of a single solution.
Since a population of solutions are processed in each iteration, the outcome of an EA is also a
population of solutions. If an optimization problem has a single optimum, all EA population
members can be expected to converge to that optimum solution. However, if an optimization
problem has multiple optimal solutions, an EA can be used to capture multiple optimal solutions
in its final population. This ability of an EA to find multiple optimal solutions in one single
simulation run makes EAs unique in solving multi-objective optimization problems. Since the
first step of the ideal strategy for multi-objective optimization requires multiple trade-off
solutions to be found, an EA's populations-search can be suitably utilized to find a number of
solutions in a single simulation run.
56
In this Section, a tutorial introduction to the basic Genetic Algorithm (GA) and outline the
procedures for solving problems using the GA are given. The genetic algorithms were
developed by John Holland in 1960s and early 1970s [49]. They are adaptive heuristic searching
algorithms used for solving demanding searching and optimisation problems. Since their
introduction genetic algorithms were spread on almost all areas of research work. They proved
to be an effective optimization tool for multi-criteria and multi-parametrical problems. Their
power is in random guided search hidden in imitation of principles of natural evolution as seen
through the "survival of the fittest" law. To implement the genetic algorithm on a certain
problem we have to describe an individual and an environment in which the individual has to fit.
In other words it is necessary to determine a coding form of variables of the problem and the
fitness function enabling the calculation of quality of the variables. The optimization process
starts by creating the initial generation of organisms which then improve by reproduction,
mutation and crossover from generation to generation. Thus, we gradually obtain the members
(organisms) of ever higher quality that, in fact, are the solutions of the problem.( Fig 18)
generate I evaluate ob jectnre Are op tiny cation best initial func fio n criteria met? indaliduals
population• no
start generate Seen result
new population
Recombination
Mutation
Fig. 18: Structure of a single population evolutionary algorithm
The principal steps of the method are:
• Creation of the initial generation of organisms,
• Evaluation of organisms by means of the fitness function,
• Selection of organisms which best solve the set problem, and
• Creation of new generation by crossover, mutation and reproduction.
57
What are Genetic Algorithms?
The GA is a stochastic global search method that mimics the metaphor of natural biological
evolution. GAs operates on a population of potential solutions applying the principle of survival
of the fittest to produce (hopefully) better and better approximations to a solution. At each
generation, a new set of approximations is created by the process of selecting individuals
according to their level of fitness in the problem domain and breeding them together using
operators borrowed from natural genetics. This process leads to the evolution of populations of
individuals that are better suited to their environment than the individuals that they were created
from, just as in natural adaptation.
Individuals, or current approximations, are encoded as strings, chromosomes, composed over
some alphabet(s), so that the genotypes (chromosome values) are uniquely mapped onto the
decision variable (phenotypic) domain. The most commonly used representation in GAs is the
binary alphabet {0, 1} although other representations can be used, e.g. ternary, integer, real-
valued etc. For example, a problem with two, variables, xi and x2, may be mapped onto the
chromosome structure in the following way:
101101001101011 010100101 xi X2
where xl is encoded with 10 bits and x2 with 15 bits, possibly reflecting the level of accuracy
or range of the individual decision variables. Examining the chromosome string in isolation
yields no information about the problem we are trying to solve. It is only with the decoding of
the chromosome into its phenotypic values that any meaning can be applied to the
representation. However, as described below, the search process will operate on this encoding of
the decision variables, rather than the decision variables themselves, except, of course, where
real-valued genes are used. Having decoded the chromosome representation into the decision
variable domain, it is possible to assess the performance, or fitness, of individual members of a
population. This is done through an objective function that characterises an individual's
performance in the problem domain. In the natural world, this would be an individual's ability
to survive in its present environment.
Thus, the objective function establishes the basis for selection of pairs of individuals that will be
mated together during reproduction. During the reproduction phase, each individual is assigned
a fitness value derived from its raw performance measure given by the objective function. This
value is used in the selection to bias towards more fit individuals. Highly fit individuals, relative
to the whole population, have a high probability of being selected for mating whereas less fit
individuals have a correspondingly low probability of being selected. Once the individuals have
been assigned a fitness value, they can be chosen from the population, with a probability
according to their relative fitness, and recombined to produce the next generation. Genetic
operators manipulate the characters (genes) of the chromosomes directly, using the assumption
that certain individual's gene codes, on average, produce fitter individuals. The recombination
operator is used to exchange genetic information between pairs, or larger groups, of individuals.
The simplest recombination operator is that of single-point crossover.
Consider the two parent binary strings:
P1=10010110, and P2=10111000.
If an integer position, i is selected uniformly at random between 1 and the string length, 1, minus
one [1, 1-1], and the genetic information exchanged between the individuals about this point,
then two new offspring strings are produced. The two offspring below are produced when the
crossover point i = 5 is selected,
01 =10010000, and 02=10111110.
This crossover operation is not necessarily performed on all strings in the population. Instead, it
is applied with a probability P(x) when the pairs are chosen for breeding. A further genetic
operator, called mutation, is then applied to the new chromosomes, again with a set probability,
P(m). Mutation causes the individual genetic representation to be changed according to some F,
59
probabilistic rule. In the binary string representation, mutation will cause a single bit to change
its state, 0 to 1 or I to 0.
Olm=1 0000000.
Mutation is generally considered to be a background operator that ensures that the probability of
searching a particular subspace of the problem space is never zero. This has the effect of tending
to inhibit the possibility of converging to a local optimum, rather than the global optimum. After
recombination and mutation, the individual strings are then, if necessary, decoded, the objective
function evaluated, a fitness value assigned to each individual and individuals selected for
mating according to their fitness, and so the process continues through subsequent generations.
In this way, the average performance of individuals in a population is expected to increase, as
good individuals are preserved and bred with one another and the less fit individuals die out.
The GA is terminated when some criteria are satisfied, e.g. a certain number of generations, a
mean deviation in the population, or when a particular point in the search space is encountered.
GAs versus Traditional Methods
From the above discussion, it can be seen that the GA differs substantially from more traditional
search and optimization methods. The four most significant differences are:
• GAs search a population of points in parallel, not a single point.
• GAs do not require derivative information or other auxiliary knowledge; only the
objective function and corresponding fitness levels influence the directions of search.
• GAs use probabilistic transition rules, not deterministic ones.
• GAs work on an encoding of the parameter set rather than the parameter set itself
(except in where real-valued individuals are used).
It is important to note that the GA provides a number of potential solutions to a given problem
and the choice of final solution is left to the user. In cases where a particular problem does not
have one individual solution, for example a family of Pareto-optimal solutions, as is the case in
multi-objective optimization and scheduling problems, then the GA is potentially useful for
identifying these alternative solutions simultaneously.
Major Elements of the Genetic Algorithm
The simple genetic algorithm (SGA) is described by Goldberg [49] and is used here to illustrate
the basic components of the GA. A pseudo-code outline of the SGA is shown below. The
population at time t is represented by the time-dependent variable P, with the initial population
of random estimates being P(0). Using this outline of a GA, the remainder of this Section
describes the major elements of the GA.
Procedure in GA
begin
t=0:
initialize P(t):
evaluate P(t):
while not finished do
begin
t=t+ 1;
select P(t) from P(t-1);
reproduce pairs in P(t);
evaluate P(t);
end
end.
Population Representation and Initialization
GAs operate on a number of potential solutions, called a population, consisting of some
encoding of the parameter set simultaneously. Typically, a population is composed of between
30 and 100 individuals, although, a variant called the micro GA uses very small populations,
—10 individuals, with a restrictive reproduction and replacement strategy in an attempt to reach
real-time execution [50].
The most commonly used representation of chromosomes in the GA is that of the single-level
binary string. Here, each decision variable in the parameter set, is encoded as a binary string and
these are concatenated to form a chromosome. The use of Gray coding has been advocated as a
method of overcoming the hidden representational bias in conventional binary representation as
61
the Hamming distance between adjacent values is constant [51]. Empirical evidence of Caruana
and Schaffer [52] suggests that large Hamming distances in the representational mapping
between adjacent values, as is the case in the standard binary representation, can result in the
search process being deceived or unable to efficiently locate the global minimum. A further
approach of Schmitendorgf et-al [53], is the use of logarithmic scaling in the conversion of
binary-coded chromosomes to their real phenotypic values. Although the precision of the
parameter values is possibly less consistent over the desired range, in problems where the spread
of feasible parameters is unknown, a larger search space may be covered with the same number
of bits than a linear mapping scheme allowing the computational burden of exploring unknown
search spaces to be reduced to a more manageable level. Whilst binary-coded GAs are most
commonly used, there is an increasing interest in alternative encoding strategies, such as integer
and real-valued representations. For some problem domains, it is argued that the binary
representation is in fact deceptive in that it obscures the nature of the search [54]. In the subset
selection problem [55], for example, the use of an integer representation and look-up tables
provides a convenient and natural way of expressing the mapping from representation to
problem domain. The use of real-valued genes in GAs is claimed by Wright [56] to offer a
number of advantages in numerical function optimization over binary encodings. Efficiency of
the GA is increased as there is no need to convert chromosomes to phenotypes before each
function evaluation; less memory is required as efficient floating-point internal computer
representations can be used directly; there is no loss in precision by discretisation to binary or
other values; and there is greater freedom to use different genetic operators. The use of real-
valued encodings is described in detail by Michalewicz [57] and in the literature on Evolution
Strategies (see, for example, [58]). Having decided on the representation, the first step in the
SGA is to create an initial population. This is usually achieved by generating the required
number of individuals using a random number generator that uniformly distributes numbers in
the desired range. For example, with a binary population of Nind individuals whose
chromosomes are Lind bits long, Nind x Lind random numbers uniformly distributed from the
set {0, 1 } would be produced. A variation is the extended random initialisation procedure of
Bramlette whereby a number of random initialisations are tried for each individual and the one
with the best performance is chosen for the initial population. Other users of GAs have seeded
the initial population with some individuals that are known to be in the vicinity of the global
minimum (see, for example, [59] and [60]). This approach is, of course, only applicable if the
62
nature of the problem is well understood beforehand or if the GA is used in conjunction with a
knowledge based system. The GA code supports binary,chromosome representations. Binary
populations may be initialised using the Toolbox function to create binary populations, crtbp.
An additional function, crtbase, is provided that builds a vector describing the integer
representation Conversion between binary strings and real values is provided by the routine
bs2ry that supports the use of Gray codes and logarithmic scaling.
The Objective and Fitness Functions
The objective function is used to provide a measure of how individuals have performed in the
problem domain. In the case of a minimization problem, the most fit individuals will have the
lowest numerical value of the associated objective function. This raw measure of fitness is
usually only used as an intermediate stage in detennining the relative performance of
individuals in a GA. Another function, the fitness function, is normally used to transform the
objective function value into a measure of relative fitness [61], thus:
F(x) g (f(x)
where f is the objective function, g transforms the value of the objective function to a non-
negative number and F is the resulting relative fitness. This mapping is always necessary when
the objective function is to be minimized as the lower objective function values correspond to
fitter individuals. In many cases, the fitness function value corresponds to the number of
offspring that an individual can expect to produce in the next generation. A commonly used
transformation is that of proportional fitness assignment (see, for example, [49]). The individual
fitness, F(x), of each individual is computed as the individual's raw performance, f(x), relative
to the whole population, i.e.,
f(x) F (x;) =
Nind
Y ✓ (xi)
where Nind is the population size and xi is the phenotypic value of individual i. Whilst this
fitness assignment ensures that each individual has a probability of reproducing according to its
relative fitness, it fails to account for negative objective function values. A linear transformation
which offsets the objective function [49] is often used prior to fitness assignment, such that,
63
F(x) =a*f(x)+b
Where a is a positive scaling factor if the optimization is maximizing and negative if we are
minimizing. The offset b is used to ensure that the resulting fitness values are non-negative.
The linear scaling and offsetting outlined above is, however, susceptible to rapid convergence.
The selection algorithm (see below) selects individuals for reproduction on the basis of their
relative fitness. Using linear scaling, the expected number of offspring is approximately
proportional to that individuals performance. As there is no constraint on an individual's
performance in a given generation, highly fit individuals in early generations can dominate the
reproduction causing rapid convergence to possibly sub-optimal solutions. Similarly, if there is
little deviation in the population, then scaling provides only a small bias towards the most fit
individuals. Baker .[62] suggests that by limiting the reproductive range, so that no individuals
generate an excessive number of offspring, prevents premature convergence. Here, individuals
are assigned fitness according to their rank in the population rather than their raw performance.
One variable, MAX, is used to determine the bias, or selective pressure, towards the most fit
individuals and the fitness of the others is determined by the following rules:
•MIN=2.0-MAX
• INC= 2.0 x (MAX-1.0) /N; d
•LOW=INC/2.0
where MIN is the lower bound, INC is the difference between the fitness of adjacent individuals
and LOW is the expected number of trials (number of times selected) of the least fit individual.
MAX is typically chosen in the interval [1.1, 2.0]. Hence, for a population size of N;d = 40 and
MAX= 1.1, we obtain MIN= 0.9, INC = 0.05 and LOW= 0.025. The fitness of individuals in
the population may also be calculated directly as,
F(x)-2-MAX+2 (MAX- 1) Nind - 1
where x; is the position in the ordered population of individual i.
64
Selection
Selection is the process of determining the number of times, or trials, a particular individual is
chosen for reproduction and, thus, the number of offspring that an individual will produce. The
selection of individuals can be viewed as two separate processes:
1) Determination of the number of trials an individual can expect to receive, and
2) Conversion of the expected number of trials into a discrete number of offspring.
The first part is concerned with the transformation of raw fitness values into a real valued
expectation of an individual's probability to reproduce and is dealt with in the previous
subsection as fitness assignment. The second part is the probabilistic selection of individuals for
reproduction based on the fitness of individuals relative to one another and is sometimes known
as sampling. The remainder of this subsection will review some of the more popular selection
methods in current usage. Baker [63] presented three measures of performance for selection
algorithms, bias, spread and efficiency. Bias is defined as the absolute difference between an
individual's actual and expected selection probability. Optimal zero bias is therefore achieved
when an individual's selection probability equals its expected number of trials. Spread is the
range in the possible number of trials that an individual may achieve. Iff(i) is the actual number
of trials that individual i receives, then the "minimum spread" is the smallest spread that
theoretically permits zero bias. Thus, while bias is an indication of accuracy, the spread of a
selection method measures its consistency. The desire for efficient selection methods is
motivated by the need to maintain a GAs overall time complexity. It has been shown in the
literature that the other phases of a GA (excluding the actual objective function evaluations) are
O ( Lind, Nind) or better time complexity, where Lind is the length of an individual and Nind is
the population size. The selection algorithm should thus achieve zero bias whilst maintaining a
minimum spread and not contributing to an increased time complexity of the GA.
Roulette Wheel Selection Methods
Many selection techniques employ a "roulette wheel" mechanism to probabilistically select
individuals based on some measure of their performance. A real-valued interval, Sum, is
determined as either the sum of the individuals' expected selection probabilities or the sum of
the raw fitness values over all the individuals in the current population. Individuals are then
mapped one-to-one into contiguous intervals in the range [0, Sum]. The size of each individual
ri
interval corresponds to the fitness value of the associated individual. For example, in Fig. 19 the
circumference of the roulette wheel is the sum of all six individual's fitness values. Individual 5
is the most fit individual and occupies the largest interval, whereas individuals 6 and 4 are the
least fit and have correspondingly smaller intervals within the roulette wheel. To select an
individual, a random number is generated in the interval [0, Sum] and the individual whose
segment spans the random number is selected. This process is repeated until the desired number
of individuals have been selected.
The basic roulette wheel selection method is stochastic sampling with replacement (SSR). Here,
the segment size and selection probability remain the same throughout the selection phase and
individuals are selected according to the procedure outlined above. SSR gives zero bias but a
potentially unlimited spread. Any individual with a segment size > 0 could entirely fill the next
population.
o/ o
Figure 19: Roulette Wheel Selection
Crossover (Recombination)
The basic operator for producing new chromosomes in the GA is that of crossover. Like its
counterpart in nature, crossover produces new individuals that have some parts of both parent's
genetic material. The simplest form of crossover is that of single-point crossover, described in
the Overview of GAs. In this Section, a number of variations on crossover are described and
discussed and the relative merits of each reviewed.
Multi point Crossover
For multi-point crossover, m crossover positions, K; C {1, 2...,1-1}, where k; are the crossover
points and 1 is the length of the chromosome, are chosen at random with no duplicate new
offspring. The section between the first allele position and the first crossover point is not
exchanged between individuals. This process is illustrated in Fig. 20.
Figure 20: Multi-point Crossover (m=5)
The idea behind multi-point, and indeed many of the variations on the crossover operator, is that
the parts of the chromosome representation that contribute to the most to the performance of a
particular individual may not necessarily be contained in adjacent substrings [64]. Further, the
disruptive nature of multi-point crossover appears to encourage the exploration of the search
space, rather than favoring the convergence to highly fit individuals early in the search, thus
making the search more robust [65].
Uniform Crossover
Single and multi-point crossover define cross points as places between loci where a
chromosome can be split. Uniform crossover [66] generalizes this scheme to make every locus a
potential crossover point.
P1 =1011000111 P2 =0001111000 Mask =0011001100 01 =0011110100 02 =1001001.011
67
A crossover mask, A crossover mask, the same length as the chromosome structures is created
at random and the parity of the bits in the mask indicates which parent will supply the offspring
with which bits. Consider the following two parents, crossover mask and resulting offspring:
P1 =1011000111
P2 =0001111000
Mask =001 1001 1 00
01 =0011110100
02 =1001001011
Here, the first offspring, 01, is produced by taking the bit from P1 if the corresponding mask bit
is 1 or the bit from P2 if the corresponding mask bit is 0. Offspring 02 is created using the
inverse of the mask or, equivalently, swapping P1 and P2.
Uniform crossover, like multi-point crossover, has been claimed to reduce the bias associated
with the length of the binary representation used and the particular coding for a given parameter
set. This helps to overcome the bias in single-point crossover towards short substrings without
requiring precise understanding of the significance of individual bits in the chromosome
representation. Spears and De Jong [67] have demonstrated how uniform crossover may be
parameterised by applying a probability to the swapping of bits. This extra parameter can be
used to control the amount of disruption during recombination without introducing a bias
towards the length of the representation used. When uniform crossover is used with real-valued
alleles, it is usually referred to as discrete recombination.
Discussion
The binary operators discussed in this Section have all, to some extent, used disruption in the
representation to help improve exploration during recombination. Whilst these operators may be
used with real-valued populations, the resulting changes in the genetic material after
recombination would not extend to the actual values of the decision variables, although
offspring may, of course, contain genes from either parent. The intermediate and line
recombination operators overcome this limitation by acting on the decision variables themselves.
Like uniform crossover, the real-valued operators may also be parameterised to provide a
control over the level of disruption introduced into offspring. For discrete-valued
representations, variations on the recombination operators may be used that ensure that only
valid values are produced as a result of crossover [68]. The GA Toolbox provides a number of
crossover routines incorporating most of the methods described above. Single-point, double-
point and shuffle crossover are implemented in the Toolbox functions xovsp, xovdp and xovsh,
respectively, and can operate on any chromosome representation. Reduced surrogate crossover
is supported with both single-point, xovsprs, and double-point, xovdprs, crossover and with
shuffle crossover, xovshrs. A further general multi-point crossover routine, xovmp, is also
provided. To support real-valued chromosome representations, discrete, intermediate and line
recombination operators are also included. The discrete recombination operator, recdis,
performs crossover on real-valued individuals in a similar manner to the uniform crossover
operators. Line and intermediate recombination are supported by the functions reclin and recint
respectively. A high-level entry function to all of the crossover operators is provided by the
function recombin.
Mutation
In natural evolution, mutation is a random process where one allele of a gene is replaced by
another to produce a new genetic structure. In GAs, mutation is randomly applied with low
probability, typically in the range 0.001 and 0.01, and modifies elements in the chromosomes.
Usually considered as a background operator, the role of mutation is often seen as providing a
guarantee that the probability of searching any given string will never be zero and acting as a
safety net to recover good genetic material that may be lost through the action of selection and
crossover [49]. The effect of mutation on a binary string is illustrated in Fig. 21 for a 10-bit
chromosome representing a real value decoded over the interval [0, 10] using both standard and
Gray coding and a mutation point of 3 in the binary string. Here, binary mutation flips the value
of the bit at the loci selected to be the mutation point. Given that mutation is generally applied
uniformly to an entire population of strings, it is possible that a given binary string may be
mutated at more than one point. With non-binary representations, mutation is achieved by either
perturbing the gene values or random selection of new values within the allowed range. Janikow
and Michalewicz [69] demonstrate how real-coded GAs may take advantage of higher mutation
rates than binary-coded GAs, increasing the level of possible exploration of the search space
without adversely affecting the convergence characteristics. Indeed, Tate and Smith [70] argue
that for codings more complex than binary, high mutation rates can be both desirable and
necessary and show how, for a complex combinatorial optimization problem, high mutation
rates and non-binary coding yielded significantly better solutions than the normal approach.
mutation point. -` binary Gray Original string - 0 0 0 1 1 0 0 0 1 0 0.9659 0.6634 Mutated string - 0 0 1 1 1 0 0 0 1 0 2.2146 1.8439
Figure 21: Binary Mutation
Many variations on the mutation operator have been proposed. For example, biasing the
mutation towards individuals with lower fitness values to increase the mutation point
exploration in the search without losing information from the fitter individuals [71] or
parameterising the mutation such that the mutation rate decreases with the population
convergence [72]. Muhlenbein [69] has introduced a mutation operator for the real-coded GA
that uses a non-linear term for the distribution of the range of mutation applied to gene values. It
is claimed that by biasing mutation towards smaller changes in gene values, mutation can be
used in conjunction with recombination as a foreground search process. Other mutation
operations include that of trade mutation [55], whereby the contribution of individual genes in a
chromosome is used to direct mutation towards weaker terms, and reorder mutation [55], that
swaps the positions of bits or genes to increase diversity in the decision variable space. Binary
and integer mutation are provided in the Toolbox by the function mut. Real-valued mutation is
available using the function mutbga. A high-level entry function to the mutation operators is
provided by the function mutate.
Reinsertion
Once a new population has been produced by selection and recombination of individuals from
the old population, the fitness of the individuals in the new population may be determined. If
fewer individuals are produced by recombination than the size of the original population, then
the fractional difference between the new and old population sizes is termed a generation gap
[73]. In the case where the number of new individuals produced at each generation is one or two,
the GA is said to be steady-state [74] or incremental [75]. If one or more of the most fit
individuals is deterministically allowed to propagate through successive generations then the
GA is said to use an elitist strategy. To maintain the size of the original population, the new
70
individuals have to be reinserted into the old population. Similarly, if not all the new individuals
are to be used at each generation or if more offspring are generated than the size of the old
population then a reinsertion scheme must be used to determine which individuals are to exist in
the new population. An important feature of not creating more offspring than the current
population size at each generation is that the generational computational time is reduced, most
dramatically in the case of the steady-state GA, and that the memory requirements are smaller as
fewer new individuals need to be stored while offspring are produced. When selecting which
members of the old population should be replaced the most apparent strategy is to replace the
least fit members deterministically. However, in studies, Fogarty [74] has shown that no
significant difference in convergence characteristics was found when the individuals selected for
replacement where chosen with inverse proportional selection or deterministically as the least fit.
He further asserts that replacing the least fit members effectively implements an elitist strategy
as the most fit will probabilistically survive through successive generations. Indeed, the most
successful replacement scheme was one that selected the oldest members of a population for
replacement. This is reported as being more in keeping with generational reproduction as every
member of the population will, at some time, be replaced. Thus, for an individual to survive
successive generations, it must be sufficiently fit to ensure propagation into future generations.
The GA Toolbox provides a function for reinserting individuals into the population after
recombination, reins. Optional input parameters allow the use of either uniform random or
fitness-based reinsertion. Additionally, this routine can also be selected to reinsert fewer
offspring than those produced at recombination.
Termination of the GA
Because the GA is a stochastic search method, it is difficult to formally specify convergence
criteria. As the fitness of a population may remain static for a number of generations before a
superior individual is found, the application of conventional termination criteria becomes
problematic. A common practice is to terminate the GA after a prespecified number of
generations and then test the quality of the best members of the population against the problem
definition. If no acceptable solutions are found, the GA may be restarted or a fresh search
initiated.
71
Outline of the basic algorithm
0 START : Create random population of n chromosomes
1 FITNESS : Evaluate fitness f(x) of each chromosome in the population
2 NEW POPULATION
0 SELECTION : Based on f(x)
1 RECOMBINATION : Cross-over chromosomes
2 MUTATION : Mutate chromosomes
3 ACCEPTATION : Reject or accept new one
3 REPLACE : Replace old with new population: the new
generation
4 TEST : Test problem criterium
5 LOOP : Continue step 1 —4 until criterium is satisfied
72