static and dynamic modelling of materials forgingcalj/ajiips98.pdf · materials. we analyse two...

7
Static and Dynamic Modelling of Materials Forging Coryn A.L. Bailer-Jones, David J.C. MacKay Cavendish Laboratory, University of Cambridge email: [email protected]; [email protected] Tanya J. Sabin and Philip J. Withers Department of Materials Science and Metallurgy, University of Cambridge email: [email protected]; [email protected] ABSTRACT The ability to model the thermomechanical processing of materials is an increasingly important requirement in many areas of engineering. This is particularly true in the aerospace industry where high material and process costs demand models that can reliably predict the microstructures of forged materials. We analyse two types of forging, cold forging in which the microstructure develops statically upon annealing, and hot forging for which it develops dynamically, and present two different models for predicting the resultant material microstructure. For the cold forging problem we employ the Gaussian process model. This probabilistic model can be seen as a generalisation of feedforward neural networks with equally powerful interpolation capabilities. However, as it lacks weights and hidden layers, it avoids ad hoc decisions regarding how complex a ‘network’ needs to be. Results are presented which demonstrate the excellent generalisation capabilities of this model. For the hot forging problem we have developed a type of recurrent neural network architecture which makes predictions of the time derivatives of state variables. This approach allows us to simultaneously model multiple time series operating on different time scales and sampled at non-constant rates. This architecture is very general and likely to be capable of modelling a wide class of dynamic systems and processes. Australian Journal on Intelligent Information Processing Systems, 5(1), 10–16, 1998 1. Introduction The problem in the modelling of materials forging can be broadly stated as follows: Given a certain material which undergoes a specified forging process, what are the final properties of this material? Typical final prop- erties in which we are interested are the microstructural properties, such as the mean grain size and shape and the extent of grain recrystallisation. Relevant forging process control variables are the strain, strain rate and temperature, all of which may be functions of time. A trial-and-error approach to solving this problem has often been taken in the materials industry, with many different forging conditions attempted to achieve a given final product. The obvious drawbacks of this ap- proach are large time and financial costs and the lack of any reliable predictive capability. Another method is to develop a parameterised, physically-motivated model, and to solve for the parameters using empirical data [2]. However, the limitation with this approach is that in terms of the physical theory the microstructural evolu- tion depends upon several “intermediate” microscopic variables which have to be measured in order to apply the model. Some of these variables, such as dislocation density, are difficult and time-consuming to measure, making it impracticable to apply such an approach to large-scale industrial processes. Our approach to the prediction of forged microstruc- tures is therefore to develop an empirical model in which we define a parameterised, non-linear relation- ship between the microstructural variables of interest and those easily measured process variables. Such a model could be implemented, for example, as a neural network with the hidden nodes essentially playing a role analogous to the “intermediate” microscopic variables. 2. Materials Forging When a material is deformed, potential energy is put into the system by virtue of work having been done to move crystal planes relative to one another. The mate- rial is therefore not in equilibrium and has a tendency to lower its potential energy by atomic rearrangement, through the competing processes of recovery, recrys- tallisation and grain growth. These processes are en- couraged by raising the temperature of the material (an- nealing). Forge deformation processes can be divided into two classes. In cold working the recrystallisation rate is so low that recrystallisation essentially does not occur during forging. Recrystallisation is subsequently 10

Upload: others

Post on 31-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Static and Dynamic Modelling of Materials Forging

    CorynA.L. Bailer-Jones,David J.C.MacKayCavendishLaboratory, Universityof Cambridge

    email: [email protected];[email protected]

    TanyaJ.Sabin and Philip J.WithersDepartmentof MaterialsScienceandMetallurgy, Universityof Cambridge

    email: [email protected];[email protected]

    ABSTRACT

    The ability to model the thermomechanicalprocessingof materialsis an increasinglyimportantrequirementin many areasof engineering. This is particularly true in the aerospaceindustry wherehigh materialandprocesscostsdemandmodelsthat canreliably predictthe microstructuresof forgedmaterials.We analysetwo typesof forging,cold forging in which themicrostructuredevelopsstaticallyuponannealing,andhot forgingfor which it developsdynamically, andpresenttwo differentmodelsforpredictingtheresultantmaterialmicrostructure.For thecold forging problemwe employ theGaussianprocessmodel.This probabilisticmodelcanbeseenasa generalisationof feedforwardneuralnetworkswith equally powerful interpolationcapabilities. However, as it lacks weightsand hiddenlayers, itavoids ad hoc decisionsregardinghow complex a ‘network’ needsto be. Resultsarepresentedwhichdemonstratetheexcellentgeneralisationcapabilitiesof thismodel.For thehot forgingproblemwehavedevelopedatypeof recurrentneuralnetworkarchitecturewhichmakespredictionsof thetimederivativesof statevariables.This approachallows us to simultaneouslymodelmultiple time seriesoperatingondifferenttimescalesandsampledatnon-constantrates.Thisarchitectureis verygeneralandlikely to becapableof modellingawide classof dynamicsystemsandprocesses.

    Australian Journalon IntelligentInformationProcessingSystems, 5(1),10–16,1998

    1. Introduction

    Theproblemin the modellingof materialsforging canbe broadlystatedas follows: Given a certainmaterialwhich undergoesa specifiedforging process,what arethefinal propertiesof this material?Typical final prop-ertiesin which weareinterestedarethemicrostructuralproperties,suchas the meangrain sizeandshapeandthe extent of grain recrystallisation. Relevant forgingprocesscontrol variablesarethe strain,strainrateandtemperature,all of whichmaybefunctionsof time.

    A trial-and-errorapproachto solving this problemhas often been taken in the materialsindustry, withmany differentforging conditionsattemptedto achieveagivenfinal product.Theobviousdrawbacksof thisap-proacharelargetimeandfinancialcostsandthelackofany reliablepredictivecapability. Anothermethodis todevelop a parameterised,physically-motivatedmodel,andto solvefor theparametersusingempiricaldata[2].However, the limitation with this approachis that intermsof the physicaltheorythe microstructuralevolu-tion dependsuponseveral “intermediate”microscopicvariableswhich have to be measuredin orderto applythemodel.Someof thesevariables,suchasdislocationdensity, are difficult and time-consumingto measure,

    making it impracticableto apply suchan approachtolarge-scaleindustrialprocesses.

    Our approachto thepredictionof forgedmicrostruc-tures is therefore to develop an empirical model inwhich we definea parameterised,non-linearrelation-ship betweenthe microstructuralvariablesof interestand thoseeasily measuredprocessvariables. Suchamodelcouldbe implemented,for example,asa neuralnetwork with thehiddennodesessentiallyplayingaroleanalogousto the“intermediate”microscopicvariables.

    2. Materials Forging

    When a material is deformed,potentialenergy is putinto the systemby virtue of work having beendonetomove crystalplanesrelative to oneanother. Themate-rial is thereforenot in equilibrium andhasa tendencyto lower its potentialenergy by atomicrearrangement,through the competingprocessesof recovery, recrys-tallisation and grain growth. Theseprocessesare en-couragedby raisingthetemperatureof thematerial(an-nealing). Forge deformationprocessescanbe dividedinto two classes.In cold working the recrystallisationrateis so low that recrystallisationessentiallydoesnotoccurduring forging. Recrystallisationis subsequently

    10

  • Fig. 1: Deformationgeometries. (a) (left) Plane-straindiametricalcompression. The workpieceis subsequentlysectionedinto manynominally identicalspecimenswhich areannealedat differentcom-binationsof temperatureandtime. The compressiongives rise to anon-lineardistribution of strainsacrossthespecimen(seeFigure2b).Theseallow usto obtainmany input trainingvectors,� , for ourmodelusinga singlecompressiontest. (b) (right) Axisymmetricaxial com-pression.

    achievedstaticallyby annealing.In contrast,hot work-ing refersto thehightemperatureforgingof materialsinwhich recrystallisationoccursdynamicallyduringforg-ing. This processis considerablymore complex thancoldworkingasnow thefinal microstructureof thema-terial is generallya path-dependentfunctionof thehis-tory of the processvariables. This is particularly trueof the Aluminium–Magnesiumalloy consideredhere,which have a relatively long ‘memory’ of the process,thusnecessitatingamodelwhichkeepstrackof thehis-tory of thematerial.We shallconsidera modelfor thisdynamicprocessin Section4.

    The ultimategoal of forge modelling is the inverseproblem: Given a set of desiredfinal propertiesfor acomponent,what is the optimal material and forgingprocesswhich will realisetheseproperties?This is aconsiderablyharderproblemsincetheremaybea one-to-many mappingbetweenthe desiredpropertiesandthe necessaryforging process. This problemwill notbeaddressedin thispaper.

    3. Static Modelling

    Cold forgingcanin generalbemodelledwith theequa-tion ��������� (1)where� is amicrostructuralvariable,� is thesetof pro-cessvariablesand � is somenon-linearfunction. In ourparticularimplementationwe areinterestedin predict-ing asinglemicrostructuralvariable,namelygrainsize,in agivenmaterial(anAl-1%Mg alloy) asa functionofthetotalstrain,� , annealingtemperature, , andanneal-ing time, � . Theexperimentalset-upfor obtainingthesedatais asfollows. A workpieceof thematerialis com-pressedin plane-straincompressionat room tempera-ture,asshown in Figure1a.After thespecimenhasbeenannealed,it is etchedandthegrainsizesmeasuredwith

    Fig. 2: (a) The left half of this diagramshows the microstructureofhalf of asectionedspecimenwhichhasbeendeformedunderaplane-strain compression.The materialhasbeenannealedat ��������� for30 mins producingmany recrystallisedgrains. (b) The right half ofthis diagramis the correspondingstrain contourmap producedbythe Finite Elementmodel. Note that the areasof high strain in (b)correspondto smallgrainsin (a).

    an optical microscope.The local strainexperiencedateachpoint in thematerialis evaluatedusingaFiniteEl-ement(FE) model,the parametersof this modelbeingdeterminedby the known materialproperties,forginggeometries,friction factorsandsoon. Figure2b showsan exampleof an FE map. Many grain sizeswithina singlesmall areaareaveragedto give a meangrainsize. Thus we now have a set of model inputs, � , and � , associatedwith a singlemeangrain sizewhichcanbeusedto developastaticmicrostructuralmodelofforging. Furtherdetailsof the experimentalprocedurecanbefoundin Sabinetal. [9].

    3.1. The Gaussian Process Model

    The Gaussianprocessmodel[4] [10] assumesthat theprior joint probabilitydistribution of a setof any � ob-servationsis givenby an � -dimensionalGaussian,i.e.� �������� ���"!$#&%'#)(*�+ (2), -�.0/�1�2435 ����627%8:9;(=� ����@2A%8CBD# (3)where ��� = �E� > �E� > �#F�HGI���GJ�#�KLK�KL#F�H������"F is the set� of observationscorrespondingto the set of � in-put vectors, �L� � !NMO�L� > #F� G #�K�KLK�#F� � ! . % and ( � ,respectively themeanandcovariancematrix for thedis-tribution, parameterisethis model. Theelementsof thecovariancematrix arespecifiedby thecovariancefunc-tion, which is a function of the input vectors, � � � ! ,and a set of hyperparameters. A typical form of thecovariancefunctionisPRQ�S �@T > -�.0/VU&2435"WYX�Z[ WYX >

    �E\�] WY^Q 27\�] W_^S G` GWa=b TJG b Tdc�e Q�S K

    (4)Thisequationgivesthecovariancebetweenany two val-ues � Q and � S with correspondingf -dimensionalinputvectors� Q an � S respectively, andis capableof imple-mentinga wide classof functions,� , thatcouldappearin equation1. (TheGaussianprocessmodelhasascalar‘output’, � ; to modelseveral microstructuralvariableswe would useseveral independentmodels.) The first

    11

  • termin equation4 expressesourbelief thatthefunctionwe aremodellingis smoothlyvarying, where ` W is thelengthscaleoverwhich thefunctionvariesin the gEhEi in-putdimension.Thesecondtermallows thefunctionstohave a constantoffsetandthethird is a noiseterm: thisparticularform is a modelfor input independentGaus-siannoise.Thehyperparameters,` W ( g = 3 K�KLK f ), T > , T G ,T c , specifythefunction,andaregenerallyinferredfromasetof trainingdatain a fashionanalogousto traininganeuralnetwork. They arecalledhyperparametersratherthanparameters becausethey explicitly parameteriseaprobability distribution rather than the function itself.This distinguishesthemfrom weightsin a neuralnet-work, which are rather“arbitrary”, in that addingan-otherhiddennodecould changethe weightsyet leavetheinput–outputmappingessentiallyunaltered.

    Oncethe hyperparametersareknown, the probabil-ity distribution of a new predictedvalue, �H�Rj > , corre-spondingto anew ‘input’ variable,���kj > , is� �E� �Rj > � � � #��L� � !I#&� �Rj > #)( �kj > (5), -�.0/mln2 �E�H�kj > 2po�H�kj > G5Hq G rsut?vxw y # (6)i.e. a one-dimensionalGaussian,where o�H�kj > andq rs t?vxw areevaluatedin termsof thecovariancefunctionand the training data. We would typically report ourpredictionas o�H�kj >kz q rs t?vxw . Theseerrorsreflectboththe noisein the data(third term in equation4) andthemodeluncertaintyin interpolatingthetrainingdata.ThefactthattheGaussianprocessmodelnaturallyproducesconfidenceintervals on its predictionsis important inthe materialsindustry wherematerialpropertiesmustoftenbespecifiedwithin certaintolerances.

    Our modelassumesthat the measurementnoiseandthe prior probability of the unknown function can bedescribedby a Gaussiandistribution. In our applicationit is moresensibleto assumethat it is the logarithmofgrain sizeswhich aredistributedasa Gaussian,ratherthan the grain sizesthemselves. This is becauseun-certaintiesin measuringgrainsizescalewith themeangrain size, and are thereforemore appropriatelyex-pressedasafractionof themeangrainsizeratherthanafixedabsolutegrainsize.Moreover, empiricalevidencesuggeststhatgrainsizedistributionsarewell describedby a log normaldistribution [7].

    3.2. Model Predictions

    A Gaussianprocessmodelwastrainedusingasetof 46datapairsobtainedfrom theplane-straingeometry, with{ K {}|~ � ~�{ Kd , 5I} C ~ ~ } C, 1 mins ~ � ~60minsastheinputs.Oncetrained,themodelwasusedto producepredictionsof grainsizesfor a rangeof theinput variables.Thesepredictions,shown in Figure3,agreewell with metallurgicalexpectations.

    Oneof theassumptionsimplicit in ourmodelof coldforging is thatgiventhe local strainconditions,themi-crostructureis independentof the materialshapeand

    Fig. 3: Grain size predictionsobtainedwith the Gaussianprocessmodel trainedon datafrom the plane-straincompressiongeometry.In eachof thethreeplots,two of theinput variablesareheldconstantand the other varied. When not being varied, the inputs wereheldconstantat: @@����� � C; R@��� mins; @� � . The crossesinthestrainplot aredatafrom thetrainingset.As theGaussianprocessis an interpolationmodel,predictionsat any valuesof the inputsareconstrainedby theentiretrainingset.

    forgegeometry. In otherwords,we assumethatpredic-tionscanbeobtainedgivenonly the local accumulatedstrain (and annealingconditions). This is an impor-tant requirementasit meansthat a singlemodelcouldbe appliedto a rangeof industrial forging geometries,provided that the local strainscould be obtained(e.g.with an FE model). We testedthe validity of this as-sumptionby usingthe Gaussianprocessmodeltrainedon plane-straindata to predict grain sizesin a mate-rial compressedusinga differentgeometry, namelyanaxial compression(Figure1b). As before,after com-pressionthematerialwasannealed,sectionedandgrainsizesmeasured.A new FE modelgavetheconcomitantlocal strains. Theseprocessinputs were then usedtoobtainpredictionsof thegrainsizesusingthe previousGaussianprocessmodel. Figure 4 plots thesepredic-tions againstthe measurements.We seeremarkableagreement—wellwithin thepredictederrors—thusval-idatingourmodellingapproach.A practicalapplicationof ourmodelis to producediagramssuchasthatshownin Figure5, a mapof thegrainsizes.Sucha mapis im-portantfor engineerswho needto know thegrainsizesatdifferentpointsin thematerial,andcanthusassessitsresistanceto phenomenasuchascreepandfatigue.

    It should be noted that this methodcontainsotherimplicit assumptions.Thefinal materialmicrostructureis very stronglydependentuponthe materialcomposi-tion. It is well known that even small changesin thefractionsof thealloying constituents(andby extension,impurities)canhave a strongeffect on the thermome-

    12

  • Fig. 4: Gaussianprocessmodelpredictionscomparedwith measuredvalues. The Gaussianprocessmodelwas trainedon datafrom onecompressionalgeometry(plane-strain)andits performanceevaluatedusingdatafrom anothergeometry(axial-compression)whichwasnotseenduring training. The line is to guidethe eye. Note thatnot even a perfectmodelwould producepredictionson this line dueto finite noisein thedata.

    Fig. 5: The left half is an imageof the microstructurein the axiallycompressedspecimen.Theright half is thecorrespondinggrainsizepredictionsfrom theGaussianprocessmodelshown asacontourmap.

    chanicalprocessingof the material. Oneway forwardis to include further input variablescorrespondingtocomposition[3]. A secondimplicit assumptionhasbeenthe constancy of the initial microstructure.Dependinguponthe materialandthe degreeof thermomechanicalprocessing,the final microstructuremay retain some‘memory’ of its initial microstructure,thusnecessitat-ing amodelwhichhas“initial conditions”asadditionalinputvariables.

    4. A Recurrent Neural Network forDynamic Process Modelling

    4.1. Model Description

    For thehot working problem,we assumethat therearetwo setsof variableswhich are relevant in describingthe behaviour of the dynamicalsystem. The first, � ,areexternalvariableswhich influencethebehaviour ofthe system,suchasthe strain,strainrateandtempera-ture. It is assumedthat all of thesecanbe measured.The secondset of variables, � , are the statevariableswhich describethe systemitself. Theseare split into

    1.0

    1.0

     ¡ ¢ £ ¢ ¤ Fig. 6: A recurrentneuralnetwork architecture(‘dynet’) for mod-elling dynamicalsystems.The outputs, , from the network arethetimederivativesof thestatevariablesof thedynamicalsystem.There-currentinputs, ¥ , arethesestatevariables.Thevaluesof ¥ at thenexttime stepareevaluated(usingequation9) from the outputs(via therecurrentconnections)andtheprevious valuesof ¥ . All connectionsandthetwo biasnodesareshown.

    two categories. The first are measured,suchas grainsize,andthe secondareunmeasured,suchasdisloca-tion density. Notethattheunmeasuredvariablesarenotintrinsicallyunmeasurable:this is simplyacategory forall of thestatevariableswhichwebelieveto berelevantbut which, for whatever reason,we donot measure.

    Both � and� arefunctionsof time. A generaldynam-ical equationwhich describesthetemporalevolutionofthe statevariablesin responseto the externalvariablesis ¦ �§� � ¦ � ��¨"�E�§� � �#F�©� � FV# (7)where ¨ is somenon-linearfunction. To a first-orderapproximation,wecanwrite�§� � b e � ©�@�§� � b ¦ �§� � ¦ � e � K (8)This dynamicalsystemcanbemodelledwith therecur-rentneuralnetworkarchitectureshown in Figure6. Thisis a discretetime network in which the input dataareprovidedasadiscretelist of valuesseparatedby knowntime intervals. The input–outputmappingof this net-work implementsequation7 directly: Ratherthanpro-ducingthestatevariablesattheoutputof thenetwork,asis often the casewith recurrentnetworks (e.g.[8]), weproducethe time derivativesof the statevariables,forreasonsthatareexpoundedonbelow. Thehiddennodescomputea non-linearfunctionof both theexternalandtherecurrentinputswith a sigmoidfunction(e.g.tanh),asconventionallyusedin feedforwardnetworks. A lin-earhidden–outputfunction is usedto allow for an ar-bitrary scaleof the outputs. The recurrentpart of thedynamicalsystem,viz. equation8, is implementedwiththe recurrentloops shown in Figure 6, by settingtheweightsof theserecurrentloopsto the sizeof the time

    13

  • step, e � , betweensuccessive epochs.Explicity, the ªxhEirecurrentinput at timestep« is givenby�}¬0� « §���}¬� « 2 3 b® ¬x� « 2 3 Fe � � « (9)where

    ® ¬ � « 2 3 ©� ¦ �§� « 2 3 &¯ ¦ � and e � � « is thetimebetweenepoch � « 2 3 andepoch � « .

    Theprincipalreasonfor developinga network whichpredictsthetimederivativesof thestatevariablesis thatit canbe trainedon time-seriesdatain which thesepa-rationsbetweentheepochs,e � � « , neednotbeconstant:at eachepoch« we simply settheweightsof therecur-rentfeedbackloopsto e � � « . Furthermore,thenetworkcanbetrainedonmultiple timeseriesin which thetimescalesfor eachtime seriesmaybevery different. Thisis important in forging applicationsas the forging oflargecomponentswould occurovera longertime scalethanfor smallcomponents,whereasthemicroscopicbe-haviour of the materialswould essentiallybe the same(for a given material). In sucha casewe would wantto incorporatedata from both forgings into the samemodel, but without having to obtain measurementsatthesameratein bothcases.

    While our network is similar to that of Jordan[5],our architecturehastheimportantattributesthat: 1) theoutputsare time derivativesof the statevariables,and2) in training the network the error derivativescanbepropagatedvia therecurrentconnectionsto thearbitrar-ily distantpast. Our training algorithmcanbe seenasageneralisationof themethoddescribedby Williams &Zipser[11] extendedto multiple time series.Althoughnecessarilyonly the feedforwardweightsaretrainable,theinput–hiddenweights,for example,arenonethelessdependentuponthevaluesof thehiddennodesby virtueof the recurrentconnections,and this dependency istaken into account. Training proceedsby minimizinganerrorfunction,typically thesumof squareserror, bygradientdescentor aconjugategradientalgorithm.Theweightscanbe updatedafter eachepochof eachtime-series(i.e.RealTimeRecurrentLearning[11]), afterallepochsof all patterns,or at any intermediatepoint.

    We use a Bayesianimplementationof the trainingprocedure[6]: multiple weight decayconstantsreg-ularize the determinationof the weights,and a noiseterm specifiesthe assumednoise levels in the targets( � ). Someform of regularisationis probablyessentialon accountof the many intermediateoutputvaluesforwhich their areno targets.Thenoiseandweightdecayhyperparametersarecurrentlysetby hand,but couldinprinciplebeinferredfrom thedata.

    To train thenetwork weneedat leastonetargetvalueatat leastoneepoch.Notethatthetrainingalgorithmisnotrestrictedto usetargetsonly for the‘outputs’: errorscanbepropagatedfrom any node.Generallywe wouldhave valuesof the statevariables(recurrentinputs)forthefinal epoch.However, in metallurgical applicationswe may sometimesbe able to obtain additionalmea-surementsatsomeintermediateepochs,thusimprovingthe accuracy of the derived input–outputfunction. We

    will of coursenot have any target valuesfor the ‘un-measured’statevariables. Hencethesevariablesmaynot even correspondto any physicalvariables,insteadacting as ‘hidden’ variableswhich convey somestateinformationnot containedin the ‘measured’statevari-ables. Nonethelesswe may be able to provide someloosephysicalinterpretationfor unmeasuredvariables.

    Oncetrained,thenetwork producesa completetimesequencefor all statevariablesgivena sequenceof theexternalinputs,i.e. thedetailsof theforgingprocess.

    4.2. Model Demonstration

    We now demonstratethe performanceof the modelononeparticularsyntheticproblemin which therearetwoexternalinput variables,\ > � � and \°GI� � , andtwo statevariables,� > � � and �HG}� � , definedby¦ � >¦ � � \ > 2 5 � > b | � G 27\ > � > (10)¦ �HG¦ � � \ G 2 � > b � G 2m\ G � G K (11)To mimic realprocessesin whichtheexternalinputsareoften constrainedto be positive (e.g. temperature),the\ > and \ G sequencesweregeneratedfrom constrainedrandomwalks: at eachepoch, \ > ( \ G ) changeswith aprobabilityof 0.1 (0.5) by a randomamountuniformlydistributedbetween2 { K and b { K ( 2 3 and b 3 ). Themodulusof \ is takento ensureapositivesequence.Theinitial � valueswererandomlyselectedfrom a uniformdistributionbetween2 3 and b 3 .

    Equations10 and11 arenot analyticallysolvable,sowe usedthe first-orderTaylor expansionin equation8to evaluatethe � > � � and �HGI� � sequences.A very smallvalueof e � wasusedto yield anaccurateapproximationto thetruecontinuoussequence.

    Onehundredsynthetictime seriesweregeneratedinthis fashion,with differentsequencesfor \ > and \±G anddifferent initial valuesof � > and �HG . The sequencesweregeneratedfrom � = 0 to � = 8 inclusive. We choseto samplethesesequencesat a constantepochsepara-tion of e � = 0.1,giving a sequencelengthof 81 epochs.This time extent andvalueof e � samplesthe dynami-cal sequencewell: If all \ dependenceis removed inequations10and11,weareleft with dampedsinusoidaloscillationsin � > and � G , with oscillationperiod1 and² dampingtimescale2. The \ termsmake theseriessomewhatmorecomplex, but theoveralloscillatorybe-haviour is retained.

    Thenetwork wastrainedon 50 of thetime series.Indoing so, the network was only given the initial andfinal � valuesand the complete \ sequencesfor eachtime series: it did not seethe intermediate� values.Oncetrained,the network wasappliedto the other50sequences,i.e. given the initial � valuesand the \ se-quences,thenetwork producessequencesfor � . We seefrom Figure7 that thenetwork makesexcellentpredic-tionsof � > and �HG at thefinal epoch.

    14

  • Fig. 7: Network predictionsof the final valuesof the ¥ ³ and ¥�´ se-quencesfor the 50 time seriesin the testdatasetplottedagainstthetarget values. The rms errorsare0.0067and0.0084for ¥ ³ and ¥�´respectively.

    It is now interestingto askwhetherthe network hasmanagedto correctly learn the entire � sequencesforthe test data,or whetherit has learnedsomesimplerinput–outputfunctionwhich just givescorrect � valuesatthefinal epoch.Figure8 showsthe \ inputsequences,and the predicted� sequencesin comparisonwith thetruesequences:Theagreementis generallyvery good.Someof thepredictedsequences(suchastheloweroneshown) deviatefrom thetruesequenceat earlyepochs.This is probablydueto the differencebetweenthe sta-tionarydistributionsof thestatevariablesat � =8 andatearly epochs:The network is explicity trainedto givepredictionsfor the target valuesof � . Theselie in therangesshown in Figure 8, which is narrower than therangeof � valuesat � = 0. Therefore,we would notexpectthenetwork to give suchgoodpredictionsin thenon-overlappingpartof this latterrange.As thedynam-ical systemis damped,the poorerpredictionstend tooccurearlieron. Nonetheless,we seethat thenetworkhaslearnedagoodapproximationof thetrueunderlyingdynamicalsequencefor therangeof � valuespresentinthetargets.

    The network hasbeenappliedto a numberof othersyntheticproblems(includingoneswith non-linear, e.g.� > �HG , terms),anda goodpredicitive capablility is ob-

    Fig. 8: Externalinput andstatevariablesequencesfor two differenttime seriesin the testdataset. In eachplot the solid lines from topto bottomare ¥ ³ , ¥�´ , x³ and }´ . For ¥ ³ and ¥�´ the solid lines arethe nework predictionsandthe dashedlines the true sequences.Forclarity, the sequencesfor ¥�´ , x³ and }´ have beenoffset from theirtruepositionsby -1, -3 and-6 respectively.

    served. The network hasalsobeenable to learnfromtraining data sets in which e � varies within eachse-quence,andthelengthsof thesequencesdiffer [1].

    5. Summary

    We have introducedtwo techniquesfor modellingma-terialsforging,onein thestaticdomainrelevantto coldforging,andanotherin thedynamicdomainrelevanttohot forging. The Gaussianprocessmodel usedin theformer casehasbeendemonstratedto provide a goodpredictive capability, even when using a small, noisydataset.Furthermore,thesufficiency of our input vari-ables(for agivenmaterial)hasbeendemonstratedfromthe successfulapplicationof the trainedGaussianpro-cessmodelto adifferentforginggeometry.

    For the hot forging problem we have introducedageneralneuralnetwork architecturefor modelling dy-namical processes. The power of this network wasdemonstratedon a syntheticproblem. Furtherdetailsof this model, as well as its applicationto a rangeofsyntheticproblems,will be presentedin a forthcomingpaper[1]. Futurework with thismodelwill focuson itsapplicationto realdatasetsobtainedfrom hot forging.

    15

  • Acknoµ

    wledgements

    The authorsaregrateful to the EPSRC(grantnumberGR/L10239),DERA andINCO Alloys Ltd. for finan-cial supportandto Mark Gibbsfor useof his Gaussianprocesssoftware.

    References

    [1] C.A.L. Bailer-Jones,D.J.C.MacKay, in prepara-tion, 1998.

    [2] T. Furu, H.R. Shercliff, C.M. Sellars, M.F.Ashby, “Physically-basedmodelling of strength,microstructureand recrystallisationduring ther-momechanicalprocessingof Al–Mg alloys”, Ma-terials Sci.Forum, 217–222,pp.453–458,1996.

    [3] L. Gavard,H.K.D.H. Bhadeshia,D.J.C.MacKay,S. Suzuki, “Bayesianneural network model forausteniteformationin steels”,MaterialsSci.Tech-nol., vol. 12,pp.453–463,1996.

    [4] M.N. Gibbs,BayesianGaussianprocessesfor re-gressionandclassification. PhDthesis,Universityof Cambridge,1997.

    [5] M.I. Jordan,“Attractor dynamicsandparallelismin a connectionistsequentialmachine”,Proc. ofthe Eight Ann. Conf. of the Cognitive Sci. Soc.,Hillsdale,NJ:Erlbaum,1986.

    [6] D.J.C.MacKay, “Probablenetworksandplausiblepredictions:a review of practicalBayesianmeth-ods for supervisedneural networks”, Network:Computationin Neural Systems, vol. 6, pp. 469–505,1995.

    [7] F.N. Rhines,B.R. Patterson,“Effect of thedegreeof prior coldwork onthegrainvolumedistributionandthe rateof graingrowth of recrystallisedalu-minium”, Metallurgical TransationsA, vol. 13A,pp.985–993,1982.

    [8] A.J. Robinson, F. Fallside, “A recurrent errorpropagationnetwork speechrecognitionsystem”,ComputerSpeech andLanguage, vol. 5, pp. 259–274,1991.

    [9] T.J. Sabin, C.A.L. Bailer-Jones,S.M. Roberts,D.J.C.MacKay, P.J.Withers,“Modelling theevo-lution of microstructuresin cold-worked andan-nealedaluminiumalloy”, in T. Chandra,T. Sakai(eds),Proc. of the Int. Conf. on Thermomechani-cal Processing, Pennsylvania,PA: The Minerals,MetalsandMaterialsSociety, 1997.

    [10] C.K.I. Williams, C.E.Rasmussen,“Gaussianpro-cessesfor regression”, in D.S. Touretzky, M.C.Mozer, M.E. Hasselmo(eds),Neural InformationProcessingSystems8, Boston,MA: MIT Press,1996.

    [11] R.J. Williams, D. Zipser, “A learning algorithmfor continuallyrunningfully recurrentneuralnet-works”, Neural Computation, vol. 1, pp.270–280,1989.

    16