how to make a computer think for you

How to Make a Computer Think for You

Jeff Knisley, The Institute for Quantitative Biology, East Tennessee State University

ALABAMA MAA STATE DINNER AND LECTURE, Feb, 2006

This is a Neuron

Signals Decay at Soma if below a Certain threshold

SomaDendrites

Axon

Synapses

Signals Propagate fromDendrites to Soma

Signals May Arrive Close Together

If threshold exceeded,then neuron “fires,” sending a signal along its axon.

Neurons Form Networks

Artificial Neural Network (ANN)

Made of artificial neurons, each of which Sums inputs from other neurons Compares sum to threshold Sends signal to other neurons if above

threshold Synapses have weights

Model relative ion collections Model efficacy (strength) of synapse

Artificial Neuron

th thi jw synaptic weight betweeni and j neuron

" "firing function that maps state to output

i i ix s i ij js w x

1x2x3x

nx

1iw2iw3iw

inw...

thj threshold of j neuron

Nonlinear firing function

Firing Functions are Sigmoidal

j

j

j

1

1 j j jj j s

se

Hopfield Network

Imagine CompleteConnectivitywith weightswij between ithand jth neurons

Blue = 1 White = 0

jij

oldjij

newi xwx

Choose ith neuron at random and calculate its new state

Energy

n

jjj

n

j

n

lljjl xxxwE

11 121

Define the energy to be

Theorem: If the weights are symmetric, then the Energydecreases each time a neuron fires.

Applications: Handwriting Recognition

http://faculty.etsu.edu/knisleyj/neural/neuralnet3.htm Universal Classifier

http://faculty.etsu.edu/knisleyj/neural/neuralnet4.htm

Expert Systems Rule-based rather than sequential programs Air traffic control, Industrial Controls

Robotics Usually using a 3-layer network



3 Layer Neural Network

Output

Hidden(is usually much larger)

Input

The output layer mayconsist of a single neuron

Neural Nets can “Think”

A Neural Network can “think” for itself Can be trained (programmed) to make decisions Can be trained to classify information

This tiny 3-Dimensional Artificial Neural Network, modeled after neural networks in the human brain, is helping machines better visualize their surroundings.

http://www.nasa.gov/vision/universe/roboticexplorers/robots_like_people.html

http://www.nasa.gov/vision/universe/roboticexplorers/robots_like_people.html

The Mars Rovers

Must choose where to explore Programmed to avoid “rough” terrain Programmed to choose “smooth” terrain

ANN decides between “rough” and “smooth” “rough” and “smooth”

are ambiguous Programming

is by means of many “examples”(lessons)

Illustration: Colors = Terrains As a robot moves, it defines 8 squares of size

A that define directions it can move in It should avoid red (rough) terrain It should prefer green (smooth) terrain It should be indifferent to blue (normal) terrain

It is impossible to program every possible shade and variation Instead, a neural network is constructed Terrain Block – Color class input/output

patterns are used to train the network

1 2 3

4 Robot 5

6 7 8

Train ANN to Classify Colors

ColorClass

Hidden

TerrainExamples

Red

Green

Blue

Training SetInput Output

Terrain <R,G,B>

<1,0,0>

<0,1,0>

<0,0,1>

<1,0,0>

<0,0,0>

“Houston, we have a problem!”

0

0

0

0.9

0.1

0.1

ANN’s Can Also Think for Us Mars Rovers do what Humans can do better

They do not “learn” on their own They are taught by Humans who could make the same

or better decisions in their place They can use their “learning” independently

What about problems Humans can’t solve at all Neural Networks can be used as savants dedicated to

a single problem too complex for humans to decipher Examples

Agent-Based Modeling RNA, Protein Structures, DNA analysis Data Mining

Division of Labor in Wasps Nests

A Wasp can alternate between laborer, water forager, and pulp forager Hypothesis: Individual wasps choose roles so

that total water in the nest reaches equilibrium Hypothesis: Pulp production is maximized

when total water in the nest is stable Limited confirmation of Hypotheses

(Karsai and Wenzel, 2002) System of 3 ode’s (Karsai, *Phillips, and Knisley, 2005)

Computer Simulation

ANN Wasp Societies

Problems with current models Role assigned randomly w.r.t. a Weibull

distribution Parameters selected so model “works”

ANN: Role is a decision of each wasp A wasp is a special type of Artificial Neuron

Wasp decides its own role Wasp learns to make good choices

System is deterministic yet unpredictable

“I’m empty, I need water!”“No luck,

so I’m a water

forager.”

“I have water!”

“I’m so good at getting water, I think I should go forage

for pulp.”

Division of Labor by choice?

Numbers of foragers goes up and down But water level in nest becomes stable

Data Mining

We often consider very large data sets Microarrays contain about 20,000 data points Typical studies use 70 – 100 microarrays Most of the data is not relevant

ANN’s can be trained to find hidden patterns Input layer = Genes Network is trained repeatedly with microarrays

collected in various physiological states ANN’s predict which genes are responsible for

a given state

ANN’s in Data Mining

Each neuron acts as a “linear classifier” Competition among neurons via nonlinear

firing function = “local linear classifying” Method for Genes:

Train Network until it can classify between control and experimental groups

Eliminating weights sufficiently close to 0 does not change local classification scheme

First results obtained with a Perceptron ANN

Simple Perceptron Model

yOutput

x1 = Gene 1

w1

w2

wn

The output is the “physiological state” due to the relative gene expression levels used as inputs.

x2 = Gene 2

xn = Gene n

Simple Perceptron Model

Features The wi measure gene “significance” Detects genes across n samples & references Ref: Artificial Neural Networks for Reducing the

Dimensionality of Gene Expression Data, A. Narayanan, et al. 2004.

Drawbacks The Perceptron is a globaly linear classifier (i.e.,

only classifies linearly separable data) We are now using a more sophisticated model

Linearly Separable Data

Separation using Hyperplanes

Data that Cannot be separated Linearly

How do we select w’s

Define an energy function

x1,…,xn are inputs, y = output t1,…,tm are the outputs associated with

the patterns to be “learned” y = ( wixi - ) for a perceptron Key: Neural networks minimize energy

m

jjn tyxxE

1

21 ),...,(

Back Propagation Minimize the Energy Function

Choose wi so that

In practice, this is hard Back Propagation:

For each pattern ti Feed Forward and Calculate E Increment weights using a delta rule:

Repeat until E is sufficiently close to 0

0

kwE

kiknew

k xyytyww 1

Perceptron for Microarray DataMining

1. Remove % of genes with synaptic weights that are close to 0

2. Create ANN classifier on reduced arrays3. Repeat 1 and 2 until only the genes that

most influence the classifer problem remain

Remaining genes are most important in classifying experimentals versus controls

Functional Viewpoint

ANN is a mapping f: Rn → R Can we train perceptron so that f(x1,…,xn) =1 if x

vector is from a control and f(x1,…,xn) =0 if x is from an experimental?

Answer: Yes if data can be linearly separated, but no otherwise unless we use better ANN’s!

General ANN’s also have problems Spurious states: (sometimes ANN’s get the

wrong answer) Hard Margins: Training set must be perfect

Multilayer Network

1 1 1t w x

1x2x3x

nx

..

....

tN N N w x

12

N1

N

j jj

out

1

Nt

j j jj

out

w x

How do we select w’s

Define an energy function

t vectors are the information to be “learned” Neural networks minimize energy The “information” in the network is

equivalent to the minima of the total squared energy function

2

1

12

n

i i ii

E t

Back Propagation Minimize the Energy Function

Choose wj and j so that

In practice, this is hard Back Propagation with cont. sigmoidal

Feed Forward and Calculate E Modify weights using a rule

Repeat until E is sufficiently close to 0

0, 0ij j

E Ew

jjj

jjnewj

newjjjj

newj

newj twwt

1

,

ANN as Classifer

(Cybenko) For any >0, the function f(x1,…,xn) =1 if x vector is from a control and f(x1,…,xn) =0 if x is from an experimental can be approximated to within by a multilayer neural network.

The weights no longer have the one-to-one correspondence to genes, so we test significance using Monte Carlo techniques.

ANN and Monte Carlo Methods

Monte Carlo methods have been a big success story with ANN’s Error estimates with network predictions ANN’s are very fast in the forward direction

Example: ANN+MC implement and outperform Kalman Filters (recursive linear filters used in Navigation and elsewhere) (De Freitas J. F. G., et. al., 2000)

Recall: Multilayer Network

..

....

12

N

1

Nt

j j jj

out

w x

N Genes N nodeHidden Layer

j correspond to genes,but do not directly dependon a single gene.

Naïve Monte Carlo ANN Method

1. Randomly choose subset S of genes2. Train using Back Propagation3. Prune based on values of wj (or j , or both)

4. Repeat 2-3 until a small subset of S remains5. Increase “count” of genes in small subset6. Repeat 1-5 until each gene has 95%

probability of appearing at least some minimum number of times in a subset

7. Most frequent genes are the predicted

Additional Considerations If a gene is up-regulated or down-regulated

for a certain condition, then put it into a subset in step 1 with probability 1.

This is a simple-minded Bayesian method. Bayesian analysis can make it much better.

Algorithm distributes naturally across a multi-processor cluster or machine Choose the subsets first Distribute subsets to different machines Tabulate the results from all the machines

Summary ANN’s are designed to make decisions in

similar fashion to how we make decisions In this way, they can “think” for themselves Can be considered supplements to existing

hardware and software tools The ability of ANN’s to make decisions allows

them to think for us as well! They can find patterns in large data sets that

we humans would likely never uncover They can “think” for days/months on end

Any Questions?

ReferencesCybenko, G. Approximation by Superpositions of a sigmoidal function,

Mathematics of Control, Signals, and Systems, 2(4),1989, p. 303-314.

De Freitas J. F. G., et. al. Sequential Monte Carlo Methods To Train Neural Network Models. Neural Computation, Volume 12, Number 4, 1 April 2000, pp. 955-993(39)

L. Glenn and J. Knisley, Solutions for Transients in Arbitrarily Branching and Tapering Cables, Modeling in the Neurosciences: From Biological Systems to Neuromimetic Robotics, ed. Lindsay, R., R. Poznanski, G.N.Reeke, J.R. Rosenberg, and O.Sporns, CRC Press, London, 2004.

A. Narayan, et. al Artificial Neural Networks for Reducing the Dimensionality of Gene Expression Data. Neurocomputing, 2004.

how to make a computer think for you

Documents