outline intro to representation and heuristic search machine learning (clustering) and my research
TRANSCRIPT
Outline
Intro to Representation and Heuristic Search
Machine Learning (Clustering) and My Research
Introduction to Representation The representation function is to
capture the critical features of a problem and make that information accessible to a problem solving procedure
Expressiveness (the result of the feature abstracted) and efficiency (the computational complexity) are major dimensions for evaluating knowledge representation
Introduction to Search
Consider “tic-tac-toe” Starting with an empty board, The first player can place a X on any
one of nine places Each move yields a different board
that will allow the opponent 8 possible responses
and so on…
Introduction to Search We can represent this collection of
possible moves by regarding each board as a state in a graph
The link of the graph represent legal move
The resulting structure is a state space graph
“tic-tac-toe” state space graph
Introduction to Search
Human use intelligent search
Human do not do exhaustive search
The rules are known as heuristics, and they constitute one of the central topics of AI search
State Space Representation
State space search characterizes problem solving as the process of finding a solution path form the start state to a goal
A goal may describe a state, such as winning board in tic-tac-toe
Introduction Consider heuristic in the game of tic-tac-
toe A simple analysis put the total number of
states for 9! Symmetry reduction decrease the
search space Thus, there are not 9 but 3 initial moves:
to a corner to the center of a side to the center of the grid
Introduction
Introduction Use of symmetry on the second level
further reduces the number of path to 3* 12 * 7!
A simple heuristic, can almost eliminate search entirely: we may move to the state in which X has the most winning opportunity
In this case, X takes the center of the grid as the first step
Introduction
Introduction
Outline
Intro to Representation and Heuristic Search
Machine Learning (Clustering) and My Research
Clustering
Clustering is trying to find similar groups based on given dimensions
It is know as unsupervised learning
K-means Clustering
K-means Clustering
K-means Clustering
K-means Clustering
K-means Clustering
Experiment setup: HSSP matrix: 1b25
Representation of Segment Sliding window size: 9 Each window corresponds to a
sequence segment, which is represented by a 9 × 20 matrix plus additional nine corresponding secondary structure information obtained from DSSP.
More than 560,000 segments (413MB) are generated by this method.
DSSP: Obtain 2nd Structure information
HSSP-BLOSUM62 Measure
Research Topics
Part1Bioinformatics
Knowledge and Dataset Collection
Part2Discovering Protein
Sequence Motifs
Part3Motif Information
Extraction
Part4Mining the Relations between Motifs and
Motifs
Part5Protein Local Tertiary Structure Prediction
FutureWorks