pac learning 8/5/2005. purpose effort to understand negative selection algorithm from totally...

21
PAC Learning 8/5/2005

Upload: angel-york

Post on 27-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

PAC Learning

8/5/2005

Page 2: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

purpose

• Effort to understand negative selection algorithm from totally different aspects– Statistics– Machine learning

• What is machine learning, in a very informal way?

• Looking for mathematical tool to describe, analyze, evaluate either a learning algorithm, or learning problem.

Page 3: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

background

• PAC learning framework is a branch of computational learning theory.

• Computational learning theory is a mathematical field related to the analysis of machine learning algorithms. It is actually considered as a field of statistics.

• Machine learning algorithms take a training set, form hypotheses or models, and make predictions about the future. Because the training set is finite and the future is uncertain, learning theory usually does not yield absolute guarantees of performance of the algorithms. Instead, probabilistic bounds on the performance of machine learning algorithms are quite common.

Page 4: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

More about computational learning theory

• In addition to performance bounds, computational learning theorists study the time complexity and feasibility of learning.

• In computational learning theory, a computation is considered feasible if it can be done in polynomial time.

Page 5: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

More about computational learning theory

• There are several different approaches to computational learning theory, which are often mathematically incompatible.

• This incompatibility arises from– using different inference principles: principles

which tell you how to generalize from limited data.

– differing definitions of probability (frequency probability, Bayesian probability).

Page 6: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

More about computational learning theory

• The different approaches include:– Probably approximately correct learning (PAC

learning), proposed by Leslie Valiant;– VC theory, proposed by Vladimir Vapnik;– Bayesian inference, arising from work first done by

Thomas Bayes.– Algorithmic learning theory, from the work of E. M.

Gold.

• Computational learning theory has led to practical algorithms. For example, PAC theory inspired boosting

Page 7: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

What is this for?

• The PAC framework allowed accurate mathematical analysis of learning.

Page 8: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

Basic facts of PAC learning• Probably approximately correct learning (PAC learning) is a

framework of learning that was proposed by Leslie Valiant in his paper A theory of the learnable.

• In this framework the learner gets samples that are classified according to a function from a certain class. The aim of the learner is to find an approximation of the function with high probability. We demand the learner to be able to learn the concept given any arbitrary approximation ratio, probability of success or distribution of the samples.

• How does negative selection fit in? We only deal with a very special distribution of the samples: one class samples. Is it a PAC learning algorithm?

Page 9: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• “The intend of PAC model is that successful learning of an unknown target concept should entail obtaining with high probability, a hypothesis that is a good approximation of it.”

• We can consider this target concept as a unknown function, e.g. f:{0,1}n{0,1}; the result to pursue is an approximation of f, or a hypothesis as called here.

• The purpose of the discussion of PAC is to decide whether a algorithm to find the approximation (1) good enough or not (2) feasible or not.

• “If we wish to define a model of learning from (random) samples, a crucial point is to formulate ‘correctly’ the notion of success.” (quoted but corrected and highlighted)

Page 10: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• To make the discussion simple, let us use the simple setup f:{0,1}n{0,1}– Instance space {0,1}n

Page 11: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

Give probability distribution D defined on {0,1}n

The error of a hypothesis h with respect to a fixed target concept c is defined as

Where denotes the symmetric difference.

Error(h) is the probability that h and c will disagree according to D.

The hypothesis h is a good approximation of the target concept c if error(h) is small. (Note that depends on D).

Page 12: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

Definition of “PAC Learnability”

• This definition is the center piece of PAC learning model.

• Defining when the concept class C is:– PAC learnable by the hypothesis space H– Properly PAC learnable – PAC learnable

Page 13: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• What is concept class C?C={Cn}n≥1, where Cn is set of target concepts

over {0, 1}n

• What is hypothesis space H? H={Hn}n≥1, where Hn is set of hypotheses over

{0, 1}n

Page 14: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• Definition of “PAC learnable by the hypothesis space H”:– The concept class C is PAC learnable by the hypothesis space

H if there exists a polynomial time algorithm A and a polynomial p(,.,.) such that for all n≥1, all target concepts cCn, all probability distribution D on the instance space {0,1}n, and all e and d, where 0<, <1, if the algorithm A is given at least p(n,1/,1/) independent random examples of c drawn according to D, then with probability at least 1-, A returns a hypothesis h Hn with error(h)≤.

• Note: this talks about the existence of A, not what exactly A is.

• The smallest such polynomial p is called the sample complexity of learning algorithm.– This is as essential to a learning algorithm as time complexity to

a general algorithm

Page 15: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• Definition pf “properly PAC learnable”– If C=H

• Definition of “PAC learnable”– If C is concept class and there exists some

hypothesis space H such that hypotheses in H can be evaluated on given instances in polynomial time and such that C is PAC learnable by H

• This extension if from “for given H” to “existence of H”• If C is properly PAC learnable, it is obviously PAC learnable

(assuming hypotheses on C can be evaluated on give instance in polynomial time)

Page 16: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• There are many variants of the basic definition.

• It can be shown they are equivalent.

• The model can be extended to various aspects.

Page 17: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• We ask for a single algorithm A for all distribution

• Not that for every distribution D there exists an algorithm that was designed for the specific distribution D

• That means: algorithm A does not know the distribution.

Page 18: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• A key part of PAC learning and the potential link to negative selection algorithm we’re try to make (if existing at all): probability distribution D

• “The error probability is measured with respect to the same distribution according to which the random examples are chosen.” “if the learning algorithm will get random examples from a distribution which provides only samples with first bits 0 and the error will be measured with respect to distribution on strings whose first bit is 1 then clearly the learning algorithm has no chance to do much.”

• NSA, at least my method, seems doing something “no chance to do much” described above, with a little help from the magic “self threshold (or self radius)”– NSA’s notion of success is not well defined?

Page 19: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

• What does it mean by “L is a PAC learning algorithm”:– For any given , >0, there is a sample size

m0, such that for all target functions t computable and all probability distribution P, we have m>=m0 Pm(error(L(s),t)>)<

Page 20: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

Unanswered questions

• How does negative selection algorithm fit into the model of PAC learning?

• Does NSA count as a learning process or algorithm at all?

Page 21: PAC Learning 8/5/2005. purpose Effort to understand negative selection algorithm from totally different aspects –Statistics –Machine learning What is

references

• D Haussler. Probably approximately correct learning. In AAAI-90 Proceedings of the Eight National Conference on Artificial Intelligence, Boston, MA, pages 1101--1108. American Association for Artificial Intelligence, 1990. http://citeseer.ist.psu.edu/haussler90probably.html

• http://en.wikipedia.org/wiki/Probably_approximately_correct_learning

• http://en.wikipedia.org/wiki/Computational_learning_theory

• ... …