corrections l2 regularization ||w|| 2 2, not ||w|| 2 show second derivative is positive or negative...

CORRECTIONS • L2 regularization ||w|| 2 2 , not ||w|| 2 • Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x 2 ) • Loss = error associated with one data point • Risk = sum of all losses • Pseudoinverse gives least-squares solution, NOT exact solutions • Magnitude of w matters for SVMs.

Upload: reynold-patterson

Post on 21-Jan-2016

221 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

TRANSCRIPT

CORRECTIONS

• L2 regularization ||w||22

, not ||w||2

• Show second derivative is positive or negative on exams, or show convex– Latter is easier (e.g. x2)

• Loss = error associated with one data point• Risk = sum of all losses• Pseudoinverse gives least-squares solution, NOT

exact solutions• Magnitude of w matters for SVMs.

Page 2: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

HW 3

• Will be released today.• Probably harder than HW1 or HW2• Due Oct 6 (two Tuesdays from now)• HW party: Oct 1.• I wrote (some of) it.

Page 3: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

Downsides of using kernels

• Speed & memory– Need to store all training data, each test point

must be computed against each training point• SVMs only need subset of data (support vectors)

• Overfit

Page 4: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

3 Perspectives on Linear Regression

Page 5: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

1. Minimize Loss (see lecture)

• Take derivative of ||Xw – y||2, set to 0• Result: X’Xw = X’y

Page 6: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

2. Projections

Page 7: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

2. Projections

Page 8: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

2. Projections

Page 9: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

3. Gaussian noise

Page 10: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

3. Gaussian noise

Page 11: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

3. Gaussian noise

• HW 3 – first problem has a question on this

Page 12: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

Bias & Variance

• Bias:– Incorrect assumptions in your model – Your algorithm is only able to capture models of

complexity <= C, but the true model complexity is C’ > C

• Variance– Sensitivity of your algorithm to noise in the data.– How much your model changes per “unit” change

in the data.

Page 13: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

Bias & Variance

• Bias vs. variance is a tradeoff• Bias– you assume data is linear, when it’s nonlinear.

• Variance– you assume data could be polynomial, when it’s

always linear.– By assuming data could be polynomial, lots of free

parameters that move around if the training data changes.

– High variance = “overfitting”

Page 14: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

Bias & Variance

• If variance if too high, will often add bias in order to reduce variance.

• This is the reason regularization exists.– Increase bias, reduce variance.

• Usually depends on amount of data– More data fix down all those free parameters.

• Will revisit this with random forests.

Page 15: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

Problem 1

• a) Do at home• b) Follow the Gaussian noise interpretation of

linear regression

Page 16: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

Problem 2Credit: Yun Park

Page 17: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

Problem 2Credit: Yun Park

Page 18: CORRECTIONS L2 regularization ||w|| 2 2, not ||w|| 2 Show second derivative is positive or negative on exams, or show convex – Latter is easier (e.g. x

Problem 3 & 4

• 3) Write loss function, find derivative.• 4) Practice problems– “Extra for experts” is inaccurate – there is a very

simple answer.

Regularization, Ridge Regression - University of …courses.cs.washington.edu/.../regularization-xvalidation-lasso.pdf · Ridge Regression: Effect of Regularization 14 ! Solution

The Gaussian Kernel , Regularization

Ensemble Manifold Regularization

Rudy Regularization

Hawking, S. W. - Zeta Function Regularization of Path Integrals in Curved Spacetime

Global Regularization of Inverse Kinematics for …papers.nips.cc/paper/642-global-regularization-of-inverse... · Global Regularization of Inverse Kinematics for Redundant Manipulators

Path Space Regularization Framework

Fully Automatic Video Colorization With Self-Regularization ......Self-Regularization 4.1. Self regularization for colorization network Consider colorizing a textureless balloon. Although

Smooth regularization of bang-bang optimal control · PDF fileSmooth regularization of bang-bang optimal control ... Smooth regularization of bang-bang optimal control problems

Regularization in Neural Networks - Welcome to CEDARcedar.buffalo.edu/~srihari/CSE574/Chap5/Chap5.5-Regularization.pdf · Regularization in Neural Networks ... Need for Regularization

Zeta Function Regularization

Geometry of Optimization and Implicit Regularization in ... · GEOMETRY OF OPTIMIZATION AND IMPLICIT REGULARIZATION DEEP LEARNING Geometry of Optimization and Implicit Regularization

Robust Attribution Regularization

Regularization Michael Moeller Chapter 3 3 Regularization Ill-Posed Problems in Image and Signal Processing ... Department of Mathematics TU Munchen¨ Regularization Michael Moeller

Chap5.5 Regularization

Our First Hyperparameters: Mini-batching, Regularization ... · Benefits of Regularization •Cheap to compute •For SGD and L2 regularization, there’s just an extra scaling •L2

Hybrid Regularization and Sparse Reconstruction of · PDF fileHybrid Regularization and Sparse Reconstruction of ... is a technique ... Hybrid Regularization and Sparse Reconstruction

Temporal Regularization of Saliency Maps in Egocentric Videosdoras.dcu.ie/22565/1/temporal-regularization-saliency.pdf · Temporal Regularization of Saliency Maps in Egocentric Videos

Thailand presentation Show - w/ audio

Posterior Regularization for Structured Latent Variable Models...POSTERIOR REGULARIZATION FOR STRUCTURED LATENT VARIABLE MODELS show, this allows tractable learning and inference even

Regularization Harijan Basti

MEESEVA USER MANUAL FOR REGULARIZATION OF ENCROACHMENTS …ap.meeseva.gov.in/DeptPortal/Manuals/Revenue/Regularization of... · MEESEVA USER MANUAL FOR REGULARIZATION OF ENCROACHMENTS

Motion Regularization for Matting Motion Blurred Objectsmbrown/pdf/PAMI_motionmatting.pdf · Motion Regularization for Matting Motion Blurred Objects ... We show how to obtain better

Title: regularization

Linkedin Slide Show W Music

Inverse problems Regularization Lecture 3piskunov/TEACHING/INVERSE_PROBLEMS/inverse_problems… · Inverse problems Regularization Lecture 3 Nikolai Piskunov 2014 . Regularization

W eb 2.0 slide show

Regularization Tools - DTU

N*W*C The Race Show

L1 Regularization

W Hole Persian sLide SHow-Tester

SV Regularization

Employee engagement3 show-w-hauck.pps

Spectral Regularization and its Applications in Quantum ... · of view and satisﬁes both requirements is the Spectral Regularization. Spectral regularization was ﬁrst introduced

Regularization Order Trade Sector_19.08.2014

corrections l2 regularization ||w|| 2 2, not ||w|| 2 show second derivative is positive or negative...

Documents

data pointrisk

training data changes

datamore data fix

high variance

bias variancebias

linear regression1

increase bias

gaussian noisehw