on the spectral bias of neural networks13-11...2000/11/13  · on the spectral bias of neural...

12
On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A. Hamprecht Yoshua Bengio Aaron Courville

Upload: others

Post on 17-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

On the Spectral Bias of

Neural NetworksNasim Rahaman* Aristide Baratin* Devansh Arpit

Felix Draxler Min Lin

Fred A. Hamprecht Yoshua Bengio Aaron Courville

Page 2: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

The good old question:

Why do massive neural

networks generalize when they

can learn random labels?

Page 3: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

The good old question:

Why do massive neural

networks generalize when they

can learn random labels?

Page 4: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

Our proposal:

Neural networks learn simpler

functions first.

Page 5: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

But how do we quantify simplicity?

We use the (Fourier) Spectrum.Our approach:

Lower Frequency Functions

Higher Frequency Functions

Page 6: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

Our proposal becomes:

Neural networks learn lower

frequencies first.

Page 7: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

Fully learned

Not learned

Colorbar shows the Fourier amplitude

of the network relative to the

target.

Trai

ning

Iter

atio

n

Page 8: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

Why should I care?

One of the many reasons:

NN training is vulnerable against low frequency label noise.

Page 9: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

Training with label noise

High frequency label noise leads to

a dip in the validation loss.

Validation loss on MNIST (w.r.t pure targets)

Low frequency label noise does

not… :(

Page 10: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

To learn how the manifold complexity

attenuates the spectral bias,

drop by at our poster!

#72

Page 11: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

Spoilers

Learning gets easier with

increasing manifold

complexity.

To express complex functions, the parameters must “work

together in harmony”.

Page 12: On the Spectral Bias of Neural Networks13-11...2000/11/13  · On the Spectral Bias of Neural Networks Nasim Rahaman* Aristide Baratin* Devansh Arpit Felix Draxler Min Lin Fred A

Thank you for your

attention!