PAL - Parabolic Approximation Line Search for DNNs:

This repository provides Tensorflow and Pytorch reference implementations for PAL. PAL is an An efficient and effective line search approach for DNNs which exploits the almost parabolic shape of the loss in negative gradient direction to automatically estimate good step sizes.

If you have any questions or suggestions, please do not hesitate to contact me: maximus.mutschler(at)uni-tuebingen.de

Short introduction to PAL:

Fig1: PAL's basic idea

PAL is based on the empirical observarion that the loss function can be approximated by a one-dimensional parabola in negative gradient/line direction. To do this, one additional point has to be meashured on the line.
PAL performs a variable update step by jumping into the minimum of the approximated parabola.
PAL surpasses SLS, ALIG, SGD-HD and COCOB and competes against ADAM, SLS, SGD and RMSProp on ResNet-32, MobilenetV2, DenseNet-40 and EfficientNet architectures trained on CIFAR-10 and CIFAR-100.
However, the latter are tuned by piecewise constant step sizes, whereas PAL does derive its own learning rate schedule.
PAL surpasses all those optimizers when they are trained without a schedule.
Therefore we PAL could be used in scenarios where default schedules fail.
For a detailed explanation, please refer to our paper.: https://arxiv.org/abs/1903.11991

Fig2: Exemplary performance of PAL with data augmentation

Fig3: Exemplary performance of PAL without data augmentation, however this leads to severe overfitting

The hyperparameters:

For a detailed explanation, please refer to our paper. The introduced hyperparameters lead to good training and test errors:
Usually only the measuring step size has to be adapted slightly. Its sensitivity is not as high as the one of of the learning rate of SGD.

Abbreviation	Name	Description	Default parameter intervalls	Sensitivity compared to SGD leaning rate
μ	measuring step size	distance to the second sampled training loss value	[0.1,1]	medium
α	update step adaptation	Multiplier to the update step	[1.0,1.2,1.7]	low
β	direciton adaptation factor	Adapts the line direction depending on previous line directions	[0.0.4]	low
s_max	maximum step size	maximum step size on line.	[3.6]	low

PyTorch Implementation:

No limitations. Can be used in the same way as any other PyTorch optimizer.
Runs with PyTorch 1.4
Uses tensorboardX for plotting
Parabola approximations and loss lines can be plotted

Tensorflow Implementation:

limitations:
- The DNN must not contain any random components such as Dropout or ShakeDrop. This is because PALS requires two loss values of the same deterministic function (= two network inferences) to determine an update step. Otherwise the function would not be continuous and a parabolic approximation is not be possible. However, if these random component implementations could be changed so that drawn random numbers can be reused for at least two inferences, PAL would also support these operations.
- If using Dropout this has to be replaced with the adapted implementation we provide which works with PAL.
- With Tensorflow 1.15 and 2.0 is was not possible for us to write a completely graph-based optimizer. Therefore it has to be used slightly different as other optimizers. Have a look into the example code! This is not the case with Pytorch.
- The Tensorflow implementation does not support Keras and Estimator API.
Runs with Tensorflow 1.15
Uses tensorboard for plotting
Parabola approximations and loss lines can be plotted

Virtual Environment

A virtual environment capable of executing the provided code can be created with the provided python_virtual_env_requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
Images		Images
PAL_evaluation_experiments		PAL_evaluation_experiments
PyTorch		PyTorch
TensorFlow		TensorFlow
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PAL - Parabolic Approximation Line Search for DNNs:

Short introduction to PAL:

The hyperparameters:

PyTorch Implementation:

Tensorflow Implementation:

Virtual Environment

About

Releases

Packages

Languages

cogsys-tuebingen/PAL

Folders and files

Latest commit

History

Repository files navigation

PAL - Parabolic Approximation Line Search for DNNs:

Short introduction to PAL:

The hyperparameters:

PyTorch Implementation:

Tensorflow Implementation:

Virtual Environment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages