PeakLab v1 Documentation Contents AIST Software Home AIST Software Support
Whittaker Baselines
Whittaker-Based Baseline Estimation
Whittaker smoothing is the modern "gold standard" for baseline correction in high-resolution chromatography and spectroscopy (DAD, FID, Mass Spec). Unlike morphological filters (Rolling Ball, Convex Hull) which treat the signal as a physical shape, Whittaker methods treat the baseline as a penalized least squares problem.
The Mathematical Foundation
The Whittaker algorithm seeks to find a baseline that minimizes a cost function balancing two competing goals:
Fidelity
How closely the baseline follows the original data.
Smoothness
How much the baseline resists rapid changes.
This is expressed via second-order finite differences:

Where lambda is the smoothing parameter. In PeakLab, we utilize reweighted Whittaker algorithms, which apply a weight vector w to the fidelity term. This allows the algorithm to "ignore" positive-ion peaks while staying pinned to the noise floor. These are iterative fitting algorithms.
Strengths and Applications
Noise Centering
Ideal for detectors where noise is centered around the baseline (e.g., UV-Vis, DAD).
Numerical Stability
Optimized via pentadiagonal matrix solvers, ensuring rapid convergence even for large datasets.
Flexibility
Different weighting schemes (arPLS, asPLS, drPLS) allow for varying levels of peak suppression and curvature adaptation.
Comparison: When to Use Whittaker vs. Morphological
|
Feature |
Whittaker (asPLS, arPLS, etc.) |
Morphological (Rolling Ball, Convex Hull) |
|
Noise Type |
Gaussian / Centered Noise |
"Count" type / Poisson / Unipolar Noise |
|
Peak Shape |
Best for varying widths and overlapping peaks |
Best for isolated, well-defined peaks |
|
Curvature |
Adapts to complex, shifting baselines |
Can "dip" into wide peaks if the ball radius is too small |
Core References
Whittaker, E. T. (1922). A new method of graduation. Proceedings of the Edinburgh Mathematical Society. (The original foundation).
Eilers, P. H. C. (2003). A Perfect Smoother. Analytical Chemistry. (The definitive modern adaptation for analytical chemistry).
Baek, S.-H., et al. (2015). Baseline correction using asymmetrically reweighted penalized least squares. Analyst. (The "arPLS" standard).
Whittaker Parameter Tuning (lambda and p)
Most Whittaker-based routines (asls, arpls, airpls, etc.) rely on two primary variables that define the "stiffness" and "directionality" of the baseline.
Smoothness (lambda)
Physical Meaning: The "Tension" of the baseline.
High lambda: The baseline behaves like a rigid steel rod; it remains a straight line and ignores all fluctuations.
Low lambda: The baseline behaves like a flexible string; it follows every curve and noise wiggle.
Asymmetry (p)
Physical Meaning: The "Directional Bias".
Definition: In algorithms like asls, p is the weight assigned to positive residuals (peaks) and 1-p is assigned to negative residuals (noise/valleys).
The Value: Usually set very small (e.g., 0.001 to $0.01).
Mechanism: By setting p low, the algorithm "penalizes" the baseline for going into a peak but allows it to stay centered within the noise floor.
Note on 'arpls/iarpls': These use a self-tuning weighting function instead of a fixed p, but they still use lambda to define the overall stiffness.
Whittaker Algorithms
arpls - Asymmetrically Reweighted Penalized Least Squares
Iteratively suppresses positive residuals (peaks) more than negative ones to estimate a smooth baseline without being biased by signal peaks.
lsrpls - Locally Symmetric Reweighted Penalized Least Squares
Applies symmetric weighting to penalized least squares, improving baseline fit in symmetric peak regions.
brpls - Bayesian Reweighted Penalized Least Squares
Uses Bayesian modeling to estimate peak proportions and reweight the baseline fit.
drpls - Doubly Reweighted Penalized Least Squares
Applies two layers of reweighting to better suppress peaks and noise during baseline estimation.
aspls - Adaptive Smoothness Penalized Least Squares
Dynamically adjusts the smoothing parameter across the signal to better handle variable baseline curvature.
iarpls - Improved Asymmetrically Reweighted Penalized Least Squares
Enhances arpls by refining the weighting scheme for better convergence and peak suppression.
psalsa - Peaked Signal's Asymmetric Least Squares Algorithm
Like asls but uses exponential decay weighting for values above the baseline, allowing better handling of noisy, peak-heavy data.
derpsalsa - Derivative Peak-Screening Asymmetric Least Squares Algorithm
Enhances psalsa by screening peaks using smoothed first and second derivatives before reweighting.
asls - Asymmetric Least Squares Smoothing
Classic baseline method that penalizes positive residuals more than negative ones to suppress peak.
iasls - Improved Asymmetric Least Squares Smoothing
Extends asls by incorporating both first and second derivatives of residuals.
airpls - Adaptive Iteratively Reweighted Penalized Least Squares
Iteratively updates weights based on residuals, adaptively suppressing peaks and improving convergence.
Whittaker Algorithm Optimization to Match Human Designed Baselines

The best way to determine which algorithm to use, we suggest you use the human element. In the Baseline procedure, you will see an Optimize button that opens the above dialog. We recommend you use the SD Variation for the Baseline Detection and the Non-Parm Linear for the Model.
To save time, adjust the automatic settings to get the baseline reasonably close, and then using the mouse highlight and unhighlight the baseline and peak regions until to the human eye you have what you perceive as the perfect human designed baseline baseline. You will probably want to set the non-parametric points to the minimum of 3 if you have very small zones of baseline resolved zones to work with.

Once you have created the optimum human-designed baseline as in the example above, it is recommended that you save your manual baseline for future use in similar data sets, or as a starting point for future optimizations.
Do not make any changes in the dialog settings. If you do, the automated algorithms will re-estimate or re-fit the baseline points. Because of the amount of effort to create a human baseline, it is recommended that you save the baseline after creating it. Right click the graph of the human baseline after it is complete for these save options:
Save this Baseline/Non-Baseline State
Use this option to save an ASCII CSV containing this baseline state information. This can be subsequently imported to recreate the zones in time that you have specified as baseline in any data set irrespective of its x range or sampling rate.
Import this Data Set's Baseline/Non-Baseline State
Import All Data Sets' Baseline/Non-Baseline State
These options import the saved baseline state information for the current data, or for all data sets currently loaded.
Save this Baseline as an XY File
You can also save the actual baseline curve as an XY file. This will save the actual fitted baseline (the while line in the sample), at the x values in the data set.
Import this Data Set's Baseline from XY File
This option will import the XY fitted baseline and apply it to the current data set.
After optionally saving your human target upon which the optimizations are to train, click the Optimize
button. You must leave the algorithm set for the non-parametric model or whatever model was used
to construct the human baseline.
Only when you have the best human-engineered baseline you can devise, click the Optimize button. Select the algorithms you wish to train to match your human baseline. Click OK to initiate a set of generic algorithm (differential evolution) optimizations where the parameter(s) of each Whittaker algorithm are optimized to match as closely as possible your human designed baseline. PeakLab launches an embedded python procedure to perform these training optimizations.
Note that you are training these algorithms how to process just this specific type of separation or spectral
analyses. Baselines of an entirely different rate of change or data with significantly different S/N,
sampling rate, or width of peaks will require separate optimizations.
When the optimizations are complete, you will see a summary similar to the following:
In this example, the arpls, lsrpls, and drpls outperformed the other algorithms in matching the human designed baseline. When you click OK, the parameter(s) for each of the algorithms will be automatically updated in the dialog so that you can view the optimizations.
Choose the Partial option each time it appears so that the human designed baseline is never updated if you wish to keep it displayed. The Whittaker baselines algorithms override your selected points as each determines it own set of peaks and baseline points in the data. Your own specified points or those automatically determined by the selected method are not used, even though they are shown on the plot as a reference.

If the arpls algorithm is chosen as the Model after this optimization, the optimized arpls lambda will be used, generating the white baseline above. You can then select all of the different baselines in you wish to see which one best manages your specific baseline.
BEADS and XPS Baseline Algorithms
The BEADS
and XPS
baseline algorithms are non-Whittaker algorithms that are also offered in the Baseline
option. Please be wary of optimizing the beads algorithm alongside the Whittaker algorithms. BEADS is
notoriously hard to manually tune, and as such the optimization may be of appreciable value, but it is
a computationally expensive iterative algorithm and the GA must optimize five different parameters. For
large data sets, it can be excruciatingly slow.