PeakLab v1 Documentation Contents AIST Software Home AIST Software Support
IRF Estimation 2 - Standards Sectioned to Contain Only 'Fronted' Peaks
In this second approach to estimating an IRF, we will use the same standard as in the IRF Estimation 1 approach, but we will section the data to fit only the first fronted peak in the 5, 10, and 25 ppm standards.
The Problem with Tailed Peaks and IRF Estimation
We will begin by looking at the IRF fits in the IRF Estimation 1 example, and plot the most fronted and tailed peaks in one of the 25 ppm concentration fits:
The red curve is the IRF-deconvolved fit, the fit determined by the parameters absent the IRF component. The red is thus the peak that would be seen if no IRF were present; that is, the instrument smears the red curve into the peak that appears in the data. For the fronted peak, because the skew of the peak and the skew of the IRF are in opposite directions, and because fronted peaks are often the earliest to elute and thus narrower in width, there tends to be a sharp difference between the peak with and without the IRF. For the tailed peak, the skew of the peak and the skew of the IRF are in the same direction, and further, strongly tailed peaks are often the last to elute and thus of a greater width. The difference between the peak with and without the IRF is much less.
In statistical terms, the IRF and tailed peak parameters are more likely to be correlated, each somewhat indistinguishable from one another in their impact upon the model. In other words, the tailing of the IRF blends in with the tailing of native peak. When the two are almost indistinguishable as in the plot above, the fit algorithm will have difficulty resolving the parameters for the two different tailing processes.
Fitting Only the Most Fronted Peak with the GenNLC<ge> Model
Just as one can generally safely assume the IRF will be close to constant across concentration, one can usually assume the IRF is constant across time within the chromatogram, whether a peak is strongly fronted, narrow, and the first peak to appear, or strongly tailed, wide, and the last peak to appear.
In this estimation, we will isolate the first eluting peak in the standard used in the IRF Estimation 1 example and fit the same nine data sets (three series across time at 5, 20, and 25 ppm concentration) to the GenNLC<ge> model:
Average for 9 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.99999486 0.99999482 0.01852498 35,613,035 5.13756651
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 4.71457477 2.31818277 0.00024211 -0.0059113 1.17888115 0.00681811 0.04313462 0.62990383
CV Percent for 9 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.00032206 0.00032450 78.6187622 33.7527815 62.6876684
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 67.4922715 0.94823526 4.45440137 66.1591631 12.0786590 52.1341809 0.75540920 3.02022061
The values we estimated for the <ge> IRF in the the IRF Estimation 1 example were [0.0080, 0.04316, 0.621]. Here we have almost exactly the same 'e' exponential tau, and the same area fraction for the Gaussian. The average half-Gaussian response width, however, is somewhat less. Let's look at this data differently and also average the three 5, 10, and 25 ppm fits:
Average for three 5 ppm Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.99999565 0.99999561 0.00731264 35,938,653 4.35397757
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 1.77347599 2.33957221 0.00024529 -0.0022722 1.13489593 0.00309406 0.04319880 0.63763542
CV Percent for three 5 ppm Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
9.7136e-5 9.788e-5 9.99223089 19.8595198 22.3120398
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 0.93539385 0.54395807 2.30976597 1.25213762 22.6172011 140.114352 1.21257714 5.31354888
Average for three 10 ppm Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.99999587 0.99999584 0.01328945 39,256,195 4.12542900
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 3.53759334 2.32080542 0.00022951 -0.0044967 1.25229684 0.00855179 0.04324721 0.62334698
CV Percent for three 10 ppmFits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.00012923 0.00013019 14.5424412 30.3054744 31.3228332
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 1.18737301 0.41425904 0.82651015 0.40048571 4.28786134 6.68955456 0.45352651 1.59413959
Average for three 25 ppm Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.99999307 0.99999301 0.03497286 31,644,258 6.93329297
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 8.83265496 2.29417067 0.00025151 -0.0109652 1.14945069 0.00880847 0.04295784 0.62872909
CV Percent for three 25 ppm Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.00056198 0.00056621 41.3070800 58.4231917 81.0544013
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 0.93954335 0.47140661 2.62973527 0.03698853 0.60398245 4.60720109 0.46421850 1.06490869
While the 10 ppm and 25 ppm fits have closer to the 0.008 'g' width we found in fitting all six peaks in the standard and averaging across samples and concentrations, here the 'g' for the 5 ppm samples averages to just .003. Given the fit to just a single peak, and the lower S/N and lower chromatographic distortion on the 5 ppm sample, the 0.008 of the higher S/N and higher a3 samples is likelier correct.
Confirmation with Fourier Deconvolution
If we enter the .0087, .0431, and .625 from the 10 and 25 ppm averages in the IRF Deconvolution option and zoom in on the first peak, we see the following:
The Fourier deconvolution across all nine data sets, using these <ge> parameters, zoomed in for the first peak, isn't perfect across all nine sets, but it is close. If we enter .003 for the 'g' SD, from the 5 ppm fits, the deconvolution errors in the plots above are clearly greater, and is especially obvious with the 25 ppm concentration:
If we enter .015 for the 'g' SD, we see an incomplete deconvolution at the 5 and 10 ppm concentration, and a clear breakdown in the deconvolution on the 25 ppm concentration samples.
When one is having difficulty consistently fitting an IRF, inspection of the different parameters in a Fourier deconvolution helps visualize what is happening. The IRF Deconvolution procedure also offers a genetic algorithm which will seek to optimize an IRF by maximizing the amount of baseline.
An IRF Models System Effects Which May Not Be Truly Constant
As evident here in these single peak fits, an IRF may model system effects which vary somewhat with sample concentrations, temperature, additives, and other prep or run factors. The 'g' component in this example is not the 'primary' component of the IRF, even though it represents better than half the overall IRF on an area basis. There is a significant amount of this probabilistic distortion occurring, but its impact on the overall peak shape is much less because it is quite small in width. This secondary 'g' component may be modeling system nonidealities which are not absolutely constant, such as axial dispersion. In such cases, you will want a good average of this 'g' component accepting that a certain compromise will be necessary to assume a constant IRF.
Again, the Fourier Deconvolution option furnishes a near immediate cross check of the validity of the IRF before removing it. In the above example, we approximately halved and doubled the 'g' component's width. In varying 'g' one observes only slowly changing effects. Let us now visually adjust this much larger and highly constant primary 'e' component. Let us reduce 'e' from 0.428 to 0.418. Clearly the width is now too small.
Let us now increase 'e' from 0.428 to 0.438. Clearly the width is too large. The larger width in an IRF is actually this sensitive in a deconvolution and rather easily visualized.
A fronted peak is preferable for this genetic algorithm estimation of the IRF estimation since it exhibits this lovely transition at the baseline. While we will not say that this genetic procedure fails with tailed peaks, we can note it is harder. If you can manage it, you may want to add an early eluting fronted peak to your standard specifically for instrumental modeling. Please remember that if you do this and the peak appears tailed, it may actually be fronted and the tailing you see is from the instrumental distortion. You will need to fit such a peak to be certain.
Fitting Only the Most Fronted Peak with the GenNLC<e2> Model
Average for 9 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.99999562 0.99999558 0.01740951 41,476,901 4.38248857
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<e2> 4.71548080 2.31799021 0.00024086 -0.0059039 1.24802205 0.00561802 0.04377159 0.63718806
CV Percent for 9 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.00032834 0.00033098 83.5864512 37.0837027 74.9193363
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<e2> 67.4956866 0.83754286 4.98583029 65.6072824 9.92539336 6.856777526 0.61770387 2.845947898
In this instance the <e2> IRF has no issue at the 5 ppm concentration. You should not automatically assume that a two-exponential IRF is going to be more correlated, and less stable, than the sum of the half-Gaussian and exponential IRF. It is quite possible a fast and slow kinetic distortion is a more accurate picture of a system's distortions. Here we see the same .437 slow 'e' and it has close to the same area fraction as the <ge> IRF. Here the fast 'e' is .0056, and stable, with only a 6.85% CV across the nine fits of the one fronted peak.
Again, we suspect the fast component of both the <e2> and <ge> distortions is probably modeling a mix of kinetic and probabilistic non-idealities, and why each of these IRFs appear about equally viable in many of the data sets where we've done IRF determinations.
In this example, the <e2> IRF looks slightly more stable then the <ge> across the different samples and concentrations in this deconvolution visualization.
Solvent or System Peaks
In many chromatograms, there may be a ready made early eluting peak in the form of a solvent or system peak which appears just after the column dead time. In our experience, such peaks may appear tailed, but because of there being virtually no chromatographic separation, are probably absent any a3 chromatographic fronting or tailing. Nearly the whole of the tailing should be from the IRF.
The non-retained peak at the dead-time has no chromatographic separation and fitting a model based upon such is dubious, although the GenHVL and GenNLC models will adjust for the lack of chromatographic distortion (a3 iterating to near 0). We have been successful in fitting a clean dead time peak, even though such will be a mix of all unretained components. We were able to get the higher width exponential correct, but the narrow width component often iterated to zero. The bigger problem in using the unretained peak for IRF estimation is that it is not a single entity, everything unretained is there, and certain impurities may be ever so slightly retained distorting this t0 shape. Further, in the event of IC, those different unretained components may have different ionic charges resulting in anything but a coherent peak.
As such, you may wish to consider fitting a peak with a tiny measure of chromatographic separation that occurs immediately after the dead-time pulse, if such exists.
Here are the solvent/system peaks in the data series we have been using for these various IRF estimation examples. There is a second peak following the t0 peak that is unrelated to concentration of the solutes. There is the t0 peak which is not cleanly defined in any of the runs, but which can possibly be fitted in the 10 ppm and 25 ppm concentration data sets in the second and third rows in the plot above.
We simply proceed as if these were solute peaks, reversing the sign of the signal where needed, performing a baseline subtraction, and then transforming for the t0=1.15 dead time.
Fitting Peak Just After t0 with the GenNLC<e> Model
If we look at only this post-dead-time peak in the View and Compare Data option using a contour with no area or amplitude normalization, we see that the the samples without the PDCA additive have a much smaller magnitude second peak and these elute sooner. The samples with PDCA elute later, and the greater the PDCA amount in the prep, the later the elution. The magnitude of the peak is strongly correlated with the amount of PDCA on the 5 ppm samples. The 10 ppm samples have this same trend, do not evidence a higher measure of broadening, and are lower in overall magnitude. This peak on the 25 ppm samples is actually less broadened and for the higher PDCA concentrations, is even smaller in magnitude. In all cases, if you look closely at the contours and peak shapes, there is a tailing in the peaks as they are registered by the instrument.
We have been suitably warned. A peak which is clearly related to additive concentration, in both location and magnitude, exists when no additive is present. The peak also narrows and decreases in magnitude with higher concentration. If there are two (or more) components mixed within this post t0 peak, any IRF estimation will be suspect.
In order to get fits that are useful on such peaks, we must either fit the NLC<ge> model (absent the ZDD asymmetry adjustment for the third moment), or we must lock this a4 asymmetry (third moment ZDD adjustment) parameter at a value typical of the GenNLC<ge> model. We will do the latter, setting the a4 asymmetry to the program's default of 1.186. This locked value, which is somewhat higher than the Giddings' 0.5 asymmetry in the NLC, is used to constrain the ZDD of these fits to values traditionally seen with such models. One of the GenNLC<ge> peaks had an anomalous shape and a weak fit, and was omitted from the average.
Average for 11 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.99997517 0.99997483 0.01808092 14,749,422 24.8323020
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 2.03591564 0.66395635 0.00062278 0.00202550 1.18580000 0.00680595 0.04652329 0.61186396
CV Percent for 11 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.00265098 0.00268737 41.4368373 102.472614 106.752769
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 62.0529065 14.0095544 15.8265837 115.358597 0.00000000 98.9459839 15.9917944 15.7660537
If you have been following the previous three examples, we realized an IRF of [0.0080, 0.0432, 0.621] from fitting all fronted and tailed peaks in these standards (IRF Estimation 1) , [0.0087, 0.0431, 0.625] from fitting the first fronted peak in these standards. Our average IRF here of [0.0068, 0.0465, 0.612] is surprisingly close. The CV percents are high as compared to the solute peaks, suggesting many more such system peaks would wisely be averaged in order to arrive at a sound estimate of the IRF.
The a3 did fit to a tailed shape.
Fitting the t0 'Mash' with the GenNLC<ge> Model
At this point, we will fit the mash of everything that elutes at the dead time volume, that which is nowhere retained. Here we must use only the 10 and 25 ppm samples since the the 5 ppm peaks oscillated about the baseline, certainly suggestive of various components with different ionic charges.
Although none of the eight t0 peaks are clean in their rise, and there are differences between additive-bearing and non-additive runs, the additive influence is much less. We also see the familiar indeterminate shape we described in the IRF Estimation 1 example for a fronted peak that appears tailed as a consequence of the tailing in the IRF.
These fits are poorer, as expected from the visible perturbation that occurs near the t0 rise of the peak, as well as the mix of components lumped into this one peak. Such a peak whose component ionic charges and locations are such that the 5 ppm samples are unfittable due to having an oscillating signal instead of peak, should be considered a last resort in an IRF estimation.
Average for 8 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.99974842 0.99974462 0.68344060 366,073 251.575019
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 16.4734538 0.10436431 0.00212077 -0.0146913 -0.0718759 0.00272253 0.04477119 0.79476011
CV Percent for 8 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.01161848 0.01179423 53.5207178 45.6786505 46.1713533
Peak Type a0 a1 a2 a3 a4 a5 a6 a7
1 GenNLC<ge> 67.9831477 13.3472578 6.23123076 161.064853 255.892590 100.667339 6.25773213 26.3491419
If we compare this [0.0027, 0.0448, .794] IRF estimate with the [0.0080, 0.0432, 0.621], [0.0087, 0.0431, 0.625], of the other estimations, and the [0.0068, 0.0465, 0.612] from this second system peak, we can't be particularly encouraged, but the stability of the a6 'e' exponential parameter is impressive.
In the instance of the data used in all of these IRF Estimation examples, there was never an instance where any single exponential model (GenNLC<e> or NLC<e>) successfully fit the IRF. We have, in fact, never seen a chromatographic data set, LC or GC, where a simple exponential model accurately modeled the IRF. We furnish an <e> IRF model, for noisy data and for the instances of very slow detectors where the exponential response overrides all else.
When fitting the t0 unretained peak, and wanting to improve the variability in the estimated parameters, one might wish to try the simpler NLC<ge> model where the intrinsic zero-distortion density's asymmetry is locked at the Giddings of the pure NLC model, or perhaps lock the GenNLC a4 at a specific asymmetry. If you do this, look closely at the degradation in the goodness of fit. The asymmetry in the generalized model may well be accommodating different components that make up this unretained peak. In other words, an odd asymmetry, positive or negative, is likely from the mix of unretained components.
For example, here is this same fit of this t0 peak to this simpler NLC<ge> model:
Average for 8 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.99902472 0.99901184 1.81627818 148,354 975.279214
Peak Type a0 a1 a2 a3 a4 a5 a6
1 NLC<ge> 16.3183264 0.10099324 0.00193297 -0.0067235 0.00145546 0.04513449 0.65620302
CV Percent for 8 Fits
Fitted Parameters
r2 Coef Det DF Adj r2 Fit Std Err F-value ppm uVar
0.06572511 0.06659403 87.2424546 69.4973578 67.3253413
Peak Type a0 a1 a2 a3 a4 a5 a6
1 NLC<ge> 67.7079295 9.38896271 41.9126207 102.003329 82.7015984 8.97717472 2.18900418
The CV% of the area fraction parameter of the <ge> IRF is much improved. This [0.0015, 0.0451, .652] IRF does look somewhat improved, although the estimate for the half-Gaussian is even worse. The key item to note is that the unaccounted variance, the error in the fitting, has nearly quadrupled. Note also that the NLC<ge> model's F-statistic is considerably worse, 148K vs 366K for the GenNLC<ge>.
About the only merit of this unretained mixture peak fit is that at t0 there can be no argument one is looking at anything complicated by the chromatographic separation. It is likely a pure IRF, merely difficult to estimate in such a peak consisting of mixture of different components.
Inclusion of the Tailed Peaks in IRF Determination
The <ge> IRF we found for the 5-25 ppm fits in the IRF Estimation 1 approach that included all peaks, both fronted and tailed, was [0.0080, 0.0432, 0.621]. In this exercise where only the first fronted peak was fit, we used [.0087, .0431, .625] from the 10 and 25 ppm concentrations as the IRF. These are essentially the same IRF parameters, a useful confirmation of the accuracy of the IRF determination.
The beauty of the first approach is that you estimate the IRF every time you fit your standard. In this second instance of partitioning the data to fit only one or more fronted peaks, there is a additional fit if the standard contains some count of tailed peaks. We will note that <irf> bearing model fits require a convolution integral to be computed in the Fourier domain, and this is done on a peak by peak basis. It is much faster to fit a single peak, a matter of seconds for the data sets in this example, than to fit the much larger data sets with all of the peaks as was done within the IRF Estimation 1 example, which required minutes for the fitting.