Commercial gas furnace producing carbon dioxide
Commercial gas furnace producing carbon dioxide
2. IDENTIFICATION OF A GAS FURNACE
2.1 Introduction
The data were obtained from a commercial gas furnace producing carbon dioxide. A schematic diagram of the system is shown in Fig. 2.1. The output variable is the concentration of carbon dioxide measured as a percentage of the outlet gas from the furnace. The concentration is affected by two input variables, the air rate and the gas rate. In the experiment described here, the input air rate was fixed so that the transfer function between the input gas rate and output concentration could be determined using a single input, single output model.
2.2 Correlation Analysis
The demonstration consists of the following five steps.
2.2.1 Display the Raw Data
In the MATLAB environment load in the gas furnace data by typing load gasdata.dat and then run the demonstration by typing gas_ident. The raw input and output data are displayed in Fig.2.2.
2.2.2 Remove the Means and Linear Trends from the Raw Data
The process of removing the trend from the input and output in Eq. (2.2) is called remove trend operation. The data after the mean and trends have been removed is shown in Fig. 2.4.
2.2.3 Cross Correlation
The cross correlation function between the raw input and output and the cross correlation function between the detrended input and output are shown in Fig.2.5. Notice that the cross correlation function is not symmetrical about zero and has a well defined peak at , indicating that the output lags behind the input. In addition, the cross correlation are negative for some small , this is to be expected since an increase in the input produces a decrease in the output (see Fig.2.22.4) .
The cross correlation between the input and output will provide an estimate of the system impulse response, if and only if the input is a white noise sequence. Most signals are not white and therefore in order to produce an estimate of the system impulse response using the correlation the raw inputs must be prewhitened. This is achieved by considering the input to be a filtered white noise sequence
"*": the cross correlation functions between the detrended input and output.
The parameters can be estimated by fitting an autoregressive model to using a least squares algorithm. Once the estimates of the autoregressive model or prewhitening filter are available the input and output can be filtered using the filter to give
The inputoutput relationship of the system will be unchanged since the same filter has been applied to both input and output. The quality of the filter can be checked by computing the autocorrelation of which should be white. The autocorrelation function of a signal is defined as
where denotes either or in the present application. The autocorrelation function of the detrended input shown in Fig.2.6 clearly indicates that the input signal is not a white noise sequence.
Prewhitening the input by estimating the filter with , and then filtering the input and output produces the prewhitened signals shown in Fig. 2.7. The autocorrelation function in Fig. 2.8 clearly shows that the filter has been correctly estimated. Finally an estimate of the cross correlation function is shown in Fig. 2.9. This is an estimate of the system impulse response function because the input was prewhitened and shown to have an autocorrelation which is an impulse. The shape of the impulse response in Fig. 2.9 suggests there is a time delay of 2 samples and that the system is probably second order. A first order system would have an impulse response resembling an exponential decay. To find the model of the system we would need to fit a model to Fig. 2.9. But it is easier to do this using the original detrended data using parameter estimation, see Section 2.4.
2.3 Spectral Analysis
An estimate of the spectral density function of a signalis defined as
(2.8) where are a set of weights called the lag window, and is called the truncation point. It is assumed that the sampling interval in Eq. (2.8) for the purpose of computing convenience. If , then the real spectrum can still be estimated from (2.8) by multiplying by and plotting the estimates against instead of .
An estimate of the cross spectral density function between two signals and is defined as
Associated with the above cross spectral density function, the cross amplitude and phase spectra etc. can be derived as follows.
(i) The real part of is
This quantity measures the linear correlation between the two components of the bivariate process at frequency and is analogous to the square of the usual correlation function. The closer is to one, the more closely related are the two processes at frequency .
where is the distribution with degree rand confidence level .
MATLAB provides a tool for cross spectral analysis, in which the default value for the number of points in the FFT is 256 (or is equal to the length of the data if the data length is less then 256), the default value for the confidence level is (i.e., 95% confidence interval). In the gas furnace data, only 296 points can be used. In order to get better spectral estimates using crossspectral analysis methods, the detrended gas furnace data was initially extended periodically to inputoutput points. The period extension method is as follows:
The power spectral density functions of the detrended input and output for the gas furnace are shown in Fig. 2.10 and Fig. 2.11. The 95% confidence intervals for these spectra are shown as dotted lines. The gain spectrum, the cross phase spectrum and the squared coherence between the detrended input and output have been analysed, and are shown in Fig. 2.122.14. The coherency is close to one between 00.25Hz suggesting that the estimates are good over this range.
2.4 Parameter Estimation
Consider an ARMAX(n,m,k,d) (AutoRegressive Moving Average with eXogenous input) model
and are called regression terms, is a white noise;'s,'s and 's are constants; n,m ,k and d are the model orders and time delay, respectively.
The task in the gas furnace identification is to find an optimal submodel from a prespecified full model set. This will include term selection and parameter estimation. The detrended data shown in Fig. 2.4 will be used throughout. The identification results are listed in Table 2.1, where the symbol "+" indicates the residual correlation test is passed, and the "x" indicates the residual correlation test failed.
An optimal model with the estimated parameters listed in Table 2.2 is finally identified as
Table 2.1 Identification results for detrended gas furnace data
n=1, m=2, k=10 
n=2, m=3, k=10 

d 
LF 
FPE 
d 
LF 
FPE 

0 
0.0897 
0.1048 
+ 
x 
1 
0.0560 
0.0662 
+ 
+ 

1 
0.1201 
0.1330 
+ 
x 
2 
0.0553 
0.0655 
+ 
+ 

2 
0.0983 
0.1081 
+ 
x 
3 
0.0854 
0.0953 
+ 
+ 

3 
0.0579 
0.0676 
+ 
+ 
4 
0.0623 
0.0738 
+ 
x 

d=2, k=10 

n 
m 
LF 
FPE 
Note:d : Time delay; + : The residual correlation test passed; x : The residual correlation test failed. LF: Loss function; MO: Model order; FPE: Final prediction error 

1 
2 
0.0893 
0.1081 
+ 
x 

2 
3 
0.0553 
0.0655 
+ 
+ 

3 
4 
0.0536 
0.0645 
+ 
+ 

4 
5 
0.0534 
0.0650 
+ 
+ 
Table 2.2 Identification results for detrended gas furnace data
terms 

parameter estimates 
1.0810 
0.3159 
0.0076 
0.0443 
0.0223 
0.4282 
0.4724 
0.3173 
terms 

parameter estimates 
0.2195 
0.7674 
0.1525 
0.1293 
0.0765 
0.0054 
0.0002 

Loss function: 0.0553; FPE(final prediction error): 0.0655 
The purpose of model validation is to test if the identification model satisfies certain criteria. In this demonstration the criteria are the autocorrelation test of the model residuals and the crosscorrelation test between the input and the model residuals. These are illustrated in Fig. 2.15 for the model described in Eq. (2.24) and (2.25). Since the values of both the correlation functions are inside the 95% confidence bands (the dotted lines parallel to the xaxis) this indicates that the model is unbiased and provides a good representation of the data. A comparison of the onestepahead predicted output of the model
which is a much better indictor of model performance than the onestepahead predicted output, is illustrated in Fig 2.17.
The residuals and deterministic prediction errors are shown in Fig.2.18. The comparison between the system impulse response estimated using correlation analysis on the prewhitened input in step 2.2.3 and the impulse response computed from the estimated model in (2.24) illustrated in Fig. 19 shows a good correspondence and confirms the model prediction and model validations.
2.5 MATLAB Commands and Programmes Used for the Gas Furnace Identification
2.5.1 Plot the Raw Data
>> load gasdata.dat;
>> subplot(2,1,1),plot(gasdata(:,1)),
>> subplot(2,1,1),title('OUTPUT #1',),
>> subplot(2,1,2),plot(gasdata(:,2)),
>> subplot(2,1,2),title('INPUT #1'),
2.5.2 Remove the Means and Linear Trend from the Raw Data
The command detrend(x,0) is to remove the mean from a signal matrix x with one or more columns; the command detrend(x,1) is to remove the linear trend from a signal matrix x with one or more columns.
>> gas_mean_removed=detrend(gasdata,0); % Remove mean only;
>> gas_mean_removed(:,2)=gasdata(:,2)mean(gasdata(:,2));
% Remove mean from input;
>> gas_mean_removed(:,1)=gasdata(:,1)mean(gasdata(:,1));
% Remove mean from output;
>> gas_detrended=detrend(gasdata,1); % Remove a linear trend;
2.5.3 Prewhitening and Onedimensional Digital Filtering
See the function filter in MATLAB environment.
2.5.4 AutoCorrelation and Cross Correlation Analysis
The command xcorr(x,y) can be used for the autocorrelation and crosscorrelation analysis. Please notice that the command xcorr(x,y) only calculates the cross correlation covariance, instead of the cross correlation function defined by (2.3). In order to calculate the cross correlation function , some auxiliary operations should be used.
>> xm=xmean(x); y=ymean(y); % Remove means;
>> xvar=length(x)*var(x);
>> yvar=length(x)*var(y);
>> ccf_xy=xcorr(xm,ym)/sqrt(xvar*yvar); % Coss correlation
% function;
The local function xcorr_acse(x,y) can be used to directly calculate the autocorrelation function and cross correlation function . The following command using the local function is equivalent to the above commands:
>> ccf_xy=xcorr_acse(x,y); % Cross correlation function ;
2.5.5 Spectral Analysis
The command spectrum(x, y) can be used for cross spectral analysis between two signals x(t) and y(t). spectrum_acse(x, y)is a local function similar to spectrum(x, y).
>> y=gas_detrended(:,1); % y is the output;
>> u=gas_detrended(:,2); % u is the input;
>> spectrum_acse(u,y); % Calculate the cross spectrum between
% the input u(t) and the output y(t).
See Fig. 2.102.14.
2.5.6 Parameter Estimation
In this identification package, both the global functions, including AR (autoregressive), ARX (autoregressive with eXogenous input) andARMAX(autoregressive moving average with eXogenous input), and local functions, including AR_ACSE, ARX_ACSE andARMAX_ACSEwere used for parameter estimation. The general form of the ARMAX model is described in (2.22).
>> y=gas_detrended(:,1); % y is the output;
>> u=gas_detrended(:,2); % u is the input;
>> na=2; % Set the model order for output;
>> nb=3; % Set the model order for input;
>> nc=10; % Set the model order for noise;
>> nk=2; % Set the model order for time delay;
>> order=[na,nb,nc,nk];
>> model=armax([y,u],order); or %Estimate the parameters of ARMAX
>> th=armax_acse([y,u],order); % model with the given orders.
>> [A,B,C]=th2poly_acse(th);
% Get the parameters;
>> [e,r]=resid(model,[y,u]); or
>> [e,r]=resid_acse([y,u],th); % Model validaty test;
2.5.7 Programmes for the Gas Furnace System Identification
The MATLAB programme gas_idnet_acse.m was designed for the gas furnace identification.
>> help gas_ident % Get overview of the programme;
>> load gasdata.dat; % Load the inputoutput data of a gas
% furnace system;
>> gas_ident % Execute the programme to analyse and
% identify the gas furnace system;
3. THE EFFECTS OF BIAS ON PARAMETER ESTIMATION
3.1 Introduction
The main task of system identification is to identify a suitable model based on the system inputoutput data. For a given system, the identified model structure should theoretically be identical to the true system structure. The system structure, however, is generally not available because of the lack of a priori knowledge of the system. In practice therefore the identified model structure will not necessarily match identically with the true system structure. In order to avoid bias in parameter estimation resulting from either overfitting or underfitting, the effectiveness of an identified model should always be verified. Several methods have been proposed for verifying the effectiveness of a model, including cross correlation analysis based on the residuals, which was used in the gas furnace identification.
An important property of a model is the prediction ability. Onestepahead predictions are usually used for testing model effectiveness. The following subsection, however, will show that sometimes onestepahead predictions can provide misleading results, in other words, it is very easy to get a bad model which satisfies the onestepahead prediction test. Model predicted outputs, Eq. (2.26), are much better for testing the adequacy of a model.
3.2 An Example Showing the Effects of Bias in Parameter Estimation
Using a band limited white noise (BLWN) sequence with zero mean, covariance of 10 and initial seed of 33333 as the input, the system was simulated with a sampling period of 1 second in SIMULINK and 1000 input and output data were obtained and shown in Fig. 3.1. The first half of the data was used for model estimation and the second half was used for model testing. The detrended input and output data will be used for system identification and parameter estimation throughout.
Both the two models were identified from the bandlimited white noise driven inputoutput data produced in model (3.1) using different model order settings in the parameter identification. Notice that the model orders have been deliberately set to incorrect values to illustrate incorrect model fitting. Simulation results show that the onestepahead predicted outputs based on each of the above three models are almost perfect. Even the firstorder model (3.2a) predicts very well as shown in Fig. 3.2.
Clearly the good onestepahead prediction results do not necessarily mean that the two models are good. This can be shown by verifying unit step response and the frequency responses properties of the models. Take the model (3.2b) as an example. The unit step response of model (3.2b) and model (3.1) are shown in Fig. 3.3. The frequency responses of these two models are illustrated in Fig. 3.4. In Fig. 3.5 and Fig. 3.6 the Bode diagrams of model (3.1) and (3.2b) are shown. These figures reveal that model (3.2b) is incorrect, although the onestepahead predictions are very good.
In practice the true model of a given system is usually unknown, therefore it is impossible to use unit step and frequency response comparisons. An effective approach for verifying a model is to use the model predicted output and compute the correlation functions and . The residual correlation test for model (3.2b) is displayed in Fig. 3.7, which clearly shows that model (3.2b) is invalid.
A valid model for the above bandlimited white noise driven system is ARMAX(3,4,21,1), which is given by
The residual correlation test for model (3.3) is displayed in Fig. 3.8. The unit step response of model (3.3) and model (3.1) are shown in Fig. 3.9. The frequency responses of these two models are illustrated in Fig. 3.10. The 100stepaheadprediction and the corresponding prediction errors are shown in Fig. 3.11. These figures testify the validity of the model.
(solid line model (3.1); dashed line—model (3.3))
3. 3 MATLAB Programmes Used to Show the Effect of Bias in Parameter Estimation
3.3.1 Simulate the Heat Exchanger System Model Eq. (3.1)
In the MATLAB environment type blwn_driven to activate the SIMULINK block diagram model blwn_driven, see Fig.3.12(a) or Fig. 3.12(b).
Set the parameter values:
Start time=0.0
Stop time=1000
Type=Fixed step ODE4(RungKutta)
Fixed step size=1
Then Start to simulate. The input and output data are saved in the files blwn_driven_in.mat and blwn_driven_out.mat, respectively. Notice that for time discrete models, the parameter Type should be set as Fixed Step Discrete.
3.3.2 Fit ARX/ARMAX Models for the Heat Exchanger System
In the MATLAB environment type the following commands:
>> load blwn_driven_in
>> load blwn_driven_out
>> heat_exchander % Fit an ARX or ARMAX model for the heat exchanger
% based on the loaded band limited white noise driven
% inputoutput data;
The program fit_ARX and fit_ARMAX will ask the user to enter some parameters interactively.
3.3.3 Step Response Analysis
In the MATLAB environment type:
>> step_response_31
>> step_response_32b
>> step_response_33
>> step_response_34
The SIMULINK block diagram models for step response analysis are then activated. The block diagram models for model Eq. (3.1) and (3.2b) are shown in Fig. 3.12 and Fig.3.13. The block diagram models for other systems are similar to Fig. 3.12 or Fig.3.13.
3.3.4 Frequency Property Analysis
In the MATLAB environment type the function name:
>> freq_analysis
The frequency response of the model (3.1), (3.2b) and (3.3) will be calculated and compared with that of model (3.1). The Bode diagrams for model (3.1) and (3.2b) are also compared.
4. TIME SERIES PREDICTION
4.1 The ARMA Model
The autoregressive moving average model, or ARMA model, provides a useful description for a large number of processes in the real word. An autoregressive moving average process of order (n, m), denoted by ARMA(n, m), satisfies an equation of the form
where, is a white noise sequence and are the model parameters; n and m are the autoregressive and moving average order, respectively. If m=0 then (4.1) becomes AR (n).
4.2 The AR and ARMA Model s for Sunspot Series Data
The Wolf sunspot data series records the monthly and annual sunspot indices from 1700 onwards. Fig.4.1 shows 300 annually recorded data from 1700 to 1999. Fig 4.2 shows 3000 smoothed monthly sunspot indices from July1749 to June 1999. It will be seen that the main feature of this series is a cycle of activity varying in duration between 9 and 14 years (110 and 170 months), with an average period of approximately 11.4 years (136 months). Another feature of the series is that there are different gradients of ascensions and descensions, i.e., in each cycle the rise to the maximum has a steeper gradient than that of the fall to the next minimum. This suggests that a nonlinear model might be more appropriate than a linear one. However, the main purpose here is to show how an ARMA model can be used to describe the sunspot series to demonstrate the use of ARMA models in real time series prediction problems.
4.2.1 One and Twostep Ahead Predictions for the Annually Recorded Sunspot Series
(A) AR models
The onestep and twostepahead predictions and the residual correlation tests are shown in Fig. 4.3, Fig. 4.4 and Fig. 4.5, respectively. The standard derivation for the onestepahead prediction errors is 14.90, and the standard derivation for twostepaheadprediction errors is 23.11.
(B) ARMA models
One of the "best" models identified is ARMA(6,5), with estimated coefficients:
AR Coefficients 

2.758 
3.213 
1.357 
0.9284 
1.462 
0.6279 

MA Coefficients 

1.688 
1.165 
0.1599 
0.7606 
0.3284 
The onestep and twostep ahead prediction and the residual correlation test are shown in Fig. 4.6, Fig. 4.7 and Fig. 4.8, respectively. The standard derivation for the onestepahead prediction errors is 14.92, and the standard derivation for twostepaheadprediction errors is 22.68
4.2.2 Multistep Ahead Prediction for the Monthly Recorded Sunspot Series
One of the "best" models identified is ARMA(5,17), with estimated coefficients:
AR Coefficients 

3.2525 
4.8382 
4.4050 
2.3479 
0.5277 

MA Coefficients 

0.9099 
0.3854 
0.4874 
0.6546 
0.1682 
0.0789 

0.0326 
0.0952 
0.0145 
0.1091 
0.0474 
0.7722 

0.7854 
0.2157 
0.4444 
0.6962 
0.1007 
The residual correlation test, the onestep, 6step, 12step and 24stepahead predictions are shown in Fig. 4.94.13. The standard derivations for these prediction errors are 0.72, 7.21, 16.01, and 30.33.
4.2.3 Remarks
As pointed out earlier, the sunspot time series is a nonlinear process. Therefore, better prediction results could be obtained using a nonlinear modelling methodology, such as the NARMAX (Nonlinear AutoRegressive Moving Average with eXogenous input) method proposed by Billings. The NARMAX model takes the form of following nonlinear difference equation:
where is an unknown nonlinear mapping, and are input and output sequences, and are the maximum input and output lags, respectively. The noise variable with maximum lag , is immeasurable but is assumed to be bounded and uncorrelated with the inputs. The model (4.2) relates the inputs and outputs and takes into account the combination effects of measurement noise, modelling errors and unmeasured disturbances represented by the variable .
4. 3 MATLAB Commands and Programmes for Time Series Analysis
The functions AR, ARX, ARMAXandPREDICT can be used for linear system modelling and prediction. The general form of the ARMA models is where A, and C are polynomials in the delay operator with the order of and .
4.3.1 Commands for fitting AR/ARMA Models
>> load sunspot_yearly.dat; % Load the annually recorded sunspot
% time series data;
>> y=sunspot_yearly(:,2); % Put the sunspot series in y;
>> na=9; % Set the model order;
>> th=ar_acse(y,na); or %
>> th=arx_acse(y,na); or % Fit an AR/ARX model with order na;
>> model=ar(y,na); or %
>> model=arx(y,na); %
>> na=6; % Set the model order for AR;
>> nc=5; % Set the model order for MA;
>> th=armax_acse(y,[na,nc]); or
>> model=armax(y,[na,nc]); % Fit an ARMAX model with the given
% orders.
>> [A,B,C]=th2poly_acse(th); or
>> [A,B,C]=polydata(model); % Get the parameters
4.3.2 Commands for Model Validity Test and Prediction
>> [e,r]=resid_acse(y,th); or
>> [e,r]=resid(model,y); % Model validity test;
>> yp=predict(model,y,np); % npstepahead predition;
4.3.3 MATLAB Programmes for Sunspot Time Series Modelling
Two programmes were designed for sunspot time series modelling and prediction:
sunspot_AR.m—AR models for sunspot series.
sunspot_ARMA.m—ARMA models for sunspot series.
>> load sunspot_yearly.dat; % Load the annually recorded sunspot
>> load sunspot_monthly.dat;% Load the annually recorded sunspot
>> sunspot_AR % Execute the program to fit an AR
% model for the sunspot series.
>> sunspot_ARMA % Execute the program to fit an ARMA
% model for the sunspot series.