Identification of kinetic models of methanol oxidation on silver in the presence of uncertain catalyst behaviour

Funding information Engineering and Physical Sciences Research Council, Grant/Award Number: EP/J017833/1; Erasmus+ Traineeship Studentship 2015/2016; Hugh Walter Stern PhD Scholarship, Department of Chemical Engineering, University College London Abstract Catalytic oxidation of methanol to formaldehyde is an important industrial process due to the value of formaldehyde either as a final product or as a precursor of numerous chemicals. The study of kinetics in this system is hindered by sources of uncertainty that are inherently associated to the nature and state of the catalyst (e.g., uncertain reactivity level, deactivation phenomena), the measurement system and the structure of the kinetic model equations. In this work, a simplified kinetic model is identified from data collected from continuous flow microreactor systems where catalysts with assorted levels of reactivity are employed. Tailored model-based data mining methods are proposed and applied for the effective estimation of the kinetic parameters and for identifying robust experimental conditions to be exploited for the kinetic characterization of catalysts with different reactivity, whose kinetic behavior is yet to be investigated.


| INTRODUCTION
The oxidation of methanol to formaldehyde over silver catalyst is an industrial process of great significance due to the importance of formaldehyde as a precursor to the production of valuable chemicals. The versatility demonstrated by this organic compound led to its usage in numerous production chains. Agriculture, medicine, cosmetics industry, pesticides and fertilizers production, and dyes and explosive synthesis are only some of the areas in which formaldehyde has found successful application. 1 Under industrial conditions, the process is usually carried out at atmospheric pressure and relatively high temperature (T = 850-923 K) on silver catalyst. 1 The overall mechanism involves the conversion of methanol CH 3 OH to formaldehyde CH 2 O, through both oxidation and dehydrogenation reactions: The principal by-products are: hydrogen H 2 , water H 2 O, carbon dioxide CO 2 , formic acid HCOOH, and traces of carbon monoxide CO. The industrial process is carried out adopting rich inlet methanol/oxygen mixtures, introducing steam for achieving high selectivity. The main undesired reactions occurring in the system, affecting the selectivity of formaldehyde, involve the complete oxidation of methanol and the oxidation of formaldehyde into carbon dioxide: Despite the great industrial importance of this process, the key phenomena occurring on the catalyst surface are yet to be completely understood. Many researchers investigated the catalytic role of silver in the reaction and the possible mechanisms occurring on the surface of the catalyst film. [2][3][4][5][6][7] In 2003, a micro-kinetic model for methanol oxidation on silver was proposed by Andreasen et al 6 to explain the surface phenomena in experimental conditions of interest for industrial application. However, the complexity of the proposed mechanism made unrealistic its use for engineering purposes and, for this reason, a simplified model derived from the micro-kinetic one was proposed by the same authors. 7 One of the main issues making this system so complicated to be unraveled is related to the highly uncertain nature of the catalyst.
Indeed, the reactivity is strongly dependent on the density and distribution of the active sites on the catalyst surface. This crucial aspect has a strong impact on the catalyst behavior and can be influenced by a multitude of factors, from the production of the catalyst to the operating conditions the catalyst has to withstand. 8 The identification of a kinetic model in a reacting system affected by such uncertainty relies on the collection of valuable information from the experiments. Microreactor platforms are suitable for collecting information with the aim of identifying kinetic models for fast exothermic and endothermic reactions. 9 The small dimensions characterizing the channel of these devices permit the execution of the reaction in isothermal conditions and in the absence of mass transfer limitations. However, the information content of an experiment does not rely exclusively on the experimental setup, but also on the thoughtful choice of the experimental conditions to investigate.
Model-based design of experiment (MBDoE) techniques have been proposed in literature to identify the best experimental conditions depending on the purpose of the experimental investigation: selecting a model out of a set of proposed candidates (i.e., MBDoE for model discrimination) [10][11][12] ; improving the parameter estimates of an already selected model (i.e., MBDoE for parameter precision) [12][13][14][15] ; or both (i.e., joint-MBDoE). [16][17][18][19] However, only few works have been proposed in the literature to address the problem of model identification in the presence of high system uncertainty (i.e., uncertain behavior of the physical system and structural uncertainty in the model equations). [18][19][20][21] In this manuscript, data obtained from experiments performed on microreactors using silver catalysts for formaldehyde synthesis are Task 1 is fulfilled employing machine learning technologies. 22,23 Statistical learning methods are increasingly being applied in surface and material sciences for the discovery and characterization of heterogeneous catalysts. 24,25 Specifically, data mining techniques 26 have been employed for supporting the recognition of patterns in kinetic datasets and suggest improved catalyst designs. 27,28  reactivity. This allows for the identification of experimental conditions whose information is insensitive to the catalyst behavior. These conditions represent the best compromise for starting the experimental activity with the aim of identifying kinetic models of methanol oxidation whenever the behavior of the catalyst is still highly uncertain.

| KINETIC MODEL IDENTIFICATION PROCEDURE
The identification of a reliable set of equations to model the kinetics of a catalytic reaction is not a straightforward process. Frequently, the process of kinetic modeling requires dealing with the impossibility of measuring directly certain physical quantities, or the difficulty of separating the key mechanisms because of an overlapping of their effects in the reacting system (e.g., uncertainty on the structure of complex reaction networks). 33 Furthermore, there might be some phenomena occurring in the system (catalyst deactivation is a typical example) that are too complex to describe. For these reasons, proposed kinetic model structures frequently embody a certain degree of approximation and uncertainty.
The precise identification of the parameters in these models requires the identification of highly informative experimental conditions whose evaluation may not be a trivial task because of the following reasons: • The range of experimental conditions in which the simplifying (possibly inappropriate) modeling hypotheses are acceptable (i.e., the range of conditions in which the model gives accurate predictions), may not be known a priori; • The model may be capable of computing reliable predictions only for a subset of its predicted quantities; • Some measurements in the dataset available to fit the nonmeasurable model parameters may contain gross errors; • The location of experimental conditions carrying valuable Fisher information (i.e., information for estimating nonmeasurable model parameters through data fitting) 32 in the experimental design space may be very sensitive to the reactivity of the specific catalyst analyzed; reactivity is normally unknown before preliminary kinetic experiments are performed.
A framework is here proposed for the systematic identification of kinetic models in catalytic systems embracing the aforementioned sources of uncertainty. The scheme of the proposed procedure is given in Figure 1. The procedure starts from the availability of: (a) a highly reliable (benchmark) dataset, assumed as reference; (b) a num- where the model is expected to provide a good fitting.
The procedure takes full advantage of previously observed catalyst behaviors in order to minimize the experimental effort required for investigating the kinetics in new catalysts that are yet to be analyzed. In the following sections, the properties of the MBDM method will be discussed.
Two tailored MBDM filters are then defined for the specific application presented in this paper. Robust MBDoE methods are then detailed.

| MBDM for parameter estimation
A model structure is proposed to describe a certain physical system. A model in its standard reduced form can be expressed in terms of the following system of equations: where quantities x and u 2 U represent vectors of state variables and control variables, respectively, t is time, θ 2 Θ is a set of N θ non measurable model parameters andŷ is a vector of N m predictions associated to N m measurable quantities of the physical system. The estimation of θ is achieved fitting experimental data through the optimization of an opportune scalar function. A popular technique for parameter estimation is the maximum likelihood (ML) approach, which demonstrated to produce good estimates in a wide range of situations. 36 Assume that N exp experiments were performed and that in each trial the values of the N m measurable variables were collected.
The set of data Ψ, defined as in Equation (6), is then available to define a parameter estimation problem.
In Equation (6) y ij is the i-th measured response in the j-th experiment. Measurements are assumed to be affected by uncorrelated Gaussian noise with zero mean and known SDs σ ij . Ifŷ ij θ ð Þ is the model prediction of y ij , the computed value of the parametersθ is called maximum likelihood estimate if it maximizes the likelihood function L (θ|Ψ). In case of known Gaussian measurement noise, however, maximizing L(θ|Ψ) is equivalent to minimizing the sum of normalized squared residuals.θ = arg min θ2Θ X Nexp The least squares estimator represents the cornerstone of classical statistics. 30 However, statisticians soon realized that traditional least squares methods were inappropriate in any situation where some measurements follow distributions that are significantly different from the postulated ones. 29 The concept of breakdown point was developed to assess the sensitivity of a parameter estimator towards the presence of outliers in the dataset. 29 Hampel defines the breakdown point as "the

| Data mining formulation
The weighted least squares method plays an important role in robust statistics. 30,36 By weighting the contribution of each residual in the objective function, it is possible to give fitting priority to relevant data and to reduce the effect of outliers on the parameter estimation. Rousseeuw and Leroy proposed the use of weights as switchers to activate or deactivate the presence of single residuals in the objective function to be minimised. 30 In Equation (9) the quantity λ ij represents the weight associated to the residual r ij . In Equation (10) the switcher behavior of the weights is defined where the hyperparameter c defines the maximum threshold of acceptability for an absolute normalized residual to be included in the fitting. Given a robust estimation of the SDs σ ij and a suitable choice of the hyperparameter c the estimator (9) subject to conditions (10) can attain a breakdown point up to 50% (i.e., the maximum breakdown point theoretically achievable), thus making it an extremely robust estimator that can deal with a high level of outlier contamination in the dataset. 30 In the field of gross error detection, the recommended value for the hyperparameter c is in the range c = 2.5-3.0, which is equivalent to treating as an outlier any normalized residual exceeding the ±2.5σ ij or ± 3.0σ ij confidence range. 30 (11), including switcher conditions on the residuals (12).θ DM, exp = arg min θ2Θ Similarly, an MBDM filter acting on model output variables whose associated residuals are incompatible with measurement noise is defined as in Equation (13) with the respective switcher conditions on the residuals (14).θ DM,m = arg min θ2Θ If the fitting of the whole dataset Ψ results in unacceptably high residuals, the MBDM estimators defined above shall be employed to detect if the bad fitting is caused by specific experiments or by specific model output variables. The quantitiesθ DM, exp andθ DM,m represent robust maximum likelihood estimates obtained from the fitting of a possibly reduced dataset.
Notice that whenever an experiment or an output variable is labeled as an outlier by an MBDM estimator, no direct information is returned on whether the outlier is caused by gross measurement errors or whether the outlier is caused by an inappropriate mathematical structure of the model (5). :

| Robust model-based Design of Experiments
From the computed covariance matrix V θ it is possible to assess the statistical significance of each parameter through a t-test. Whenever some parameter estimate is found to be statistically unsatisfactory, its precision can be improved collecting new experimental data.
Equation (16) can be evaluated in the entire experimental design space adopting the best parameter estimate available. In order to summarize the multidimensional nature ofĤ θ , the information is frequently evaluated with a convenient scalar measure ψ of the matrix. Popular choices for ψ are: the trace Tr(Ĥ θ ), the determinant Det(Ĥ θ ) and the largest eigenvalue. 13 Whenever the model (5) is nonlinear in the parameters, the distribution of ψ in the experimental design space is highly sensitive to the values of θ adopted for the evaluation of Equation (16). At the beginning of the experimental activity, some model parameters may be affected by a wide range of uncertainty (e.g., the values of parameters correlated to catalyst reactivity are typically affected by high uncertainty before any experiment is performed). In this situation, the aim of the scientist is planning preliminary experiments at robust conditions that can provide high Fisher information for all the possible values of the uncertain parameters. 19,39 Optimal robust conditions u R are identified by solving the following min-max optimization problem 34 : In Equation (17), ψ MIN is maximized acting on the experimental conditions u for the worst case scenario, that is, considering the value for θ 2 Θ that minimizes ψ across the experimental design space. In the proposed framework in Figure 1, the execution of kinetic experiments on catalysts with assorted levels of reactivity results in the estimation of different sets of parameters. The sets of kinetic parameterŝ θ CLR andθ CHR identified for the catalyst with low reactivity (CLR) and for the catalyst with high reactivity (CHR) can then be used to define the expected hyperbox of variability for the kinetic parameters, that An approximated definition of ψ MIN is used in this work to represent graphically the distribution of the information for the worst case scenario. For this purpose, the following assumptions are made: the input space U & R 2 is assumed to be bi-dimensional for visualization purposes; it is assumed that the parameter values θ that are representative of the worst case scenarios (i.e., the parameters θ 2 Θ minimizing ψ according to Equation (18)) are the kinetic parameters associated to the least reactive and to the most reactive catalyst, that is, the distribution of information ψ MIN representative of the worst case scenario is evaluated according to Equation (19).
An example, derived from the case study presented in this manuscript, is reported in Figure 2 to illustrate the adopted procedure. In the top-left corner of Figure 2, a plot shows the information surface ψ computed adopting the kinetic parameters resulting in the most divergent catalyst behavior: ψ u,θ CLR computed adopting the kinetic parametersθ CLR associated with the catalyst with lowest reactivity and ψ u,θ CHR , computed adopting the parametersθ CHR related to the catalyst with highest reactivity observed in the experiments. The distribution of ψ MIN is then obtained according to Equation (19) and it is represented in the plot to the right of Figure 2. Experimental conditions along the line of intersection between the CLR and CHR information surfaces (i.e., red region in the plot on the right of Figure 2) represent a good compromise for beginning the experimental activity when the level of reactivity is still unknown, that is, the conditions with high information content for a wide range of catalyst behaviors.

| DATASETS-MATERIALS AND METHODS
Data collected with microreactors are analyzed for the purpose of identifying simplified kinetic models of methanol oxidation over silver.
A description of the experimental setups is given in this section. The available datasets and the investigated experimental conditions are presented and a graphical assessment of the catalyst reactivity is proposed. The modeling assumptions are then illustrated and discussed.
Eventually, the assumptions for the application of the methodology ( Figure 1) to this specific case study are detailed. water, formaldehyde, hydrogen, carbon dioxide, and carbon monoxide were the measured species (i.e., y OUT CH3OH , y OUT O2 , y OUT H2O , y OUT CH2O , y OUT H2 , y OUT CO2 , and y OUT CO , respectively). Different ranges of experimental conditions were investigated in the setups. A summary of the values for the considered independent variables of the system is given in Table 1 (the detailed datasets including also the values measured for the output variables can be found in Supporting Information Appendix B).

Methanol conversion at comparable conditions of temperature and inlet
composition is plotted in Figure 3 for the various datasets as a function of the residence time. The plot in Figure 3 allows for a preliminary qualitative comparison of the datasets, demonstrating the presence of different levels of reactivity in the reactors. In this reacting system, catalyst reactivity is influenced by numerous independent factors that are complex to control and to model. Difficulties in controlling the deposition process of the silver film contribute to the final uncertainty on the catalyst behavior. Furthermore, reactivity is also sensitive to the experimental history, that is, during its lifetime, the catalyst undergoes an ageing process (typically a decay in the reactivity) with dynamics that depend on the experimental campaign conducted on the catalyst. 8 Another source of uncertainty is due to the nonuniform deactivation along the catalyst film. The first part of the catalyst film is typically more stressed and undergoes a more rapid deactivation process than the rest of the film. 41 Since different quantities of catalyst (i.e., different catalyst film lengths) were present in the five microreactor chips (see Supporting F I G U R E 2 Graphical approach for the identification of robust experimental conditions in catalytic systems with uncertain behavior in the special case of bi-dimensional design space defined by temperature and residence time. Fisher information surfaces evaluated for a simplified kinetic model adopting the kinetic parameters associated with the lowest (CLR) and with the highest (CHR) reactivity observed in the experiments (top-left). The distribution of the information metric ψ MIN is evaluated as the smallest between the CLR and the CHR information distributions (bottom-left). This represents the distribution of the information across the design space for the worst case scenario (right). The region in proximity of the intersection of the CHR and CLR information surfaces (i.e., the red colored region in the colormap on the right) identifies conditions that are insensitive to the catalyst behavior where high information is expected for a wide range of catalyst behaviors [Color figure can be viewed at wileyonlinelibrary.com] Information Appendix A), the nonuniform deactivation may result in different observed dynamics of deactivation.

| Reactor model
where  volume. 1 The axial coordinate of the channel is expressed by z.
P represents the pressure, T is the temperature (since the reaction is assumed to occur isothermally, temperature is space independent). ν ij and O ij are stoichiometric coefficient and the order associated to the i-th component in the j-th reaction. A linear pressure profile is assumed between the two extrema of the catalyst film region. The values of pressure at the extrema of the reactor part occupied by the catalyst film were evaluated from the values measured at the inlet and outlet of the reactors adopting Poiseuille's law for laminar flow, corrected for microchannels. 42 More details on the evaluation of the pressure at the extrema of the catalyst film stage are given in Supporting Information Appendix C.

| Assumptions for kinetic modeling
Different behaviors among catalysts are assumed to be a consequence of different active sites densities on the catalyst surface. In terms of model parameters, the active sites density is proportional to the pre-exponential factors of the catalytic reactions. The catalyst is primarily involved in promoting the partial methanol oxidation and formaldehyde oxidation, that is, reactions (1) and (2). 5,6 Evidence reported in the literature suggests that the hydrogen oxidation reaction occurs slowly on the catalyst surface. 5,44 Therefore, the catalytic effect of silver on reaction (3) is neglected and it is assumed that only parameters A 1 and A 2 are metrics of catalyst reactivity.

| Methods
Since different levels of reactivity were detected in the different microreactor chips, the five available datasets are treated independently for identifying five different sets of kinetic parameters. The steps detailed in this section follow directly from the general procedure presented in Figure 1.

| Quantification of catalyst reactivity
It is assumed that variability in the catalytic behavior among the datasets can be explained by different active site densities. As a direct consequence of this assumption, only parameters A 1 and A 2 have to be re-estimated for adapting the reference set of parameters identified for C1 to the other datasets. Instead of estimating directly A 1 and A 2 , two scaling parameters κ 1 and κ 2 are added to the reaction rates: 1 In the present case study, since all microchannels are characterized by the same depth (see Supporting Information Appendix A for more details on the geometry of the microchannels), parameter a is the same for all setups and will be therefore not considered explicitly in the following kinetic analysis.
At this stage, kinetic parameters A 1 , A 2 , A 3 , E a1 , E a2 , and E a3 are

| Analysis of the distribution of information
The following analysis focuses only on the kinetic models identified for the most reactive and the least reactive catalysts towards reaction (1) (i.e., models with highest and lowest κ 1 value in Equation (24)).
These are selected as representatives of a CLR with kinetic parametersθ CLR and a CHR with parametersθ CHR respectively. The trace of the information matrix (16)  computing approximately the distribution of the information for the worst case scenario, that is, the distribution of ψ MIN , in the special case of a bi-dimensional design space. With reference to aim 1, it is chosen to limit the investigation to three cases assuming different bidimensional design spaces and a fixed reactor geometry: • Case 1: the information surface for CLR and CHR is evaluated in the design subspace identified by reactor temperature and residence time at fixed inlet composition; • Case 2: the information surface for CLR and CHR is evaluated in the design subspace identified by reactor temperature and inlet ratio methanol/oxygen; • Case 3: the impact of the inlet molar fraction of water on the information surface for CHR is assessed in the experimental design subspace identified by temperature and residence time.
In Table 2 values and ranges of variation for the input variables are given. Residence time τ is defined as the ratio between the catalystcontaining reactor volume and the inlet flowrate at standard conditions (i.e., at temperature T * = 273 K and pressure P * = 101,325 Pa). In all cases, a uniform pressure profile was assumed along the reactor with P = 160,000 Pa and the inlet molar fraction of oxygen was set at y IN O2 = 0:044.

| Reference model
The parameter estimates obtained fitting dataset C1 are given in Table 3 with related statistics, expressed in terms of t-values. A t-test performed on the estimates showed that the only parameter that was not estimated with satisfactory precision is the activation energy of the third reaction E a3 . Other parameters passed the t-test. The parity plot in Figure 4 compares measurements against model predictions.
The distribution of the points in proximity of the diagonal and within the error range demonstrates the good fitting achieved by the proposed kinetic model for the considered dataset.

| MBDM for catalyst reactivity quantification
In this section, the kinetic model identified for C1 is employed as reference, together with MBDM estimators (11) and (13)  quantification of catalyst reactivity in datasets C2, R1, R2, and R3. The kinetic model (24) was initially fitted to the datasets adopting a conventional maximum likelihood method, thus fitting all the experimental data available. Parameter estimates computed for each dataset are given in Table 4. A comparison between measurements and model predictions is given in Figure 5, where residuals associated to all the datasets are shown. As one can see from the parity plot, a number of data points lay outside the error range. The plot highlights the impossibility of representing all the data points with the given model. (11) and (13) were employed in series for obtaining a more significant estimation of parameters κ 1 and κ 2 . The parameter estimates obtained from the application of the two-stage MBDM method to datasets C2 to R3 are given in Table 5. No critical data were identified for the sets collected on C2 and R2 therefore leading to the same parameter estimates obtained with the conventional likelihood method. Regarding dataset R1, no weaknesses in the fitting of specific measured species were highlighted. However, as one can see from  Table 1).

MBDM estimators
Since the model represents an approximation of the comprehensive mechanism, it may not provide accurate predictions outside the range of data fitted for its identification. Concerning R3, only experiments 1-3 were retained, while trials 4-8 were excluded from the parameter estimation by the solver. The incompatibility between experiments 1-3 and experiments 4-8 (for which a graphical assessment is given in Supporting Information Appendix D), is interpreted as a consequence of a reduction in the catalyst reactivity that occurred during the campaign of experiments. Also notice that the reference model was found unable to describe the molar fraction of water at the outlet, which was therefore neglected for the purpose of parameter estimation. Estimates for κ 1 and κ 2 in R3 were obtained fitting only three experiments, that is, only three distinct experimental conditions. The fitting of the reduced dataset results in a negative t-test for parameter κ 2 in R3. Since it is known that the model is identifiable, the statistical quality of κ 2 in R3 can be effectively enhanced by the execution of additional experiments designed employing MBDoE methodologies for parameter precision.
In Figure 6, a parity plot shows the distribution of the residuals for all the fitted data. that is, significantly different estimates for κ 1 and κ 2 in R1, R2, and R3 (see Table 4). However, the silver catalysts sputtered on the microchannels of the serpentine reactors were synthesized in the same batch in 2015. Furthermore, the serpentine reactor chips were characterized by the same microchannel geometry (see Supporting Information Appendix A) and, although the catalyst film lengths were different, the same sputtering protocol was adopted. Comparable levels of reactivity were therefore expected in these setups. The application of MBDM allows for highlighting the fact that the catalyst sputtered in R1, R2, and R3 is actually the same, that is, similar values for κ 1 and κ 2 in R1, R2, and R3 (see Table 5), and that the different parameter estimates obtained with the maximum likelihood estimator are consequence of fitting data lying outside the domain of model reliability.

| Reactivity impact on the information content of the experiments
In this section, the impact of the catalyst reactivity on the information content of a kinetic experiment is assessed across the experimental design space. This is done evaluating the trace of the expected Fisher information matrix (16) with the kinetic parameters estimated for C1 The information surfaces obtained in Case 1 are given in Figure 7. The optimal experimental conditions in the presence of a CLR and a CHR, identified by the crests of information in the plot, are located in different positions of the design space. The plot highlights the fact that information is particularly sensitive to catalyst reactivity and that robust MBDoE methodologies shall be employed to take this aspect into account for the design of informative kinetic experiments. The kinetic mechanism associated to a CLR shall be investigated adopting longer residence time and higher temperature with respect to a CHR.
Being the peaks of information clearly distinct, the study of a CHR through the investigation of CLR optimal conditions (and vice versa),

| Case 3: Effect of inlet water concentration on information
In this case, the impact of different inlet fractions of water through the analysis of the effect produced on the information surface in the design space identified by temperature and residence time is assessed. Only the CHR case was considered. As one can see from  (23)). Since reaction (1) generates the reactants for the other reactions included in the mechanism, adding water at the inlet slows down the overall process. If all the other experimental conditions are fixed and the amount of water in the system is increased, higher temperature is required to achieve the optimal level of reactant conversion and product generation required to maximize the expected information from the experiment.

| Robust design of kinetic experiments
In the previous section, it was highlighted that the distribution of information across the design space is highly sensitive to catalyst reactivity. This high sensitivity justifies the employment of robust experimental design methodologies. The existence of robust experimental conditions is demonstrated by the fact that information surfaces associated with CLR and CHR intersect in the design space (see