Introduction

High-entropy alloys (HEAs) constitute a new family of high-interest materials because some have shown exceptional properties with high potential for applications. The near-equiatomic composition distinguishes them from the conventional alloys, which are primarily based on a single major element. In HEAs, all components significantly contribute to determining the equilibrium phase and microstructure; thus, even small changes in the alloy composition can result in a distinguishable set of properties. Consequently, a fundamental challenge in the development of HEAs is the exploration of vast high-dimensional composition spaces [1].While experimental approaches have been successful in studying HEAs, they tend to be costly and time-consuming; therefore, the development of computational tools to aid and accelerate the discovery of new alloys is essential.

Recent publications on \(\text {Al}_{n}\text {CrFe}_{2}\text {Ni}_{2}\) based medium-entropy alloys (MEAs) have demonstrated the excellent mechanical properties of this system [2]. Experimental studies have shown the microstructure to consist mainly of a combination of the face-centered cubic (FCC-A1) and body-centered cubic phases in a disordered (BCC-A2) and ordered (BCC-B2) state [3]. In the pseudo-binary phase diagram region of interest (Fig. 1), a miscibility gap exists where the A2 phase separates from the parent B2, either via nucleation and growth or spinodal decomposition. Moreover, with the increase in aluminum content, the A1 field overlaps part of the miscibility gap, creating a region where the three phases can coexist at equilibrium. This combination of phases yields the possibility to obtain a large variety of microstructure morphologies, finely controlled by the alloy composition and the processing route [4].

Figure 1
figure 1

\(\text {Al}_{n}\text {CrFe}_{2}\text {Ni}_{2}\) medium-entropy alloy phase-diagram, plotted as function of the mole fraction of aluminum x and temperature. The labels for the FCC (A1), disordered (A2) and ordered (B2) BCC, and liquid phases are showed

Figure 2
figure 2

Analysis of neural network architecture. a The absolute error distribution is plotted for a series of neural network architectures using box plots. The x-axis indicates the number of nodes on each layer and the different colors are used to identify the number of layers, which are two, three and four according to the legend. b The plot contains the wall time needed to compute one prediction with each architecture in seconds. Both plots are used to judge the optimal neural network architecture

Figure 3
figure 3

Analysis of absolute error on prediction of site fractions and the nondimensional thermodynamic quantities using a 3-layer 32-nodes neural network. a The box plot shows the absolute error distribution for prediction of each site fraction variable. The site fractions are indicated on the x-axis by their respective component name and sublattice number. b The absolute error distribution on the thermodynamic quantities calculated using only the site fractions of the first sublattice

Figure 4
figure 4

Simulated microstructures and analysis of composition profile. The microstructures for alloys one to four are displayed in the first row of the figure with the Degree of ordering (DOO) map, in which yellow represents the B2 and blue the A2 phase. In the second, third and fourth row, the distribution of the composition profile of aluminum, chromium and iron are plotted respectively. In addition the peaks that identify the equilibrium composition of each phase are given in the x-axis

As-cast \(\text {Al}_{n}\text {CrFe}_{2}\text {Ni}_{2}\) alloys with composition \(n\approx 1\) have shown exceptional tensile strength and ductility [2, 5]. These properties have been attributed to the peculiar dual-phase microstructure consisting of A1 and B2 regions, formed simultaneously with the decomposition in the B2 phase [4]. Furthermore, the use of additive manufacturing techniques, such as laser powder bed fusion [6] and laser metal deposition [7], has revealed a strong dependency of the phase fractions and morphology on the cooling conditions. To understand the process of microstructure evolution in this material and the correlation between phase transformation and phase separation, a computational model that takes diffusion of the alloy components into account in a thermodynamically consistent manner must be employed.

Phase-field models (PFM) are a common choice for simulating microstructure evolution at the mesoscopic scale. If diffusion of chemical species is considered, the amount of each element is often represented by a mole fraction field. When combined with CALPHAD Gibbs free energy models, PFM simulations can be used to predict quantitatively the system compositions, fractions of existing phases, and their morphology as a function of time. The parameters of a CALPHAD model are assessed based on experiments and first-principle calculated data and are stored in thermodynamic databases (TDB). Many approaches have been proposed to combine CALPHAD free energy models in PFM simulations [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23]. However, their applicability and efficiency depend on the CALPHAD Gibbs energy model.

Sublattice models [24,25,26] are commonly used to describe the Gibbs free energy of solid phases, with multiple sublattices being employed to model preferential occupation of atoms to specific sites in the crystal structure. The amount of each element is quantified by one or multiple internal variables called constituent or site fractions, depending on the number of sublattices in the model. For solid solution phases, only one sublattice is used as the atoms can occupy any position in the crystal structure, i.e., equivalent to a substitutional model. Consequently, there is only one site fraction variable for each element, which can be directly related to the mole fractions in conventional PFMs. Intermetallic phases, however, are usually described with multiple sublattices, resulting in Gibbs free energy models expressed as a function of site fraction variables for which there is no straightforward analytical relation with the mole fractions used as variables in a PFM model. More complex models often consider additional physics, as is the case for the B2 phase, which is modeled with a two sublattice order-disorder model, introduces vacancies as a constituent, and has a magnetic energy contribution [3, 27]. Therefore, to combine these models in PFM, the values of the site fractions variables must be obtained using a numerical procedure in which the Gibbs free energy is minimized for a given composition, temperature, and pressure.

A solution, proposed in the literature, is the formulation of PFM equations as a function of site fraction variables [12, 16]. This approach, however, increases the computational cost of simulations as a continuity equation is introduced for each site fraction variable. For the B2 phase of the quaternary \(\text {Al}_{n}\text {CrFe}_{2}\text {Ni}_{2}\) alloy, eight continuity equations would be necessary to describe the evolution of the site fractions, instead of three when using mole fractions. Moreover, it is still unclear how to handle the interaction between phases with a different number of sublattices and site fractions, limiting the application of this approach for multiphase models.

Here, we propose a new approach to make use of free energy sublattice models in PFMs formulated as function of mole fraction variables using neural networks. The relation between site and mole fractions is learned by a neural network with data obtained from equilibrium calculations using a thermodynamic software. This procedure is described in the methods section, together with the PFM and the CALPHAD sublattice model. In the results and discussion section, we show the efficacy of the proposed approach when applied for the B2 phase free energy model. Finally, we conduct phase separation simulations for the \(\text {Al}_{n}\text {CrFe}_{2}\text {Ni}_{2}\) MEA to test the applicability of this new approach. The method can be generalized to any system containing phases for which a CALPHAD Gibbs free energy expression based on a sublattice model is available, which is the model of choice for metallic and ceramic systems.

Methods

Phase-field model

The temporal and spatial evolution of the microstructure of the quaternary \((C=4)\) alloy is described with three \((N=3)\) mole fraction fields \(X_i(\mathbf {r},t)\), one for each independent component, \(X_{\mathrm {Al}}\) representing the mole fraction of aluminum, \(X_{\mathrm {Cr}}\) of chromium and \(X_{\mathrm {Fe}}\) of iron. Nickel is assumed to be the dependent component and is indirectly tracked with \(X_{\mathrm {Ni}} = 1.0 - X_{\mathrm {Al}} - X_{\mathrm {Cr}} - X_{\mathrm {Fe}}\). Taking a volume-fixed frame of reference [28], the continuity equation for each dependent component is given by

$$\begin{aligned} \left( \frac{1}{V_{\mathrm {m}}}\right) \frac{\partial {X_{i}}}{\partial {t}} = -\nabla \mathbf {J}_{i}, \end{aligned}$$
(1)

in which \(V_{\mathrm {m}}\) is the constant molar volume and \(\mathbf {J}_{i}\) is the diffusion flux, defined as

$$\begin{aligned} \mathbf {J}_{i} = \sum _{j}^{N} - L_{ij} \nabla \frac{\delta F}{\delta X_{j}}. \end{aligned}$$
(2)

The kinetic parameters \(L_{ij}\) relate the fluxes of components i with all driving forces, which are the derivatives of a free energy functional, of the form [29],

$$\begin{aligned} F = \int _{V} f_{0} + \sum _{j}^{N} \frac{\kappa _{j}}{2}(\nabla X_{j})^{2} dV, \end{aligned}$$

with respect to the mole fractions \(X_{j}\), i.e.,

$$\begin{aligned} \frac{\delta F}{\delta X_{j}} = \frac{\partial f_{0}}{\partial X_{j}} - \nabla \left( \kappa _{j}\nabla X_{j}\right) . \end{aligned}$$
(3)

The gradient energy coefficient \(\kappa _{j}\) controls the interface energy and width in the PFM, and \(f_{0}\) is the free energy density that guides the system towards equilibrium. The complete model is obtained by replacing Eq. (3) in Eq. (2), and Eq. (2) in Eq. (1), and results in a Cahn–Hilliard type equation [29],

$$\begin{aligned} \left( \frac{1}{V_{\mathrm {m}}}\right) \frac{\partial {X_{i}}}{\partial {t}} = \nabla \sum _{j}^{N} L_{ij} \nabla \left[ \frac{\partial f_{0}}{\partial X_{j}} - \kappa _{j}\left( \nabla X_{j}\right) ^{2}\right] , \end{aligned}$$
(4)

The derivatives of the free energy density of the B2 phase w.r.t. the mole fractions are obtained from the relation

$$\begin{aligned} \frac{\partial f_{0}}{\partial X_{j}} = \frac{1}{V_{\mathrm {m}}}\left( \mu _{j} - \mu _{\text {Ni}} \right) = \frac{1}{V_{\mathrm {m}}} \tilde{\mu }_{j}. \end{aligned}$$

With \(\mu _{j}\) being the chemical potential of component j and \(\mu _{\text {Ni}}\) the chemical potential of nickel. The thermodynamic quantity \(\tilde{\mu }_{j}\) is referred as the diffusion potential and it is presented in "Thermodynamic model" section. The models for the material properties \(L_{ij}\) and \(\kappa _{j}\) are explained in "Mobility" and "Gradient energy coefficient" sections, respectively, while the numerical implementation of Eq. (4) and the simulation parameters are discussed in "Implementation and numerical details" section.

Thermodynamic model

The Gibbs free energy of the B2 phase is formulated with an order-disorder model [24, 25], in which the disordered A2 and ordered B2 states are combined in a single description. The sublattice model of the A2 phase is \((\text {Al,Cr,Fe,Ni,Va})_{1.0}\), meaning that there is one site fraction for each component and one additional to model vacant lattice sites. The variables describing the occupation of each constituent in a given sublattice are the site fractions and are represented as \(x_i\), with \(i=\text {Al, Cr, Fe, Ni, and Va}\), for the A2 phase, while the sublattice model of the B2 phase is \((\text {Al,Cr,Fe,Ni,Va})_{0.5}(\text {Al,Cr,Fe,Ni,Va})_{0.5}\). Therefore, there are two sublattices (\(S=2\)), two site fraction variables for each component and two additional for vacancies. The site fractions of the B2 phase are represented as \(y_i^{(s)}\), with \(i=\text {Al, Cr, Fe, Ni, and Va}\), and s is the sublattice index. The number in the parenthesis subscript is the stoichiometric factor, which is \(N^{(s)}=1.0\) for A2 and \(N^{(s)}=0.5\) for B2 in both sublattices.

The expression for the molar Gibbs free energy of the B2 phase is defined as

$$\begin{aligned} G_{\text {m}}^{\text {B2}} = G_{\text {m}}^{\text {dis}} + \Delta G_{\text {m}}^{\text {ord}}, \end{aligned}$$

with \(G_{\text {m}}^{\text {dis}}=G_{\text {m}}^{\text {A2}}\) being the contribution from the disordered A2 phase, and

$$\begin{aligned} \Delta G_{\text {m}}^{\text {ord}} = G_{\text {m}}^{\text {ord}}(y_i^{(s)}) - G_{\text {m}}^{\text {ord}}(y_i^{(s)}=x_i), \end{aligned}$$

is used to describe chemical ordering. In this term, \(G_{\text {m}}^{\text {ord}}\) is the ordering contribution to the molar Gibbs free energy, which is evaluated as a function of site fractions of the B2 phase \(G_{\text {m}}^{\text {ord}}(y_i^{(s)})\) and by setting the site fraction of B2 equal to the A2 phase \(G_{\text {m}}^{\text {ord}}(y_i^{(s)}=x_i)\). This procedure guarantees that \(\Delta G_{\text {m}}^{\text {ord}}=0\) when all \(y_i^{(s)}=x_i\), i.e. the phase is in a disordered state, and \(\Delta G_{\text {m}}^{\text {ord}}\ne 0\) when all or some \(y_i^{(s)}\ne x_i\), i.e. the phase is partially or fully ordered. The site fraction of the A2 phase can be calculated from the site fractions of the B2 phase with the relation \(x_i=\sum ^{S}_{s}N^{(s)}y_i^{(s)}\), and Eq.(5) can be converted to a function of the site fraction of the B2 phase only.

The mole fractions in the PFM are calculated as a function of the site fraction with the relation

$$\begin{aligned} X_{i} = \frac{\sum _{s}^{S} N^{(s)} y_{i}^{(s)}}{\sum _{s}^{S} N^{(s)}\sum _{j\ne \text {Va}}y_{j}^{(s)}} = \frac{\sum _{s}^{S} N^{(s)} y_{i}^{(s)}}{\text {n}_{\text {f}}}. \end{aligned}$$
(5)

\(\text {n}_{\text {f}}\) in Eq.(5) is the proportion of the sublattice positions occupied by constituents that contribute to the content of matter, i.e., excluding vacancies. When the site fractions are given, Eq.(5) can be used to calculate the corresponding mole fractions. However, when composition is given in mole fractions, it is generally not possible to compute the corresponding site fraction. Therefore, the values of the site fractions must be obtained through the minimization of the Gibbs free energy.

The degree of ordering [30], which is defined as

$$\begin{aligned} DOO = \frac{1}{C}\sum _{s}^{S}\sum _{i}^{C}\frac{N^{(s)}(y_{i}^{(s)}-X_{i})}{\sum _{s}^{2}N^{(s)}y_{i}^{(s)}}, \end{aligned}$$
(6)

is used to identify the phase and visualize the microstructure. We assume that if \(DOO < 10^{\,\text {-2}}\) the alloy is disordered and if \(DOO \ge 10^{\,\text {-2}}\) the alloys is ordered or exhibits partial ordering. In Eq. (6), C is the number of components, which is equal to four in the alloy under study.

Both \(G_{\text {m}}^{\text {A2}}\) and \(G_{\text {m}}^{\text {ord}}\) models are formulated according to the CALPHAD method, containing the terms,

$$\begin{aligned} G_{\text {m}} = G_{\text {ref}} + G_{\text {mix}} + G_{\text {exs}} + G_{\text {phys}}. \end{aligned}$$
(7)

In which \(G_{\text {ref}}\) is the term describing a surface of reference formed by the free energy of the pure components or end-members. \(G_{\text {mix}}\) is known as the ideal mixing term and models the configurational entropy of the phase. \(G_{\text {exs}}\) is the excess term and is used to model any contribution that deviates from the ideal behavior. The last term \(G_{\text {phys}}\) can include the contribution from additional physics, and for the B2 phase, it contains a model describing magnetic ordering.

The form of each term in Eq. (7) is well-known in the literature [24,25,26] and is not presented here. All terms in the Gibbs free energy model of the B2 phase in Eq. (7) are formulated as a function of site fractions, i.e., \(G_{\text {m}}^{\text {B2}}(y_{\text {Al}}^{(1)},y_{\text {Cr}}^{(1)},y_{\text {Fe}}^{(1)},y_{\text {Ni}}^{(1)},y_{\text {Va}}^{(1)},y_{\text {Al}}^{(2)},y_{\text {Cr}}^{(2)},y_{\text {Fe}}^{(2)},y_{\text {Ni}}^{(2)},y_{\text {Va}}^{(2)})\), while the free energy in Eq. (4) is formulated as a function of mole fractions, \(f_0(X_{\text {Al}},X_{\text {Cr}},X_{\text {Fe}},X_{\text {Ni}})\).

The derivatives of the free energy density in Eq. (7), i.e., the diffusion potentials, are obtained as

$$\begin{aligned} \frac{\partial f_{0}}{\partial X_{j}} = \frac{1}{V_{\text {m}}}\tilde{\mu }_{j} = \frac{1}{V_{\text {m}}N^{(1)}}\left( \frac{\partial G_{\text {m}}}{\partial y^{(1)}_{j}} - \frac{\partial G_{\text {m}}}{\partial y^{(1)}_{\text {Ni}}}\right) , \end{aligned}$$

and its derivatives,

$$\begin{aligned} \frac{\partial f_{0}}{\partial X_{j}X_{k}} = \frac{1}{V_{\text {m}}}\frac{\partial \tilde{\mu }_{j}}{\partial X_{k}} = \frac{1}{V_{\text {m}}N^{(1)}}\left( \frac{\partial ^2 G_{\text {m}}}{\partial {y^{(1)}_j}y^{(1)}_k} - \frac{\partial ^2 G_{\text {m}}}{\partial y^{(1)}_{\text {Ni}} \partial y^{(1)}_k}\right) , \end{aligned}$$

which are required to numerically solve Eq. (4). According to [26], any sublattice can be selected for the computation of the derivatives, and we consistently used the first sublattice in all cases.

Mobility

The kinetic parameter \(L_{ij}\) in Eq. (4) is a phenomenological coefficient expressing the mobility of component i with respect to a diffusion potential gradient of component j and is defined as

$$\begin{aligned} L_{ij} = \sum ^{C}_{k} (\delta _{ik} - X_{i})(\delta _{jk} - X_{j})X_{k}M_{k}, \end{aligned}$$
(8)

where \(\delta _{ik}\) and \(\delta _{jk}\) are Kronecker delta functions, and \(M_{k}\) is the atomic mobility of component k given as

$$\begin{aligned} M_{k} = M_{k}^{0}\,\exp {\left( \frac{-Q_{k}}{\text {R}T}\right) }\frac{1}{\text {R}T}. \end{aligned}$$

With \(M_{k}^{0}\) being a frequency factor and \(Q_{k}\) the activation energy. The order-disorder transition is modeled similarly to the Gibbs free energy, with the term

$$\begin{aligned} Q_{k}= Q_{k}^{\text {dis}} + \Delta Q_{k}^{\text {ord}}. \end{aligned}$$

\(Q_{k}^{\text {dis}}=Q_{k}^{\text {A2}}\) is the contribution from the disordered A2 phase, and

$$\begin{aligned} \Delta Q_{k}^{\text {ord}} = \sum _{l}^{C}\sum _{m\ne l}^{C}\Delta Q_{k}^{\text {ord}}\left( y_{l}^{(1)}y_{m}^{(2)} - X_{i}X_{j}\right) . \end{aligned}$$
(9)

The parameters \(Q_{k}\) and \(M_{k}^{0}\) have their composition dependency represented by a Redlich–Kister polynomial [31] and are acquired from open databases available in the literature [31,32,33]. The site fractions in Eq. (9) are obtained from the minimization of the Gibbs free energy; consequently, their values must also be provided for the computation of the mobilities.

Gradient energy coefficient

The model for the gradient energy coefficient is formulated as

$$\begin{aligned} \kappa _{j} = \max \left[ \frac{a^2}{4} \frac{\sum _{k \ne j} \frac{X_k}{V_{\text {m}}N^{(1)}}|\frac{\partial ^2 G_{\text {exs}}}{\partial {y^{(1)}_j}y^{(1)}_k}|}{1 - X_j}\right] . \end{aligned}$$
(10)

With \(a=2.85\,\text {\AA }\) being an effective interaction distance, which is assumed to be equal to the interatomic distance of the A2 phase, and is obtained for the equiatomic composition from [34]. The original formulation of (10) is introduced in [35, 36] and is a composition dependent expression. However, we modify this model by taking the maximum value of \(\kappa _{j}\) over the entire composition domain, making this material property a constant to simplify the numerical solution of Eq. (4), avoiding higher-order derivatives of \(G_{\text {exs}}\).

Surrogate model

To compute the Gibbs free energy with Eq. (7) in Eq. (4), the equilibrium site fractions for a given set of mole fractions must be known, i.e., we assume that an underlying function of the form,

$$\begin{aligned} f(X_{\text {Al}},X_{\text {Cr}},X_{\text {Fe}})=y_{i}^{(s)}, \end{aligned}$$
(11)

exists for each site fraction on each sublattice, and for the B2 phase, ten of these functions are required. However, an analytical expression for (11) does not exist as its values can only be obtained through minimization of the Gibbs free energy.

Our strategy is to train a regression model with data computed on a thermodynamic software and use it as a surrogate of (11) for each site fraction. Therefore, the model inputs are the mole fractions, and the outputs are the site fractions. The Gibbs free energy and diffusion potentials required in the PFM are then computed, evaluating each term in Eq. (7), using the site fractions obtained from this regression model.

We use Thermo-Calc 2020b TC-Toolbox for MATLAB R2020a (script provided in [37]) and the thermodynamic database from [3] to sample equilibrium data of the B2 phase in the \(\mathrm {AlCrFeNi}\) system. The computed data are saved as a dataset (example provided in [37]) containing \(10^{4}\) entries. Each entry is calculated by generating a random composition, i.e., assigning a random value to the mole fraction of each element, but respecting the constraint \(\sum X_{i}=1.0\) and at a constant temperature of \({1500}\,{\hbox {K}}\). The values obtained for the Gibbs free energy, derivatives and corresponding site fraction, are stored in a dataset for each condition once the Gibbs free energy is minimized. On a Windows computer with a i7-7700 CPU and 16GB of ram, this sampling procedure took approximately 14 minutes.

Analysis of the sampled dataset showed that the maximum value of the site fractions of vacancies encountered on each sublattice are \(\max \left( y^{(1)}_{\text {Va}}\right) =0.2356\) and \(\max \left( y^{(2)}_{\text {Va}}\right) =1.8 \times 10^{\,\text {-4}}\). This information is used to reduce the number of site fractions which are modeled. The normalization factor is simplified as it can be calculated with the site fractions on the first sublattice only, as

$$\begin{aligned} \text {n}_{\text {f}} = 0.5\left( y^{(1)}_{\text {Al}} + y^{(1)}_{\text {Cr}} + y^{(1)}_{\text {Fe}} + y^{(1)}_{\text {Ni}} + 1.0\right) , \end{aligned}$$

since the sum of all site fractions on the second sublattice, excluding vacancies, is approximately 1.0.

Therefore, \(y_{\text {i}}^{(2)}\) can be calculated if \(X_{\text {i}}\), \(y_{\text {i}}^{(1)}\) and \(\text {n}_{\text {f}}\) are available, by modifying Eq.(5) as

$$\begin{aligned} y_{\text {i}}^{(2)} = X_{\text {i}}\text {n}_{\text {f}} - y_{\text {i}}^{(1)}. \end{aligned}$$

Additionally, we use the constraint \(\sum y^{(1)}_{i} = 1.0\) to indirectly track the site fraction of vacancies on the first sublattice as \(y^{(1)}_{\text {Va}} = 1.0 - y^{(1)}_{\text {Al}} - y^{(1)}_{\text {Cr}} - y^{(1)}_{\text {Fe}} - y^{(1)}_{\text {Ni}}\). With these simplifications, only the site fractions \(y^{(1)}_{\text {Al}}\), \(y^{(1)}_{\text {Cr}}\), \(y^{(1)}_{\text {Fe}}\) and \(y^{(1)}_{\text {Ni}}\) are required to compute Eq. (7).

Neural network

Neural networks (NN) are known as universal function approximators and are used here to train a regression model that learns the relation between mole and site fractions. The input layer consists of three nodes (\(n_{x}=3\)), which are the mole fractions of aluminum, iron and chromium; the mole fraction of nickel is suppressed since it is the dependent component. The output layer consists of a single site fraction variable (\(n_{y}=1\)); therefore, four NNs are needed for this approach. Taking in consideration that the NN is evaluated multiple times during the phase-field simulation, a simple and efficient NN architecture is desired. To select the NN parameters, a search is conducted on the optimal number of hidden layers L and number of neurons on each layer \(n_{h}\).

Tensorflow version 2.4.1 is used to construct and optimize the NNs, and tanh is selected as activation function for all hidden layers and linear for the output layer. This choice of activation functions is commonly suggested in the literature when NNs are used as regression models of nonlinear functions, and in a preliminary study, it exhibited the best performance compared to other options. All training and validation data are scaled to the interval \([-1,1]\) in a pre-processing step. The Adam optimizer is used with a learning rate of \(10^{\,\text {-4}}\) and the mean squared error as loss function, with the batch size equal to 32 and for \(10^{4}\) epochs. The dataset obtained as described in "Surrogate model" section is used for the NN optimization, with 80% of the entries assigned as training data and 20% for validation. These choice of these parameters were also investigated in a preliminary study.

The training is conducted on a HPC using a single core of a 18-cores Xeon Gold 6140 CPU and 5 GB of RAM for each NN. No significant improvements in performance are observed when the training is conducted using multiple cores or a GPU since the NN architecture is relatively small in terms of trainable parameters. Therefore, multiple NN are trained simultaneously on a HPC node, with an average duration of five hours for the given number of epochs. The weights and biases of the optimized model are saved after the training procedure is concluded.

Implementation and numerical details

The Multiphysics Object-Oriented Simulation Environment (MOOSE) is an open-source finite element framework [38, 39] in which the Cahn–Hilliard model Eq. (4) [40, 41] is efficiently implemented as part of the phase-field module. All simulations presented here are conducted using MOOSE, and the simulation input files are included in [37]. The 2D spatial domain of the simulations is discretized using 4-node quadrilateral elements, and the mole fraction fields are interpolated using Lagrange shape functions with periodic boundary condition in all directions. Additionally, mesh adaptivity is employed to improve the simulation performance by reducing the number of required finite elements. The domain size of each simulation is 100 by 100 nanometers, and initially 400 by 400 grid points. The preconditioned Jacobian-free Newton–Krylov (PJFNK) method is used to solve the system of equations, with the convergence tolerance for the linear solve set to \(10^{\,\text {-4}}\) and, \(10^{\,\text {-8}}\) (relative) and \(10^{\,\text {-10}}\) (absolute) tolerances for the nonlinear solve.

For the numerical solution of Eq. (4), the materials properties are converted to their dimensionless form (identified with a star superscript) with the equations

$$\tilde{\mu }_{j}^{*} = \frac{{\tilde{\mu }_{j} }}{{e_{c} }},\;\kappa _{j}^{*} = \frac{{\kappa _{j} }}{{e_{c} {\mkern 1mu} l_{c}^{2} }},{\text{and }}L_{{ij}}^{*} = \frac{{L_{{ij}} {\mkern 1mu} e_{c} {\mkern 1mu} t_{c} }}{{l_{c}^{2} }}.$$

The characteristic length \(l_{c}\), energy \(e_{c}\) and time \(t_{c}\) are defined as

$$l_{c} = 10^{{ - 8}} {\mkern 1mu} {\text{m}},\;e_{c} = \frac{{RT}}{{V_{{\text{m}}} }},{\text{and }}t_{c} = \frac{{l_{c}^{2} }}{{e_{c} {\mkern 1mu} V_{{\text{m}}} {\mkern 1mu} \bar{L}}},$$

with the molar gas constant \(R = 8.3145\,\hbox {J\,mol}^{-1}\,\hbox {K}^{-1}\), the temperature \(T = 1500\,\hbox {K}\), the molar volume \(V_{\mathrm {m}} = 7.7142 \times 10^{\,\text {-6}}\,\hbox {m}^{3}\,\hbox {mol}^{-1}\) (obtained from Thermo-Calc) and the average mobility \(\bar{L}=2.0 \times 10^{\,\text {-17}}\, \hbox {m}^{2}\hbox {mol\,J}^{-1}\,\hbox {s}^{-1}\), calculated with Eq. (8).

The NNs are implemented in MOOSE as materials objects (see [37]), with the equation for predictions given as

$$\begin{aligned} \mathbf {a}^{(n)}=g(\mathbf {W}^{(n)}\mathbf {a}^{(n\,\text {-1})}+\mathbf {b}^{(n)}). \end{aligned}$$
(12)

where \(\mathbf {a}^{(n)}\) is a vector containing the neurons in layer n, g is the activation function (tanh or linear), \(\mathbf {W}^{(n)}\) is a matrix containing the weights and \(\mathbf {b}^{(n)}\) a vector with biases. Using (12), the values of the neurons in a given layer \(\mathbf {a}^{(n)}\) can be calculated based on the neurons in the previous layer \(\mathbf {a}^{(n\,\text {-1})}\), once all layers are computed, a feedforward pass through the NN is completed and the value of the output layer is obtained.

The uncertainty on the fitting of the outputs (site fractions) to the input features (mole fractions) is contained in the values of the weights \(\mathbf {W}^{(n)}\) and biases \(\mathbf {b}^{(n)}\) and is cumulatively passed through each layer of the NN. Additionally, the uncertainty on the optimization of the CALPHAD parameters [42, 43] is propagated to the NN model. An uncertainty quantification and propagation analysis including the latter was not conducted in this work, as it involves access to the optimization procedure of the CALPHAD parameters. The validation of the NN models is only provided based on errors measured comparing the NN models and the validation dataset, see "Neural network validation 297" section, and by comparing quantities obtained from thermodynamic equilibrium calculations with the simulation results, see "Phase-field simulations" section.

The equation for the Gibbs free energy model of the B2 phase Eq. (7) is constructed using the coefficients from the thermodynamic optimization in [3] and the pycalphad [30] python package. The symbolic derivatives of Eq. (7) are obtained with the SymPy package [44]. Eq. (7) and its first and second derivatives are implemented as a materials object in MOOSE. The phase-field simulations are performed on an HPC using 10 nodes, each with two 18-cores Xeon Gold 6140 CPU and 192 GB of RAM. The duration of each simulation (wall time) is given in Table 2.

Table 1 The composition used for each of the simulation cases is given in the table
Table 2 The equilibrium composition and the volume fraction of the B2 and A2 phases obtained from Thermo-Calc are given for each component on each alloy in the eq. column, and the values obtained from the simulations are given in the sim. column. Additionally, the absolute error between these quantities are displayed in the err. column. The simulation wall time (SWT) is given in hours and the number of time steps (NTS) is also provided

Results and discussion

Neural network validation

A series of NNs are trained with a varied number of layers \(L=2,3,4\) and nodes \(n_{h}=4,8,16,32,64\) in search for an optimal architecture. This procedure is conducted for the site fractions \(y^{(1)}_{\text {Al}}\), \(y^{(1)}_{\text {Cr}}\), \(y^{(1)}_{\text {Fe}}\) and \(y^{(1)}_{\text {Ni}}\), with a total of 60 NN being trained. In Fig. 2a, the absolute error distribution of the NNs after training is plotted using box plots. The errors of all site fractions are combined in the same distribution for each architecture, i.e., each combination of L and \(n_{h}\). Additionally, in Fig. 2b, the wall time required to make a single prediction with each NN architecture is plotted.

From the analysis of Fig. 2, a NN with \(L=3\) and \(n_{h}=16\) is chosen as the optimal architecture and is used in further validation and for the phase-field simulations. Increasing the number of hidden layers from \(L=2\) to \(L=3\) provides a significant increase in accuracy, while an increase from \(L=3\) to \(L=4\) has a relatively small effect. A continuous decrease in the error is observed when increasing the number of nodes from \(n_{h}=4\) to \(n_{h}=16\), but improvements become limited for higher \(n_{h}\). The wall time needed to make predictions with each NN appears to be more sensitive to the number of nodes than to the number of layers under consideration. For instance, moving from \(n_{h}=16\) to \(n_{h}=32\) increases the wall time by a factor of two. Taking into account that the NN will be evaluated multiple times throughout the simulation mesh and at every time step, the choice of architecture is also crucially dependent on this measure.

The absolute error on each site fraction for the selected architecture is presented as box plots in Fig. 3a. The absolute error distribution on the dimensionless thermodynamic quantities, namely Gibbs free energy \(G_{\text {m}}\) and diffusion potentials \(\tilde{\mu }_{i}\), is plotted in (b).

In Fig. 3a, we observe that the absolute error is similar for all site fractions, and that the upper quartile in all cases is smaller than \(10^{\,\text {-3}}\). The distribution of the errors show that all variables have an alike behavior, which facilitates the modeling of each of these variables with the same NN architecture. The thermodynamic quantities in Fig. 3b also display a similar error, with a slightly larger error occurring on the diffusion potential of aluminum \(\tilde{\mu }_{\text {Al}}\). With analysis of all terms in \(\tilde{\mu }_{\text {j}}\), we observed that the contribution coming from the term \(G_{\text {mix}}\) in Eq. (7) gives the larger contribution to the error. For \(\tilde{\mu }_{\text {Al}},\) this term equals to \(RT\log (Y^{(1)}_{\text {Al}}/Y^{(1)}_{\text {Ni}})\), and is the most sensitive to deviations on the values of the site fractions.

Phase-field simulations

In the pseudo-binary phase-diagram in Fig. 1, the x-axis represents the mole fraction of aluminum in the \(\mathrm {AlCrFe_{2}Ni_{2}}\) system, and the y-axis the temperature in Kelvin. The liquid, FCC (A1), disordered BCC (A2), and ordered BCC (B2) phases are present in the displayed section. The miscibility gap, where the separation of the BCC occurs, is identified by the A2 + B2 label. The three green circled labels with the numbers one, two and three are used to locate the alloy compositions which are considered in the phase-field simulations. This information is also provided in Table 1; in addition, a fourth alloy is included which cannot be encountered in the phase-diagram section. This fourth alloy is selected with an equimolar composition.

The simulation results are presented in Fig. 4. The first row contains the DOO map of each alloy at the last time step for visualization purpose. The second, third and fourth rows contain the distribution of the composition map of aluminum, chromium and iron, respectively, for each alloy. Two peaks are observed on each distribution, which correspond to the composition of the A2 and B2 phases. The numerical values of the peaks are displayed on the x-axis and in Table 2. The latter also contains the expected equilibrium composition, obtained from Thermo-Calc for each component and alloy, and the equilibrium volume fraction of the A2 \(N^{\text {A2}}\) and B2 \(N^{\text {B2}}\) phases. The absolute error between the equilibrium and simulated quantities is also displayed, and in the last two rows, the simulation duration in wall time (SWT) and the number of time steps (NTS) is given for each case. The simulation is stopped when the duration reaches 10 seconds in real time, and because of the adaptive time stepping, the NTS is different for each simulation.

The overall error between the equilibrium and the mole fractions obtained from the simulations has a mean value of 0.007. In all cases, the highest error is encountered on the mole fraction of chromium, since this variable is the nearest to assume dilute values. The choice of NN architecture has shown a good accuracy compared to its low computational cost. The simulation duration in wall time, shown in Table 2, is longer for alloys two and three due to the smaller size of the A2 particles and the consequently larger interface area.

The simplifications discussed in "Surrogate model" section that allowed us to reduce the number of modeled site fractions to four are possible due to the specifications of the B2 sublattice model. If these simplifications are not possible, e.g., sublattice models that do not have all components present in all sublattices, it is still feasible to model all site fractions in all sublattices. Therefore, the approach presented here is generally applicable to CALPHAD sublattice models.

Conclusion

A new general approach to use CALPHAD sublattice models in PFM mole fraction-based equations is proposed, in which neural networks are used to model the relation between mole and site fractions. It is shown that a simple NN architecture can be used to aid the calculation of the molar Gibbs free energy and its derivatives with good accuracy. The choice of NN training parameters are kept as close to the default as possible to facilitate the reproduction of this approach. Nonetheless, more complex architectures might be beneficial in reducing the observed error or the evaluation time.

In future work, the consideration of temperature as a NN input is also possible, which would allow for simulation of solidification or phase transformation over a temperature range. Furthermore, Bayesian NNs might be employed to address the uncertainty quantification and error propagation in the model by providing a probabilistic approach to the regression problem. However, since these models are more computationally demanding in comparison to the conventional NN used in this work, the duration of the NNs training step is expected to increase.

Finally, this approach can be extended to multiphase systems in which the number of sublattices, stoichiometric factors, and sublattices occupancy is not necessarily the same for all phases. This is only possible since the phase-field model is expressed as a function of the mole fractions.