Control of a bioreactor using reinforcement learning
Zero-dimensional model of a bioreactor
A bioreactor is a dynamical system that can be modeled as a zero-dimensional, perfectly stirred reactor of constant volume \(V\) where biomass is produced from a substrate. The substrate is continually added to the bioreactor at the rate \(F\) (in \(L/h\)), and the products of the bioreactor are expelled at the same rate, thus maintaining a constant volume in the reactor.
The system of initial value ordinary differential equations (ODEs) that describes this process can be written as:
\(\begin{equation} \begin{cases} \frac{d X}{d t} = \left( \mu(S) - D \right) X \\ \frac{d S}{d t} = D \left( S_{\text{in}} - S \right) - \frac{\mu(S)}{Y_d} X \end{cases} \end{equation}\)
with the initial conditions
\(\begin{equation} \begin{cases} X(t = 0) = X_0 \\ S(t = 0) = S_0 \end{cases} \end{equation}\)
and where \(X\) is the biomass concentration in \(g/L\), \(S\) is the substrate concentration in \(g/L\), \(S_{\text{in}}\) is the substrate concentration in the feed in \(g/L\), \(D = F / V\) is the dilution rate in \(1/h\), \(\mu(S)\) is the biomass growth rate in \(1/h\), and \(Y_d\) is the yield coefficient specifying how much biomass can be obtained from the unit mass of substrate.
The biomass growth rate is often modeled using Monod kinetics:
\(\begin{equation} \mu(S) = \mu_{\text{max}} \frac{S}{K_S + S} \end{equation}\)
where \(\mu_{\text{max}}\) is the maximum growth rate in \(1/h\) and \(K_S\) is the half-saturation constant in \(g/L\).
For more realism, we may also model gradual decay of biomass if substrate is not continually provided. With the current ODE model, \(X\) stays constant once \(D \rightarrow 0\) and \(S \rightarrow 0\). Realistically, without new substrate biomass eventually starves. This can be modeled by an additional decay term, \(k_d\), in the first ODE:
\(\begin{equation} \frac{d X}{d t} = \left( \mu(S) - D - k_d \right) X \end{equation}\)
Control action
This dynamical system is controlled by establishing the right dilution rate, \(D\), such that we maximize the rate of biomass expelled from the reactor, \(D X\), also known as the reactor’s productivity. Note that too low \(D\) will hamper the growth of biomass with too little nutrients provided and equally small biomass output. But too high \(D\) can lead to reactor washout, i.e., complete removal of biomass from the reactor with time.
Below, we visualize a couple of scenarios starting from initial condition
\(\begin{equation} \begin{cases} X(t = 0) = 1.0 \frac{g}{L} \\ S(t = 0) = 5.0 \frac{g}{L} \end{cases} \end{equation}\)
First, a good control case where dilution rate is not too low and not too high. With just the right amount of substrate continually supplied to the reactor the biomass concentration establishes a steady-state at some high value.
Second, zero dilution rate causes the biomass to gradually decay due to starvation.
Third, too high dilution rate causes washout, where at some point in time the biomass is completely removed from the bioreactor and its production cannot be restored.