Skip to main content

[MATPY00] Exponential Trends for the PH COVID-19 Data

.
Related post: 

Our MATPY00 class collected the data on the confirmed cases of COVID-19 in the Philippines (data is available through this link). The purpose of this post is to provide a data-based analysis (an "experiment") to apply the things we learned about the exponential function, as detailed in the previous post

First, we realized that the first infected individual in the Philippines was reported on January 30, 2020. Afterwards, the class collected the data from official bulletins from the Department of Health (DOH) and news reports to reconstruct the time series of the number of infected individuals. The plot of the increase in the number of infections is plotted in Figure 1 below. 

Figure 1. The number of COVID-19 infected individuals in the Philippines, plotted with day number (with day 0 corresponding to January 30, 2020). The rapid rise, especially after day 40, is indicative of exponential growth. 


Note: If you tried doing the Try this yourself portion from the previous post, you will notice that the obtained plot is similar in shape to the trend of the exponential function. 

Now, let us go back to the derivation of the exponential function. Recall that the exponential trend results from the following differential equation: 

\[\begin{equation}\label{eqn:expde} \frac{dy}{dt} = ry \end{equation}\]
i.e. the rate of change in the quantity \(y\) is proportional to its current value, with the rate \(r\), the constant of proportionality, assumed to be constant. The solution, as we obtained previously, is:

\[\begin{equation}\label{eqn:expsoln} y(t) = y_0 e^{rt} \end{equation}\]
where \(y_0\) is the value at \(t = 0\). 

Now, let us go back to our data: one readily notices that there is no rise in the number of infected individuals for a long period up to March 6. This means that there is no rate of increase to speak of. As such, we will not use these data for fitting the exponential function; at these periods, \(\frac{dy}{dt} = 0\) (i.e. zero rate of change). We will therefore start the fitting at March 5 onwards. Using the regression functions from Excel, we will obtain the following fit shown in Figure 2: 

Figure 2. Best-fit exponential trend for data from day 35 (March 5, 2020) onwards. 

The blue line exponential fit is described by the equation above (where, here, \(t \rightarrow x\)). Let us give some remarks about the fitting obtained. 

Comparing the obtained equation of the fit with Equation (\(\ref{eqn:expsoln}\)), we will find that: 
  • \(y_0 \rightarrow 0.0027\). Why is it a fractional number? This is because remember that we did not start the exponential trend at \(t = 0\); in fact, from Figure 2, you can see that the starting day of the exponential curve is at \(t = 35\). This translation in time resulted in the fractional value of \(y_0\). But this is still related to the initial number of infected individuals. 
  • \(r \rightarrow 0.2272\). This rate is also fractional; while this may sound promising, please take note that we are just at the start of the data collection, and this may change with better identification.


Before proceeding any further, let us use a technique that will show us how to fit exponential functions for our data. Let us linearize the exponential trend given by Equation (\(\ref{eqn:expsoln}\)). Again, let us use the inverse function of the exponential, the natural logarithm. 

\[\begin{equation}\label{eqn:lin1} \ln y = \ln \left[e^{rt}\right] \end{equation}\] 
\[\begin{equation}\label{eqn:lin2} \ln y = rt \end{equation}\] 
Now, let us denote \(\ln y \rightarrow y'\). In other words, instead of plotting the \(y\)-axis in a linear scale, we can plot it in logarithmic scale. This results in an equation that looks like a linear equation: 

\[\begin{equation}\label{eqn:lin3} y' = (r)t + 0 \end{equation}\] 
where the slope is \(r\) and the y-intercept is 0. 

True enough, when the y-axis is logarithmic, you should produce a linear curve when the expected trend is exponential. This is presented in Figure 3. 

Figure 3. The same trend with the y-axis plotted logarithmically. The linear-looking trend starts from day 35.


From Figure 3, it is easier to see why we did not include the earlier days and started the trend at day 35. This is because the linear trend starts from day 35; prior to that, the number of cases is flat at 3. This time, let us look back at the obtained exponential fit, this time in a logarithmic scale for the y-axis, in Figure 4: 

Figure 4. Exponential fit for the data appears linear in a logarithmic y-axis.


This time, the exponential trend looks like a line when the vertical axis is logarithmically scaled. 

There are still questions, though. For example, you may think that there are exactly two linear trends in Figure 4: A steeper one from day 35 to day 45 (incidentally, day 45 corresponds to March 15, around the time of the lockdown), and a less steep one onwards. Can this mean that the exponential trend is really fast at the start, and the lockdown helped slow down the trend? Or, maybe we can ask: Did the number of infections just caught up with the limitations on the number of the people that can be tested, thereby giving us an artificial slower trend, when there are, in fact, more people infected? These questions are good research questions, and we need more data to verify which of these explanations are more accurate for the data. 

In the meantime, I hope that this exercise helped us appreciate the exponential trend, in the context of the crisis we are facing right now. 


Next meeting, we will explain the "flattening the curve" statement. 





Comments