96158735

2950208

2056996

Update
2021-01-19

348041

46

15591

100

Update
2020-12-17

282741

124845

11477

28752

20-05-23
Update

72283

208

Update
2020-07-01

World country dataset from: John Hopkins University Center for System Science and Engineering John Hopkins University dataset, which is updated daily in DATA1. The name of the latest time series (since 22/3):

- time_series_covid19_confirmed_global.csv for cumulative confirmed cases.
- time_series_covid19_deaths_global.csv for cumulative deaths.
- time_series_covid19_recovered_global.csv for cumulative recovered cases.

Spanish region dataset. Confirmed, hospitalised, Intensive care units (ICU), deaths and recovered cases by Autonomous Community of Spain available at Situation of COVID-19 in Spain from Instituto de Salud Carlos III. Data updated daily in DATA2. The structure of this file is not stable over time. The current variables are: CCAA, FECHA, CASOS, PCR+, TestAc+, Hospitalizados, UCI, Fallecidos, Recuperados. Please read the notes at the end of the CSV.

Italian region dataset. Confirmed, hospitalised, Intensive care units (ICU), deaths and recovered cases by regions of Italy available at COVID-19 Italia - Monitoraggio situazioneDipartimento della Protezione Civile from Presidenza del Consiglio dei Ministri - Dipartimento della Protezione Civile. Data updated daily in DATA3.

Catalonia region dataset. These data come from the RSAcovid19 record from the Health Department and show data from the accumulated positive cases, which are those that tested positive on some diagnostic test (PCR or fast test). It also includes data from the accumulated suspicious cases corresponding to people who presented symptoms at some point and a sanitary professional has classified them as a possible case, but they do not have a diagnostic test (PCR or fast test) with a positive result. The surveillance service activated all the cases and they identified the person's residence zone indicated on each sanitary card. Information is updated in open data daily at Dades obertes de Catalunya.

- Cumulative cases at day \(t\): \(x_t^{(j)}\) with \(j\in \{1,…,5\}\) being, respectively for, confirmed, deaths, hospitalized, ICU and recovered cases.
- New cases at day \(t\): \(x_t^{(j)} - x_{t-1}^{(j)}\)
- Growth Rate of cases - H\(_k\): \(r_{k}^{(j)}(t)=\frac{x_{t+k}^{(j)} - x_{t}^{(j)}}{x_{t}^{(j)} + 1}\) for \(t=…,t_0-1\) and \(k=1,\ldots,5\)
- Active cases at day \(t\): \(a_t = x_{t}^{(1)} - x_{t}^{(2)} - x_{t}^{(5)}\).
- Hospitalised and ICU cases are only available for regions of Spain,

Note: new active cases can be negative for some days, if on this day there were more new recoveries \(+\) deaths cases than there were new confirmed cases.

Related with the idea of “flattening the curve”, we consider the curve (\(r_{1}^{(j)}(t)\)) that captures how growth rate changes over time. Besides, we smooth this signal to avoid the effect of sudden changes in notification (such as the weekend effect).

Objective: Predict the growth rate at horizon \(k\) using the past during the last 15 days of growth rate H\(_1\):

\[R_{1}(0)=\{r_1^{(j)}(-14),\ldots,r_1^{(j)}(0)\}\]

Filtering:

- Some data from certain regions are banned by certain inconsistency on the records: “Diamond Princess”,“Iran”,“Japan”,“Bahrain” and “Qatar”
- For \(r_{t+k}^{(1)}\) response (confirmed cases), we uses the countries or regions with more than 200 confirmed cases at time \(t\).
- For \(r_{t+k}^{(2)}\) response (deaths cases), we uses the countries or regions with more than 30 deaths at time \(t\).

Fit the model. Three functional models of the general regression are constructed: \(r_{k}^{(j)}(0) = f(R_{1}(0)) + \epsilon\), where the difference lies in the form of the \(f\):

- FLM, uses a linear function: \(f(R_{1}(0))= \int{R_{1}(t)\beta(t)dt}\).
- FNP: uses a \(f\) is a nonparametric kernel estimate.
- SAM: uses a \(f\) is an additive combination of smooth functions of the main functional principal components.

Predictions:

- Re-estimate Functional Models (Step 2) when new data is available (all countries and regions of Data1, Data2 and Data2).
- Reconstruct the expected number of accumulated cases and deduce the new cases to each horizon (confirmed , deaths and actives).

This work has been supported by Project MTM2016-76969-P from Ministerio de Economía y Competitividad - Agencia Estatal de Investigación and European Regional Development Fund (ERDF) and IAP network StUDyS from Belgian Science Policy.

Thanks to Diego Campanario for creating the Shiny server.

The file obtained from Instituto de Salud Carlos III (ISCIII) has suffer changes along time in the units of the variables. Typically, the historical data is not reconstructed.

- Apr, 4th, 2020. Hospitalized - Extremadura. Adjustment (-36)
- Apr, 8th, 2020. ICU - C. Valenciana. Cumulative instead prevalence.
- Apr, 11th, 2020. Hospitalized - Castilla La Mancha. Cumulative instead prevalence.
- Apr, 12th, 2020. ICU - Castilla La Mancha. Cumulative instead prevalence.
- Apr, 16th, 2020. ICU - Castilla y León. Cumulative instead prevalence.
- Apr, 16th, 2020. ICU - Aragón. Adjustment (-51)
- Apr, 17-18th, 2020. Recovered - Galicia. Missing data.
- Apr, 23rd, 2020. ICU - Extremadura. Adjustment.
- Apr, 26th, 2020. ICU and Hospitalized - Madrid. Cumulative instead prevalence.
- Apr, 28th, 2020. ICU - Galicia - Cumulative instead prevalence. (+235)
- Apr, 28th, 2020. Recovered - Galicia. Increased the number of recovered people at home (+3552)
- Apr, 28th, 2020. Hospitalized - Galicia. Adjustment (-22)
- Apr, 29th, 2020. Confirmed - Galicia. Adjustment (-769)
- May, 21th, 2020. Due to the new surveillance and control strategy there is a change in the notification of the Spanish regions (CCAA). The predictions for Spain and its regions will be updated when this information becomes again available, source: ISCIII.