The Corona Virus Story through number at Jakarta

Andriyan Saputra
6 min readJun 13, 2020

During the covid situation, i had tried to analyze Covid-19 spread at the Jakarta City. Using Phyton modules for gathering data, ArcGIS Map for Geospatial analysis (Hot Spot Analysis and Geographically Weighted Regression — Esri) and for data visualization i use Power BI.

POWER BI: Map of The Spread of Covid-19

Sources : https://corona.jakarta.go.id/id/peta-persebaran

For analysis I tried using different schema to measure ratio or relationship between different variables.

  1. Relationship between number of Positive cases and Population Density number of each region.
  2. Relationship between number of PDP number and Positive case number.

For spatial analysis method, i tried using Hot Spot analysis for spotlight the spread of covid coefficient number at the map and Geographically Weighted Regression (GWR) for measuring the relationship between different variables.

Hot Spot Analysis

The Hot Spot Analysis tool calculates the Getis-Ord Gi* statistic (pronounced G-i-star) for each feature in a dataset. The resultant z-scores and p-values tell you where features with either high or low values cluster spatially. This tool works by looking at each feature within the context of neighboring features. A feature with a high value is interesting but may not be a statistically significant hot spot. To be a statistically significant hot spot, a feature will have a high value and be surrounded by other features with high values as well. The local sum for a feature and its neighbors is compared proportionally to the sum of all features; when the local sum is very different from the expected local sum, and when that difference is too large to be the result of random chance, a statistically significant z-score results.

Geographically Weighted Regression (GWR)

Geographically Weighted Regression (GWR) is one of several spatial regression techniques used in geography and other disciplines. GWR evaluates a local model of the variable or process you are trying to understand or predict by fitting a regression equation to every feature in the data set. GWR constructs these separate equations by incorporating the dependent and explanatory variables of the features falling within the neighborhood of each target feature. The shape and extent of each neighborhood analyzed is based on the Neighborhood Type and Neighborhood Selection Method parameters. GWR should be applied to datasets with several hundred features. It is not an appropriate method for small datasets and does not work with multi point data.

Relationship Between Number of Positive Cases and Population Density Number

POWER BI: Map of The Spread of Covid-19 (For better Visualization click the POWER BI title)

Hot Spot Analysis

According to the map, the high number of HA (Hot Spot Analysis) show with Red color (High) and low number of HA is represented with Blue color. High number of Positive cases located at different places, beside high number of Density Population concentrated at central of Jakarta region.

Geographically Weighted Regression (GWR) Analysis

The relationship between two of variables could showed with Geographically Weighted Regression (GWR) method.

Relationship between Positive cases and Population Density at March 2020 (above image) and June 2020 (below image)

Low number of coefficient are showed with spread out of blue color and High number represented with red color. According to the image above, we could see the time lapse of covid positive case growth from March until June 2020. The spread out of covid positive case varied at different region, it began with several number at various location. During the time, the development number of Positive case pointed out only at few number of regions.

At the current condition (June-2020), left table showed list of Region with the highest number of Positive patients. The high number of coefficient (Red color) indicates that High Density Population number that relates or support in increasing number of positives number of Covid-19 patients.

Standardized Residual Model for positive number and Density Population at several regions have value close to zero (0). It means that the models are able to produce prediction model closure to actual data. Besides, there are several regions still has high Std coefficient value.

The region with High/Low Std Residual coefficient value closure to each others, it tends create several clusters. The condition showed that Positive number of cases with Density Population has the same relationship for the closure region.

Relationship Between Number of PDP and Positive Case Number

POWER BI: Map of The Spread of Covid-19 (For better Visualization click the POWER BI title)

According to the Hot Spot Analysis map, there are correlation between PDP (left image) and Positive cases number (right image). It shows same high number and low number coefficients.

Relationship between PDP and Positive cases at March 2020 (above image) and June 2020 (below image)

According to the map above, majority increasing number of positive cases are following with number of increasing number of PDP. Meanwhile, it is not the same situation for all different cases region. At some points, even we could find the opposite relationship where increasing number of Positive with decreasing number of PDP.

The Standardized Residual Model showing spread throughout the region with different value. Comparing with previous spatial analysis (Positive number and Density Population cases), PDP and Positive case regression model showing coefficient number are not clustering. If the region has high value (red color), the closest region between them (the neighborhood region) will not necessarily resulting the same condition.

Relationship Between Number of Self-isolation and Population Density Number

POWER BI: Map of The Spread of Covid-19 (For better Visualization click the POWER BI title)

According to the Hot Spot Analysis map, there are correlation between Self-isolation (left image) and Population density number (right image).

Relationship between Self-isolation and Population Density cases at March 2020 (above image) and June 2020 (below image)

According to the image above, the spread of coefficient number throughout the region began with several regions and it ends up only with a few particular regions only. It tends to conclusion that population density support increasing number of self isolation patients number at some regions.

Standardized residual Self-Isolation through Population density model showing majority regions have coefficient number closest to zero (0). It means that model is able to describe the actual data. The spread of standardized residual coefficient value through the region showed do not have same condition. It tends to describe that only on a few regions that density population numbers are supporting increasing number of self isolation patient number.

Since the analysis is just only based on number statistically, it would be better for supporting the assumption with quality information such as News media information, medical records, and etc.

Conclusion

  1. Jakarta City during the time of pandemic Covid-19 condition showing spread across the region until resulting only a few clustering with high number of Positive cases. The top rank number of cases are located at North Jakarta and Central Jakarta regions. At some points, local government should pay more attention with high clusters. Increasing rapid test frequency and tighten up for lock down into smaller area policy.
  2. Spatial Analytics methods (Hot Spot Analysis and Geographically Weighted Regression (GWR)) are able to give better understanding about spread of Covid-19 through the time

Recommendation

  1. For better understanding the results are only based on statistically number of cases. It would be better and strong judgement if we could add several Quality condition that support or against the relationship condition on the result.
  2. For better understanding the results should followed with News or Cases that announced at the media that could able to support or against the statistical result.

Closing Remark

This publication is produced for educational or information only, if there are any mistake in data, judgement, or methodology that i used to produce this publication.

  • * Please consider to contact the writer using contact information at Profile. I would like to discuss and sharing more about the topic. Thank you.

Best Regards,

Andriyan Saputra

--

--

Andriyan Saputra

Just an ordinary person who is curious about the world.