Abstract

Panel data describes a condition in which there are many observations with each observation observed periodically over a period of time. The observation clustering context based on this data is known as Clustering of Time Series Data. Many methods are developed based on fluctuating time series data conditions. However, missing data causes problems in this analysis. Missing data is the unavailability of data value on an observation because there is no information related to it. This study attempts to provide an alternative method of clustering observations on data with time series containing missing data by utilizing correlation matrices converted into Euclid distance matrices which are subsequently applied by the hierarchical clustering method. The simulation process was done to see the goodness of alternative method with common method used in data with 0%, 10%, 20% and 40% missing data condition. The result was obtained that the accuracy of the observation bundling on the proposed alternative method is always better than the commonly used method. Furthermore, the implementation was done on the annual gini ratio data of each province in Indonesia in 2007 to 2017 which contained missing data in North Kalimantan Province. There were 2 clusters of province with different characteristics.