Cluster Analysis: Theory, Methodology, and Applications

Authors

  • Sanja Nikolic, Ph.D, Tanja Sekulic, Ph.D, Branko Medic, D.Sc

DOI:

https://doi.org/10.70135/seejph.vi.4515

Abstract

Cluster analysis is a statistical technique used to group objects into sets called clusters, based on their similarities. Clustering is one of the fundamental tasks in data analysis and is widely applied in various fields, including market research, biomedical studies, pattern recognition, and big data analysis. The primary goal of cluster analysis is to categorize data into groups such that objects within each cluster are as similar as possible, while objects from different clusters are as dissimilar as possible. Cluster analysis is a technique in statistics used to group objects based on their similarity. The main objective of this method is to link data into groups (clusters), where objects within each cluster are as similar as possible, while objects from different clusters are as dissimilar as possible. Cluster analysis is valuable in various fields such as market analysis, biomedical studies, biotechnology, pattern recognition, and many other areas where identifying data structure is required. Academics and market researchers often encounter situations best addressed by defining groups of homogeneous objects, whether they are individuals, companies, products, or even their behaviors. Strategic decisions based on identifying groups within a population, such as segmentation and targeted marketing, would not be possible without an objective methodology. This same need arises in other areas, from the physical to the social sciences. In all cases, researchers seek the natural structure among observations based on multiple profiles. The most commonly used technique for this purpose is cluster analysis. It aims to maximize internal homogeneity and external heterogeneity of clusters. An important feature of cluster analysis is the fact that it is not a method of strict statistical inference, where the selected sample is necessarily considered representative of a given population. Cluster analysis is a method for determining structural characteristics of measured properties on a strict mathematical but not statistical basis. Therefore, for the results of cluster analysis to be meaningful, it is necessary to establish assumptions related to the representativeness of the sample and multicollinearity of the variables. In cluster analysis, the group membership of objects is unknown, as is the final number of groups. The goal of cluster analysis is to identify homogeneous groups or clusters.

Downloads

Published

2025-02-12

How to Cite

Sanja Nikolic, Ph.D, Tanja Sekulic, Ph.D, Branko Medic, D.Sc. (2025). Cluster Analysis: Theory, Methodology, and Applications. South Eastern European Journal of Public Health, 673–681. https://doi.org/10.70135/seejph.vi.4515

Issue

Section

Articles