Empirical Study of Performance of Classification and Clustering Algorithms on Binary Data with Real-World Applications

Public Deposited

Analytics

Resource Type

Creator

Abstract

This thesis compares statistical algorithms paired with dissimilarity measures for their ability to identify clusters in benchmark binary datasets. The techniques examined are visualization, classification, and clustering. To visually explore for clusters, we used parallel coordinates plots and heatmaps. The classification algorithms used were neural networks and classification trees. Clustering algorithms used were: partitioning around centroids, partitioning around medoids, hierarchical agglomerative clustering, and hierarchical divisive clustering. The clustering algorithms were evaluated on their ability to identify the optimal number of clusters. The "goodness" of the resulting clustering structures was assessed and the clustering results were compared with known classes in the data using purity and entropy measures. Experimental design was employed to test if the algorithms and / or dissimilarity measures had a statistically significant effect on the optimal number of clusters chosen by our methods as well as whether the algorithms and dissimilarity measures performed differently from one another.

Subject

Language

Publisher

Thesis Degree Level

Thesis Degree Name

Thesis Degree Discipline

Identifier

Rights Notes

Copyright © 2014 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.

Date Created

Relations

In Collection:

Thumbnail	Title	Date Uploaded	Visibility	Actions
	nahmias-empiricalstudyofperformanceofclassification.pdf	2023-05-04	Public	Download