Deep Generative Models for Unsupervised Scale-Based and Position-Based Disentanglement of Concepts from Face Images.

Abdolahnejad Bahramabadi, Mahla

Download PDF

Resource Type

Thesis

Creator

Abdolahnejad Bahramabadi, Mahla

Abstract

Among the different categories of natural images, face images are very important because of the role they play in human social interactions. It is recognised that despite all the recent advances of artificial intelligence using deep neural networks, computers are still struggling at achieving a rich and flexible understanding of face images comparable to humans' face perception abilities. This thesis aims at finding fully unsupervised ways for learning a transformation from face images pixel space to a representation space in which the underlying facial concepts are captured and disentangled. We propose that it is possible to utilize clues from the real 3D world in order to guide the representation learner in the direction of disentangling facial concepts. We conduct two studies in order to test this hypothesis. First, we propose a deep autoencoder model for extracting facial concepts based on their scales. We introduce an adaptive resolution reconstruction loss inspired by the fact that different categories of concepts are encoded in (and can be captured from) different resolutions of face images. With the help of this new reconstruction loss, the deep autoencoder model is able to receive a real face image and compute its representation vector, which not only makes it possible to reconstruct the input image faithfully, but also separates the concepts related to specific scales. Second, we introduce a new scheme to enable generative adversarial networks to learn a representation for face images which is composed of the representations for smaller facial components. This is inspired by the fact that all face images display the same underlying structure. As a result, a face image can be divided into parts with fixed positions each containing specific facial components only. Learning a separate distribution for each of these parts is equivalent to disentangling these components in the representation space.

Subject

Artificial intelligence

Language

English

Publisher

Carleton University

Thesis Degree Level

Doctoral

Thesis Degree Name

Doctor of Philosophy (Ph.D.)

Thesis Degree Discipline

Engineering, Electrical and Computer

Identifier

DOI: https://doi.org/10.22215/etd/2022-15326

Rights Notes

Copyright © 2022 the author(s). Theses may be used for non-commercial research, educational, or related academic purposes only. Such uses include personal study, research, scholarship, and teaching. Theses may only be shared by linking to Carleton University Institutional Repository and no part may be used without proper attribution to the author. No part may be used for commercial purposes directly or indirectly via a for-profit platform; no adaptation or derivative works are permitted without consent from the copyright owner.

Date Created

2022

Relations

In Collection:

Theses and Dissertations

Items

Thumbnail	Title	Date Uploaded	Visibility	Actions
	abdolahnejadbahramabadi-deepgenerativemodelsforunsupervisedscalebased.pdf	2023-05-05	Public	Download

Deep Generative Models for Unsupervised Scale-Based and Position-Based Disentanglement of Concepts from Face Images.

Downloadable Content

Relations

Items