VIETNAM NATIONAL UNIVERSITY, HANOI
UNIVERSITY OF ENGINEERING AND TECHNOLOGY
DOAN THI HIEN
DEEP LEARNING-BASED APPROACH FOR
WATER CRYSTAL CLASSIFICATION
MASTER THESIS
Major: Computer Science
HA NOI - 2021
Abstract
Almost the earth’s surface area is covered by water. As it is pointed out in the 2020
edition of the World Water Development Report, climate change challenges the sustain-
ability of water resources. It is important to monitor the quality of water to preserve
sustainable water resources. Quality of water can be related to the water crystal struc-
ture, solid-state of water, methods to understand water crystal help to improve water
quality. First step, water crystal exploratory analysis has been initiated under cooper-
ation with the Emoto Peace Project (EPP). The 5K EPP Dataset has been created as
the first world-wide small dataset of water crystals. Our research focused on reducing
inherent limitations when fitting machine learning models to the 5K EPP dataset. One
major result is the classification of water crystals and how to split our small dataset into
most related groups. Using the 5K EPP dataset human observations and past researches
on snow crystal classification, we provided a simple set of visual labels to name water
crystal shapes, with 12 categories. A deep learning-based method has been used to auto-
matically do the classification task with a subset of the labeled dataset. The classification
achieved high accuracy when fine-tuning the ResNet pretrained model.
Keywords: Water crystal, Deep learning, Fine-tuning, Supervised, Classification.
iii
Acknowledgements
I would first like to thank my thesis supervisor Dr. Tran Quoc Long, Head of the Depart-
ment of Computer Science at the University of Engineering and Technology. Thanks for
his insightful comments both in my work and in this thesis, for his support, and many
motivating discussions.
I also want to acknowledge my co-supervisor Dr. Frederic Andres from the Na-
tional Institute of Informatics, Japan for offering me the internship opportunities at NII,
Japan and leading me working on diverse exciting projects. Without his support and
experience, I could not achieve today result.
Besides, I have been very privileged to get to know and to collaborate with many
other great collaborators.
Finally, I must express my very profound gratitude to my family for providing me
with unfailing support and continuous encouragement throughout my years of study and
through the process of researching and writing this thesis. This accomplishment would
not have been possible without them.
iv
Declaration
I declare that the thesis has been composed by myself and that the work has not be
submitted for any other degree or professional qualification. I confirm that the work
submitted is my own, except where work which has formed part of jointly-authored
publications has been included. My contribution and those of the other authors to this
work have been explicitly indicated below. I confirm that appropriate credit has been
given within this thesis where reference has been made to the work of others.
This study was conceived by all of the authors. I carried out the main idea(s) and
implemented all the model(s) and material(s).
I certify that, to the best of my knowledge, my thesis does not infringe upon any-
one’s copyright nor violate any proprietary rights and that any ideas, techniques, quota-
tions, or any other material from the work of other people included in my thesis, pub-
lished or otherwise, are fully acknowledged in accordance with the standard referencing
practices. Furthermore, to the extent that I have included copyrighted material, I certify
that I have obtained a written permission from the copyright owner(s) to include such
material(s) in my thesis and have fully authorship to improve these materials.
Master student
Doan Thi Hien
v