
CHAPTER 1
LITERATURE REVIEW
1.1 Plant identification from images of single organ
There are a large number of automatic plant identification methods. Among
different organs of the plant, leaf is the most widely used [4] because leaf usually exists
in a whole year. The identification results on leafscan often give the best results when
compared with other organs [5]. The popular organ is flower because its appearances
(e.g., color, shape, texture) are highly distinguishing [6]. In addition, other organs are
used to identify plant such as fruit, stem and branch. There are two main approaches
for the plant identification based on image of the plant organs. The first one uses the
hand-designed feature-based while the second one employs the deep learning method.
Hand-designed feature-based method consists of main stages: training and test-
ing. Each stage consists of four main components: image acquisition, preprocessing,
feature extraction and classification [7]. Feature extraction can be considered the most
important component in system. The purpose of extracting features is reducing the
dimensionality of the data and good representation of that data. Features include
global (color, texture, shape) and local features (organ-specific). For example, leaf has
an organ-specific feature such as leaf vein structure, leaf margin, tooth. Shape of leaf
plays the most important role [4]. Shape and color are important features for a flower.
Previous studies often combine two or more feature types for each organ because there
is no single feature strong enough to separate all categories.
The second employs deep learning methods. Recently, learning feature represen-
tations using a Convolutional Neural Networks (CNN) show a number of successes in
different topics in the field of computer vision such as object detection, segmentation,
and image classification [8]. CNN can automatically learn the features. Each layer
extracts features from the output of the previous layer. The first layers in the network
are very simple to extract for lines, curves, or blobs in the input image. This infor-
mation will be used as input for the next layer, with the task more difficult to extract
the components of the object in the image. Finally, the highest classes in the training
network will receive the task of classifying objects in the image. Typically CNNs are
AlexNet, VGG, GoogLeNet and ResNet. The teams utilizing deep learning techniques
are top winners in LifeCLEF competition.
3