MINISTRY OF EDUCATION AND TRAINING

VIETNAM ACADEMY OF SCIENCE AND TECHNOLOGY

GRADUATE UNIVERSITY OF SCIENCE AND TECHNOLOGY

------------------------------- LUONG THI HONG LAN SOME EXTENSIONS OF THE COMPLEX FUZZY INFERENCE SYSTEM FOR DECISION SUPPORT PROBLEM

Major: Computer science

Code: 9 48 01 01

SUMMARY OF COMPUTER DOCTORAL THESIS

Ha Noi - 2021

The doctoral thesis was completed at Graduate University of Science and Technology – Vietnam Academy of Science and Technology Supervisor 1: Assoc. Prof. Dr. Le Hoang Son Supervisor 2: Assoc. Prof. Dr. Nguyen Long Giang Reviewer 1: Reviewer 2: Reviewer 3: This doctoral thesis will be defended at the Board of Examiners of Graduate University of Science and Technology, Vienam Academy of Science and Technology on hour….., date….. month….. 2021 This doctoral thesis can be explored at: - Library of the Graduate University of Science and Technology - National Library of Vietnam

1

PREFACE

Fuzzy set (FS) [1] proposed by Zadeh in 1965 is considered as an effective tool to solve the problems

with uncertain properties. Various extensions and operations of FS have been presented in recent years [2-6].

One of the most important techniques in FS is Fuzzy Inference System (FIS), which is widely applied in many

decision-making and classification/prediction problems such as green supplier selection, personnel selection,

company strategy, etc. In these applications, FIS was used to generate a set of fuzzy rules to detect, predict or

classify objects such as lung cancer detection, detection of diabetes mellitus, heart disease prediction,

evaluation of green supply chain management performance, penetration index estimation in rock mass [7-13].

An extended version of FIS embedded with neural network and gradient-based learning is the Adaptive Neuro

Fuzzy Inference System (ANFIS) [14], which also demonstrated good performance in coronary artery disease

prognosis, estimating thermal conductivity enhancement of metal and metal oxide, flood prediction, etc. [15-

21].

Recently, with the boost up of various decision-making problems inspired by time variants or phase

changes, an extension of the Fuzzy Set namely the Complex Fuzzy Set (CFS) with new membership functions

including both the amplitude and the phase terms has been proposed [37]. CFS has been applied with much

concentration by new fuzzy aggregation operators, complex fuzzy soft information, distance measures, and

complex fuzzy concept lattice [37]–[43]. The advantage of CFS is the capability to model phenomena and

events by the phase term to show their overall progress within a given context.

For an example, in order to validate whether blood pressure of a patient is ‘High’ or ‘Low’, a sample

of 30 measured times is recorded, and the mean and variance are computed. Using the fuzzification of FIS in

the CFS, the blood pressure of the patient can be easily measured within the recorded period as ‘Low’

amplitude with ‘Low’ phase (i.e. small mean and variance values). If the blood pressure is measured in a

specific time stamp, a wrong decision may be given out.

Another example of the problem of disease diagnosis: if only based on the disease attribute values

without considering other attributes, the diagnosis result will be inaccurate, because the disease conclusion

depends not only on each disease attribute value, but also need to consider factors related to that disease.

Moreover, there are also many scenarios that involves a phase term, which is encountered in data with a

periodic trend, such as rainfall recorded in a region, or the sound waves produced by a musical instrument. It

is therefore evident, that complex numbers must be given a place in the literature of fuzzy inference system as

well. This is therefore the main motive of this thesis.

The ordinary fuzzy inference systems such as the Mamdani, Sugeno and Tsukamoto systems and

various versions of the ANFIS architectures are only able to handle phenomena that are not periodic or

seasonal. In order to handle time-series data in time-periodic phenomena, FISs and ANFISs employ two

general strategies: 1) ignore the information related to the phase term; 2) represent the amplitude and phase

terms separately using two fuzzy sets. This would cause loss of information and produce unreliable results (if

information related to the phase terms are ignored), distortion of information, and a reduction in computational

efficiency (if information related to the amplitude and phase are represented separately) as it becomes more

time-consuming due to the increased number of sets that need to be dealt with.

Complex fuzzy inference systems are considered to be an effective tool for solving problems with

periodicity and uncertainty. The first system introduced by Ramot [44] is called Complex Fuzzy Logic System,

which is developed from the usual fuzzy logic system but replaces the fuzzy set and the implication rule by its

version in complex plane. Another study by Man et al. [45] is based on the combination of inductive learning

2

with inference systems in complex fuzzy sets. Another version of embedded learning with neural fuzzy

network on CFS set called Adaptive Neural Complex Fuzzy Inference System (ANCFIS) was introduced by

Chen et al. [46]. Then two improvements of ANCFIS with the aim of increasing the computational speed are

also given in [47-48]. In the other words, most of the so-called CFISs that have been proposed in the literature

are not truly complex systems.

From the existing studies on complex fuzzy systems, complex fuzzy systems still have some

limitations as follows:

- Complex fuzzy systems have not provided an overall procedure for building complex fuzzy inference

systems for decision support systems.

-The rule set are only established based on experience, based on logical thinking, without mentioning

the problem of optimizing complex fuzzy inference systems.

- Complex fuzzy systems have not been studied to apply to new datasets that are not included in the

training data when generating the inference model.

- The complex fuzzy t-normal and t-normal operators have not yet been pay attention and applied in

decision support systems.

Research objectives of the thesis.

The thesis focuses on researching and applying complex fuzzy inference system to the problem of

decision support system, as following:

1) Research the theories of complex fuzzy sets, complex fuzzy logic and measures based on complex

fuzzy sets

2) Research and development of fuzzy inference system based on complex fuzzy sets

3) Research applied techniques to reduce rules, optimize fuzzy rules in complex fuzzy inference

system

4) Research on how to represent rules based on fuzzy knowledge graphs to reduce inference

computation time for the test set and deal with the cases where the new dataset is not present in the training

data set.

The layout of the thesis consists of four main chapters, Introduction, Conclusion and list of references.

The Introduction presents an overview of the research problem, the reasons for choosing the topic, objectives

and research content of the thesis. The Conclusion summarizes the achieved results of the thesis and researchs

in future. The main content chapters are organized as follows

Chapter 1 presents the basic concepts and background knowledge that will be used in the next chapters,

such as fuzzy set, complex fuzzy set, fuzzy measure, complex fuzzy measure and related studies on inference

system based on complex fuzzy set in recent years. On that basis, the thesis analyzes the remaining problems,

clearly the research motivations of the thesis: using complex fuzzy inference system for solving problems to

support decision making. In addition, the experimental data sets in the thesis with the measures used for

experimental evaluation are also detailed in this first chapter.

Chapter 2 presents two main research results: the first is the definition of complex fuzzy t-norm and t-

norm operations and the example to use these operations in decision support system; the second is to develop

Mamdani inference system on complex fuzzy set. The choice of aggregation operator, the methods of

determining the rule firing strength and defuzzification methods are presented clearly in this chapter.

Chapter 3 proposes M-CFIS-R model that is a new Mamdani Complex Fuzzy Inference System with

Rule Reduction Using Complex Fuzzy Measures in Granular Computing. Several fuzzy similarity measures such

3

as Complex Fuzzy Cosine Similarity Measure (CFCSM), Complex Fuzzy Dice Similarity Measure (CFDSM),

and Complex Fuzzy Jaccard Similarity Measure (CFJSM) together with their weighted versions are proposed.

Those measures are integrated into the M-CFIS-R system by the idea of Granular Computing such that only

important and dominant rules are being kept in the system

The problem of reducing the rule system in the Mamdani complex fuzzy inference system is the content

considered in chapter 3. Based on the theory of granular computing, the thesis proposes three complex fuzzy

measures and combined these measures with granular computing to optimize the rule system in the complex

fuzzy inference system Mamdani proposed in chapter 2 (M-CFIS-R complex fuzzy inference system). Numerical

examples and experimental results have also demonstrated the effectiveness of the problem of rule in the

Mamdani complex fuzzy inference system.

If chapter 3 only focuses on the problem of rule reduction and optimization in the training phase, then

chapter 4 focus on improving the testing set by applying the theory of fuzzy knowledge graphs. In addition, the

thesis also proposes some exxtesions of M-CFIS-R: Sugeno Complex Fuzzy Inference Systems (S-CFIS-R),

Tsukamoto Complex Fuzzy Inference Systems (T-CFIS-R), Complex fuzzy measures and Complex fuzzy

integrals in M-CFIS-R.

Finally, the conclusion section presents the contributions of the thesis, development direction and issues

of concern of the author.

CHAPTER 1. INTRODUCTION

1.1. Introduction

Fuzzy set theory and Complex Fuzzy Set are approaches for representing and processing vagueness

found abundantly in the real world.

Figure 1.1. Fuzzy Inference Systems in Decision Support Systems

1.2. Problem of Fuzzy Inference Systems in Decision Support Systems General process of the method of using fuzzy systems in decision support systems

Firstly, based on the training sample data, a rule generation procedure is applied to generate a set of

fuzzy rules. This rule set is the center of the collection of rules and knowledge extracted from the training data

set. Next, for each new input is applied to each rule and compute the outputs. A process that aggregates results

from rules to produce final value. Finally, at the decision-making step, this value is adjusted and normalized

to make the final decision.

1.3. An overview of related works

1.3.1. Fuzzy Inference System

Fuzzy Inference System (FIS) is an a popular computational framework based on the concept of fuzzy

theory and commonly applied when construct decision support model. There are three types of FIS: Mamdani

FIS, Sugeno FIS(or Takagi – Sugeno), Tsukamoto FIS.

4

1.3.2. Complex Fuzzy Inference Systems

1.3.2.1 Complex Fuzzy Logic System of Ramot

Ramot proposed a complex fuzzy logic system (CFLS) that consists of three stages: The fuzzification

module; The fuzzy inference stage and The defuzzification process. In CFLS, Ramot et al. did not outline any

specific method of defuzzification to reduce the complex fuzzy outputs into crisp outputs.

1.3.2.2. CANFIS Model of Li and Jang

Li và Jang proposed a FIS based on CFSs called the Complex Neuro-Fuzzy Inference System

(CANFIS). This system is however not truly complex in nature, as the real and imaginary parts of the input

membership functions are dealt with separately using two type-1 fuzzy sets. Separating the real and imaginary

parts also leads to an increased number of rules which makes this system computationally expensive.

1.3.2.3. ANCFIS model of Chen et al

ANCFIS structure proposed by Chen et al in 2010, similar to the complex valued neural network

structure. The ANCFIS models by Man [12] and Chen [13] used vector dot-product for the aggregation stage

and treated the complex-valued inputs as real values, thereby enabling them to obtain scalar values for the dot

product. This would not be possible if the inputs are indeed treated as complex values, as the dot product of

two complex numbers is a complex number and not a scalar value. The ANCFIS system is therefore not truly

complex as the outputs of the system will not be representative of the periodicity of the elements.

1.3.2.4. OtherFuzzy Inference Systems on Complex Fuzzy Set Besides the existing studies, complex fuzzy sets have also been interested and developed by many

research groups. Malekzadeh and Akbarzadeh [27] proposed another inference system based on complex fuzzy

sets, called the Complex-valued Adaptive Neuro-fuzzy Inference System (CANFIS), which is a hybrid of CFIS

and fuzzy neural networks. However, the authors did not present any method to deal with the defuzzification

of the complex-valued outputs to crisp values, and chose to only consider the real part of the output. Deshmukh

et al. introduced a complex fuzzy logic module and applied this to the design process of a fuzzy microprocessor

using the VLSI approach. However, the authors did not implement rule interference and did not provide a valid

defuzzification module.

1.3.3. Remaining problems of the current CFIS system

From previous studies, most of the so-called CFISs that have been proposed in the literature are not

truly complex systems. In order to handle time-series data in time-periodic phenomena, FISs and ANFISs

employ two general strategies: 1) ignore the information related to the phase term; 2) represent the amplitude

and phase terms separately using two fuzzy sets. This would cause loss of information and produce unreliable

results (if information related to the phase terms are ignored), distortion of information, and a reduction in

computational efficiency (if information related to the amplitude and phase are represented separately) as it

becomes more time-consuming due to the increased number of sets that need to be dealt with.

1.4. Theoretical basic

1.4.1. A fuzzy set

The concept of fuzzy sets was introduced by Lotfi A.Zadel in 1965 [1] with the aim of describing the

concepts of "unclear sets" in the study of uncertain factors.

1.4.2. Complex Fuzzy Set

Complex Fuzzy Set is characterized by a membership function thatlies within the unit circle

in the complex plane and has the form:

5

(0.1) ,

Where the amplitude and phase are both real-valued with condition and

.

1.4.3. Some basic operations of CFS:

1.4.3.1 Complement of a Complex Fuzzy Set

Let and be two complex fuzzy sets, and let: and , the

complement of (denoted as ) and is specified by the function:

(1.4)

Where and .

1.4.3.2. Union and intersection of two Complex Fuzzy Sets

 The union of and is denoted as :

(1.5)

Where is t-conorm, for instance,

):  Intersection of two Complex Fuzzy Sets, and (denoted as

(1.6)

Where and

Where is t-norm, for example, Min-operator.

1.4.4. Complex Fuzzy Logic

A complex fuzzy logic system (CFLS) consists of a complex fuzzy rule (CFL) base on complex fuzzy

set to form a complex fuzzy logic system. A CFL represents a complex fuzzy implication relation between

complex fuzzy propositions p and q, where p ∼ “X is A” and q ∼ “Y is B,” respectively. A complex fuzzy

implication is then defined as (1.14)

1.4.5. Fuzzy measures and complex fuzzy measures

Definition: [44] A distance of complex fuzzy sets is for any and

if satisfies:

o khi và chỉ khi

(1.16)

where

o o

is the set of all complex fuzzy sets in

1.5. Experiment datasets

1.5.1. Benchmark datasets

To illustrate the proposed models, the thesis uses five commonly available Benchmark datasets taken from

the UCI Machine Learning Depository, including Wisconsin Breast Cancer Diagnosis (WBCD), Diebetes,

Wine Quality, CardiotocoGraphy and Arrhythmia Datasets.

6

1.5.2. Real dataset

The real dataset received from the Gangthep Hospital and Thai Nguyen National Hospital, Vietnam.

including 4156 patients that come to the hospital for examination of their liver function. Based on these results,

the physician may ask the patient to perform additional examinations aiming to improve the diagnosis.

1.5.3. Experimental evaluation measures

The metrics used to evaluate the effectiveness of proposed model for the decision support system

include: Accuracy, Precision measure, Recall measure and computational time.

1.6. Chapter conclusion

Chapter 1 presents some basic concepts of complex fuzzy set theory and existing fuzzy inference

systems and complex fuzzy inference systems, overview of research on fuzzy inference systems based on

complex fuzzy sets. These contents will be the background knowledge and use in the next chapters of the

thesis.

Chương 2. CONSTRUCT MAMDANI COMPLEX FUZZY INFERENCE SYSTEM (M-CFIS)

2.1. Overview

This chapter introduces Mamdani complex fuzzy inference system with details of the components as

well as implementation and operators. Moreover, the complex fuzzy t-norm and t-conorm also are proposed

and applied in decision support problems.

2.2. Propose complex fuzzy t-norm and t-conorm

2.2.1. t-norm and t-conorm

This section presents the brief of t-norm and t-conorm operators.

2.2.2. Complex fuzzy t-norms and t-conorms

Definition 2.3. Let be a mapping where is the unit complex disk in the

set of complex numbers. Then is called a complex T-norm if the following conditions hold for all

where are the complex fuzzy membership grade with

(1)

, nếu , (2)

(3)

(4)

Definition 2.4. Let be a mapping where is the unit complex disk in the

where

set of complex numbers. Then is called a complex T -conorm if the following conditions hold for all

are the complex fuzzy membership grade:

(1) (2) (3) (4)

, nếu ,

is continuous and

for then it is called an Archimedean complex fuzzy t-norm. If an Archimedean complex fuzzy t- it strictly

Definition 2.5. If the complex fuzzy t-norm function

increasing with variable respect each then to is for

all norm is called a strict Archimedean complex fuzzy t-norm.

is continuous and

for then it is called an Archimedean complex fuzzy t-conorm. If an Archimedean complex fuzzy t-

Definition 2.6. If a complex fuzzy t-conorm function

all

7

conorm is strictly increasing with respect to each then it variable

is called a strict Archimedean complex fuzzy t-conorm.

Definition 2.7. Let , is called a negation function if:

(1)

(2)

is called strict if:

when Definition 2.8. A negation function (1) is continuous and

(2) strictly decreasing: if for all

Definition 2.9. A negation function is called involutive if it is strict and for all

.

2.2.3. An example in multi-criteria decision making algorithm

In this section, we apply the t-norm operators to develop a multi-criteria decision making

(MCDM) algorithm, which consists of the following steps:

Step 1. Consider a MCDM problem where there are alternatives and criteria

. The decision maker constructs the decision matrix where represents the

degree that the decision maker prefers the alternative with respect to the criterion . The weights of the

criteria are expressed as the CFNs , where indicates the amplitude

function/degree that the

decision maker prefers criterion and indicates the phase term/degree.

Step 2. Transform the decision matrix into the normalized decision matrix ,

where

Step 3. Utilize the operators in Example 2.3 to compute the Lukasiewicz complex fuzzy T-norms.

Step 4: Summ up the complex membership grades.

Step 5: Consider the highest score as the candidate for the best ranking.

8

2.3. Mamdani complex fuzzy inference system (M-CFIS)

2.3.1. Proposes Mamdani complex fuzzy inference system

Figure 2.1. The framework of M-CFIS

2.3.2. Some choices use in M-CFIS

2.3.2.1. Complex fuzzy membership function

In M-CFIS model, the classic complex membership fuction is used, as follows:

whith and representing the amplitude and phase terms of the membership grade.

2.3.2.2. Operations used in the M- CFIS

In this research, the operations that will be used in our M-CFIS are given below:

1. The minimum T-norm is used for calculating the firing strength of a complex fuzzy rule with AND

connecting the antecedents.

2. The maximum T-conorm is used for calculating the firing strength of a complex fuzzy rule with

OR connecting the antecedents.

3. The Mamdani implication rule for complex fuzzy sets is used to calculate the values of the

consequent of each complex fuzzy rule:

2.3.2.3. Vector aggregation for CFSs

M-CFIS proposed the dot product between complex-valued vectors which is as given:

9

2.3.2.4. Aggregation of the output distribution

The output distribution: with are complex functions. This

way, we can be sure of obtaining a truly complex CFIS in which the information pertaining to the phase are

not disregarded but taken into consideration in every step of the decision-making process. 2.3.3. Structure of the Mamdani CFIS

The proposed Mamdani CFIS consists of six stages which must be completed before an output is

obtained. Each of these individual stages are as given below:

Stage 1: Determine a set of complex fuzzy rules

Establish a set of complex fuzzy rules of the form:

is is is CFR1: If

is 𝐴1,𝑛1 then then is

is is is CFR2: If

… … … …

then

in which:

is is is is CFRk: If

with (a)

, with . and (b)

, with and . (c)

(d) is a T-norm, and is the S-norm (i.e. the T-conorm) that corresponds to 𝑇0.

with (e)

(f) , where

.

(i) iff

(ii) iff

Stage 2: Fuzzification.

This stage involves finding the fuzzified input membership function values:

with

Stage 3: Establishing the rule firing strength.

Compute the firing strengths for each complex fuzzy rule , where:

Stage 4: Calculating the consequence of the complex fuzzy rules

We form the consequent of :

Stage 5: Aggregation for the output distribution

The output distribution is defined as:

10

.

Stage 6: Defuzzification. Choose a function , Determine the value of the output: . For instance, we

using the trapezoidal rule, for all

may choose the approximation of

2.4. Experimental results

T his section aims to evaluate performance of proposed M-CFIS with Mamdani fuzzy inference system

(M-FIS) in benchmark UCI datasets UCI and real medical dataset from Gangthep Hospital and Thai Nguyen

National Hospital. The experimental results are described in Figure 2.2, Figure 2.3 and Figure 2.4.

Figure 2.2. Performance on the WBCD Figure 2.3. Performance on the Diebetes

Figure 2.4. Performance on the Liver

Chương 3. MAMDANI COMPLEX FUZZY INFERENCE SYSTEM WITH RULE

REDUCTION (M-CFIS-R)

3.1. Overview

One of the limitations of the existing M-CFIS is the rule base that may be redundant to a specific

dataset and based on the caculation of the strength rule. Thus, the rule set in M-CFIS may be still redundant.

In order to handle the problem, this chapter presents the improvement of optimizing the rule system of M-

CFIS by combine some similarity measures and granular computing.

11

3.2. Propose complex fuzzy measure

3.2.1. Complex Fuzzy Cosine Similarity Measure

Definition 3.1. Assume that there are two complex fuzzy sets và in

for all , both amplitude and phase term in [0,1]. A Complex Fuzzy Cosine Similarity Measure

(CFCSM) between and :

(3.1)

where ; ; ;

Assume that there are two complex fuzzy sets

Definition 3.2. Weighted Complex Fuzzy Cosine Similarity Measure (WCNCSM)

A Weighted Complex Fuzzy Cosine Similarity Measure between

và in for all .

and :

(3.2) với

3.2.2. Complex Fuzzy Dice Similarity Measure

Definition 3.3. Assume that there are two complex fuzzy sets và in

between

for all , both amplitude and phase term in [0,1]. A Complex Fuzzy Dice Similarity Measure (CFCSM)

and :

(3.3)

where ; ; ;

Definition 3.4. Weighted Complex Fuzzy Dice Similarity Measure (WCFDSM)

Assume that there are two complex fuzzy sets in for all và

, both amplitude and phase term in [0,1]. A Weighted Complex Fuzzy Dice Similarity Measure between

and :

(3.4) với

3.2.3. Complex Fuzzy Jaccard Similarity Measure

Definition 3.5. Assume that there are two complex fuzzy sets và in

for all , both amplitude and phase term in [0,1]. A Complex Fuzzy Jaccard Similarity Measure

(CFCSM) between and :

(3.5)

; ; ; Với

Definition 3.6. Weighted Complex Fuzzy Jaccard Similarity Measure (WCFJSM)

Assume that there are two complex fuzzy sets và in for all

, both amplitude and phase term in [0,1]. A Weighted Complex Fuzzy Jaccard Similarity Measure

between and :

12

3.6 với

3.3. Proposed M-CFIS-R System

3.3.1. Main ideas

M-CFIS-R is devided into two main parts: The Training phase in order to create the original complex fuzzy rule base and improve it by the Granular Computing with Complex Fuzzy Measure; The Testing phase

used to test the performance of the rule system in Training phase.

3.3.2. Training

Figure 3.1. Training diagram for the proposed model

3.3.2.1. Real and Imaginary Data Selection.

From the Training data, we build the real and imaginary data as follows: The real data are defined as the original data values. The imaginary data at P of attribute Q is determined as: var.P(row)+ var.Q(column)

where var.P (row) is the variance in row at row P and var.Q (column) is the variance according to the column

in column Q .

13

3.3.2.2. Fuzzy C-Means (FCM)

The Fuzzy C-Means clustering method is used for dividing the data according to each attribute into

several groups. The number of clusters specified for each attribute is different based on the semantic value of

the attribute.

3.3.2.3 Granular Complex Fuzzy Measures

Assume that the outputs of three similarity measures are three corresponding squared matrices

whose elements are the correlations between pairs of complex fuzzy rules ( ).

Then, we determine the final degree of similarity between complex fuzzy rules based on the

to be determined

aggregation: For each set of labels, we obtain :

For rules other than labels, then . From these, we obtain the matrix . A new complex fuzzy

rule base is found from F by removing rules having a high or maximal degree of similarity within a group. Then, we proceed to the next steps to evaluate the performance of the new rule system. In cases that the performance of the new complex fuzzy rule base is worse than that of the current rule, we return to the steps of computing the complex fuzzy measures and granular computing for the new complex fuzzy rule base. The iteration stops either when the performance of the new complex fuzzy rule base is better than that of the current base or the cardinality of rules according to any label is equal to 1.

3.3.3. Testing

We perform a similar procedure with M-CFIS [23] for testing the performance of the system with

the reduced complex fuzzy rule base found in the Training phase.

3.4. Experiments

3.4.1. Experimental Results on the Benchmark UCI Datasets

Using 3-fold cross-validation method, the values of criteria obtained by applying M-CFIS and M-

CFIS-R on the UCI datasets are visually presented in Figures 3.3 and 3.4, respectively.

(a) (b) (c)

(d) (e)

Figure 3.3. Performance on the WBCD dataset Figure 3.3 shows the results of applying M-CFIS and M-CFIS-R on the first dataset-WBCD. The

accuracy, the recall and Precision of M-CFIS-R in the training data and testing data is higher than that of M-

14

CFIS. The computation time of these methods can be considered as equal with 36 rules less than the result of

M-CFIS.

200

t ậ u

l

100

106

101

g n ợ ư

l

0

ố S

M-CFIS

M-CFIS-R

(c) (b) (a)

(e) (d)

Figure 3.4. Performance on the Diebetes dataset The values of validity indices (Figure 5a,b) obtained from M-CFIS-R are higher than those of M-CFIS

by more than 1% and with small SD. But, the running time (Figure 5c) of M-CFIS-R is higher than that of M-

CFIS by only 0.02 s on the training data and 0.086 s on the testing data. The average number of rules in Figure

5d of M-CFIS-R is 5 rules less than that of M-CFIS, with SD of 0.94.

3.4.2. Experimental Results on the Real Datasets

On the real datasets, the classification quality evaluation between our proposed method M-CFIS-R and

M-CFIS is shown in Figures 3.5.

900

850

t ậ u

l

800

839

750

g n ợ ư

770

l

ố S

700

M-CFIS

M-CFIS-R

(a) (c) (b)

(e) (d) Figure 3.5. Performance on the Liver dataset Figure 3.5 shows the performance of M-CFIS-R and M-CFIS on the Liver dataset. It is clear that the

accuracy, the recall and precision values of M-CFIS-R on the training data is higher than that of M-CFIS.

15

Although the recall of M-CFIS-R on the testing data is 0.4% smaller than that of M-CFIS, the SD is very small

(only 0.03). This is caused by the decreasing in number of rules. On the Liver dataset, the number of rules in

M-CFIS.

M-CFIS-R is 69 less than that of M-CFIS. This is the reason for M-CFIS-R being more time-consuming than

Chương 4. EXTENSION MAMDANI COMPLEX FUZZY INFERENCE SYSTEM WITH

KNOWLEDGE GRAPH (M-CFIS-FKG)

4.1. Overview

The M-CFIS-R model has been designed to utilize granular computing with complex similarity

measures to reduce the rule base so as to gain better performance in decision-making problems. However M-

CFIS-R has some limitations as follows: (1), testing data are checked by matching with each rule in the rule

base, which leads to a high cost of computational time; (2) if the testing data contain records that are not

inferred by the rule base, the output cannot be generated ; (3) The M-CFIS-R model works based on the

Mamdani inference model, which needs to be developed on the Sugeno and Tsukamoto inference models. (4)

Other complex fuzzy integral or measure concepts also need to be considered. Therefore, it is for these reasons

that in Chapter 4, the thesis presents a new approach based on the knowledge graph to overcome the limitations

of the M-CFIS-R model that has proposed in Chapter 3.

4.2. Some extensions of M-CFIS-R

4.2.1. Sugeno and Tsukamoto complex fuzzy inference system

 Sugeno complex fuzzy inference system (S-CFIS-R): The steps in S-CFIS-R are as follows

Step 1. Rule formation.

is is … A complex fuzzy rule CFRi can be expressed as follows: CFRi : If

is then ; where: are the complex fuzzy sets taken by the rule antecedent variables;

the complex T-norm or T-conorm operator depending on practical applications and is a

polynomial function of the rule’s consequent..

Step 2: Fuzzification.

Find the fuzzified input values with the complex fuzzy membership function hóa.

Step 3: Aggregation for firing strengths.

Calculate the firing strength value for each rule by the function ;

Step 4: Determine the consequence value of the complex fuzzy rules:

Step 5: Aggregation for the output distribution.

Denote and for all m. The output is obtained by using the

weighted average formula as given below:

 Tsukamoto complex fuzzy inference system (T-CFIS-R): The inference process on

Tsukamoto complex fuzzy inference system is similar to Sugeno complex fuzzy inference system. Each

consequence of rules in Tsukamoto complex fuzzy inference system has specified by a monotonous function

on complex fuzzy set. Thus, the inference outputs of each rule are obtained based on the predicate. Finally, the

final is calculated by a weighted average formula (same to S-CFIS-R).

16

4.2.2. Complex fuzzy measures based on set theory

is an algebra of complex fuzzy sets on

Definition 4.1. Given a non-empty complex fuzzy set on universe discourse . A subset of

if it satisfies the following conditions:

(1)

then (2) If

then (3) If

Definition 4.2. Given the complex fuzzy measurable space . A mapping is

defined as complex fuzzy measure on if it satisfies:

and (1)

for any , with and (2)

Definition 4.3. Given , are complex fuzzy measurable spaces and a mapping

is called an isomorphism between

. Mapping and if the following conditions

holds:

is a bijective mapping with (1)

and with (2)

(3) Existing a bijective mapping with , and

Definition 4.4. Given complex fuzzy measures spaces and . A mapping

is called an isomorphism mapping between and if the following

is an isomorphism mapping between

conditions are satisfies

and . (1)

with (2)

Definition 4.5. A complex fuzzy space is cardinal space if the following holds:

, then , (1)

for any and any permutation on . (2)

4.2.3. Complex fuzzy integral

Definition 4.6. Given a complex fuzzy measure space with , a mapping

and - measure .

Complex fuzzy integral - of on is calculated as follows:

với

4.3.3.1. Complex fuzzy integral

Definition 4.7. Given a complex fuzzy measurable space and , an algebra of sets

is a crisp representation of the algebra

if and only if there is

on for any such that:

and if for , then .

4.3.3.2. A relation to the Sugeno integral

Theorem 4.4. Given a complete divisible residual lattic , a complex fuzzy measure space

with and mapping .

Then we have:

is the inner complex fuzzy measure on

where .

4.3.3.3. Properties of the complex fuzzy integral

Theorem 4.6. Given with .

17

with and is crisp, then If

that is specified by the below

is the complex fuzzy measure on . function:

4.3. Propose Mamdani complex fuzzy inference systems with fuzzy knowledge graph M-CFIS-FKG 4.3.1. Main ideas

With the aim of making the inference process in Testing faster we extend the M-CFIS-R as follows.

The initial data are divided into 3 parts namely Training, Validation, and Testing. From the Training data, we

build the real and imaginary data. Using the M-CFIS-R model, we obtain the complex fuzzy rule base with

suitable and effective number of rules. Then, we construct Fuzzy Knowledge Graph (FKG) from the rule

base and represent it by an adjacency matrix. In the Testing, we design the Fast Inference

Search Algorithm (FISA) to derive the outputs from the FKG.

4.3.2. Construct fuzzy knowledge graph

Suppose that we have the following complex fuzzy rule base with X1, X2, …Xm are attributes of in rule dataset. We gradually build FKG for each rule . For each pair , with

, let us construct an edge as where

is the linguistic variable in attribute

. With each , an edge is constructed where is the label

is the weight of the edge

of rule t. Suppose that in rule t với ,

, then : (1)

is the relationship of the attribute

The weight to the label l where , ,

. Then is caculated as

(2)

18

Figure 4.1. The Training of the proposed model

Figure 4.2. The Testing

19

For example: suppose 6 fuzzy rule:

R1: If x1 is Medium1 and x2 is High2 and x3 is High3 then k is 1

R2 : If x1 is High1 and x2 is Low2 and x x3 is Low3 then k is 2

R3 : If x1 is Low1 and x2 is Medium2 and x3 is High3 then k is 1

R4 : If x1 is Low1 and x2 is High2 and x3 is Medium3 then k is 1

R5 : If x1 is High1 and x2 is Low2 and x3 is Medium3 then k is 2

R6 : If x1 is Medium1 and x2 is Low2 and x3 is Low3 then k is 2

Apply the above caculation steps, we get the FKG graph of 6 rule as follows:

Figure 4.5. FKG for 6 rules

4.3.3. Fast inference search algorithm (FISA)

By using Fuzzy Knowledge Graph of rule base, Testing process will try to assign the labels for each

input testing data (Hình 4.2). FISA (Fast Inference Search Algorithm) is applied for deploying on FKG to assign

labels for input fuzzy rules after fuzzification as follows:

Firstly, we calculate the linguistic value to label for each rule based on FKG by the following formula:

where t is rule in which the value of relation with label l.

Based on Approximate Reasoning, which means linguistic attributes of a new record in the Testing set

are approximated by the corresponding attributes in the FKG by the MIN-MAX operator as:

. The final label for the record is determined by the MAX rule: nếu

.

FISA Algorithm

Input: Testing Data, Fuzzy Knowledge Graph

Output: Labels of Testing Data

Begin

1: Build the real and imaginary data

- The real data are determined by using the original values of data.

- The imaginary data are calculated by the function var.R (record) +

var.A (attribute) in which: Var.R (record) is the variance value at record P;

Var.A (attribute) is the variance value in attribute Q.

Fuzzification to achieve linguistic values 2:

3: For t in dataset

4: For l in label

20

5: For i in attribute

6: Calculate:

7: Calculate:

8: Identify the label:: nếu

9: Get label and repeat the steps 1-8 for the new records until end

End

4.4. Experimental results

4.4.1. Experiment

In the evaluation, we have used both two-label datasets and multi-label datasets. The two-label datasets

are three benchmark datasets (Breast Wisconsin dataset, Diabetes dataset and Liver dataset) taken from the

benchmark UCI Machine Learning Repository. Multi-label datasets consist of three other benchmark UCI

datasets (Wine, Cardiotcography -CTG and Arrthymia). Two scenarios have been designed in Table 4.2 and

4.3.

Table 4.2. Scenario 1

Data Number of examples for each label

Training 2/3 * 2/3 * 0.6* ( Number of examples for each label )

2/3 * 1/3 * 0.6 * ( Number of examples for each label ) Validation

Testing 1/3 * 0.6 * ( Number of examples for each label )

New data 0.4 * ( Number of examples for each label )

Table 4.3. Kịch bản 2

Data If Number of examples for each label /Total If Number of examples for each label

number of example > 5% /Total number of example < 5%

2/3 * 2/3 * 0.3 * (Number of examples for each 2/3 * 2/3 * 0.05 * (Number of examples for Training label) each label)

2/3 * 1/3 * 0.3 * (Number of examples for each 2/3 * 1/3 * 0.05 * (Number of examples for Validation label) each label)

1/3 * 0.3 * (Number of examples for each 1/3 * 0.05 * (Number of examples for each Testing label) label)

0.7 * (Number of examples for each label) 0.95 * (Number of examples for each label) New data

4.4.2. Experimental results

For two-label datasets, the experimental results compare the proposed model M-CFIS-FKG with the model M-

CFIS-R on two evaluation criteria including computation time and accuracy.

4.5.2.1. Experimental results on 2-label datasets

21

(b) (a)

As shown in Figure 4.8, there are not much difference from two scenarios. The accuracy of M-CFIS-FKG

Figure 4.8. Experiment results on WBCD dataset

in Scenario 1 is lower than that of M-CFIS-R (i.e. 13.93% lower on average). For Scenario 2, this number is

around 3.44%. However, the computation time of M-CFIS-FKG is much lower than that of M-CFIS-R on two

scenarios (about 97% time decreasing on average). Moreover, on the new data, time consuming of M-CFIS-R

(i.e. 1.98s for scenario 1 and 2.44s for scenario 2) is much higher than that of M-CFIS-. This means that M-CFIS-

FKG works well in Scenario 2 by the Approximate Reasoning capability

(a) (b)

As shown in Figure 4.9 the accuracy of M-CFIS-FKG is a slightly lower than that of M-CFIS-R on both scenarios

Figure 4.9. Experiment results on Diebetes dataset

(i.e. 6.89% and 3.82% on average). And time consuming of M-CFIS-FKG is still much lower. Especially on the

new data in Scenario 2, the computation time of M-CFIS-R is 2.31 times slightly more than that of M-CFIS-FKG

while the values of accuracy are the same. Thus, M-CFIS-FKG works better on new data of Scenario 2. The experimental results on the Liver dataset are presented in Figure 4.10. Clearly, in Scenario 1, the

accuracy of M-CFIS-FKG is 4.27% lower than that of M-CFIS-R while the computation time is 3.77 times

decreasing on average. In Scenario 2, M-CFIS-FKG is more effective than M-CFIS-R with only 2.23% lower in

accuracy and the computation time decrease of 4.1 times (on average). Moreover, on the new data of Scenario 2,

M-CFIS-FKG produces the results with the same accuracy (only 1.14% lower) within 33.56% run time,

comparing with that of M-CFIS-R.

22

(b) (a)

Figure 4.10. Experimental results on Liver dataset

4.5.2.2. Experimental results on multi-label datasets

On the multi-label datasets, data distribution in each data group in these datasets is quite different from

that in two-label datasets. This leads to the changes in experimental results.

(a) (b)

Figure 4.11. Experimental results on Wine dataset As shown in Figure 11(a) for Wine dataset, the accuracy of M-CFIS-FKG is quite lower than that of M-

CFIS-R, except for the new data of Scenario 2. Providing similarity to the previous datasets, the computation

time of M-CFIS-FKG is much lower than that of M-CFIS-R. Especially in the new data of Scenario 2, M-CFIS-

FKG is only 0.37% lower in accuracy with 2.88 times lower in the time consuming.

(a) (b)

23

Figure 4.12. Experimental results on CTG dataset

For CTG dataset, the accuracy and computation time obtained by two models are given in Figure 12 with

similar characteristics as in the other mentioned datasets . Apparently, M-CFIS-FKG is better than M-CFIS-R in

the new data of Scenario 2 in term of the time consuming.

For Arrhythmia dataset, data distribution in Scenario 2 differs from those of the other multi-label datasets.

Among six selected datasets, the accuracy value of M-CFIS-FKG in this case is higher than that of M-CFIS-R

while the time consuming is 3.88 times lower. It is clear that M-CFIS-FKG is very effective in discovering new

information that does not exist in training and even testing process.

(b) (a)

Figure 4.13. Experimental results on Arrthythmia dataset

In Scenario 1 of all datasets, the appearance of labels in each data group is similar. In this case, all the

labels are involved in both the Training, Testing and new data. M-CFIS-FKG has an advantage in decreasing

time consuming. Besides, the gap of accuracy between M-CFIS-FKG and M-CFIS-R on the multi-label datasets

is smaller than that on the two-label datasets. This shows that the performance of M-CFIS-FKG on the multi-

label dataset is better than that on the two-label ones.

In Scenario 2, especially in the new data, there are labels that are not involved in both the Training and

Testing sets. Thus, the results on the new data show the reasoning ability of the new model. In the other words,

the new model M-CFIS-FKG is able to uncover new information from unfamiliar samples. By experimental

results in the new data of Scenario 2 on all datasets, M-CFIS-FKG classifies samples as correctly as M-CFIS-R

(even higher as in Arrhythmia dataset) with very lower run time (6.45 times on average). This proves that M-

CFIS-FKG works effectively in reasoning and referring.

Results affirmed that the variances of accuracy in these instances are the same. Same results are obtained

for the computational time. This again proves the effectiveness of the new M-CFIS-FKG algorithm in reducing

the computational time of inference with acceptable accuracy and approximate reasoning capability for the new

datasets.

4.5. Chapter conclusion

Some extensions of M-CFIS-R, as following: we firstly propose four extensions of M-CFIS-R including

Sugeno Complex Fuzzy Inference Systems (S-CFIS-R), Tsukamoto Complex Fuzzy Inference Systems (T-CFIS-

R), Complex fuzzy measures in M-CFIS-R, and Complex fuzzy integrals in M-CFIS-R. Specifically, the

Complex fuzzy measures and Complex fuzzy integrals were equipped with new notions and theorems for their

24

properties in different contexts. In order to handle the limitations of M-CFIS-R in computational time and the

reasoning capability for new dataset, we construct a Fuzzy Knowledge Graph (FKG) from Complex Fuzzy Rule

Base in the Training stage. The a FISA is proposed for testing phase to select the suitable rules quickly by

matching the records in the Testing data with complex fuzzy rules in obtained complex rule base.

CONCLUSION

1) The main key contributions:

With the aims to develop Mamdani inference system based on complex fuzzy set and apply to decision

support system problems. The main key contributons of the thesis include:

1) Proposed Mamdani complex fuzzy inference system and operations t-norm, t-conorm based on

complex fuzzy set. The components and operations of the Mamdani complex fuzzy inference system model are

also clearly stated in the model and applied to the decision support system. Experiments with UCI datasets and

real data from Gangthep Hospital and Thai Nguyen National Hospital also prove that the proposed model is

better than the Mamdani fuzzy inference system model on the evaluation indicators: Accuracy, Recall and

Precision.

2) Proposing M-CFIS-R model: in this content, thesis proposes complex fuzzy similarity measures and

proposes a method to reduce rules in the Mamdani M-CFIS complex fuzzy inference system model based on the

combination granular computing with complex fuzzy similarity measures. The experimental results also show

that the proposed rule reduction method has reduced the number of rules in the M-CFIS system model and

improved the accuracy of the new model compared to the old M-CFIS model.

3) Proposed M-CFIS-FKG model: thesis extend complex fuzzy inference system according to Sugeno

and Tsukamoto models; propose complex fuzzy measures, complex fuzzy integrals based on set theory. In

addition, thesis also proposed a method of representing fuzzy rules on fuzzy knowledge graphs and construct the

M-CFIS-FKG model which is considered as an improved model of M-CFIS-R in decision making problem.

Experiments on the 2-label and multi-label dataset also prove the approximate inference ability of the proposed

method, especially in cases where the record is not in the Training dataset.

2) Future works:

((1) Continue to research and propose the composition operator on the complex fuzzy set and apply

the proposed operators in the decision support system model.

(2 Continue to research and propose learning algorithms such as transfer learning, collaborative

learning... in the fuzzy rule reduction process with the goal of optimizing the rule system.

(3) Continuing to research and propose methods to represent complex fuzzy rule systems, new

inference methods for the purpose of improving the ability to search on fuzzy knowledge graphs.

(4) Testing the models proposed in the thesis with more complex data sets in different fields in life

such as: health, economy, geography...

THE LIST OF WORKS OF THE AUTHOR RELATED TO THE THESIS

Tran Thi Ngan, Luong Thi Hong Lan, Mumtaz Ali, Dan Tamir, Le Hoang Son, Tran Manh

Tuan, Naphtali Rishe, Abe Kandel (2018), “Logic Connectives of Complex Fuzzy

1 Sets”, Romanian Journal of Information Science and Technology, Vol. 21, No. 4, pp. 344-

358 (ISSN:1453-8245, SCIE, 2020 IF = 0.760), DOI = http://www.romjist.ro/abstract-

606.html.

Ganeshsree Selvachandran, Shio Gai Quek, Luong Thi Hong Lan, Le Hoang Son, Nguyen

Long Giang, Weiping Ding, Mohamed Abdel-Basset, Victor Hugo C. de

Albuquerque (2021), “A New Design of Mamdani Complex Fuzzy Inference System for 2 Multi-attribute Decision Making Problems”, IEEE Transactions on Fuzzy Systems, Vol. 29,

No.4, pp. 716-730 (ISSN:1063-6706, SCI, 2019 IF = 9.518),

DOI = http://dx.doi.org/10.1109/TFUZZ.2019.2961350.

Tran Manh Tuan, Luong Thi Hong Lan, Shuo-Yan Chou, Tran Thi Ngan, Le Hoang Son,

Nguyen Long Giang, Mumtaz Ali (2020), “M-CFIS-R: Mamdani Complex Fuzzy Inference

3 System with Rule Reduction Using Complex Fuzzy Measures in Granular

Computing”, Mathematics, Vol. 8, No. 5, pp. 707 – 731 (ISSN: 2227-7390, SCIE, 2019 IF

= 1.747), DOI = https://doi.org/10.3390/math8050707.

Luong Thi Hong Lan, Tran Manh Tuan, Tran Thi Ngan, Le Hoang Son, Nguyen Long

Giang, Vo Truong Nhu Ngoc, Pham Van Hai (2020), “A New Complex Fuzzy Inference

4 System with Fuzzy Knowledge Graph and Extensions in Decision Making”, IEEE

Access, Vol. 8, pp. 164899 - 164921 (ISSN: 2169-3536, SCIE, 2019 IF = 3.745), DOI

= http://dx.doi.org/10.1109/ACCESS.2020.3021097.