bookssland.com » Other » Data Mining by Mehmed Kantardzic (inspirational novels TXT) 📗

Book online «Data Mining by Mehmed Kantardzic (inspirational novels TXT) 📗». Author Mehmed Kantardzic



1 ... 156 157 158 159 160 161 162 163 164 ... 193
Go to page:
examples.

Figure 14.13. Multifactorial-evaluation model.

14.5.1 A Cloth-Selection Problem

Assume that the basic factors of interest in the selection of cloth consist of f1 = style, f2 = quality, and f3 = price, that is, F = {f1, f2, f3}. The verbal grades used for the selection are e1 = best, e2 = good, e3 = fair, and e4 = poor, that is, E = {e1, e2, e3, e4}. For a particular piece of cloth u, the single-factor evaluation may be carried out by professionals or customers by a survey. For example, if the survey results of the “style” factor f1 are 60% for the best, 20% for the good, 10% for the fair, and 10% for the poor, then the single-factor evaluation vector R1(u) is

Similarly, we can obtain the following single-factor evaluation vectors for f2 and f3:

Based on single-factor evaluations, we can build the following evaluation matrix:

If a customer’s weight vector with respect to the three factors is

then it is possible to apply the multifactorial-evaluation model to compute the evaluation for a piece of cloth u. “Multiplication” of matrices W(u) and R(u) is based on the max–min composition of fuzzy relations, where the resulting evaluation is in the form of a fuzzy set D(u) = [d1, d2, d3, d4]:

where, for example, d1 is calculated through the following steps:

The values for d2, d3, and d4 are found similarly, where ∧ and ∨ represent the operations min and max, respectively. Because the largest components of D(u) are d1 = 0.4 and d2 = 0.4 at the same time, the analyzed piece of cloth receives a rating somewhere between “best” and “good.”

14.5.2 A Problem of Evaluating Teaching

Assume that the basic factors that influence students’ evaluation of teaching are f1 = clarity and understandability, f2 = proficiency in teaching, f3 = liveliness and stimulation, and f4 = writing neatness or clarity, that is, F = {f1, f2, f3, f4}. Let E = {e1, e2, e3, e4} = {excellent, very good, good, poor} be the verbal grade set. We evaluate a teacher u. By selecting an appropriate group of students and faculty, we can have them respond with their ratings on each factor and then obtain the single-factor evaluation. As in the previous example, we can combine the single-factor evaluation into an evaluation matrix. Suppose that the final matrix R(u) is

For a specific weight vector W(u) = {0.2, 0.3, 0.4, 0.1}, describing the importance of the teaching-evaluation factor fi and using the multifactorial-evaluation model, it is easy to find

Analyzing the evaluation results D(u), because d2 = 0.4 is a maximum, we may conclude that teacher u should be rated as “very good.”

14.6 EXTRACTING FUZZY MODELS FROM DATA

In the context of different data-mining analyses, it is of great interest to see how fuzzy models can automatically be derived from a data set. Besides prediction, classification, and all other data-mining tasks, understandability is of prime concern, because the resulting fuzzy model should offer an insight into the underlying system. To achieve this goal, different approaches exist. Let us explain a common technique that constructs grid-based rule sets using a global granulation of the input and output spaces.

Grid-based rule sets model each input variable usually through a small set of linguistic values. The resulting rule base uses all or a subset of all possible combinations of these linguistic values for each variable resulting in a global granulation of the feature space into rectangular regions. Figure 14.14 illustrates this approach in two dimensions: with three linguistic values (low, medium, high) for the first dimension x1 and two linguistic values for the second dimension x2 (young, old).

Figure 14.14. A global granulation for a two-dimensional space using three membership functions for x1 and two for x2.

Extracting grid-based fuzzy models from data is straightforward when the input granulation is fixed, that is, the antecedents of all rules are predefined. Then, only a matching consequent for each rule needs to be found. This approach, with fixed grids, is usually called the Mamdani model. After predefinition of the granulation of all input variables and also the output variable, one sweeps through the entire data set and determines the closest example to the geometrical center of each rule, assigning the closest fuzzy value output to the corresponding rule. Using graphical interpretation in a 2-D space, the global steps of the procedure are illustrated through an example in which only one input x and one output dimension y exist. The formal analytical specification, even with more than one input/output dimension, is very easy to establish.

1. Granulate the Input and Output Space. Divide each variable xi into ni equidistant, triangular, MFs. In our example, both input x and output y are granulated using the same four linguistic values: low, below average, above average, and high. A representation of the input–output granulated space is given in Figure 14.15.

2. Analyze the Entire Data Set in the Granulated Space. First, enter a data set in the granulated space and then find the points that lie closest to the centers of the granulated regions. Mark these points and the centers of the region. In our example, after entering all discrete data, the selected center points (closest to the data) are additionally marked with x, as in Figure 14.16.

3. Generate Fuzzy Rules from Given Data. Data representative directly selects the regions in a granulated space. These regions may be described with the corresponding fuzzy rules. In our example, four regions are selected, one for each fuzzy input linguistic value, and they are represented in Figure 14.17 with a corresponding crisp approximation (a thick line through the middle of the regions). These regions are the graphical representation of fuzzy rules. The same rules may be expressed linguistically as a set of IF-THEN constructions:R1:IF x is small, THEN y is above average.R2:IF x is below average, THEN y is above average.R3:IF x is above average, THEN y is

1 ... 156 157 158 159 160 161 162 163 164 ... 193
Go to page:

Free e-book «Data Mining by Mehmed Kantardzic (inspirational novels TXT) 📗» - read online now

Comments (0)

There are no comments yet. You can be the first!
Add a comment