Title: | Subgroup Discovery with Evolutionary Fuzzy Systems |
---|---|
Description: | Implementation of evolutionary fuzzy systems for the data mining task called "subgroup discovery". In particular, the algorithms presented in this package are: M. J. del Jesus, P. Gonzalez, F. Herrera, M. Mesonero (2007) <doi:10.1109/TFUZZ.2006.890662> M. J. del Jesus, P. Gonzalez, F. Herrera (2007) <doi:10.1109/MCDM.2007.369416> C. J. Carmona, P. Gonzalez, M. J. del Jesus, F. Herrera (2010) <doi:10.1109/TFUZZ.2010.2060200> C. J. Carmona, V. Ruiz-Rodado, M. J. del Jesus, A. Weber, M. Grootveld, P. González, D. Elizondo (2015) <doi:10.1016/j.ins.2014.11.030> It also provide a Shiny App to ease the analysis. The algorithms work with data sets provided in KEEL, ARFF and CSV format and also with data.frame objects. |
Authors: | Angel M. Garcia [aut, cre], Pedro Gonzalez [aut, cph], Cristobal J. Carmona [aut, cph], Francisco Charte [ctb], Maria J. del Jesus [aut, cph] |
Maintainer: | Angel M. Garcia <[email protected]> |
License: | LGPL (>= 3) |
Version: | 0.7.22 |
Built: | 2024-10-26 03:55:42 UTC |
Source: | https://github.com/cran/SDEFSR |
SDEFSR_Rules
object returning a new SDEFSR_Rules
objectGenerates a new SDEFSR_Rules
object containing the rules that passed the filter
specified
## S3 method for class 'SDEFSR_Rules' SDEFSR_RulesObject[condition = T]
## S3 method for class 'SDEFSR_Rules' SDEFSR_RulesObject[condition = T]
SDEFSR_RulesObject |
The |
condition |
Expression to filter the |
This functions allows to filter the rule set by a given quality measure. The quality measures
that are available are: nVars, Coverage, Unusualness, Significance, FuzzySupport,
Support, FuzzyConfidence, Confidence, TPr and FPr
library(SDEFSR) #Apply filter by unusualness habermanRules[Unusualness > 0.05] #Also, you can make the filter as complex as you can #Filter by Unusualness and TPr habermanRules[Unusualness > 0.05 & TPr > 0.9]
library(SDEFSR) #Apply filter by unusualness habermanRules[Unusualness > 0.05] #Also, you can make the filter as complex as you can #Filter by Unusualness and TPr habermanRules[Unusualness > 0.05 & TPr > 0.9]
S3 function to convert into a data.frame the SDEFSR dataset
This function converts a SDEFSR_Dataset object into a data.frame
## S3 method for class 'SDEFSR_Dataset' as.data.frame(x, ...)
## S3 method for class 'SDEFSR_Dataset' as.data.frame(x, ...)
x |
The |
... |
Additional arguments passed to the as.data.frame function |
Internally, a SDEFSR_Dataset
object has a list of of examples
and this examples are coded numerically. This function decode these examples and convert the list into a data.frame
a data.frame with the dataset uncoded. Numeric attributes are "numeric" class, while categorical attributes are "factor"
@examples
as.data.frame(habermanTra)
Training data for the car dataset
A SDEFSR_Dataset class with 1382 instances, 6 variables (without the target variable) and 4 values for the target Variable. Three labels for each variable are defined.
Car Evaluation Database was derived from a simple hierarchical decision model. The model evaluates cars according to six input attributes: buying, maint, doors, persons, lug_boot, safety.
M. Bohanec and V. Rajkovic: Knowledge acquisition and explanation for multi-attribute decision making. In 8th Intl Workshop on Expert Systems and their Applications, Avignon, France. pages 59-78, 1988.
B. Zupan, M. Bohanec, I. Bratko, J. Demsar: Machine learning by function decomposition. ICML-97, Nashville, TN. 1997 (to appear).
carTra$data carTra$attributeNames
carTra$data carTra$attributeNames
Test data for the car dataset
A SDEFSR_Dataset class with 346 instances, 6 variables (without the target variable) and 4 values for the target Variable. Three labels for each variable are defined.
Car Evaluation Database was derived from a simple hierarchical decision model. The model evaluates cars according to six input attributes: buying, maint, doors, persons, lug_boot, safety.
M. Bohanec and V. Rajkovic: Knowledge acquisition and explanation for multi-attribute decision making. In 8th Intl Workshop on Expert Systems and their Applications, Avignon, France. pages 59-78, 1988.
B. Zupan, M. Bohanec, I. Bratko, J. Demsar: Machine learning by function decomposition. ICML-97, Nashville, TN. 1997 (to appear).
carTst$data carTst$attributeNames
carTst$data carTst$attributeNames
Make a subgroup discovery task using the FuGePSD algorithm.
FUGEPSD( paramFile = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, t_norm = "product_t-norm", ruleWeight = "Certainty_Factor", frm = "Normalized_Sum", numGenerations = 300, numberOfInitialRules = 100, crossProb = 0.5, mutProb = 0.2, insProb = 0.15, dropProb = 0.15, tournamentSize = 2, globalFitnessWeights = c(0.7, 0.1, 0.05, 0.2), minCnf = 0.6, ALL_CLASS = TRUE, targetVariable = NA )
FUGEPSD( paramFile = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, t_norm = "product_t-norm", ruleWeight = "Certainty_Factor", frm = "Normalized_Sum", numGenerations = 300, numberOfInitialRules = 100, crossProb = 0.5, mutProb = 0.2, insProb = 0.15, dropProb = 0.15, tournamentSize = 2, globalFitnessWeights = c(0.7, 0.1, 0.05, 0.2), minCnf = 0.6, ALL_CLASS = TRUE, targetVariable = NA )
paramFile |
The path of the parameters file. |
training |
A |
test |
A |
output |
Character vector with the paths where store information file, rules file and test quality measures file, respectively. For rules and quality measures files, the algorithm generate 4 files, each one with the results of a given filter of fuzzy confidence. |
seed |
An integer to set the seed used for generate random numbers. |
nLabels |
Number of linguistic labels for numerical variables. By default 3. We recommend an odd number between 3 and 9. |
t_norm |
A string with the t-norm to use when computing the compatibilty degree of the rules. Use |
ruleWeight |
String with the method to calculate the rule weight. Possible values are:
|
frm |
A string specifying the Fuzzy Reasoning Method to use. Possible Values are:
|
numGenerations |
An integer to set the number of generations to perfom before stop the evolutionary process. |
numberOfInitialRules |
An integer to set the number individuals or rules in the initial population. |
crossProb |
Sets the crossover probability. We recommend a number in [0,1]. |
mutProb |
Sets the mutation probability. We recommend a number in [0,1]. |
insProb |
Sets the insertion probability. We recommend a number in [0,1]. |
dropProb |
Sets the dropping probability. We recommend a number in [0,1]. |
tournamentSize |
Sets the number of individuals that will be chosen in the tournament selection procedure. This number must be greater than or equal to 2. |
globalFitnessWeights |
A numeric vector of length 4 specifying the weights used in the computation of the Global Fitness Parameter. |
minCnf |
A value in [0,1] to filter rules with a minimum confidence |
ALL_CLASS |
if TRUE, the algorithm returns, at least, the best rule for each target class, even if it does not pass the filters. If FALSE, it only returns, at least, the best rule if there are not rules that passes the filters. |
targetVariable |
The name or index position of the target variable (or class). It must be a categorical one. |
This function sets as target variable the last one that appear in SDEFSR_Dataset
object. If you want
to change the target variable, you can set the targetVariable
to change this target variable.
The target variable MUST be categorical, if it is not, throws an error. Also, the default behaviour is to find
rules for all possible values of the target varaible. targetClass
sets a value of the target variable where the
algorithm only finds rules about this value.
If you specify in paramFile
something distinct to NULL
the rest of the parameters are
ignored and the algorithm tries to read the file specified. See "Parameters file structure" below
if you want to use a parameters file.
@return The algorithm shows in console the following results:
Information about the parameters used in the algorithm.
Results for each filter:
Rules generated that passes the filter.
The test quality measures for each rule in that filter.
Also, this results are saved in a file with rules and other with the quality measures, one file per filter.
@section How does this algorithm work?: This algorithm performs a EFS based on a genetic programming algorithm. This algorithm starts with an initial population generated in a random manner where individuals are represented through the "chromosome = individual" approach includind both antecedent and consequent of the rule. The representation of the consequent has the advantage of getting rules for all target class with only one execution of the algorithm.
The algorithm employs a cooperative-competition approach were rules of the population cooperate and compete between them in order to obtain the optimal solution. So this algorithm performs to evaluation, one for individual rules to competition and other for the total population for cooperation.
The algorithm evolves generating an offspring population of the same size than initial generated by the application of the genetic operators over the main population. Once applied, both populations are joined a token competition is performed in order to maintain the diversity of the rules generated. Also, this token competition reduce the population sice deleting those rules that are not competitive.
After the evolutionary process a screening function is applied over the best population. This screening function filter the rules that have a minimum level of confidence and sensitivity. Those levels are 0.6 for sensitivy and four filters of 0.6, 0.7, 0.8 and 0.9 for fuzzy confidence are performed.
Also, the user can force the algorithm return at least one rule for all target class values, even if not pass the screening function. This behaviour is specified by the ALL_CLASS parameter.
@section Parameters file structure:
The paramFile
argument points to a file which has the neccesary parameters to execute FuGePSD.
This file must be, at least, this parameters (separated by a carriage return):
algorithm
Specify the algorithm to execute. In this case. "MESDIF"
inputData
Specify two paths of KEEL files for training and test. In case of specify only the name of the file, the path will be the working directory.
seed
Sets the seed for the random number generator
nLabels
Sets the number of fuzzy labels to create when reading the files
nEval
Set the maximum number of evaluations of rules for stop the genetic process
popLength
Sets number of individuals of the main population
eliteLength
Sets number of individuals of the elite population. Must be less than popLength
crossProb
Crossover probability of the genetic algorithm. Value in [0,1]
mutProb
Mutation probability of the genetic algorithm. Value in [0,1]
Obj1
Sets the objetive number 1.
Obj2
Sets the objetive number 2.
Obj3
Sets the objetive number 3.
Obj4
Sets the objetive number 4.
RulesRep
Representation of each chromosome of the population. "can" for canonical representation. "dnf" for DNF representation.
targetClass
Value of the target variable to search for subgroups. The target variable is always the last variable. Use null
to search for every value of the target variable
An example of parameter file could be:
algorithm = FUGEPSD inputData = "banana-5-1tra.dat" "banana-5-1tst.dat" outputData = "Parameters_INFO.txt" "Rules.txt" "TestMeasures.txt" seed = 23783 Number of Labels = 3 T-norm/T-conorm for the Computation of the Compatibility Degree = Normalized_Sum Rule Weight = Certainty_Factor Fuzzy Reasoning Method = Normalized_Sum Number of Generations = 300 Initial Number of Fuzzy Rules = 100 Crossover probability = 0.5 Mutation probability = 0.2 Insertion probability = 0.15 Dropping Condition probability = 0.15 Tournament Selection Size = 2 Global Fitness Weight 1 = 0.7 Global Fitness Weight 2 = 0.1 Global Fitness Weight 3 = 0.05 Global Fitness Weight 4 = 0.2 All Class = true
Written on R by Angel M. Garcia <[email protected]>
A fuzzy genetic programming-based algorithm for subgroup discovery and the application to one problem of pathogenesis of acute sore throat conditions in humans, Carmona, C.J., Ruiz-Rodado V., del Jesus M.J., Weber A., Grootveld M., Gonzalez P., and Elizondo D. , Information Sciences, Volume 298, p.180-197, (2015)
FUGEPSD(training = habermanTra, test = habermanTst, output = c(NA, NA, NA), seed = 23783, nLabels = 3, t_norm = "Minimum/Maximum", ruleWeight = "Certainty_Factor", frm = "Normalized_Sum", numGenerations = 20, numberOfInitialRules = 15, crossProb = 0.5, mutProb = 0.2, insProb = 0.15, dropProb = 0.15, tournamentSize = 2, globalFitnessWeights = c(0.7, 0.1, 0.3, 0.2), ALL_CLASS = TRUE) ## Not run: # Execution with a parameters file called 'ParamFile.txt' in the working directory: FUGEPSD("ParamFile.txt") ## End(Not run)
FUGEPSD(training = habermanTra, test = habermanTst, output = c(NA, NA, NA), seed = 23783, nLabels = 3, t_norm = "Minimum/Maximum", ruleWeight = "Certainty_Factor", frm = "Normalized_Sum", numGenerations = 20, numberOfInitialRules = 15, crossProb = 0.5, mutProb = 0.2, insProb = 0.15, dropProb = 0.15, tournamentSize = 2, globalFitnessWeights = c(0.7, 0.1, 0.3, 0.2), ALL_CLASS = TRUE) ## Not run: # Execution with a parameters file called 'ParamFile.txt' in the working directory: FUGEPSD("ParamFile.txt") ## End(Not run)
Training data for the german dataset
A SDEFSR_Dataset class with 800 instances, 20 variables (without the target variable) and 2 values for the target class.
A numerical version of the Statlog German Credit Data data set. Here, the task is to classify customers as good (1) or bad (2), depending on 20 features about them and their bancary accounts.
https://sci2s.ugr.es/keel/dataset.php?cod=88
germanTra$data
germanTra$data
Test data for the german dataset
A SDEFSR_Dataset class with 200 instances, 20 variables (without the target variable) and 2 values for the target class.
A numerical version of the Statlog German Credit Data data set. Here, the task is to classify customers as good (1) or bad (2), depending on 20 features about them and their bancary accounts.
https://sci2s.ugr.es/keel/dataset.php?cod=88
germanTra$data
germanTra$data
Rules generated by the SDIGA algorithm with the default parameters for the haberman
dataset.
The rule set contains only two rules. One for each target variable
Haberman, S. J. (1976). Generalized Residuals for Log-Linear Models, Proceedings of the 9th International Biometrics Conference, Boston, pp. 104-122.
Landwehr, J. M., Pregibon, D., and Shoemaker, A. C. (1984), Graphical Models for Assessing Logistic Regression Models (with discussion), Journal of the American Statistical Association 79: 61-83.
Lo, W.-D. (1993). Logistic Regression Trees, PhD thesis, Department of Statistics, University of Wisconsin, Madison, WI.
habermanRules
habermanRules
Training data for the Haberman dataset.
A SDEFSR_Dataset class with 306 instances, 3 variables (without the target variable) and 2 values for the target variable. Three fuzzy labels for each numerical variable are defined.
This data set contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago's Billings Hospital on the survival of patients who had undergone surgery for breast cancer. The task is to determine if the patient survived 5 years or longer (positive) or if the patient died within 5 year (negative)
Haberman, S. J. (1976). Generalized Residuals for Log-Linear Models, Proceedings of the 9th International Biometrics Conference, Boston, pp. 104-122.
Landwehr, J. M., Pregibon, D., and Shoemaker, A. C. (1984), Graphical Models for Assessing Logistic Regression Models (with discussion), Journal of the American Statistical Association 79: 61-83.
Lo, W.-D. (1993). Logistic Regression Trees, PhD thesis, Department of Statistics, University of Wisconsin, Madison, WI.
habermanTra$data habermanTra$fuzzySets
habermanTra$data habermanTra$fuzzySets
Test data for the Haberman dataset.
A SDEFSR_Dataset class with 62 instances, 3 variables (without the target variable) and 2 values for the target variable. Three fuzzy labels for each numerical variable are defined.
This data set contains cases from a study that was conducted between 1958 and 1970 at the University of Chicago's Billings Hospital on the survival of patients who had undergone surgery for breast cancer. The task is to determine if the patient survived 5 years or longer (positive) or if the patient died within 5 year (negative)
Haberman, S. J. (1976). Generalized Residuals for Log-Linear Models, Proceedings of the 9th International Biometrics Conference, Boston, pp. 104-122.
Landwehr, J. M., Pregibon, D., and Shoemaker, A. C. (1984), Graphical Models for Assessing Logistic Regression Models (with discussion), Journal of the American Statistical Association 79: 61-83.
Lo, W.-D. (1993). Logistic Regression Trees, PhD thesis, Department of Statistics, University of Wisconsin, Madison, WI.
habermanTra$data habermanTra$fuzzySets
habermanTra$data habermanTra$fuzzySets
Performs a subgroup discovery task executing the MESDIF algorithm.
MESDIF( paramFile = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 10000, popLength = 100, eliteLength = 3, crossProb = 0.6, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", Obj4 = "null", targetVariable = NA, targetClass = "null" )
MESDIF( paramFile = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 10000, popLength = 100, eliteLength = 3, crossProb = 0.6, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", Obj4 = "null", targetVariable = NA, targetClass = "null" )
paramFile |
The path of the parameters file. |
training |
A |
test |
A |
output |
character vector with the paths where store information file, rules file and quality measures file, respectively. |
seed |
An integer to set the seed used for generate random numbers. |
nLabels |
Number of linguistic labels that represents numerical variables. |
nEval |
An integer for set the maximum number of evaluations in the evolutive process. Large values of this parameter increments the computing time. |
popLength |
An integer to set the number of individuals in the population. |
eliteLength |
An integer to set the number of individuals in the elite population. |
crossProb |
Sets the crossover probability. A number in [0,1]. |
mutProb |
Sets the mutation probability. A number in [0,1]. |
RulesRep |
Representation used in the rules. "can" for canonical rules, "dnf" for DNF rules. |
Obj1 |
Sets the Objective number 1. See |
Obj2 |
Sets the Objective number 2. See |
Obj3 |
Sets the Objective number 3. See |
Obj4 |
Sets the Objective number 4. See |
targetVariable |
The name or index position of the target variable (or class). It must be a categorical one. |
targetClass |
A string specifing the value of the target variable. |
This function sets as target variable the last one that appear in SDEFSR_Dataset
object. If you want
to change the target variable, you can set the targetVariable
to change this target variable.
The target variable MUST be categorical, if it is not, throws an error. Also, the default behaviour is to find
rules for all possible values of the target varaible. targetClass
sets a value of the target variable where the
algorithm only finds rules about this value.
If you specify in paramFile
something distinct to NULL
the rest of the parameters are
ignored and the algorithm tries to read the file specified. See "Parameters file structure" below
if you want to use a parameters file.
The algorithm shows in the console the following results:
The parameters used in the algorithm
The rules generated.
The quality measures for test of every rule and the global results. This globals results shows the number of rules generated and means results for each quality measure.
Also, the algorithms save those results in the files specified in the output
parameter of the algorithm or
in the outputData
parameter in the parameters file.
Additionally a SDEFSR_Rules
object is returned with this information.
This algorithm performs a multi-objective genetic algorithm based on elitism (following the SPEA2 approach). The elite population has a fixed size and it is filled by non-dominated individuals.
An individual is non-dominated when (! all(ObjI1 <= ObjI2) & any(ObjI1 < ObjI2))
where ObjI1
is the objective value for our individual and ObjI2 is the objetive value for another individual.
The number of dominated individuals by each one determine, in addition with a niches technique that considers
the proximity among values of the objectives a fitness value for the selection.
The number of non-dominated individuals might be greater or less than elite population size and in those cases MESDIF implements a truncation operator and a fill operator respectively. Then, genetic operators are applied.
At the final of the evolutive process it returns the rules stored in elite population. Therefore, the number of rules is fixed with the eliteLength
parameter.
The paramFile
argument points to a file which has the necesary parameters for MESDIF works.
This file must have, at least, those parameters (separated by a carriage return):
algorithm
Specify the algorithm to execute. In this case. "MESDIF"
inputData
Specify two paths of KEEL files for training and test. In case of specify only the name of the file, the path will be the working directory.
seed
Sets the seed for the random number generator
nLabels
Sets the number of fuzzy labels to create when reading the files
nEval
Set the maximun number of evaluations of rules for stop the genetic process
popLength
Sets number of individuals of the main population
eliteLength
Sets number of individuals of the elite population. Must be less than popLength
crossProb
Crossover probability of the genetic algorithm. Value in [0,1]
mutProb
Mutation probability of the genetic algorithm. Value in [0,1]
Obj1
Sets the objective number 1.
Obj2
Sets the objective number 2.
Obj3
Sets the objective number 3.
Obj4
Sets the objective number 4.
RulesRep
Representation of each chromosome of the population. "can" for canonical representation. "dnf" for DNF representation.
targetVariable
The name or index position of the target variable (or class). It must be a categorical one.
targetClass
Value of the target variable to search for subgroups. The target variable is always the last variable. Use null
to search for every value of the target variable
An example of parameter file could be:
algorithm = MESDIF inputData = "irisd-10-1tra.dat" "irisd-10-1tst.dat" outputData = "irisD-10-1-INFO.txt" "irisD-10-1-Rules.txt" "irisD-10-1-TestMeasures.txt" seed = 0 nLabels = 3 nEval = 500 popLength = 100 eliteLength = 3 crossProb = 0.6 mutProb = 0.01 RulesRep = can Obj1 = comp Obj2 = unus Obj3 = null Obj4 = null targetClass = Iris-setosa
@section Objective values: You can use the following quality measures in the ObjX value of the parameter file using this values:
Unusualness -> unus
Crisp Support -> csup
Crisp Confidence -> ccnf
Fuzzy Support -> fsup
Fuzzy Confidence -> fcnf
Coverage -> cove
Significance -> sign
If you dont want to use a objective value you must specify null
Berlanga, F., Del Jesus, M., Gonzalez, P., Herrera, F., & Mesonero, M. (2006). Multiobjective Evolutionary Induction of Subgroup Discovery Fuzzy Rules: A Case Study in Marketing.
Zitzler, E., Laumanns, M., & Thiele, L. (2001). SPEA2: Improving the Strength Pareto Evolutionary Algorithm.
MESDIF( paramFile = NULL, training = habermanTra, test = habermanTst, output = c(NA, NA, NA), seed = 0, nLabels = 3, nEval = 300, popLength = 100, eliteLength = 3, crossProb = 0.6, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", Obj4 = "null", targetClass = "positive" ) ## Not run: Execution for all classes, see 'targetClass' parameter MESDIF( paramFile = NULL, training = habermanTra, test = habermanTst, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 300, popLength = 100, eliteLength = 3, crossProb = 0.6, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", Obj4 = "null", targetClass = "null" ) ## End(Not run)
MESDIF( paramFile = NULL, training = habermanTra, test = habermanTst, output = c(NA, NA, NA), seed = 0, nLabels = 3, nEval = 300, popLength = 100, eliteLength = 3, crossProb = 0.6, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", Obj4 = "null", targetClass = "positive" ) ## Not run: Execution for all classes, see 'targetClass' parameter MESDIF( paramFile = NULL, training = habermanTra, test = habermanTst, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 300, popLength = 100, eliteLength = 3, crossProb = 0.6, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", Obj4 = "null", targetClass = "null" ) ## End(Not run)
Perfoms a subgroup discovery task executing the algorithm NMEEF-SD
NMEEF_SD( paramFile = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 10000, popLength = 100, mutProb = 0.1, crossProb = 0.6, Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", minCnf = 0.6, reInitCoverage = "yes", porcCob = 0.5, StrictDominance = "yes", targetVariable = NA, targetClass = "null" )
NMEEF_SD( paramFile = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 10000, popLength = 100, mutProb = 0.1, crossProb = 0.6, Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", minCnf = 0.6, reInitCoverage = "yes", porcCob = 0.5, StrictDominance = "yes", targetVariable = NA, targetClass = "null" )
paramFile |
The path of the parameters file. |
training |
A |
test |
A |
output |
character vector with the paths of where store information file, rules file and test quality measures file, respectively. |
seed |
An integer to set the seed used for generate random numbers. |
nLabels |
Number of linguistic labels for numerical variables. |
nEval |
An integer for set the maximum number of evaluations in the evolutionary process. |
popLength |
An integer to set the number of individuals in the population. |
mutProb |
Sets the mutation probability. A number in [0,1]. |
crossProb |
Sets the crossover probability. A number in [0,1]. |
Obj1 |
Sets the Objective number 1. See |
Obj2 |
Sets the Objective number 2. See |
Obj3 |
Sets the Objective number 3. See |
minCnf |
Sets the minimum confidence that must have a rule in the Pareto front for being returned. A number in [0,1]. |
reInitCoverage |
Sets if the algorithm must perform the reinitialitation based on coverage when it is needed. A string with "yes" or "no". |
porcCob |
Sets the maximum percentage of variables that participate in the rules generated in the reinitialitation based on coverage. A number in [0,1] |
StrictDominance |
Sets if the comparison between individuals must be done by strict dominance or not. A string with "yes" or "no". |
targetVariable |
The name or index position of the target variable (or class). It must be a categorical one. |
targetClass |
A string specifing the value the target variable. |
This function sets as target variable the last one that appear in SDEFSR_Dataset
object. If you want
to change the target variable, you can set the targetVariable
to change this target variable.
The target variable MUST be categorical, if it is not, throws an error. Also, the default behaviour is to find
rules for all possible values of the target varaible. targetClass
sets a value of the target variable where the
algorithm only finds rules about this value.
If you specify in paramFile
something distinct to NULL
the rest of the parameters are
ignored and the algorithm tries to read the file specified. See "Parameters file structure" below
if you want to use a parameters file.
The algorithm shows in the console the following results:
The parameters used in the algorithm
The rules generated.
The quality measures for test of every rule and the global results.
Also, the algorithms save those results in the files specified in the output
parameter of the algorithm or
in the outputData
parameter in the parameters file.
NMEEF-SD is a multiobjetctive genetic algorithm based on a NSGA-II approach. The algorithm first makes a selection based on binary tournament and save the individuals in a offspring population. Then, NMEEF-SD apply the genetic operators over individuals in offspring population
For generate the population which participate in the next iteration of the evolutionary process NMEEF-SD calculate the dominance among all individuals (join main population and offspring) and then, apply the NSGA-II fast sort algorithm to order the population by fronts of dominance, the first front is the non-dominated front (or Pareto), the second is where the individuals dominated by one individual are, the thirt front dominated by two and so on.
To promove diversity NMEEF-SD has a mechanism of reinitialization of the population based on coverage if the Pareto doesnt evolve during a 5
At the final of the evolutionary process, the algorithm returns only the individuals in the Pareto front which has a confidence greater than a minimum confidence level.
The paramFile
argument points to a file which has the necesary parameters for NMEEF-SD works.
This file must be, at least, those parameters (separated by a carriage return):
algorithm
Specify the algorithm to execute. In this case. "NMEEFSD"
inputData
Specify two paths of KEEL files for training and test. In case of specify only the name of the file, the path will be the working directory.
seed
Sets the seed for the random number generator
nLabels
Sets the number of fuzzy labels to create when reading the files
nEval
Set the maximun number of evaluations of rules for stop the genetic process
popLength
Sets number of individuals of the main population
ReInitCob
Sets if NMEEF-SD do the reinitialization based on coverage. Values: "yes" or "no"
crossProb
Crossover probability of the genetic algorithm. Value in [0,1]
mutProb
Mutation probability of the genetic algorithm. Value in [0,1]
RulesRep
Representation of each chromosome of the population. "can" for canonical representation. "dnf" for DNF representation.
porcCob
Sets the maximum percentage of variables participe in a rule when doing the reinitialization based on coverage. Value in [0,1]
Obj1
Sets the objective number 1.
Obj2
Sets the objective number 2.
Obj3
Sets the objective number 3.
minCnf
Minimum confidence for returning a rule of the Pareto. Value in [0,1]
StrictDominance
Sets if the comparison of individuals when calculating dominance must be using strict dominance or not. Values: "yes" or "no"
targetClass
Value of the target variable to search for subgroups. The target variable is always the last variable.. Use null
to search for every value of the target variable
An example of parameter file could be:
algorithm = NMEEFSD inputData = "irisd-10-1tra.dat" "irisd-10-1tra.dat" "irisD-10-1tst.dat" outputData = "irisD-10-1-INFO.txt" "irisD-10-1-Rules.txt" "irisD-10-1-TestMeasures.txt" seed = 1 RulesRep = can nLabels = 3 nEval = 500 popLength = 51 crossProb = 0.6 mutProb = 0.1 ReInitCob = yes porcCob = 0.5 Obj1 = comp Obj2 = unus Obj3 = null minCnf = 0.6 StrictDominance = yes targetClass = Iris-setosa
You can use the following quality measures in the ObjX value of the parameter file using this values:
Unusualness -> unus
Crisp Support -> csup
Crisp Confidence -> ccnf
Fuzzy Support -> fsup
Fuzzy Confidence -> fcnf
Coverage -> cove
Significance -> sign
If you dont want to use a objetive value you must specify null
Carmona, C., Gonzalez, P., del Jesus, M., & Herrera, F. (2010). NMEEF-SD: Non-dominated Multi-objective Evolutionary algorithm for Extracting Fuzzy rules in Subgroup Discovery.
NMEEF_SD(paramFile = NULL, training = habermanTra, test = habermanTst, output = c(NA, NA, NA), seed = 0, nLabels = 3, nEval = 300, popLength = 100, mutProb = 0.1, crossProb = 0.6, Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", minCnf = 0.6, reInitCoverage = "yes", porcCob = 0.5, StrictDominance = "yes", targetClass = "positive" ) ## Not run: NMEEF_SD(paramFile = NULL, training = habermanTra, test = habermanTst, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 300, popLength = 100, mutProb = 0.1, crossProb = 0.6, Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", minCnf = 0.6, reInitCoverage = "yes", porcCob = 0.5, StrictDominance = "yes", targetClass = "null" ) ## End(Not run)
NMEEF_SD(paramFile = NULL, training = habermanTra, test = habermanTst, output = c(NA, NA, NA), seed = 0, nLabels = 3, nEval = 300, popLength = 100, mutProb = 0.1, crossProb = 0.6, Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", minCnf = 0.6, reInitCoverage = "yes", porcCob = 0.5, StrictDominance = "yes", targetClass = "positive" ) ## Not run: NMEEF_SD(paramFile = NULL, training = habermanTra, test = habermanTst, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 300, popLength = 100, mutProb = 0.1, crossProb = 0.6, Obj1 = "CSUP", Obj2 = "CCNF", Obj3 = "null", minCnf = 0.6, reInitCoverage = "yes", porcCob = 0.5, StrictDominance = "yes", targetClass = "null" ) ## End(Not run)
This function plots the rule set by means of a bar graph that shows TPR vs FPR quality measure of each rule
## S3 method for class 'SDEFSR_Rules' plot(x, ...)
## S3 method for class 'SDEFSR_Rules' plot(x, ...)
x |
an |
... |
additional arguments passed to the plot |
This function works depending on the package ggplot2 that allow to generate such graph. If the package ggplot2 is not installed, the this function ask the user to install it. After install, load the package and show the graph.
A TPR vs FPR graph show the precision of a rule. Quality rules has big TPR values and small FPR values. Big values of both quality measures indicates that the rule is too much general and it is too obvious. Small values of both indicates that the rule is too much specific and it would be an invalid rule.
A TPR vs FPR graph generated by ggplot2
plot(habermanRules)
plot(habermanRules)
S3 function to print in console the contents of the dataset
This function shows the matrix of data uncoded.
## S3 method for class 'SDEFSR_Dataset' print(x, ...)
## S3 method for class 'SDEFSR_Dataset' print(x, ...)
x |
The |
... |
Additional arguments passed to the print function |
This function show the matix of data. Internally, a SDEFSR_Dataset
object has a list of of examples
and this examples are coded numerically. This function decode these examples and convert the list into a matrix.
a matrix with the dataset uncoded.
@examples
print(habermanTra)
This function reads a KEEL (.dat), ARFF (.arff) or CSV dataset file and store the information
in a SDEFSR_Dataset
class.
read.dataset(file, sep = ",", quote = "\"", dec = ".", na.strings = "?")
read.dataset(file, sep = ",", quote = "\"", dec = ".", na.strings = "?")
file |
The path of the file to read |
sep |
The separator character to use |
quote |
The character used to take quotes |
dec |
The character used to define decimal characters |
na.strings |
The character to detect lost data |
A KEEL data file must have the following structure:
@relation: Name of the data set
@attribute: Description of an attribute (one for each attribute)
@inputs: List with the names of the input attributes
@output: Name of the output attribute (Not used in this algorithms implementation)
@data: Starting tag of the data
The rest of the file contains all the examples belonging to the data set, expressed in comma sepparated values format. ARFF file format is a well-know dataset format from WEKA data mining tool. CSV is a format which means comma-separated values. Where each examples is on a line and each value of the variables of the examples are separated by commas.
Angel M. Garcia <[email protected]>
J. Alcala-Fdez, A. Fernandez, J. Luengo, J. Derrac, S. Garcia, L. Sanchez, F. Herrera. KEEL Data-Mining Software Tool: Data Set Repository, Integration of Algorithms and Experimental Analysis Framework. Journal of Multiple-Valued Logic and Soft Computing 17:2-3 (2011) 255-287.
KEEL Dataset Repository (Standard Classification): https://sci2s.ugr.es/keel/category.php?cat=clas
## Not run: Reads a KEEL dataset from a file. read.dataset(file = "C:\KEELFile.dat") read.dataset(file = "C:\KEELFile.dat", nLabels = 7) Reads an ARFF dataset from a file. read.dataset(file = "C:\ARFFFile.arff") read.dataset(file = "C:\ARFFFile.arff", nLabels = 7) ## End(Not run)
## Not run: Reads a KEEL dataset from a file. read.dataset(file = "C:\KEELFile.dat") read.dataset(file = "C:\KEELFile.dat", nLabels = 7) Reads an ARFF dataset from a file. read.dataset(file = "C:\ARFFFile.arff") read.dataset(file = "C:\ARFFFile.arff", nLabels = 7) ## End(Not run)
The SDEFSR package provide a tool for read KEEL datasets and four evolutionary fuzzy rule-based algorithms for subgroup discovery.
The algorithms provided works with datasets in KEEL, ARFF or CSV format and also with data.frame
objects.
The package also provide a Shiny app for making the same tasks that the package can do and can display some additional information about data for making an exploratory analysis.
The algorithms provided are Evolutionary Fuzzy Systems (EFS) which take advantages of evolutionary algorithms for maximize more than one quality measure and fuzzy logic, which makes a representation of numerical variables that are more understandable for humans and more robust to noise.
The algorithms in the SDEFSR package support target variable with more than two values. However, this target variables must be categorical. Thus, if you have a numeric target variable, a discretization must be perfomed before executing the method.
MESDIF
Multiobjective Evolutionary Subgroup DIscovery Fuzzy rules (MESDIF) Algorithm.
NMEEF_SD
Non-dominated Multi-objective Evolutionary algorithm for Extracting Fuzzy rules in Subgroup Discovery (NMEEF-SD).
read.dataset
reads a KEEL, ARFF or CSV format file.
SDIGA
Subgroup Discovery Iterative Genetic Algorithm (SDIGA).
SDEFSR_GUI
Launch the Shiny app in your browser.
FUGEPSD
Fuzzy Genetic Programming-based learning for Subgroup Discovery (FuGePSD) Algorithm.
plot.SDEFSR_Rules
Plot the discovered rules by a SDEFSR algorithm.
sort.SDEFSR_Rules
Sort the discovered rules by a given quality measure.
SDEFSR_DatasetFromDataFrame
Reads a data.frame and create a SDEFSR_Dataset
object to be execute by an algorithm of this package.
Angel M. Garcia-Vico <[email protected]>
SDEFSR_Dataset
object from a data.frame
Creates a SDEFSR_Dataset
object from a data.frame
and create fuzzy labels for numerical variables too.
SDEFSR_DatasetFromDataFrame( data, relation, names = NA, types = NA, classNames = NA )
SDEFSR_DatasetFromDataFrame( data, relation, names = NA, types = NA, classNames = NA )
data |
A |
relation |
A string that indicate the name of the relation. |
names |
An optional character vector indicating the name of the attributes. |
types |
An optional character vector indicating 'c' if variable is categorical, 'r' if is real and 'e' if it is an integer |
classNames |
An optional character vector indicating the values of the target class. |
The information of the data.frame must be stored with instances in rows and variables in columns If you dont specify any of the optional parameter the function try to obtain them automatically.
For 'names'
if it is NA, the function takes the name of the columns by colnames
.
For 'types'
if it is NA, the function takes the type of an attribute asking the type of the column of the data.frame.
If it is 'character'
it is assumed that it is categorical, and if 'numeric'
it is assumed that it is a real number.
PLEASE, PAY ATTENTION TO THIS WAY OF WORK. It can cause tranformation errors taking a numeric variable as categorical or vice-versa.
For 'classNames'
if it is NA, the function returns unique values of the last attribute of the data.frame that is considered the class attribute.
A SDEFSR_Dataset
object with all the information of the dataset.
Angel M Garcia <[email protected]>
library(SDEFSR) df <- data.frame(matrix(runif(1000), ncol = 10)) #Add class attribute df[,11] <- c("0", "1", "2", "3") SDEFSR_DatasetObject <- SDEFSR_DatasetFromDataFrame(df, "random") invisible()
library(SDEFSR) df <- data.frame(matrix(runif(1000), ncol = 10)) #Add class attribute df[,11] <- c("0", "1", "2", "3") SDEFSR_DatasetObject <- SDEFSR_DatasetFromDataFrame(df, "random") invisible()
Launches a Shiny-based interface for the package in your browser.
SDEFSR_GUI()
SDEFSR_GUI()
The package SDEFSR
provide simple, shiny-based web interface for performs the taks
easily. The interface only work with new datasets loaded directly in the platform.
The web application is structured as follows:
The first you have to do is load your training and test files. This files must be valids KEEL format files.
After chose your datasets, you can view information about the dataset or execute the algorithm
You can choose the target variable or the variable to visualize and choose the target value or execute the algorithm for all the values.
Choosed the target variable, you can choose the algorithm to execute and change his parameters with the controls provided.
After you can execute the algorithm. The results are exposed in three tabs that are at the top of the page, just at the right of the "Exploratory Analysis" tab.
The tables can be sorted for each value and also you can search and filter values.
## Not run: library(SDEFSR) SDEFSR_GUI() ## End(Not run)
## Not run: library(SDEFSR) SDEFSR_GUI() ## End(Not run)
Perfoms a subgroup discovery task executing the algorithm SDIGA
SDIGA( parameters_file = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 10000, popLength = 100, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", w1 = 0.7, Obj2 = "CCNF", w2 = 0.3, Obj3 = "null", w3 = 0, minConf = 0.6, lSearch = "yes", targetVariable = NA, targetClass = "null" )
SDIGA( parameters_file = NULL, training = NULL, test = NULL, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 10000, popLength = 100, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", w1 = 0.7, Obj2 = "CCNF", w2 = 0.3, Obj3 = "null", w3 = 0, minConf = 0.6, lSearch = "yes", targetVariable = NA, targetClass = "null" )
parameters_file |
The path of the parameters file. |
training |
A |
test |
A |
output |
character vector with the paths of where store information file, rules file and test quality measures file, respectively. |
seed |
An integer to set the seed used for generate random numbers. |
nLabels |
Number of linguistic labels that represents numerical variables. |
nEval |
An integer for set the maximum number of evaluations in the evolutive process. |
popLength |
An integer to set the number of individuals in the population. |
mutProb |
Sets the mutation probability. A number in [0,1]. |
RulesRep |
Representation used in the rules. "can" for canonical rules, "dnf" for DNF rules. |
Obj1 |
Sets the Objective number 1. See |
w1 |
Sets the weight of |
Obj2 |
Sets the Objective number 2. See |
w2 |
Sets the weight of |
Obj3 |
Sets the Objective number 3. See |
w3 |
Sets the weight of |
minConf |
Sets the minimum confidence that must have the rule returned by the genetic algorithm after the local optimitation phase. A number in [0,1]. |
lSearch |
Sets if the local optimitation phase must be performed. A string with "yes" or "no". |
targetVariable |
A string with the name or an integer with the index position of the target variable (or class). It must be a categorical one. |
targetClass |
A string specifing the value the target variable. |
This function sets as target variable the last one that appear in SDEFSR_Dataset
object. If you want
to change the target variable, you can set the targetVariable
to change this target variable.
The target variable MUST be categorical, if it is not, throws an error. Also, the default behaviour is to find
rules for all possible values of the target varaible. targetClass
sets a value of the target variable where the
algorithm only finds rules about this value.
If you specify in paramFile
something distinct to NULL
the rest of the parameters are
ignored and the algorithm tries to read the file specified. See "Parameters file structure" below
if you want to use a parameters file.
The algorithm shows in the console the following results:
The parameters used in the algorithm
The rules generated.
The quality measures for test of every rule and the global results.
Also, the algorithms save those results in the files specified in the output
parameter of the algorithm or
in the outputData
parameter in the parameters file.
This algorithm has a genetic algorithm in his core. This genetic algorithm returns only the best rule of the population and it is executed so many times until a stop condition is reached. The stop condition is that the rule returned must cover at least one new example (not covered by previous rules) and must have a confidence greater than a minimum.
After returning the rule, a local improvement could be applied for make the rule more general. This local improve is done by means of a hill-climbing local search.
The genetic algorithm cross only the two best individuals. But the mutation operator is applied over all the population, individuals from cross too.
The parameters_file
argument points to a file which has the necesary parameters for SDIGA works.
This file must be, at least, those parameters (separated by a carriage return):
algorithm
Specify the algorithm to execute. In this case. "SDIGA"
inputData
Specify two paths of KEEL files for training and test. In case of specify only the name of the file, the path will be the working directory.
seed
Sets the seed for the random number generator
nLabels
Sets the number of fuzzy labels to create when reading the files
nEval
Set the maximun number of evaluations of rules for stop the genetic process
popLength
Sets number of individuals of the main population
mutProb
Mutation probability of the genetic algorithm. Value in [0,1]
RulesRep
Representation of each chromosome of the population. "can" for canonical representation. "dnf" for DNF representation.
Obj1
Sets the objective number 1.
w1
Sets the weigth assigned to the objective number 1. Value in [0,1]
Obj2
Sets the objective number 2.
w2
Sets the weigth assigned to the objective number 2. Value in [0,1]
Obj3
Sets the objective number 3.
w3
Sets the weigth assigned to the objective number 3. Value in [0,1]
minConf
Sets the minimum confidence of the rule for checking the stopping criteria of the iterative process
lSearch
Perform the local search algorithm after the execution of the genetic algorithm? Values: "yes" or "no"
targetVariable
The name or index position of the target variable (or class). It must be a categorical one.
targetClass
Value of the target variable to search for subgroups. The target variable is always the last variable.. Use null
to search for every value of the target variable
An example of parameter file could be:
algorithm = SDIGA inputData = "irisD-10-1tra.dat" "irisD-10-1tst.dat" outputData = "irisD-10-1-INFO.txt" "irisD-10-1-Rules.txt" "irisD-10-1-TestMeasures.txt" seed = 0 nLabels = 3 nEval = 500 popLength = 100 mutProb = 0.01 minConf = 0.6 RulesRep = can Obj1 = Comp Obj2 = Unus Obj3 = null w1 = 0.7 w2 = 0.3 w3 = 0.0 lSearch = yes
You can use the following quality measures in the ObjX value of the parameter file using this values:
Unusualness -> unus
Crisp Support -> csup
Crisp Confidence -> ccnf
Fuzzy Support -> fsup
Fuzzy Confidence -> fcnf
Coverage -> cove
Significance -> sign
If you dont want to use a objetive value you must specify null
M. J. del Jesus, P. Gonzalez, F. Herrera, and M. Mesonero, "Evolutionary Fuzzy Rule Induction Process for Subgroup Discovery: A case study in marketing," IEEE Transactions on Fuzzy Systems, vol. 15, no. 4, pp. 578-592, 2007.
SDIGA(parameters_file = NULL, training = habermanTra, test = habermanTst, output = c(NA, NA, NA), seed = 0, nLabels = 3, nEval = 300, popLength = 100, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", w1 = 0.7, Obj2 = "CCNF", w2 = 0.3, Obj3 = "null", w3 = 0, minConf = 0.6, lSearch = "yes", targetClass = "positive") ## Not run: SDIGA(parameters_file = NULL, training = habermanTra, test = habermanTst, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 300, popLength = 100, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", w1 = 0.7, Obj2 = "CCNF", w2 = 0.3, Obj3 = "null", w3 = 0, minConf = 0.6, lSearch = "yes", targetClass = "positive") ## End(Not run)
SDIGA(parameters_file = NULL, training = habermanTra, test = habermanTst, output = c(NA, NA, NA), seed = 0, nLabels = 3, nEval = 300, popLength = 100, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", w1 = 0.7, Obj2 = "CCNF", w2 = 0.3, Obj3 = "null", w3 = 0, minConf = 0.6, lSearch = "yes", targetClass = "positive") ## Not run: SDIGA(parameters_file = NULL, training = habermanTra, test = habermanTst, output = c("optionsFile.txt", "rulesFile.txt", "testQM.txt"), seed = 0, nLabels = 3, nEval = 300, popLength = 100, mutProb = 0.01, RulesRep = "can", Obj1 = "CSUP", w1 = 0.7, Obj2 = "CCNF", w2 = 0.3, Obj3 = "null", w3 = 0, minConf = 0.6, lSearch = "yes", targetClass = "positive") ## End(Not run)
@title Return an ordered rule set by a given quality measure
@description This function sorts a rule set in descendant order by a given quality measure that are available on the object
## S3 method for class 'SDEFSR_Rules' sort(x, decreasing = TRUE, ...)
## S3 method for class 'SDEFSR_Rules' sort(x, decreasing = TRUE, ...)
x |
The rule set passed as a |
decreasing |
A logical indicating if the sort should be increasing or decreasing. By default, decreasing. |
... |
Additional parameters as "by", a String with the name of the quality measure to order by. Valid values are: |
The additional argument in "..." is the 'by' argument, which is a s
string with the name of the quality measure to order by. Valid values are:
nVars, Coverage, Unusualness, Significance, FuzzySupport, Support, FuzzyConfidence, Confidence, Tpr, Fpr
.
another SDEFSR_Rules
object with the rules sorted
sort(habermanRules)
sort(habermanRules)
Summary relevant data of a SDEFSR_Dataset
dataset.
## S3 method for class 'SDEFSR_Dataset' summary(object, ...)
## S3 method for class 'SDEFSR_Dataset' summary(object, ...)
object |
A |
... |
Additional arguments to the summary function. |
This function show important information about the SDEFSR_Dataset
dataset for the user. Note that it does not
show all the information available. The rest is only for the algorithms. The values that appear are accessible by the
$
operator, e.g. dataset$relation or dataset$examplesPerClass.
summary(carTra)
summary(carTra)