Binary classification rule generation from decomposed data

Piotr Hońko

Abstract

Learning classification rules from data that do not fit in the available memory is a challenging task. The goal of this study is to develop an approach for generating binary classification rules from decomposed data that are equivalent in terms of quality to those found over the whole data. In the proposed approach, each class is divided into the same arbitrary small number of subtables. For each pair of subsets from different classes, rule sets are induced using any sequential covering algorithm. Rule sets generated from the same positive class subset and different negative class subsets are merged using an operator constructed on the basis of Cartesian product and conjunction operators. The rule sets obtained in this way are joined into one set. During the rule merging, unnecessary rules are removed. It is proven that for training data, the quality of the rule set generated using the approach is the same as that for the whole data. It is experimentally verified that for test data, the quality of classification is comparable with that obtained using a nondecomposed data approach.
Author Piotr Hońko (FCS / DISCN)
Piotr Hońko,,
- Department of Information Systems and Computer Networks
Journal seriesInternational Journal of Intelligent Systems, ISSN 0884-8173, e-ISSN 1098-111X, (N/A 100 pkt)
Issue year2019
Vol34
No12
Pages3123-3138
Publication size in sheets0.75
ASJC Classification1702 Artificial Intelligence; 1709 Human-Computer Interaction; 1712 Software; 2614 Theoretical Computer Science
DOIDOI:10.1002/int.22181
Internal identifierROC 19-20
Languageen angielski
Score (nominal)100
Score sourcejournalList
ScoreMinisterial score = 100.0, 24-03-2020, ArticleFromJournal
Publication indicators Scopus SNIP (Source Normalised Impact per Paper): 2018 = 2.027; WoS Impact Factor: 2018 = 7.229 (2) - 2018=5.861 (5)
Citation count*
Cite
Share Share

Get link to the record


* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Back
Confirmation
Are you sure?