Multi-GPU approach for big data mining - global induction of decision trees

Krzysztof Jurczuk , Marcin Czajkowski , Marek Krętowski

Abstract

This paper identifies scalability bounds of the evolutionary induced decision trees (DT)s. In order to conquer the barriers concerning the large-scale data we propose a novel multi-GPU approach. It incorporates the knowledge of the global DT induction and EA parallelization. The search for a tree structure and tests is performed sequentially by a CPU, while the fitness calculations are delegated to GPUs, thus the core evolution is unchanged. The results show that the evolutionary induction is accelerated several thousand times by using up to 4 GPUs on datasets with up to 1 billion objects.
Author Krzysztof Jurczuk (FCS / SD)
Krzysztof Jurczuk,,
- Software Department
, Marcin Czajkowski (FCS / SD)
Marcin Czajkowski,,
- Software Department
, Marek Krętowski (FCS / SD)
Marek Krętowski,,
- Software Department
Book López-ibáñez Manuel (eds.): The Genetic and Evolutionary Computation Conference: GECCO 2019, 2019, Association for Computing Machinery
Keywords in Englishevolutionary data mining, big data, decision trees, scalability bounds, parallel computing, graphics processing unit (GPU), CUDA
Internal identifierROC 19-20
Languageen angielski
Score (nominal)140
Score sourceconferenceList
ScoreMinisterial score = 140.0, 04-03-2020, ChapterFromConference
Citation count*1 (2020-04-03)
Cite
Share Share

Get link to the record


* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Back
Confirmation
Are you sure?