Distributed Pattern Recognition in RapidMiner

Alexander Arimond, Christian Kofler, Faisal Shafait
RapidMiner Community Meeting and Conference, Dortmund, Germany, Online, 9/2010


RapidMiner already provides easy to use interfaces for developing and evaluating Pattern Recognition and Machine Learning applications. However, it has only limited support for parallelization and it lacks functionality to spread long-running computations over multiple machines. A solution to this is distributed computing with paradigms like MapReduce. In this paper, we present a system called DisPaRe, which integrates distributed computing frameworks into RapidMiner. A special focus is put on utilizing MapReduce as a programming model. The frameworks GridGain and Oracle Coherence are reviewed and evaluated with respect to their suitability to fit into the context of RapidMiner. The system provides effective means for transparently utilizing these frameworks and enabling RapidMiner processes to parallelize their computations within a distributed environment.




@inproceedings{ ARIM2010,
	Title = {Distributed Pattern Recognition in RapidMiner},
	Author = {Alexander Arimond and Christian Kofler and Faisal Shafait},
	BookTitle = {RapidMiner Community Meeting and Conference},
	Month = {9},
	Year = {2010},
	Publisher = {Online}

Last modified:: 30.08.2016