Author: | | Mark Hall <mhall{[at]}pentaho.com> |
Category: | | Distributed |
Changes: | | Replaced p^2 quantile computation with TDigest estimators from the stream-lib project (https://github.com/addthis/stream-lib). These are streaming estimators that can be applied in parallel and the results merged. The ARFF header job now uses them to compute quantiles in one pass over the data. Also fixed a bug in the handling of names files that affected Windows users. |
Date: | | 2014-07-30 |
Depends: | | weka (>=3.7.11), distributedWekaBase (>=1.0.7) |
Description: | | Provides loaders and savers for HDFS, plus Hadoop jobs and tasks that wrap the tasks provided in distributedWekaBase. Includes libraries for Apache Hadoop 1.1.2. |
License: | | GPL 3.0 |
Maintainer: | | Mark Hall <mhall{[at]}pentaho.com> |
PackageURL: | | http://prdownloads.sourceforge.net/weka/distributedWekaHadoop1.0.10.zip?download |
URL: | | http://markahall.blogspot.co.nz/2013/10/weka-and-hadoop-part-1.html |
Version: | | 1.0.10 |