public class SparkUtils
extends java.lang.Object
Modifier and Type | Class and Description |
---|---|
static class |
SparkUtils.WekaLoggingPrintWriter
Used for hooking into Spark's log4j logging via a WriterAppender
|
Constructor and Description |
---|
SparkUtils() |
Modifier and Type | Method and Description |
---|---|
static java.lang.String |
addSubdirToPath(java.lang.String parent,
java.lang.String subdirName)
Adds a subdirectory to a parent path.
|
static boolean |
checkFileExists(java.lang.String file)
Check that the named file exists on either the local file system or HDFS.
|
static void |
deleteDirectory(java.lang.String path)
Delete a directory (and all contents).
|
static org.apache.hadoop.conf.Configuration |
getFSConfigurationForPath(java.lang.String path,
java.lang.String[] pathOnly)
Returns a Configuration object configured with the name node and port
present in the supplied path (
hdfs://host:port/path ). |
static long |
getSizeInBytesOfPath(java.lang.String path)
Get the size in bytes of a file/directory
|
static org.apache.log4j.WriterAppender |
initSparkLogAppender(distributed.core.DistributedJob job)
Initialize and return an appender for hooking into Spark's log4j logging
and directing it to Weka's log
|
static java.io.InputStream |
openFileForRead(java.lang.String file)
Opens the named file for reading on either the local file system or HDFS.
|
static java.io.OutputStream |
openFileForWrite(java.lang.String file)
Open the named file for writing to on either the local file system or HDFS.
|
static java.io.PrintWriter |
openTextFileForWrite(java.lang.String file)
Open the named file as a text file for writing to on either the local file
system or any other protocol specific file system supported by Hadoop.
|
static void |
removeSparkLogAppender(org.apache.log4j.WriterAppender appender)
Remove the supplied appender from Spark's logging.
|
static java.lang.String |
resolveLocalOrOtherFileSystemPath(java.lang.String original)
Takes an input path and returns a fully qualified absolute one.
|
public static java.lang.String addSubdirToPath(java.lang.String parent, java.lang.String subdirName)
parent
- the parent (may include the hdfs://host:port
partsubdirName
- the name of the subdirectory to addpublic static org.apache.hadoop.conf.Configuration getFSConfigurationForPath(java.lang.String path, java.lang.String[] pathOnly)
hdfs://host:port/path
). Also returns the
path-only part of the URI. Note that absolute paths will require an extra
/. E.g. hdfs://host:port//users/fred/input
. Also handles local files system
paths if no protocol is supplied - e.g. bob/george for a relative path
(relative to the current working directory) or /bob/george for an absolute
path.path
- the URI or local path from which to configurepathOnly
- will hold the path-only part of the URIpublic static java.lang.String resolveLocalOrOtherFileSystemPath(java.lang.String original) throws java.io.IOException
hdfs://host:port//users/fred/input
- otherwise it will be treated as
relative (to the user's home directory in HDFS). In either case, the
returned path will be an absolute one.original
- original path (either relative or absolute) on a file
systemjava.io.IOException
- if a problem occurspublic static void deleteDirectory(java.lang.String path) throws java.io.IOException
path
- the path to the directory to deletejava.io.IOException
- if the path is not a directory or a problem occurspublic static java.io.InputStream openFileForRead(java.lang.String file) throws java.io.IOException
hdfs://host:port/<path>
"file
- the file to open for reading on either the local or HDFS file
systemjava.io.IOException
- if a problem occurspublic static java.io.OutputStream openFileForWrite(java.lang.String file) throws java.io.IOException
hdfs://host:port/<path>
". Note that, on the
local file system, the directory path must exist. Under HDFS, the path is
created automatically.file
- the file to write tojava.io.IOException
- if a problem occurspublic static java.io.PrintWriter openTextFileForWrite(java.lang.String file) throws java.io.IOException
protocol://host:port/<path>
." Note
that, on the local file system, the directory path must exist.file
- the file to write tojava.io.IOException
- if a problem occurspublic static boolean checkFileExists(java.lang.String file) throws java.io.IOException
file
- the file to checkjava.io.IOException
- if a problem occurspublic static long getSizeInBytesOfPath(java.lang.String path) throws java.io.IOException
path
- the path to the file/directoryjava.io.IOException
- if a problem occurspublic static org.apache.log4j.WriterAppender initSparkLogAppender(distributed.core.DistributedJob job)
job
- the job to initialize the appender forpublic static void removeSparkLogAppender(org.apache.log4j.WriterAppender appender)
appender
- the appender to remove