sparkR.session {SparkR} | R Documentation |
SparkSession is the entry point into SparkR. sparkR.session
gets the existing
SparkSession or initializes a new SparkSession.
Additional Spark properties can be set in ...
, and these named parameters take priority
over values in master
, appName
, named lists of sparkConfig
.
sparkR.session(master = "", appName = "SparkR", sparkHome = Sys.getenv("SPARK_HOME"), sparkConfig = list(), sparkJars = "", sparkPackages = "", enableHiveSupport = TRUE, ...)
master |
the Spark master URL. |
appName |
application name to register with cluster manager. |
sparkHome |
Spark Home directory. |
sparkConfig |
named list of Spark configuration to set on worker nodes. |
sparkJars |
character vector of jar files to pass to the worker nodes. |
sparkPackages |
character vector of package coordinates |
enableHiveSupport |
enable support for Hive, fallback if not built with Hive support; once set, this cannot be turned off on an existing session |
... |
named Spark properties passed to the method. |
For details on how to initialize and use SparkR, refer to SparkR programming guide at http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession.
sparkR.session since 2.0.0
## Not run:
##D sparkR.session()
##D df <- read.json(path)
##D
##D sparkR.session("local[2]", "SparkR", "/home/spark")
##D sparkR.session("yarn-client", "SparkR", "/home/spark",
##D list(spark.executor.memory="4g"),
##D c("one.jar", "two.jar", "three.jar"),
##D c("com.databricks:spark-avro_2.10:2.0.1"))
##D sparkR.session(spark.master = "yarn-client", spark.executor.memory = "4g")
## End(Not run)