corr {SparkR} | R Documentation |
Computes the Pearson Correlation Coefficient for two Columns.
Calculates the correlation of two columns of a SparkDataFrame. Currently only supports the Pearson Correlation Coefficient. For Spearman Correlation, consider using RDD methods found in MLlib's Statistics.
## S4 method for signature 'Column' corr(x, col2) corr(x, ...) ## S4 method for signature 'SparkDataFrame' corr(x, colName1, colName2, method = "pearson")
x |
a Column or a SparkDataFrame. |
col2 |
a (second) Column. |
... |
additional argument(s). If |
colName1 |
the name of the first column |
colName2 |
the name of the second column |
method |
Optional. A character specifying the method for calculating the correlation. only "pearson" is allowed now. |
The Pearson Correlation Coefficient as a Double.
corr since 1.6.0
corr since 1.6.0
Other math_funcs: acos
,
acos,Column-method
; asin
,
asin,Column-method
; atan2
,
atan2,Column-method
; atan
,
atan,Column-method
; bin
,
bin
, bin,Column-method
;
bround
, bround
,
bround,Column-method
; cbrt
,
cbrt
, cbrt,Column-method
;
ceil
, ceil
,
ceil,Column-method
, ceiling
,
ceiling,Column-method
; conv
,
conv
,
conv,Column,numeric,numeric-method
;
cosh
, cosh,Column-method
;
cos
, cos,Column-method
;
covar_pop
, covar_pop
,
covar_pop,characterOrColumn,characterOrColumn-method
;
cov
, cov
, cov
,
cov,SparkDataFrame-method
,
cov,characterOrColumn-method
,
covar_samp
, covar_samp
,
covar_samp,characterOrColumn,characterOrColumn-method
;
expm1
, expm1,Column-method
;
exp
, exp,Column-method
;
factorial
,
factorial,Column-method
;
floor
, floor,Column-method
;
hex
, hex
,
hex,Column-method
; hypot
,
hypot
, hypot,Column-method
;
log10
, log10,Column-method
;
log1p
, log1p,Column-method
;
log2
, log2,Column-method
;
log
, log,Column-method
;
pmod
, pmod
,
pmod,Column-method
; rint
,
rint
, rint,Column-method
;
round
, round,Column-method
;
shiftLeft
, shiftLeft
,
shiftLeft,Column,numeric-method
;
shiftRightUnsigned
,
shiftRightUnsigned
,
shiftRightUnsigned,Column,numeric-method
;
shiftRight
, shiftRight
,
shiftRight,Column,numeric-method
;
sign
, sign,Column-method
,
signum
, signum
,
signum,Column-method
; sinh
,
sinh,Column-method
; sin
,
sin,Column-method
; sqrt
,
sqrt,Column-method
; tanh
,
tanh,Column-method
; tan
,
tan,Column-method
; toDegrees
,
toDegrees
,
toDegrees,Column-method
;
toRadians
, toRadians
,
toRadians,Column-method
;
unhex
, unhex
,
unhex,Column-method
Other stat functions: approxQuantile
,
approxQuantile,SparkDataFrame,character,numeric,numeric-method
;
cov
, cov
, cov
,
cov,SparkDataFrame-method
,
cov,characterOrColumn-method
,
covar_samp
, covar_samp
,
covar_samp,characterOrColumn,characterOrColumn-method
;
crosstab
,
crosstab,SparkDataFrame,character,character-method
;
freqItems
,
freqItems,SparkDataFrame,character-method
;
sampleBy
, sampleBy
,
sampleBy,SparkDataFrame,character,list,numeric-method
## Not run: corr(df$c, df$d)
## Not run:
##D df <- read.json("/path/to/file.json")
##D corr <- corr(df, "title", "gender")
##D corr <- corr(df, "title", "gender", method = "pearson")
## End(Not run)