sampleBy {SparkR} | R Documentation |
Returns a stratified sample without replacement based on the fraction given on each stratum.
sampleBy(x, col, fractions, seed) ## S4 method for signature 'SparkDataFrame,character,list,numeric' sampleBy(x, col, fractions, seed)
x |
A SparkDataFrame |
col |
column that defines strata |
fractions |
A named list giving sampling fraction for each stratum. If a stratum is not specified, we treat its fraction as zero. |
seed |
random seed |
A new SparkDataFrame that represents the stratified sample
sampleBy since 1.6.0
Other stat functions: approxQuantile
,
approxQuantile,SparkDataFrame,character,numeric,numeric-method
;
corr
, corr
,
corr
, corr,Column-method
,
corr,SparkDataFrame-method
;
cov
, cov
, cov
,
cov,SparkDataFrame-method
,
cov,characterOrColumn-method
,
covar_samp
, covar_samp
,
covar_samp,characterOrColumn,characterOrColumn-method
;
crosstab
,
crosstab,SparkDataFrame,character,character-method
;
freqItems
,
freqItems,SparkDataFrame,character-method
## Not run:
##D df <- read.json("/path/to/file.json")
##D sample <- sampleBy(df, "key", fractions, 36)
## End(Not run)