spark.assignClusters {SparkR} | R Documentation |
A scalable graph clustering algorithm. Users can call spark.assignClusters
to
return a cluster assignment for each input vertex.
Run the PIC algorithm and returns a cluster assignment for each input vertex.
spark.assignClusters(data, ...) ## S4 method for signature 'SparkDataFrame' spark.assignClusters( data, k = 2L, initMode = c("random", "degree"), maxIter = 20L, sourceCol = "src", destinationCol = "dst", weightCol = NULL )
data |
a SparkDataFrame. |
... |
additional argument(s) passed to the method. |
k |
the number of clusters to create. |
initMode |
the initialization algorithm; "random" or "degree" |
maxIter |
the maximum number of iterations. |
sourceCol |
the name of the input column for source vertex IDs. |
destinationCol |
the name of the input column for destination vertex IDs |
weightCol |
weight column name. If this is not set or |
A dataset that contains columns of vertex id and the corresponding cluster for the id.
The schema of it will be: id: integer
, cluster: integer
spark.assignClusters(SparkDataFrame) since 3.0.0
## Not run:
##D df <- createDataFrame(list(list(0L, 1L, 1.0), list(0L, 2L, 1.0),
##D list(1L, 2L, 1.0), list(3L, 4L, 1.0),
##D list(4L, 0L, 0.1)),
##D schema = c("src", "dst", "weight"))
##D clusters <- spark.assignClusters(df, initMode = "degree", weightCol = "weight")
##D showDF(clusters)
## End(Not run)