pyspark.RDD.cartesian¶
-
RDD.
cartesian
(other: pyspark.rdd.RDD[U]) → pyspark.rdd.RDD[Tuple[T, U]][source]¶ Return the Cartesian product of this RDD and another one, that is, the RDD of all pairs of elements
(a, b)
wherea
is in self andb
is in other.New in version 0.7.0.
See also
Examples
>>> rdd = sc.parallelize([1, 2]) >>> sorted(rdd.cartesian(rdd).collect()) [(1, 1), (1, 2), (2, 1), (2, 2)]