pyspark.pandas.Series.nlargest¶
-
Series.
nlargest
(n: int = 5) → pyspark.pandas.series.Series[source]¶ Return the largest n elements.
- Parameters
- nint, default 5
- Returns
- Series
The n largest values in the Series, sorted in decreasing order.
See also
Series.nsmallest
Get the n smallest elements.
Series.sort_values
Sort Series by values.
Series.head
Return the first n rows.
Notes
Faster than
.sort_values(ascending=False).head(n)
for small n relative to the size of theSeries
object.In pandas-on-Spark, thanks to Spark’s lazy execution and query optimizer, the two would have same performance.
Examples
>>> data = [1, 2, 3, 4, np.nan ,6, 7, 8] >>> s = ps.Series(data) >>> s 0 1.0 1 2.0 2 3.0 3 4.0 4 NaN 5 6.0 6 7.0 7 8.0 dtype: float64
The n largest elements where
n=5
by default.>>> s.nlargest() 7 8.0 6 7.0 5 6.0 3 4.0 2 3.0 dtype: float64
>>> s.nlargest(n=3) 7 8.0 6 7.0 5 6.0 dtype: float64