nxcals.api.extraction.data.builders.DataFrame.distinct

DataFrame.distinct() DataFrame

Returns a new DataFrame containing the distinct rows in this DataFrame.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Returns:

DataFrame with distinct records.

Return type:

DataFrame

Examples

>>> df = spark.createDataFrame(
...     [(14, "Tom"), (23, "Alice"), (23, "Alice")], ["age", "name"])

Return the number of distinct rows in the DataFrame

>>> df.distinct().count()
2