nxcals.api.extraction.data.builders.DataFrame.dropna

DataFrame.dropna(how: str = 'any', thresh: Optional[int] = None, subset: Optional[Union[str, Tuple[str, ...], List[str]]] = None) DataFrame

Returns a new DataFrame omitting rows with null values. DataFrame.dropna() and DataFrameNaFunctions.drop() are aliases of each other.

New in version 1.3.1.

Changed in version 3.4.0: Supports Spark Connect.

Parameters:
  • how (str, optional) – ‘any’ or ‘all’. If ‘any’, drop a row if it contains any nulls. If ‘all’, drop a row only if all its values are null.

  • thresh (int, optional) – default None If specified, drop rows that have less than thresh non-null values. This overwrites the how parameter.

  • subset (str, tuple or list, optional) – optional list of column names to consider.

Returns:

DataFrame with null only rows excluded.

Return type:

DataFrame

Examples

>>> from pyspark.sql import Row
>>> df = spark.createDataFrame([
...     Row(age=10, height=80, name="Alice"),
...     Row(age=5, height=None, name="Bob"),
...     Row(age=None, height=None, name="Tom"),
...     Row(age=None, height=None, name=None),
... ])
>>> df.na.drop().show()
+---+------+-----+
|age|height| name|
+---+------+-----+
| 10|    80|Alice|
+---+------+-----+