nxcals.api.extraction.data.builders.DataFrame.filter
- DataFrame.filter(condition: ColumnOrName) DataFrame
Filters rows using the given condition.
where()
is an alias forfilter()
.Added in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters:
condition (
Column
or str) – aColumn
oftypes.BooleanType
or a string of SQL expressions.- Returns:
Filtered DataFrame.
- Return type:
Examples
>>> df = spark.createDataFrame([ ... (2, "Alice"), (5, "Bob")], schema=["age", "name"])
Filter by
Column
instances.>>> df.filter(df.age > 3).show() +---+----+ |age|name| +---+----+ | 5| Bob| +---+----+ >>> df.where(df.age == 2).show() +---+-----+ |age| name| +---+-----+ | 2|Alice| +---+-----+
Filter by SQL expression in a string.
>>> df.filter("age > 3").show() +---+----+ |age|name| +---+----+ | 5| Bob| +---+----+ >>> df.where("age = 2").show() +---+-----+ |age| name| +---+-----+ | 2|Alice| +---+-----+