nxcals.api.extraction.data.builders.DataFrame.intersect
- DataFrame.intersect(other: DataFrame) DataFrame
Return a new
DataFrame
containing rows only in both thisDataFrame
and anotherDataFrame
. Note that any duplicates are removed. To preserve duplicates useintersectAll()
.New in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters:
other (
DataFrame
) – AnotherDataFrame
that needs to be combined.- Returns:
Combined DataFrame.
- Return type:
Notes
This is equivalent to INTERSECT in SQL.
Examples
>>> df1 = spark.createDataFrame([("a", 1), ("a", 1), ("b", 3), ("c", 4)], ["C1", "C2"]) >>> df2 = spark.createDataFrame([("a", 1), ("a", 1), ("b", 3)], ["C1", "C2"]) >>> df1.intersect(df2).sort(df1.C1.desc()).show() +---+---+ | C1| C2| +---+---+ | b| 3| | a| 1| +---+---+