nxcals.api.extraction.data.builders.DataFrame.sameSemantics
- DataFrame.sameSemantics(other: DataFrame) bool
Returns True when the logical query plans inside both
DataFrame
s are equal and therefore return the same results.New in version 3.1.0.
Changed in version 3.5.0: Supports Spark Connect.
Notes
The equality comparison here is simplified by tolerating the cosmetic differences such as attribute names.
This API can compare both
DataFrame
s very fast but can still return False on theDataFrame
that return the same results, for instance, from different plans. Such false negative semantic can be useful when caching as an example.This API is a developer API.
- Parameters:
other (
DataFrame
) – The other DataFrame to compare against.- Returns:
Whether these two DataFrames are similar.
- Return type:
bool
Examples
>>> df1 = spark.range(10) >>> df2 = spark.range(10) >>> df1.withColumn("col1", df1.id * 2).sameSemantics(df2.withColumn("col1", df2.id * 2)) True >>> df1.withColumn("col1", df1.id * 2).sameSemantics(df2.withColumn("col1", df2.id + 2)) False >>> df1.withColumn("col1", df1.id * 2).sameSemantics(df2.withColumn("col0", df2.id * 2)) True