nxcals.api.extraction.data.builders.DataFrame.withColumn
- DataFrame.withColumn(colName: str, col: Column) DataFrame
Returns a new
DataFrame
by adding a column or replacing the existing column that has the same name.The column expression must be an expression over this
DataFrame
; attempting to add a column from some otherDataFrame
will raise an error.Added in version 1.3.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters:
colName (str) – string, name of the new column.
col (
Column
) – aColumn
expression for the new column.
- Returns:
DataFrame with new or replaced column.
- Return type:
Notes
This method introduces a projection internally. Therefore, calling it multiple times, for instance, via loops in order to add multiple columns can generate big plans which can cause performance issues and even StackOverflowException. To avoid this, use
select()
with multiple columns at once.Examples
>>> df = spark.createDataFrame([(2, "Alice"), (5, "Bob")], schema=["age", "name"]) >>> df.withColumn('age2', df.age + 2).show() +---+-----+----+ |age| name|age2| +---+-----+----+ | 2|Alice| 4| | 5| Bob| 7| +---+-----+----+