nxcals.api.common.utils.array_utils.StructType

class nxcals.api.common.utils.array_utils.StructType(fields: List[StructField] | None = None)

Struct type, consisting of a list of StructField.

This is the data type representing a Row.

Iterating a StructType will iterate over its StructFields. A contained StructField can be accessed by its name or position.

Examples

>>> from pyspark.sql.types import *
>>> struct1 = StructType([StructField("f1", StringType(), True)])
>>> struct1["f1"]
StructField('f1', StringType(), True)
>>> struct1[0]
StructField('f1', StringType(), True)
>>> struct1 = StructType([StructField("f1", StringType(), True)])
>>> struct2 = StructType([StructField("f1", StringType(), True)])
>>> struct1 == struct2
True
>>> struct1 = StructType([StructField("f1", CharType(10), True)])
>>> struct2 = StructType([StructField("f1", CharType(10), True)])
>>> struct1 == struct2
True
>>> struct1 = StructType([StructField("f1", VarcharType(10), True)])
>>> struct2 = StructType([StructField("f1", VarcharType(10), True)])
>>> struct1 == struct2
True
>>> struct1 = StructType([StructField("f1", StringType(), True)])
>>> struct2 = StructType([StructField("f1", StringType(), True),
...     StructField("f2", IntegerType(), False)])
>>> struct1 == struct2
False

The below example demonstrates how to create a DataFrame based on a struct created using class:StructType and class:StructField:

>>> data = [("Alice", ["Java", "Scala"]), ("Bob", ["Python", "Scala"])]
>>> schema = StructType([
...     StructField("name", StringType()),
...     StructField("languagesSkills", ArrayType(StringType())),
... ])
>>> df = spark.createDataFrame(data=data, schema=schema)
>>> df.printSchema()
root
 |-- name: string (nullable = true)
 |-- languagesSkills: array (nullable = true)
 |    |-- element: string (containsNull = true)
>>> df.show()
+-----+---------------+
| name|languagesSkills|
+-----+---------------+
|Alice|  [Java, Scala]|
|  Bob|[Python, Scala]|
+-----+---------------+

Methods

StructType.__init__([fields])

StructType.add()

Construct a StructType by adding new elements to it, to define the schema.

StructType.fieldNames()

Returns all field names in a list.

StructType.fromInternal(obj)

Converts an internal SQL object into a native Python object.

StructType.fromJson(json)

Constructs StructType from a schema defined in JSON format.

StructType.json()

StructType.jsonValue()

StructType.needConversion()

Does this type needs conversion between Python object and internal SQL object.

StructType.simpleString()

StructType.toInternal(obj)

Converts a Python object into an internal SQL object.

StructType.typeName()