nxcals.api.extraction.data.builders.DataFrame.observe

DataFrame.observe(observation: Observation, *exprs: Column) DataFrame

Observe (named) metrics through an Observation instance.

A user can retrieve the metrics by accessing Observation.get.

New in version 3.3.0.

Parameters:
  • observation (Observation) – an Observation instance to obtain the metric.

  • exprs (list of Column) – column expressions (Column).

Returns:

the observed DataFrame.

Return type:

DataFrame

Notes

This method does not support streaming datasets.

Examples

>>> from pyspark.sql.functions import col, count, lit, max
>>> from pyspark.sql import Observation
>>> observation = Observation("my metrics")
>>> observed_df = df.observe(observation, count(lit(1)).alias("count"), max(col("age")))
>>> observed_df.count()
2
>>> observation.get
{'count': 2, 'max(age)': 5}

New in version 3.3.