Class LookupStrategy

  • All Implemented Interfaces:
    cern.nxcals.common.utils.TriFunction<java.util.function.Function<TimeWindow,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>>,​java.lang.String,​ExtractionProperties,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>>

    public final class LookupStrategy
    extends java.lang.Object
    implements cern.nxcals.common.utils.TriFunction<java.util.function.Function<TimeWindow,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>>,​java.lang.String,​ExtractionProperties,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>>
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> apply​(java.util.function.Function<TimeWindow,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>> dataProvider, java.lang.String timestampField, ExtractionProperties properties)
      This method retrieves the dataset containing the datapoint accordingly to the lookup strategy.
      boolean equals​(java.lang.Object o)  
      @NonNull java.time.Duration getDuration()  
      @NonNull cern.nxcals.api.custom.service.extraction.LookupStrategy.Handler getHandler()  
      int hashCode()  
      java.lang.String toString()  
      LookupStrategy withLookupDuration​(long amount, @NonNull java.time.temporal.TemporalUnit unit)  
      • Methods inherited from class java.lang.Object

        clone, finalize, getClass, notify, notifyAll, wait, wait, wait
    • Field Detail

      • LAST_BEFORE_START

        public static final LookupStrategy LAST_BEFORE_START
        Perform data lookup to extract the last available datapoint (if any) prior the actual data that belong to a given time window. If such datapoint exists, we query the dataset with the timestamp of that datapoint as the startTime to get the dataset of the time window plus that datapoint.
      • LAST_BEFORE_START_IF_EMPTY

        public static final LookupStrategy LAST_BEFORE_START_IF_EMPTY
        Perform data lookup to extract the last available datapoint (if any) prior the given time window, only if the actual data do not exist. In that case a single row dataset would be returned, containing that last datapoint. Otherwise, if data exist for the given time window, this strategy will not perform any data lookup action, thus, only data points matching the query time window would be included on the result.
      • NEXT_AFTER_START_ONLY

        public static final LookupStrategy NEXT_AFTER_START_ONLY
        Perform data lookup to extract the first available datapoint (if any) after the given time window. A single row dataset would be returned, containing that first datapoint.
      • LAST_BEFORE_START_ONLY

        public static final LookupStrategy LAST_BEFORE_START_ONLY
        Perform data lookup to extract the first available datapoint (if any) prior the given time window. A single row dataset would be returned, containing that first datapoint.
    • Method Detail

      • withLookupDuration

        public LookupStrategy withLookupDuration​(long amount,
                                                 @NonNull
                                                 @NonNull java.time.temporal.TemporalUnit unit)
      • apply

        public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> apply​(java.util.function.Function<TimeWindow,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>> dataProvider,
                                                                            java.lang.String timestampField,
                                                                            ExtractionProperties properties)
        This method retrieves the dataset containing the datapoint accordingly to the lookup strategy. All three parameters are needed because when the strategy is LAST_BEFORE_START_IF_EMPTY we cannot know beforehand what the startTime will be.
        Specified by:
        apply in interface cern.nxcals.common.utils.TriFunction<java.util.function.Function<TimeWindow,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>>,​java.lang.String,​ExtractionProperties,​org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>>
        Parameters:
        dataProvider - This is used to extract the data
        properties - This is used to retrieve the end time of time window we are querying on
        Returns:
        Dataset<Row> The dataset
      • getHandler

        @NonNull
        public @NonNull cern.nxcals.api.custom.service.extraction.LookupStrategy.Handler getHandler()
      • getDuration

        @NonNull
        public @NonNull java.time.Duration getDuration()
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • equals

        public boolean equals​(java.lang.Object o)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object