Skip to content

Extraction Thin API

This API allows to send scripts or perform specific queries (currently related to Fills data) to the existing Spark Server which are executed on the server side and results are returned to the client. When working with scripts, results are in the form of Avro bytes that need to be converted to the desired row format. Currently we support converting to:

For the queries related to fills results can be converted to native CERN Extraction API domain objects using provided methods:

  • *

The Thin API is currently only supported from Java, in order to use it from Python please resort to JPype.

Important

Please make sure you don't mix together the jars from normal Extraction API together with the Thin client jars. Such a configuration will not work due to jar clashes between those two APIs. In order to use the Thin API please import the nxcals-extraction-api-thin product only.

Input script

The Thin API greatly limits the number of jars required to access NXCALS data. On the limitations side it requires that the passed script returns the Dataset type. Other return types are not supported for the moment.
The script can be provided in the form of a string that contains references to our normal API calls. The syntax of the script is Javascript but you can use Java objects there. It has to be of the form:

String script = "DataQuery.builder(sparkSession)" + 
".byEntities().system('CMW')" + 
".startTime('2018-08-01 00:00:00.0').endTime('2018-08-01 01:00:00.0')" + 
".entity().keyValue('property', 'Acquisition').keyValue('device', 'SPSBQMSPSv1').build()";

There are appropriate Query builders provided in the Thin API that will generate some typical basic query strings.

Java 11

This is currently the only API that supports Java11. Please refer to the compatibility page for more information.

Authentication

The API uses RBAC authentication so no Kerberos token is required. On the other hand one must provide a valid RBAC token present in the environment. So some method of RBAC login is required to be executed before the Thin API is used.

One example of the explicit RBAC login is here:

try {
    String user = ""; //obtain the user-name from somewhere
    String password = ""; //obtain the password from somewhere
    AuthenticationClient authenticationClient = AuthenticationClient.create();
    RbaToken token = authenticationClient.loginExplicit(user, password);
    ClientTierTokenHolder.setRbaToken(token);
} catch (AuthenticationException e) {
    throw new IllegalArgumentException("Cannot login", e);
}

For other methods of login please refer to the RBAC documentation on the wikis.

Example of use

Please find an example of the typical Thin API client call here:

final String SPARK_SERVERS_URL = "cs-ccr-nxcals5.cern.ch:15000,cs-ccr-nxcals6.cern.ch:15000,cs-ccr-nxcals7.cern.ch:15000,cs-ccr-nxcals8.cern.ch:15000";
//Create the service stub
ExtractionServiceGrpc.ExtractionServiceBlockingStub extractionService = ServiceFactory
        .createExtractionService(SPARK_SERVERS_URL);

String startTime = "2018-08-01 00:00:00.00";
String endTime = "2018-08-01 01:00:00.00";

//This is the meta-data query script (you can write your own, this is just a helper class).
String script = DevicePropertyDataQuery.builder().system("CMW")
        .startTime(startTime).endTime(endTime).entity()
        .parameter("SPSBQMSPSv1/Acquisition").build();

//Adding some custom operations to the generated script
script += ".select('acqStamp',\"bunchIntensities\")";

//Build a query
AvroQuery avroQuery = AvroQuery.newBuilder()
        .setScript(script)
        .setCompression(AvroQuery.Codec.BZIP2) //optional bytes compression
        .build();

//Query for data (return type is just bytes that need to be converted to desired record type, currently Avro or CMW ImmutableData)
AvroData avroData = extractionService.query(avroQuery);
long nrOfRecordsinExampleDataset = avroData.getRecordCount();
//Convert to Avro GenericRecord
List<GenericRecord> records = Avro.records(avroData);
//Do something with the list of GenericRecord objects...

//Convert to ImmutableData
List<ImmutableData> cmwRecords = Datax.records(avroData);
//Do something with the list of ImmutableData objects...
#WORK IN PROGRESS, please use JPype

For the Spark Server addresses please use:

  • PRO: "cs-ccr-nxcals5.cern.ch:15000,cs-ccr-nxcals6.cern.ch:15000,cs-ccr-nxcals7.cern.ch:15000,cs-ccr-nxcals8.cern.ch:15000"
  • TESTBED: "cs-ccr-nxcalstbs2.cern.ch:15000,cs-ccr-nxcalstbs3.cern.ch:15000,cs-ccr-nxcalstbs4.cern.ch:15000"

A full working example of the Thin API usage can be seen in our Java examples project

Java doc can be found here.