# Allow DataFrame Actions to take formatable columns

XMLWordPrintable

#### Details

• Type: Story
• Status: Invalid
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
2
• Team:
Data Release Production
• Urgent?:
No

#### Description

Allow DataFrameActions to format columns based on passed in keywords. This will allow downstream plotting code to work over multiple bands within one task.

#### Activity

Hide
Keith Bechtol added a comment - - edited

Here is a summary of the current implementation of selector actions in faro as related to this ticket.

Problem we were trying to solve:
Imagine that we have an object catalog for a given tract and we want to compute one metric value per band. We iterate over bands and for each band, we want to load the columns specific for that band that are specified in the selector actions (e.g., load g_psfFlux in order to apply a SNR selection for the g band to get bright stars).

Selector definitions are located in https://github.com/lsst/faro/blob/main/python/lsst/faro/utils/selectors.py ; see for example SNRSelector

For an example of how selectors are used to identify which columns to load from objectTable_tract, see https://github.com/lsst/faro/blob/main/python/lsst/faro/measurement/TractTableMeasurement.py, in particular, the runQuantum method of TractTableMeasurementTask

  def runQuantum(self, butlerQC, inputRefs, outputRefs):  inputs = butlerQC.get(inputRefs)  kwargs = {"currentBands": butlerQC.quantum.dataId['band']}    columns = list(self.config.measure.columns.values())  for column in self.config.measure.columnsBand.values():  columns.append(kwargs["currentBands"] + '_' + column)  columnsWithSelectors = self._getTableColumnsSelectors(columns, kwargs["currentBands"])  kwargs["catalog"] = inputs["catalog"].get(parameters={"columns": columnsWithSelectors}) 

The _getTableColumnsSelectors function is defined in https://github.com/lsst/faro/blob/main/python/lsst/faro/base/CatalogMeasurementBase.py in the CatalogMeasurementBaseTask

  def _getTableColumnsSelectors(self, columns, currentBands=None):  """given a list of selectors return columns required to apply these  selectors.  Parameters  ----------  columns: list [str]  a list of columns required to calculate a metric. This list  is appended with any addditional columns required for the selectorActions.  currentBands: list [str]  The filter band(s) associated with the observations.  Returns  -------  columnNames: list [str] the set of columns required to compute a  metric with any addditional columns required for the selectorActions  appended to the set.  """  columnNames = set(columns)  for actionStruct in [self.config.measure.selectorActions]:  for action in actionStruct:  for col in action.columns(currentBands):  columnNames.add(col)    return columnNames 

The critical line of code above is the following:

action.columns(currentBands)

where we pass in the relevant band or bands to the selector action in order to return the list of columns corresponding to those bands. For example, for the SNRSelector

  def columns(self, currentBands=None):  allCols = []  if self.selectorBandType == "staticBandSet":  bands = self.staticBandSet  else:  bands = currentBands    if bands is not None:  for band in bands:  allCols += [band+'_'+self.fluxType, band+'_'+self.fluxType+'Err']  else:  allCols = [self.fluxType, self.fluxType+'Err']  return allCols 

As I understand it, the difference in implementation of selector actions for analysis_drp https://github.com/lsst/analysis_drp/blob/main/python/lsst/analysis/drp/dataSelectors.py is that the bands are not updated in run time, but are set in advance in configuration.

Show
Keith Bechtol added a comment - - edited Here is a summary of the current implementation of selector actions in faro as related to this ticket. Problem we were trying to solve: Imagine that we have an object catalog for a given tract and we want to compute one metric value per band. We iterate over bands and for each band, we want to load the columns specific for that band that are specified in the selector actions (e.g., load g_psfFlux in order to apply a SNR selection for the g band to get bright stars). Selector definitions are located in https://github.com/lsst/faro/blob/main/python/lsst/faro/utils/selectors.py ; see for example SNRSelector For an example of how selectors are used to identify which columns to load from objectTable_tract, see https://github.com/lsst/faro/blob/main/python/lsst/faro/measurement/TractTableMeasurement.py , in particular, the runQuantum method of TractTableMeasurementTask def runQuantum( self , butlerQC, inputRefs, outputRefs): inputs = butlerQC.get(inputRefs) kwargs = { "currentBands" : butlerQC.quantum.dataId[ 'band' ]}   columns = list ( self .config.measure.columns.values()) for column in self .config.measure.columnsBand.values(): columns.append(kwargs[ "currentBands" ] + '_' + column) columnsWithSelectors = self ._getTableColumnsSelectors(columns, kwargs[ "currentBands" ]) kwargs[ "catalog" ] = inputs[ "catalog" ].get(parameters = { "columns" : columnsWithSelectors}) The _getTableColumnsSelectors function is defined in https://github.com/lsst/faro/blob/main/python/lsst/faro/base/CatalogMeasurementBase.py in the CatalogMeasurementBaseTask def _getTableColumnsSelectors( self , columns, currentBands = None ): """given a list of selectors return columns required to apply these selectors. Parameters - - - - - - - - - - columns:  list  [ str ] a list of columns required to calculate a metric. This list is appended with any addditional columns required for the selectorActions. currentBands:  list  [ str ] The filter band(s) associated with the observations. Returns - - - - - - - columnNames:  list  [ str ] the set of columns required to compute a metric with any addditional columns required for the selectorActions appended to the set . """ columnNames = set (columns) for actionStruct in [ self .config.measure.selectorActions]: for action in actionStruct: for col in action.columns(currentBands): columnNames.add(col)   return columnNames The critical line of code above is the following: action.columns(currentBands) where we pass in the relevant band or bands to the selector action in order to return the list of columns corresponding to those bands. For example, for the SNRSelector def columns( self , currentBands = None ): allCols = [] if self .selectorBandType = = "staticBandSet" : bands = self .staticBandSet else : bands = currentBands   if bands is not None : for band in bands: allCols + = [band + '_' + self .fluxType, band + '_' + self .fluxType + 'Err' ] else : allCols = [ self .fluxType, self .fluxType + 'Err' ] return allCols As I understand it, the difference in implementation of selector actions for analysis_drp https://github.com/lsst/analysis_drp/blob/main/python/lsst/analysis/drp/dataSelectors.py is that the bands are not updated in run time, but are set in advance in configuration.
Hide
Nate Lust added a comment -

This has been completely superseded by analysis_tools redesign where the functionality is largely as desired.

Show
Nate Lust added a comment - This has been completely superseded by analysis_tools redesign where the functionality is largely as desired.

#### People

Assignee:
Nate Lust
Reporter:
Nate Lust
Watchers:
Jeffrey Carlin, Keith Bechtol, Nate Lust, Peter Ferguson, Sophie Reed, Yusra AlSayyad