Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-30948

Change gen3 schema for visit to add seq_num

    XMLWordPrintable

    Details

    • Story Points:
      4
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      Robert Lupton has requested that the first sequence number in a visit be added to the visit dimension record definition in the gen3 registry schema. We also wish to store the group_id to allow grouping of visits.

      In the general case this will allow a user to specify a dataId for the raw that can also be used to obtain the calexp so long as they are specifying the first raw and not the second raw.

      day_obs is already in the visit definition.

      There are some caveats:

      • If a registry has multiple visit_system definitions the day_obs+seq_num may not be able to uniquely identify a calexp.
      • Visit definitions use the exposure dimension record (they do not look at files) so determining the first sequence in the visit depends on all the visit members being ingested.

      To make this robust I think ObservationInfo is going to have to be modified to have the first_seq and end_seq concepts from CAP-763 – this would require that we also change the exposure dimension record to include first_seq and end_seq since without the former we can't define the visit and without the latter the tooling can't warn about possible missing exposures in the visit definition.

        Attachments

          Issue Links

            Activity

            Hide
            jbosch Jim Bosch added a comment - - edited

            It may be worth noting that we do already have other visit fields that require us to have at least one detector from each snap ingested prior to constructing the visit - visit.timespan and visit.exposure_time come to mind.  I don't dispute that they are problematic, too, but unfortunately they are also harder to fix with metadata additions.

            So, either we don't need to consider this ticket blocked by metadata additions (because we already require at least one detector from each snap to be ingested), or we should think about working out a bigger schema change that resolves the full problem (maybe we could just drop those temporal fields from visit and nobody would care, especially, if queries automatically join in exposure for temporal information).  Another alternative is running UPDATE queries on the visit table once all of the information has landed, which is conceptually yucky but may not be a huge problem in practice.

            Show
            jbosch Jim Bosch added a comment - - edited It may be worth noting that we do already have other visit fields that require us to have at least one detector from each snap ingested prior to constructing the visit - visit.timespan and visit.exposure_time come to mind.  I don't dispute that they are problematic, too, but unfortunately they are also harder to fix with metadata additions. So, either we don't need to consider this ticket blocked by metadata additions (because we already require at least one detector from each snap to be ingested), or we should think about working out a bigger schema change that resolves the full problem (maybe we could just drop those temporal fields from visit and nobody would care, especially, if queries automatically join in exposure for temporal information).  Another alternative is running UPDATE queries on the visit table once all of the information has landed, which is conceptually yucky but may not be a huge problem in practice.
            Hide
            mfisherlevine Merlin Fisher-Levine added a comment -

            Having had a chat with Tim, just adding a vote to say this is is extremely important for observing/commissioning use (among many things). Not being able to retrieve anything with visit as a dimension by its day_obs/seq_num is extremely bad news. Right now we can hack around it with string-concatenating the ints with some padding, but 1) that's super hacky, but moreover 2) doesn't work in general/the future, so is incredibly fragile, not to mention awkward.

            Show
            mfisherlevine Merlin Fisher-Levine added a comment - Having had a chat with Tim, just adding a vote to say this is is extremely important for observing/commissioning use (among many things). Not being able to retrieve anything with visit as a dimension by its day_obs/seq_num is extremely bad news. Right now we can hack around it with string-concatenating the ints with some padding, but 1) that's super hacky, but moreover 2) doesn't work in general/the future, so is incredibly fragile, not to mention awkward.
            Hide
            tjenness Tim Jenness added a comment -

            Going back to Jim Bosch's comment. With start_seq/end_seq in the exposure table the visit definition code can work out if it has all the expected exposures when working out the exposure time field. The problem comes if the observation was aborted with only some exposures. Consider (seq_num, start_seq, end_seq):

            • 10, 10, 11
            • 11, 10, 11
            • 12, 12, 13
            • 13, 13, 13
            • 14, 14, 15
            • 16, 16, 16
            • 17, 17, 18

            There are 6 visits there. Visits 10 is complete with two snaps. Visit 12 was clearly aborted but the define-visit code can only tell that because visit 13 has turned up with a new start_seq. Visit 13 is complete (single snap). 14 was aborted but exposure 15 is missing. The define-visit code can't tell that this is not simply a data transfer issue. 16 is complete. 17 was meant to be 2 snaps but no more data has turned up for the night. Was it the end of observing and it didn't finish? Or is the file just slow coming.

            The automated data transfer code can therefore define visits 10, 12, 13, and 16 reliably but we need some scheme for handling visits 14 and 17. (assuming OODS will be defining visits) OODS can probably assume that it must define the visit for 14 once 16 turns up because OODS is going to have to assume that there are no data transfer issues. I don't think OODS can define visit 17 unless an operator kicks it or it is listening to the script queue and understanding that the script with that group_name (which is in the exposure record) has completed (or at least is not actively running). At the USDF end for these scenarios we will need to have some flag to indicate that all data have arrived before we can run define visits for 14 and 17.

            Show
            tjenness Tim Jenness added a comment - Going back to Jim Bosch 's comment. With start_seq/end_seq in the exposure table the visit definition code can work out if it has all the expected exposures when working out the exposure time field. The problem comes if the observation was aborted with only some exposures. Consider (seq_num, start_seq, end_seq): 10, 10, 11 11, 10, 11 12, 12, 13 13, 13, 13 14, 14, 15 16, 16, 16 17, 17, 18 There are 6 visits there. Visits 10 is complete with two snaps. Visit 12 was clearly aborted but the define-visit code can only tell that because visit 13 has turned up with a new start_seq. Visit 13 is complete (single snap). 14 was aborted but exposure 15 is missing. The define-visit code can't tell that this is not simply a data transfer issue. 16 is complete. 17 was meant to be 2 snaps but no more data has turned up for the night. Was it the end of observing and it didn't finish? Or is the file just slow coming. The automated data transfer code can therefore define visits 10, 12, 13, and 16 reliably but we need some scheme for handling visits 14 and 17. (assuming OODS will be defining visits) OODS can probably assume that it must define the visit for 14 once 16 turns up because OODS is going to have to assume that there are no data transfer issues. I don't think OODS can define visit 17 unless an operator kicks it or it is listening to the script queue and understanding that the script with that group_name (which is in the exposure record) has completed (or at least is not actively running). At the USDF end for these scenarios we will need to have some flag to indicate that all data have arrived before we can run define visits for 14 and 17.
            Hide
            ktl Kian-Tat Lim added a comment -

            Prompt Processing doesn't have the problems that Tim described because it will refuse to process an incomplete visit according to the number of snaps that were specified up front in the nextVisit event. After a timeout, it will drop the visit and any snap exposures that were part of it on the floor. Daytime catch-up reprocessing would be needed if the visit were later found to be complete.

            Show
            ktl Kian-Tat Lim added a comment - Prompt Processing doesn't have the problems that Tim described because it will refuse to process an incomplete visit according to the number of snaps that were specified up front in the nextVisit event. After a timeout, it will drop the visit and any snap exposures that were part of it on the floor. Daytime catch-up reprocessing would be needed if the visit were later found to be complete.
            Hide
            tjenness Tim Jenness added a comment -

            The work for this is happening on DM-33942 but with some test data from AT_O_20220405_000348 and surrounding observations define-visits is working properly with CURINDEX/MAXINDEX.

            For example visits (with columns elided):

            instrument id name day_obs visit_system_mask
            ---------- -------------- -------------------------- -------- -----------------
            LATISS 2022040500340 AT_O_20220405_000340 20220405 5
            LATISS 2022040500341 AT_O_20220405_000341 20220405 5
            LATISS 2022040500342 AT_O_20220405_000342 20220405 5
            LATISS 2022040500343 AT_O_20220405_000343 20220405 5
            LATISS  2022040500344 AT_O_20220405_000344 20220405 5
            LATISS  2022040500345 AT_O_20220405_000345 20220405 5
            LATISS  2022040500346 AT_O_20220405_000346 20220405 5
            LATISS  2022040500347 AT_O_20220405_000347 20220405 5
            LATISS  2022040500348 AT_O_20220405_000348 20220405 4
            LATISS  2022040500349 AT_O_20220405_000349 20220405 1
            LATISS 92022040500348 AT_O_20220405_000348_first 20220405 1
            

            visit_definition:

            instrument    exposure       visit     
            ---------- ------------- --------------
                LATISS 2022040500340  2022040500340
                LATISS 2022040500341  2022040500341
                LATISS 2022040500342  2022040500342
                LATISS 2022040500343  2022040500343
                LATISS 2022040500344  2022040500344
                LATISS 2022040500345  2022040500345
                LATISS 2022040500346  2022040500346
                LATISS 2022040500347  2022040500347
                LATISS 2022040500348  2022040500348
                LATISS 2022040500348 92022040500348
                LATISS 2022040500349  2022040500348
                LATISS 2022040500349  2022040500349
            

            visit_system_membership:

            instrument visit_system     visit     
            ---------- ------------ --------------
                LATISS            0  2022040500340
                LATISS            0  2022040500341
                LATISS            0  2022040500342
                LATISS            0  2022040500343
                LATISS            0  2022040500344
                LATISS            0  2022040500345
                LATISS            0  2022040500346
                LATISS            0  2022040500347
                LATISS            0  2022040500349
                LATISS            0 92022040500348
                LATISS            2  2022040500340
                LATISS            2  2022040500341
                LATISS            2  2022040500342
                LATISS            2  2022040500343
                LATISS            2  2022040500344
                LATISS            2  2022040500345
                LATISS            2  2022040500346
                LATISS            2  2022040500347
                LATISS            2  2022040500348
            

            Noting that 349 is a standalone visit (system=0) but not a multi-exposure visit (system=2) (it's part of 348).

            Currently I do not create the multi-exposure visit if the first exposure is missing but I do create it if the second exposure is missing (in theory define-visit can be run again when the second one turns up).

            Show
            tjenness Tim Jenness added a comment - The work for this is happening on DM-33942 but with some test data from AT_O_20220405_000348 and surrounding observations define-visits is working properly with CURINDEX/MAXINDEX. For example visits (with columns elided): instrument id name day_obs visit_system_mask ---------- -------------- -------------------------- -------- ----------------- LATISS 2022040500340 AT_O_20220405_000340 20220405 5 LATISS 2022040500341 AT_O_20220405_000341 20220405 5 LATISS 2022040500342 AT_O_20220405_000342 20220405 5 LATISS 2022040500343 AT_O_20220405_000343 20220405 5 LATISS 2022040500344 AT_O_20220405_000344 20220405 5 LATISS 2022040500345 AT_O_20220405_000345 20220405 5 LATISS 2022040500346 AT_O_20220405_000346 20220405 5 LATISS 2022040500347 AT_O_20220405_000347 20220405 5 LATISS 2022040500348 AT_O_20220405_000348 20220405 4 LATISS 2022040500349 AT_O_20220405_000349 20220405 1 LATISS 92022040500348 AT_O_20220405_000348_first 20220405 1 visit_definition: instrument exposure visit ---------- ------------- -------------- LATISS 2022040500340 2022040500340 LATISS 2022040500341 2022040500341 LATISS 2022040500342 2022040500342 LATISS 2022040500343 2022040500343 LATISS 2022040500344 2022040500344 LATISS 2022040500345 2022040500345 LATISS 2022040500346 2022040500346 LATISS 2022040500347 2022040500347 LATISS 2022040500348 2022040500348 LATISS 2022040500348 92022040500348 LATISS 2022040500349 2022040500348 LATISS 2022040500349 2022040500349 visit_system_membership: instrument visit_system visit ---------- ------------ -------------- LATISS 0 2022040500340 LATISS 0 2022040500341 LATISS 0 2022040500342 LATISS 0 2022040500343 LATISS 0 2022040500344 LATISS 0 2022040500345 LATISS 0 2022040500346 LATISS 0 2022040500347 LATISS 0 2022040500349 LATISS 0 92022040500348 LATISS 2 2022040500340 LATISS 2 2022040500341 LATISS 2 2022040500342 LATISS 2 2022040500343 LATISS 2 2022040500344 LATISS 2 2022040500345 LATISS 2 2022040500346 LATISS 2 2022040500347 LATISS 2 2022040500348 Noting that 349 is a standalone visit (system=0) but not a multi-exposure visit (system=2) (it's part of 348). Currently I do not create the multi-exposure visit if the first exposure is missing but I do create it if the second exposure is missing (in theory define-visit can be run again when the second one turns up).
            Hide
            tjenness Tim Jenness added a comment -

            Note that dataId (20220405, 348) has two matching visits. They have different visit_id values. The convention is that the multi-exposure visit has the more natural visit_id and visit name and the "one to one" visit definition has modified visit_id/name to make them distinct (the 9 prefix and _first suffix).

            This does mean that if you butler.get() using the day_obs and seq_num that something has to break the tie to decide which visit_id you actually meant. The idea is that the instrument record will have a default visit_system that could then be used to make the decision. Jim Bosch, how do I formulate a WHERE clause for query-dimension-records that would be able to say "where default visit_system bit is set in visit_system_mask"?

            Show
            tjenness Tim Jenness added a comment - Note that dataId (20220405, 348) has two matching visits. They have different visit_id values. The convention is that the multi-exposure visit has the more natural visit_id and visit name and the "one to one" visit definition has modified visit_id/name to make them distinct (the 9 prefix and _first suffix). This does mean that if you butler.get() using the day_obs and seq_num that something has to break the tie to decide which visit_id you actually meant. The idea is that the instrument record will have a default visit_system that could then be used to make the decision. Jim Bosch , how do I formulate a WHERE clause for query-dimension-records that would be able to say "where default visit_system bit is set in visit_system_mask"?
            Hide
            jbosch Jim Bosch added a comment -

            Now that I think about it, I don't think we have the binary OR and AND operators needed to make use of the mask field at all (well, you could probably use mod, but it's not a pattern I'd recommend). The simple thing is to not use the mask field at all, and just write visit_system=X. That will require another join in the query, but it should be totally transparent to users.

            Show
            jbosch Jim Bosch added a comment - Now that I think about it, I don't think we have the binary OR and AND operators needed to make use of the mask field at all (well, you could probably use mod, but it's not a pattern I'd recommend). The simple thing is to not use the mask field at all, and just write visit_system=X . That will require another join in the query, but it should be totally transparent to users.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              tjenness Tim Jenness
              Watchers:
              Jim Bosch, Kian-Tat Lim, Merlin Fisher-Levine, Robert Lupton, Tim Jenness
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins Builds

                  No builds found.