Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-18173

Auxtel Calibration System integration and test in the lab

    Details

      Description

      Now that the software components for the calibration system are up and running again I'll be able to run some integration tests in the lab. I'll take this opportunity to properly debug an document any SAL/ScriptQueue related issue I encounter.

      Required Components:

      • ATMonochromator
      • FiberSpectrograph
      • Electrometer
      • ATSpectrograph
      • ATCamera

      Additional components:

      • White Light Source
      • Chiller
      • - Additional components are those that are not required for the tests but will be added if available.

      The artifacts of this task will be "at least" a script to run the Calibration system and probably one that also involves the ATCamera and ATSpectrograph, as well as confluence page with description of the issues encountered and jira tickets.

        Attachments

          Activity

          Hide
          tribeiro Tiago Ribeiro added a comment -

          In the first round of testing, I may have figured out why late joiner stops working after some time. To start, I figure that "some time" is 100s and that it is probably due to a limit imposed by SAL in the age of the data. I added my comments on DM-18035.

          Show
          tribeiro Tiago Ribeiro added a comment - In the first round of testing, I may have figured out why late joiner stops working after some time. To start, I figure that "some time" is 100s and that it is probably due to a limit imposed by SAL in the age of the data. I added my comments on DM-18035 .
          Hide
          aclements Andy Clements added a comment -

          Reading through the DDS documentation, it states this could happen if the writers for this topic bring it's datatype lifecycle to the NOT_ALIVE_DISPOSED state.   Is it possible that the writers for this topics have essentially been "shut off" and disposed of?   Either implicitly through the QOS settings or explicitly through the Writer API call?

          http://download.prismtech.com/docs/Vortex/html/ospl/DDSTutorial/readandwrite.html

           

          Show
          aclements Andy Clements added a comment - Reading through the DDS documentation, it states this could happen if the writers for this topic bring it's datatype lifecycle to the NOT_ALIVE_DISPOSED state.   Is it possible that the writers for this topics have essentially been "shut off" and disposed of?   Either implicitly through the QOS settings or explicitly through the Writer API call? http://download.prismtech.com/docs/Vortex/html/ospl/DDSTutorial/readandwrite.html  
          Hide
          tribeiro Tiago Ribeiro added a comment -

          I don't think so.

          Note that I did tested changing the sal code to ignore the topic age and it then works for longer than the 100 seconds... I also expanded the debug information and could check that the topic arrives as expected and is later rejected due to age. So I'm confident that the issue is indeed SAL imposing an age limit to the topic.

          The code that I changes is this one:

          https://github.com/lsst-ts/ts_sal/blob/818de4bcfd4f36810875d2ca060ca7710b04cff0/lsstsal/scripts/gensalgetput.tcl#L90-L97

              if (debugLevel > 8) \{
                cout << \"=== \[GetSample\] message received :\" << numsamp << endl;
                cout << \"    revCode  : \" << Instances\[j\].private_revCode << endl;
                cout << \"    sndStamp  : \" << Instances\[j\].private_sndStamp << endl;
                cout << \"    origin  : \" << Instances\[j\].private_origin << endl;
                cout << \"    host  : \" << Instances\[j\].private_host << endl;
              \}
              if ( (rcvdTime - Instances\[j\].private_sndStamp) < sal\[actorIdx\].sampleAge && (Instances\[j\].private_origin != 0)) \{
          

          Show
          tribeiro Tiago Ribeiro added a comment - I don't think so. Note that I did tested changing the sal code to ignore the topic age and it then works for longer than the 100 seconds... I also expanded the debug information and could check that the topic arrives as expected and is later rejected due to age. So I'm confident that the issue is indeed SAL imposing an age limit to the topic. The code that I changes is this one: https://github.com/lsst-ts/ts_sal/blob/818de4bcfd4f36810875d2ca060ca7710b04cff0/lsstsal/scripts/gensalgetput.tcl#L90-L97 if (debugLevel > 8) \{ cout << \"=== \[GetSample\] message received :\" << numsamp << endl; cout << \" revCode : \" << Instances\[j\].private_revCode << endl; cout << \" sndStamp : \" << Instances\[j\].private_sndStamp << endl; cout << \" origin : \" << Instances\[j\].private_origin << endl; cout << \" host : \" << Instances\[j\].private_host << endl; \} if ( (rcvdTime - Instances\[j\].private_sndStamp) < sal\[actorIdx\].sampleAge && (Instances\[j\].private_origin != 0)) \{
          Hide
          aclements Andy Clements added a comment -

          OK, just wanted throw that out there as a possibility. 

          Show
          aclements Andy Clements added a comment - OK, just wanted throw that out there as a possibility. 
          Hide
          tribeiro Tiago Ribeiro added a comment -

          This task ended up being much more distributed than I have anticipated. As part of it I started debugging some issues we've seen in the lab with SAL and discovered an age limit imposed by SAL to the topics we where not aware of and was incompatible with the use-case we had for SAL.

          Next, I spent some time trying to debug another issue we had with commands being dropped by SAL. I started working on this independently but then, this became the main topic of discussions with the DDS consultant, so I shifted the focus to working with the consultant and the rest of the team in debugging this issue.

          The artifacts for this was the information I added to the appropriate tickets about the SAL age issue and probably the container that uses the lates OpenSplice version. The container was largely used during the DDS workshop to try to debug the command issue and will also serve as road mapping for OpenSplice/SAL rpm installation for SAL 3.9 and later in the future.

          Show
          tribeiro Tiago Ribeiro added a comment - This task ended up being much more distributed than I have anticipated. As part of it I started debugging some issues we've seen in the lab with SAL and discovered an age limit imposed by SAL to the topics we where not aware of and was incompatible with the use-case we had for SAL. Next, I spent some time trying to debug another issue we had with commands being dropped by SAL. I started working on this independently but then, this became the main topic of discussions with the DDS consultant, so I shifted the focus to working with the consultant and the rest of the team in debugging this issue. The artifacts for this was the information I added to the appropriate tickets about the SAL age issue and probably the container that uses the lates OpenSplice version. The container was largely used during the DDS workshop to try to debug the command issue and will also serve as road mapping for OpenSplice/SAL rpm installation for SAL 3.9 and later in the future.
          Hide
          aclements Andy Clements added a comment -

          The age-limit has a solution brought to it to increase the age-limit, so that it would not interfere.   This was implemented in 3.8.42 for release to the project and to be put in use in the scriptqueue with the pLAN testing.

          The other issue with dropping commands in the scriptqueue was taken into triage during the DDS workshop meetings.   A good couple of days were devoted to these problems.   Although the exact problem was not determined, we were able to disect the system to better understand what is going on.

          We have a plan now for Dave Mills implement the python native Openslice code into SAL to allow for asyncio interaction.

          Show
          aclements Andy Clements added a comment - The age-limit has a solution brought to it to increase the age-limit, so that it would not interfere.   This was implemented in 3.8.42 for release to the project and to be put in use in the scriptqueue with the pLAN testing. The other issue with dropping commands in the scriptqueue was taken into triage during the DDS workshop meetings.   A good couple of days were devoted to these problems.   Although the exact problem was not determined, we were able to disect the system to better understand what is going on. We have a plan now for Dave Mills implement the python native Openslice code into SAL to allow for asyncio interaction.

            People

            • Assignee:
              tribeiro Tiago Ribeiro
              Reporter:
              tribeiro Tiago Ribeiro
              Reviewers:
              Andy Clements
              Watchers:
              Andy Clements, James Buffill [X] (Inactive), Patrick Ingraham, Tiago Ribeiro
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Summary Panel