Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-36931

rotation data increasingly delayed over time

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: ts_main_telescope
    • Labels:

      Description

      The rotation telemetry topic is increasingly delayed over time.

      I added a diagnostic print statement to the MTRotator CSC in an effort to figure out if this was likely in the low-level controller or the CSC. The statement is in the callback function that is called whenever the CSC reads telemetry from the low-level controller. It prints two values:

      • telemetry_age: rcv_time - telemetry header timestamp, where rcv_time is the time at which the telemetry calllback function is called to process low-level telemetry.
      • write rotation delay: the time just after the "rotation" SAL event is sent - rcv_time.

      What I see is that relemetry_tag increases slowly and apparently fairly linearly, whereas write rotation delay does not change.

      telemetry_age=0.084; write rotation delay=0.001
      telemetry_age=0.084; write rotation delay=0.001
      telemetry_age=0.084; write rotation delay=0.001
      telemetry_age=0.085; write rotation delay=0.001
      telemetry_age=0.085; write rotation delay=0.001
      telemetry_age=0.085; write rotation delay=0.001
      telemetry_age=0.086; write rotation delay=0.001
      telemetry_age=0.086; write rotation delay=0.001
      telemetry_age=0.086; write rotation delay=0.001
      telemetry_age=0.087; write rotation delay=0.001
      telemetry_age=0.087; write rotation delay=0.001
      telemetry_age=0.087; write rotation delay=0.001
      telemetry_age=0.088; write rotation delay=0.001
      telemetry_age=0.088; write rotation delay=0.001
      telemetry_age=0.088; write rotation delay=0.001
      telemetry_age=0.089; write rotation delay=0.001
      telemetry_age=0.089; write rotation delay=0.001
      telemetry_age=0.089; write rotation delay=0.001
      telemetry_age=0.090; write rotation delay=0.001
      telemetry_age=0.090; write rotation delay=0.001
      telemetry_age=0.091; write rotation delay=0.001
      telemetry_age=0.091; write rotation delay=0.001
      telemetry_age=0.091; write rotation delay=0.001
      telemetry_age=0.092; write rotation delay=0.001
      telemetry_age=0.092; write rotation delay=0.001
      telemetry_age=0.092; write rotation delay=0.001
      telemetry_age=0.093; write rotation delay=0.001
      telemetry_age=0.093; write rotation delay=0.001
      telemetry_age=0.093; write rotation delay=0.001
      telemetry_age=0.094; write rotation delay=0.001
      telemetry_age=0.094; write rotation delay=0.001
      telemetry_age=0.094; write rotation delay=0.001
      telemetry_age=0.095; write rotation delay=0.001
      telemetry_age=0.095; write rotation delay=0.001
      telemetry_age=0.095; write rotation delay=0.001
      telemetry_age=0.096; write rotation delay=0.001
      telemetry_age=0.096; write rotation delay=0.001
      telemetry_age=0.097; write rotation delay=0.001
      telemetry_age=0.097; write rotation delay=0.001
      telemetry_age=0.097; write rotation delay=0.001
      telemetry_age=0.098; write rotation delay=0.001
      telemetry_age=0.098; write rotation delay=0.001
      telemetry_age=0.098; write rotation delay=0.001
      telemetry_age=0.099; write rotation delay=0.001
      telemetry_age=0.099; write rotation delay=0.001
      telemetry_age=0.099; write rotation delay=0.001
      telemetry_age=0.100; write rotation delay=0.001
      ...
      

      The message is printed every 25 telemetry samples. After many minutes the time is up to 0.465 seconds.

      I strongly suspect the error is inside the low-level controller and I hope you will be willing to rule this out before I look for it in the CSC. I'm having trouble imagining how the CSC could do this (especially with such large delays).

      The only possible explanation I have come up with is an algorithm such as this:

      timestamp = get_current_time()
      while True:
          compute and send telemetry
          sleep telemetry_interval
          timestamp = old_timestamp + telemetry_interval
      

      The timestamp would fall further behind in each iteration by the amount of the time spent in the "compute and send telemetry" step.

        Attachments

        1. delay times after Te-Wei's fix.txt
          33 kB
        2. diff_sal_rcvStamp.png
          diff_sal_rcvStamp.png
          17 kB
        3. diff_sal_sndStamp_after_90sec.png
          diff_sal_sndStamp_after_90sec.png
          18 kB
        4. diff_sal_sndStamp.png
          diff_sal_sndStamp.png
          18 kB
        5. show_mismatched_delays_modified.py
          0.8 kB
        6. show_mismatched_delays.py
          0.7 kB
        7. timestamp_rotator_90sec.ipynb
          290 kB
        8. timestamp_rotator.ipynb
          288 kB

          Activity

          No builds found.
          rowen Russell Owen created issue -
          Hide
          rowen Russell Owen added a comment -

          I tried sending the CSC to standby and back to enabled and I see that the time delay more-or-less resets:

          1667946189.507: connected: connected=True
          1667946189.552: summaryState: summaryState=<State.DISABLED: 1>
          Finished command start
          telemetry_age=0.096; write rotation delay=0.001
          telemetry_age=0.096; write rotation delay=0.001
          telemetry_age=0.096; write rotation delay=0.001
          telemetry_age=0.097; write rotation delay=0.001
          telemetry_age=0.097; write rotation delay=0.001
          telemetry_age=0.097; write rotation delay=0.001
          telemetry_age=0.098; write rotation delay=0.001
          telemetry_age=0.098; write rotation delay=0.001
          telemetry_age=0.098; write rotation delay=0.001
          telemetry_age=0.099; write rotation delay=0.001
          telemetry_age=0.099; write rotation delay=0.001
          telemetry_age=0.099; write rotation delay=0.001
          telemetry_age=0.100; write rotation delay=0.001
          

          Show
          rowen Russell Owen added a comment - I tried sending the CSC to standby and back to enabled and I see that the time delay more-or-less resets: 1667946189.507: connected: connected=True 1667946189.552: summaryState: summaryState=<State.DISABLED: 1> Finished command start telemetry_age=0.096; write rotation delay=0.001 telemetry_age=0.096; write rotation delay=0.001 telemetry_age=0.096; write rotation delay=0.001 telemetry_age=0.097; write rotation delay=0.001 telemetry_age=0.097; write rotation delay=0.001 telemetry_age=0.097; write rotation delay=0.001 telemetry_age=0.098; write rotation delay=0.001 telemetry_age=0.098; write rotation delay=0.001 telemetry_age=0.098; write rotation delay=0.001 telemetry_age=0.099; write rotation delay=0.001 telemetry_age=0.099; write rotation delay=0.001 telemetry_age=0.099; write rotation delay=0.001 telemetry_age=0.100; write rotation delay=0.001
          ttsai Te-Wei Tsai made changes -
          Field Original Value New Value
          Labels M2
          Hide
          ttsai Te-Wei Tsai added a comment - - edited

          Did you check the TCP/IP packet time by Wireshark? The rotator control software uses the hard real-time functionality in the Xenomai dual-kernel. Therefore, it is hard for me to imagine the problem comes from the low-level controller as well.

          In addition, I think you should check the time difference among the "telemetry header timestamp" (not "rcv_time - telemetry header timestamp") to see the difference is increasing or not. If it is, maybe the real-time functionality has the problem in the low-level controller.

          Show
          ttsai Te-Wei Tsai added a comment - - edited Did you check the TCP/IP packet time by Wireshark? The rotator control software uses the hard real-time functionality in the Xenomai dual-kernel. Therefore, it is hard for me to imagine the problem comes from the low-level controller as well. In addition, I think you should check the time difference among the "telemetry header timestamp" (not "rcv_time - telemetry header timestamp") to see the difference is increasing or not. If it is, maybe the real-time functionality has the problem in the low-level controller.
          Hide
          rowen Russell Owen added a comment - - edited

          I am not sure what you mean by check the time difference among the "telemetry header timestamp" (not "rcv_time - telemetry header timestamp") to see the difference is increasing or not.. The only timestamp I know about is the timestamp in the telemetry header. The only thing I can think to compare it to is the time at which I receive the data.

          After a long time (I don't know how long) the delay seems to have maxed out at 0.55 seconds. I have not seen it increase past that. Which means we may still be able to make CCW follow the rotator with it. I'm not sure.

          Show
          rowen Russell Owen added a comment - - edited I am not sure what you mean by check the time difference among the "telemetry header timestamp" (not "rcv_time - telemetry header timestamp") to see the difference is increasing or not. . The only timestamp I know about is the timestamp in the telemetry header. The only thing I can think to compare it to is the time at which I receive the data. After a long time (I don't know how long) the delay seems to have maxed out at 0.55 seconds. I have not seen it increase past that. Which means we may still be able to make CCW follow the rotator with it. I'm not sure.
          rowen Russell Owen made changes -
          Urgent? on off
          ttsai Te-Wei Tsai made changes -
          Epic Link DM-27721 [ 442128 ]
          Hide
          ttsai Te-Wei Tsai added a comment -

          I mean if you received the telemetry_1, telemetry_2, telemetry_3, ..., etc., you can compare the header timestamp among them. That will tell you the execution time in the low-level controller. If you have this information in the ts_xml, I can compare for you as long as you provide the timestamp of your test in the yesterday night.

          Show
          ttsai Te-Wei Tsai added a comment - I mean if you received the telemetry_1, telemetry_2, telemetry_3, ..., etc., you can compare the header timestamp among them. That will tell you the execution time in the low-level controller. If you have this information in the ts_xml, I can compare for you as long as you provide the timestamp of your test in the yesterday night.
          ttsai Te-Wei Tsai made changes -
          Attachment timestamp_rotator.ipynb [ 64548 ]
          Hide
          ttsai Te-Wei Tsai added a comment - - edited

          Analyzed the rotator timestamp on summit in timestamp_rotator.ipynb.

          The standard deviation among timestamps is very small. After the analyzing the timestamp of controller, SAL receiving timestamp, and SAL sending timestamp, I could only find two weird data points at 0.2 sec and 0.5 sec individually. Otherwise, the distribution of timestamp looks good.

          After the discussion with Russell, I realized the controller system has an intrinsic time delay of 0.05 sec. We agreed that it would be useful to log the times of putting the timestamp (get_clocktime()) and sending to the TCP/IP buffer (send()) in the low-level controller.

          Show
          ttsai Te-Wei Tsai added a comment - - edited Analyzed the rotator timestamp on summit in  timestamp_rotator.ipynb . The standard deviation among timestamps is very small. After the analyzing the timestamp of controller, SAL receiving timestamp, and SAL sending timestamp, I could only find two weird data points at 0.2 sec and 0.5 sec individually. Otherwise, the distribution of timestamp looks good. After the discussion with Russell, I realized the controller system has an intrinsic time delay of 0.05 sec. We agreed that it would be useful to log the times of putting the timestamp ( get_clocktime() ) and sending to the TCP/IP buffer ( send() ) in the low-level controller.
          ttsai Te-Wei Tsai made changes -
          Sprint TSSW Sprint - Nov 07 - Nov 21 [ 1200 ]
          Hide
          rowen Russell Owen added a comment - - edited

          Te-Wei Tsai and I walked through the code today and I'm fairly sure that the problem a very small issue is in the way the low-level controller sends the telemetry: one thread puts telemetry messages onto a buffer and another thread takes them off and publishes them to TCP/IP. The issue is that both threads are loops that sleep for 0.05 seconds. If the reader thread loop takes a tiny bit longer than the writer loop, then eventually it will start to get behind: first one extra message will accumulate in the buffer, then 2, then 3. As each extra message is added the reported delay will go up by 0.05 seconds. This accumulation should be steady, which is exactly what we see.

          The solution is trivial: sleep for a significantly shorter time in the reader loop than in the writer loop. Anything shorter than the run time difference is sufficient, but I recommend sleeping for 0.01 or 0.02 seconds. That reduces the maximum lag in publishing the telemetry. It also reduces latency in handling commands (since the same loop processes commands).

          I have attached a trivial python script that shows the issue. Run it and you will see that the number of items queued stays at 1 for awhile, then goes to 2 and stays there for awhile, then... The sleep time in the reader is slightly longer in order to model my hypothesis that the read loop takes more time than the write loop in the real controller.

          Show
          rowen Russell Owen added a comment - - edited Te-Wei Tsai and I walked through the code today and I'm fairly sure that the problem a very small issue is in the way the low-level controller sends the telemetry: one thread puts telemetry messages onto a buffer and another thread takes them off and publishes them to TCP/IP. The issue is that both threads are loops that sleep for 0.05 seconds. If the reader thread loop takes a tiny bit longer than the writer loop, then eventually it will start to get behind: first one extra message will accumulate in the buffer, then 2, then 3. As each extra message is added the reported delay will go up by 0.05 seconds. This accumulation should be steady, which is exactly what we see. The solution is trivial: sleep for a significantly shorter time in the reader loop than in the writer loop. Anything shorter than the run time difference is sufficient, but I recommend sleeping for 0.01 or 0.02 seconds. That reduces the maximum lag in publishing the telemetry. It also reduces latency in handling commands (since the same loop processes commands). I have attached a trivial python script that shows the issue. Run it and you will see that the number of items queued stays at 1 for awhile, then goes to 2 and stays there for awhile, then... The sleep time in the reader is slightly longer in order to model my hypothesis that the read loop takes more time than the write loop in the real controller.
          rowen Russell Owen made changes -
          Attachment show_mismatched_delays.py [ 64550 ]
          Hide
          ttsai Te-Wei Tsai added a comment -

          Although there should be the time delay, it can not be more than 0.05 sec because once the reader has 0.05 sec delay, it would always have the new message in the message queue to send the TCP/IP packet.

          Show
          ttsai Te-Wei Tsai added a comment - Although there should be the time delay, it can not be more than 0.05 sec because once the reader has 0.05 sec delay, it would always have the new message in the message queue to send the TCP/IP packet.
          ttsai Te-Wei Tsai made changes -
          Attachment show_mismatched_delays_modified.py [ 64551 ]
          Hide
          ttsai Te-Wei Tsai added a comment -

          BTW, I do not think the reader will always be slower. Instead, it is random. I modified your script to demonstrate this: show_mismatched_delays_modified.py

          Show
          ttsai Te-Wei Tsai added a comment - BTW, I do not think the reader will always be slower. Instead, it is random. I modified your script to demonstrate this: show_mismatched_delays_modified.py
          Hide
          rowen Russell Owen added a comment - - edited

          My point is that IF the reader (consumer of data from the queue) is usually slower than the writer (producer of data to the queue) then we will see this delay. Given the symptoms, I think your read loop is always (or almost always) slower.

          Please shorten the delay in the reader loop – preferably by a lot (e.g. 50 Hz or 100 Hz). If that fixes the problem then we are done. If not, we can think harder about it.

          This is very high priority – it is a potential blocker for the "soak test".

          Show
          rowen Russell Owen added a comment - - edited My point is that IF the reader (consumer of data from the queue) is usually slower than the writer (producer of data to the queue) then we will see this delay. Given the symptoms, I think your read loop is always (or almost always) slower. Please shorten the delay in the reader loop – preferably by a lot (e.g. 50 Hz or 100 Hz). If that fixes the problem then we are done. If not, we can think harder about it. This is very high priority – it is a potential blocker for the "soak test".
          Hide
          ttsai Te-Wei Tsai added a comment - - edited

          If that is the case, you should not see the saturation of delay time. This is why I am very confused. In addition, the consumer is always slower or not is another question.

          Show
          ttsai Te-Wei Tsai added a comment - - edited If that is the case, you should not see the saturation of delay time. This is why I am very confused. In addition, the consumer is always slower or not is another question.
          ttsai Te-Wei Tsai made changes -
          Status To Do [ 10001 ] In Progress [ 3 ]
          ttsai Te-Wei Tsai made changes -
          Story Points 1
          ttsai Te-Wei Tsai made changes -
          Attachment timestamp_rotator.ipynb [ 64548 ]
          ttsai Te-Wei Tsai made changes -
          Attachment timestamp_rotator.ipynb [ 64553 ]
          ttsai Te-Wei Tsai made changes -
          Attachment timestamp_rotator_90sec.ipynb [ 64554 ]
          ttsai Te-Wei Tsai made changes -
          Attachment diff_sal_sndStamp.png [ 64555 ]
          Attachment diff_sal_rcvStamp.png [ 64556 ]
          Hide
          ttsai Te-Wei Tsai added a comment - - edited

          I restricted the time of EFD data to 90 second only in show_mismatched_delays_modified.py.

          When I checked the time difference in rcvStamp and sndStamp, I found the value of median of sndStamp is slightly higher than the 0.05 sec. I think this might be possible explanation of the gradually time delay in the experiment:

          time_start = Time("2022-11-08T22:22:30", scale="utc", format="isot")
          time_end = Time("2022-11-08T22:24:00", scale="utc", format="isot")
          

          It is noted that if I analyzed 20 min data, I will not see this. Instead, I need to analyze the data every 90 second to see this. This implies that this is not a systematic error.

          Since the CSC uses the callback to write the SAL telemetry, there are three possible places to induce this:
          1. Control system has a time delay to send the TCP/IP packet.
          2. OS in controller or switch has a time delay to send the TCP/IP packet.
          3. CSC has a time delay to receive the TCP/IP packet.

          The debug process would be 1 --> 3 --> 2.

          Show
          ttsai Te-Wei Tsai added a comment - - edited I restricted the time of EFD data to 90 second only in show_mismatched_delays_modified.py . When I checked the time difference in rcvStamp and sndStamp , I found the value of median of sndStamp is slightly higher than the 0.05 sec. I think this might be possible explanation of the gradually time delay in the experiment: time_start = Time( "2022-11-08T22:22:30" , scale = "utc" , format = "isot" ) time_end = Time( "2022-11-08T22:24:00" , scale = "utc" , format = "isot" ) It is noted that if I analyzed 20 min data, I will not see this. Instead, I need to analyze the data every 90 second to see this. This implies that this is not a systematic error. Since the CSC uses the callback to write the SAL telemetry, there are three possible places to induce this: 1. Control system has a time delay to send the TCP/IP packet. 2. OS in controller or switch has a time delay to send the TCP/IP packet. 3. CSC has a time delay to receive the TCP/IP packet. The debug process would be 1 --> 3 --> 2.
          ttsai Te-Wei Tsai made changes -
          Attachment diff_sal_sndStamp_after_90sec.png [ 64557 ]
          Hide
          ttsai Te-Wei Tsai added a comment - - edited

          Another interesting thing is that after 90 sec, the sndStamp comes back to normal:

          time_start = Time("2022-11-08T22:24:00", scale="utc", format="isot")
          time_end = Time("2022-11-08T22:25:30", scale="utc", format="isot")
          

          After some checking of other time period, it looks like this delay is intermittent. Sometimes I have it and sometimes I do not have. Right now, the question is: this intermittence is by accident or something does happen at that time.

          Show
          ttsai Te-Wei Tsai added a comment - - edited Another interesting thing is that after 90 sec, the sndStamp comes back to normal: time_start = Time( "2022-11-08T22:24:00" , scale = "utc" , format = "isot" ) time_end = Time( "2022-11-08T22:25:30" , scale = "utc" , format = "isot" ) After some checking of other time period, it looks like this delay is intermittent. Sometimes I have it and sometimes I do not have. Right now, the question is: this intermittence is by accident or something does happen at that time.
          Hide
          ttsai Te-Wei Tsai added a comment -

          The following is the code in low-level controller to write the telemetry TCP/IP packet:

          // Send the telemetry and command status to the client.
          // Note: The data types of input and output are required for pthread_create().
          // The input needs to cast to the correct data type.
          static void *cmdTlmServer_sendTlmAndCmdStatus(void *pData) {
           
              // Get the server information
              serverInfo_t *pServerInfo = (serverInfo_t *)pData;
           
              // Loop delay time
              // 50 milliseconds (= 20 Hz)
              struct timespec ts50;
              ts50.tv_nsec = 50000000;
              ts50.tv_sec = 0;
           
              // Wait until begins to write the telemetry
              while (!pServerInfo->isReadyTlm) {
                  nanosleep(&ts50, NULL);
              }
           
              // Write the telemetry and command status
              syslog(LOG_NOTICE, "The telemetry thread is running in the %s server.",
                     pServerInfo->pName);
           
              int lastServerStatus = ServerStatus_Disconnected;
              while (pServerInfo->isReadyTlm) {
                  // Reset the record of finding the closed connection and update the last
                  // server status
                  if ((lastServerStatus == ServerStatus_Disconnected) &&
                      (pServerInfo->serverStatus == ServerStatus_Connected)) {
                      pServerInfo->isCloseConnDetected = false;
                  }
           
                  lastServerStatus = pServerInfo->serverStatus;
           
                  // Reply the last command status from commanding.c in controller code
                  int error = 0;
                  if (!pServerInfo->isCloseConnDetected && (error == 0)) {
                      error = cmdTlmServer_sendLastCmdStateInMsgQueue(pServerInfo);
                  }
           
                  // Send the telemetry if any
                  if (!pServerInfo->isCloseConnDetected && (error == 0)) {
                      error = cmdTlmServer_sendLastTlmInMsgQueue(pServerInfo);
                  }
           
                  // Report if sending failed. The server thread will notice this and
                  // change the server status to be disconnected in cmdTlmServer_run()
                  if (error == -1) {
                      pServerInfo->isCloseConnDetected = true;
                  }
           
                  nanosleep(&ts50, NULL);
              }
           
              return 0;
          }
          

          If we assume there is one command status and one telemetry packets need to write into TCP/IP pipe at each iteration, the time will be more than 50 second for sure. However, this value would be "steady" and can not be gradually increasing as what we saw on summit. The message queues are non-blocking and the functions (cmdTlmServer_sendLastCmdStateInMsgQueue() and cmdTlmServer_sendLastTlmInMsgQueue() will only take one instance from queue at each time. There is no while loop instead (if we have, it would be easier to explain the increase of time delay).

          Show
          ttsai Te-Wei Tsai added a comment - The following is the code in low-level controller to write the telemetry TCP/IP packet: // Send the telemetry and command status to the client. // Note: The data types of input and output are required for pthread_create(). // The input needs to cast to the correct data type. static void *cmdTlmServer_sendTlmAndCmdStatus( void *pData) {   // Get the server information serverInfo_t *pServerInfo = (serverInfo_t *)pData;   // Loop delay time // 50 milliseconds (= 20 Hz) struct timespec ts50; ts50.tv_nsec = 50000000; ts50.tv_sec = 0;   // Wait until begins to write the telemetry while (!pServerInfo->isReadyTlm) { nanosleep(&ts50, NULL); }   // Write the telemetry and command status syslog(LOG_NOTICE, "The telemetry thread is running in the %s server." , pServerInfo->pName);   int lastServerStatus = ServerStatus_Disconnected; while (pServerInfo->isReadyTlm) { // Reset the record of finding the closed connection and update the last // server status if ((lastServerStatus == ServerStatus_Disconnected) && (pServerInfo->serverStatus == ServerStatus_Connected)) { pServerInfo->isCloseConnDetected = false ; }   lastServerStatus = pServerInfo->serverStatus;   // Reply the last command status from commanding.c in controller code int error = 0; if (!pServerInfo->isCloseConnDetected && (error == 0)) { error = cmdTlmServer_sendLastCmdStateInMsgQueue(pServerInfo); }   // Send the telemetry if any if (!pServerInfo->isCloseConnDetected && (error == 0)) { error = cmdTlmServer_sendLastTlmInMsgQueue(pServerInfo); }   // Report if sending failed. The server thread will notice this and // change the server status to be disconnected in cmdTlmServer_run() if (error == -1) { pServerInfo->isCloseConnDetected = true ; }   nanosleep(&ts50, NULL); }   return 0; } If we assume there is one command status and one telemetry packets need to write into TCP/IP pipe at each iteration, the time will be more than 50 second for sure. However, this value would be "steady" and can not be gradually increasing as what we saw on summit. The message queues are non-blocking and the functions ( cmdTlmServer_sendLastCmdStateInMsgQueue() and cmdTlmServer_sendLastTlmInMsgQueue() will only take one instance from queue at each time. There is no while loop instead (if we have, it would be easier to explain the increase of time delay).
          Hide
          ttsai Te-Wei Tsai added a comment -

          Please help to review the PR:
          https://github.com/lsst-ts/ts_rotator_controller/pull/46

          Thanks!

          Show
          ttsai Te-Wei Tsai added a comment - Please help to review the PR: https://github.com/lsst-ts/ts_rotator_controller/pull/46 Thanks!
          ttsai Te-Wei Tsai made changes -
          Reviewers Russell Owen [ rowen ]
          Status In Progress [ 3 ] In Review [ 10004 ]
          Hide
          rowen Russell Owen added a comment - - edited

          Reviewed on github.

          Also the change is running and seems to have worked. I will append a short log of delay times (time MTRotator received the telemetry - telemetry header time). It increases and then resets to a small value and starts over. All delays values are much smaller than they used to be.

          Unlike the log in the original report this shows every packet (so is chatty enough that I did not want to leave it running). I am not sure what to make of the alternation between almost no delay and slightly longer delay, but I do not consider it to be a problem.

          Show
          rowen Russell Owen added a comment - - edited Reviewed on github. Also the change is running and seems to have worked. I will append a short log of delay times (time MTRotator received the telemetry - telemetry header time). It increases and then resets to a small value and starts over. All delays values are much smaller than they used to be. Unlike the log in the original report this shows every packet (so is chatty enough that I did not want to leave it running). I am not sure what to make of the alternation between almost no delay and slightly longer delay, but I do not consider it to be a problem.
          rowen Russell Owen made changes -
          Status In Review [ 10004 ] Reviewed [ 10101 ]
          rowen Russell Owen made changes -
          Attachment delay times after Te-Wei's fix.txt [ 64570 ]
          Hide
          ttsai Te-Wei Tsai added a comment -

          I am glad to hear it works, although I still do not understand why. I am not sure how long you tested this already. Based on the data analysis on EFD, this time delay is intermittent. I have an assumption that this delay happened only when you issued the track command but not others (such as the point-to-point movement). Just let me know if you want me to take time to dig into it more or not. Thanks!

          Show
          ttsai Te-Wei Tsai added a comment - I am glad to hear it works, although I still do not understand why. I am not sure how long you tested this already. Based on the data analysis on EFD, this time delay is intermittent. I have an assumption that this delay happened only when you issued the track command but not others (such as the point-to-point movement). Just let me know if you want me to take time to dig into it more or not. Thanks!
          ttsai Te-Wei Tsai made changes -
          Resolution Done [ 10000 ]
          Status Reviewed [ 10101 ] Done [ 10002 ]

            People

            Assignee:
            ttsai Te-Wei Tsai
            Reporter:
            rowen Russell Owen
            Reviewers:
            Russell Owen
            Watchers:
            Russell Owen, Te-Wei Tsai
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved:

                Jenkins

                No builds found.