Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-7390

Improve tests highly sensitive to random seeds

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Won't Fix
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Story Points:
      2
    • Team:
      Alert Production
    • Urgent?:
      No

      Description

      While it is important to use random seeds to prevent a strange random value from causing a test to fail, a handful of tests seem to depend to heavily on the specific seed they are given. This ticket should act as both a log for tests that may be too sensitive to the seed value and eventually corrections to those tests.

        Attachments

          Issue Links

            Activity

            Hide
            fred3m Fred Moolekamp added a comment -

            So far the following tests have been noted to be sensitive to the random seed:

            1. afw/tests/testApCorrMap.py
            2. ip_diffim/tests/testDipole.py
            Show
            fred3m Fred Moolekamp added a comment - So far the following tests have been noted to be sensitive to the random seed: afw/tests/testApCorrMap.py ip_diffim/tests/testDipole.py
            Hide
            tjenness Tim Jenness added a comment -

            Also, afw/tests/testChebyshevBoundedField.py. See DM-7461. Setting the random seed can change the results of testEvaluate significantly enough for the difference to exceed the tolerance by a factor of 2 and also to be sensitive to MKL vs no MKL.

            Show
            tjenness Tim Jenness added a comment - Also, afw/tests/testChebyshevBoundedField.py . See DM-7461 . Setting the random seed can change the results of testEvaluate significantly enough for the difference to exceed the tolerance by a factor of 2 and also to be sensitive to MKL vs no MKL.
            Hide
            tjenness Tim Jenness added a comment -

            There's a test in pipe_tasks that can fail because of random number sensitivity.

            Show
            tjenness Tim Jenness added a comment - There's a test in pipe_tasks that can fail because of random number sensitivity.
            Hide
            swinbank John Swinbank added a comment -

            I'm unsure of where to go with this ticket.

            I've confirmed that I can cause both test_apCorrMap.py and test_dipole.py to fail by changing the seed (although it took me several attempts to find a seed which fails). I didn't manage to get a failure in test_chebyshevBoundedField.py, but I'm prepared to believe I would if I kept playing with the seed for long enough.

            But... is this actually a problem? In all of those cases, we could avoid the problem by loosening the test tolerance. Would that be generally useful? It's not obvious to me that it would.

            I'm happy to hear thoughts, but I'm inclined to close this as “won't fix”, and invite folks to file bugs against specific tests describing exactly what they want changed.

            Show
            swinbank John Swinbank added a comment - I'm unsure of where to go with this ticket. I've confirmed that I can cause both test_apCorrMap.py and test_dipole.py to fail by changing the seed (although it took me several attempts to find a seed which fails). I didn't manage to get a failure in test_chebyshevBoundedField.py , but I'm prepared to believe I would if I kept playing with the seed for long enough. But... is this actually a problem? In all of those cases, we could avoid the problem by loosening the test tolerance. Would that be generally useful? It's not obvious to me that it would. I'm happy to hear thoughts, but I'm inclined to close this as “won't fix”, and invite folks to file bugs against specific tests describing exactly what they want changed.
            Hide
            tjenness Tim Jenness added a comment -

            I do worry that the random seed variation is not understood and so we aren't entirely sure what the right number is for the tolerance of each test. Ideally I'd like to know the distribution of answers as we change the seed and determine from that distribution whether the answers are all acceptable or whether it's telling us that the algorithm itself is unstable and will cause us grief later on. I understand that this is a lot of work though so I'm not going to block a won't fix.

            Show
            tjenness Tim Jenness added a comment - I do worry that the random seed variation is not understood and so we aren't entirely sure what the right number is for the tolerance of each test. Ideally I'd like to know the distribution of answers as we change the seed and determine from that distribution whether the answers are all acceptable or whether it's telling us that the algorithm itself is unstable and will cause us grief later on. I understand that this is a lot of work though so I'm not going to block a won't fix.
            Hide
            swinbank John Swinbank added a comment -

            I'm nervous about the idea that we can post-facto assess the scientific validity of algorithms based on unit test outputs.

            If the test were carefully written together with the algorithm with an eye to being used for this purpose, I think it'd be a great idea. Where that wasn't the case, though, I think it'd be a lot of work for minimal gain — I'd rather treat these tests as effectively regression tests, and assess scientific validity of our algorithms by large-scale data processing campaigns.

            Show
            swinbank John Swinbank added a comment - I'm nervous about the idea that we can post-facto assess the scientific validity of algorithms based on unit test outputs. If the test were carefully written together with the algorithm with an eye to being used for this purpose, I think it'd be a great idea. Where that wasn't the case, though, I think it'd be a lot of work for minimal gain — I'd rather treat these tests as effectively regression tests, and assess scientific validity of our algorithms by large-scale data processing campaigns.
            Hide
            swinbank John Swinbank added a comment -

            Won't Fixing, as threatened.

            Show
            swinbank John Swinbank added a comment - Won't Fixing, as threatened.

              People

              Assignee:
              Unassigned Unassigned
              Reporter:
              fred3m Fred Moolekamp
              Watchers:
              Fred Moolekamp, John Swinbank, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.