Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-4946

afw Wcs object copying does not copy exactly

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: afw, wcslib
    • Labels:
      None
    • Story Points:
      6
    • Team:
      Data Release Production

      Description

      A probable bug in WCSLIB is causing wcscopy to create copies of Wcs objects which are not the same as the object that was copied. In some cases when this object is passed to wcsset it fails, as the Wcs object contains impossible values.

      This has behaviour is non-deterministic (failure is only seen occasionally). The error has only been observed on OSX, but we do not believe it to be operating system dependent (except insofar as different systems and compilers produce different memory layouts and hence different failure modes). This reliably causes ci_hsc to fail when running on a Mac.

      Relevant lines in afw are image/Wcs.cc:140 and the Wcs copy constructor in image/Wcs.cc:468

      Additionally a bug has been found in image/Wcs.cc on line 485 where the flag property should be set on an element, and not on the object itself, ie _wcsInfo[i]->flag = -1;.

        Attachments

          Issue Links

            Activity

            Hide
            swinbank John Swinbank added a comment -

            Per OOB discussion, Nate will come back and add steps to reproduce and a potential fix (or, at least, workaround) for the issue.

            Show
            swinbank John Swinbank added a comment - Per OOB discussion, Nate will come back and add steps to reproduce and a potential fix (or, at least, workaround) for the issue.
            Hide
            swinbank John Swinbank added a comment -

            Nate Lust discovered that the WCSLIB bug has already been fixed upstream: at least versions 5.10 and later are good. If we can upgrade (DM-3793) we get this bug fix for free.

            Show
            swinbank John Swinbank added a comment - Nate Lust discovered that the WCSLIB bug has already been fixed upstream: at least versions 5.10 and later are good. If we can upgrade ( DM-3793 ) we get this bug fix for free.
            Hide
            nlust Nate Lust added a comment -

            Small fix for bug in our code, as the other fix will come in with 3793

            Show
            nlust Nate Lust added a comment - Small fix for bug in our code, as the other fix will come in with 3793
            Hide
            nlust Nate Lust added a comment - - edited

            For posterity the bug is in wcslib/C/wcs.c:841 ->

             835   npv = 0;
             836   for (k = 0; k < wcssrc->npv; k++) {
             837     i = wcssrc->pv[k].i;
             838     if (i == 0 || (i > 0 && map[i-1])) {
             839       /* i == 0 is a special code for the latitude axis. */
             840       wcsdst->pv[npv] = wcssrc->pv[k];
             841       wcsdst->pv[npv].i = map[i-1];
             842       npv++;
             843     }
             844   }
            

            when i is 0 map[ i-1] will try and read before the beginning of the array, which may or may not be a garbage value in memory.

            Show
            nlust Nate Lust added a comment - - edited For posterity the bug is in wcslib/C/wcs.c:841 -> 835 npv = 0; 836 for (k = 0; k < wcssrc->npv; k++) { 837 i = wcssrc->pv[k].i; 838 if (i == 0 || (i > 0 && map[i-1])) { 839 /* i == 0 is a special code for the latitude axis. */ 840 wcsdst->pv[npv] = wcssrc->pv[k]; 841 wcsdst->pv[npv].i = map[i-1]; 842 npv++; 843 } 844 } when i is 0 map[ i-1] will try and read before the beginning of the array, which may or may not be a garbage value in memory.
            Hide
            swinbank John Swinbank added a comment -

            The small fix looks fine. However, I don't think we can just assume that DM-3793 will close this – I'm not clear from the discussion there how much work it'll take to upgrade WCSLIB, and we don't want to be blocked on this for long.

            Since there are really two issues here, can we make a new ticket for the small fix you want to merge straight away, and leave this open/blocked on DM-3793 for the WCSLIB upgrade? (Sorry for the extra clicking around.)

            Show
            swinbank John Swinbank added a comment - The small fix looks fine. However, I don't think we can just assume that DM-3793 will close this – I'm not clear from the discussion there how much work it'll take to upgrade WCSLIB, and we don't want to be blocked on this for long. Since there are really two issues here, can we make a new ticket for the small fix you want to merge straight away, and leave this open/blocked on DM-3793 for the WCSLIB upgrade? (Sorry for the extra clicking around.)
            Hide
            price Paul Price added a comment -

            I'm so very sorry, but I've just found an e-mail I sent to Mark Calabretta a few years ago. Here's his response:

            Hi Paul,

            Thanks for your bug report and the fix which I can confirm is
            valid.

            The i == 0 case arises for PROJPn keywords which appeared in
            early drafts of the WCS papers before being replaced by PVi_ma.
            They are non-standard but supported for backwards compatibility,
            though I have never come across any in a FITS header which is
            probably why the bug was never triggered.

            PROJPn is associated exclusively with the latitude axis, but
            wcslib doesn't know which that is until the wcsprm struct is
            analysed by wcsset(). Hence i == 0 means "the latitude axis",
            to be filled in later by wcsset().

            I'll release a new version anon.

            Regards,
            Mark Calabretta

            On Fri 2013/09/27 17:14:05 -0400, Paul Price wrote
            in a message to: mcalabre@atnf.csiro.au

            Dear Mark,

            We're using wcslib in the LSST/HSC pipeline code. I ran the code
            through valgrind today trying to track down a memory bug, and it found a
            problem in wcslib. We're currently using 4.14, but it appears to be
            present in 4.18 as well. Here's line 834ff in 4.18, within wcssub():

            /* Parameter values. */
            npv = 0;
            for (k = 0; k < wcssrc->npv; k++) {
            i = wcssrc->pv[k].i;
            if (i == 0 || (i > 0 && map[i-1]))

            Unknown macro: { /* i == 0 is a special code for the latitude axis. */ wcsdst->pv[npv] = wcssrc->pv[k]; wcsdst->pv[npv].i = map[i-1]; npv++; }

            }
            wcsdst->npv = npv;

            Now consider the case where i = wcssrc->pv[k].i = 0 (I'm not sure how
            this comes about, but it apparently does come about in our code, and it
            must in at least some cases be legitimate, judging by the comment about
            i == 0). Then you're indexing map[-1], which is Bad.

            I've modified it in my testing to:

            /* Parameter values. */
            npv = 0;
            for (k = 0; k < wcssrc->npv; k++) {
            i = wcssrc->pv[k].i;
            if (i == 0)

            Unknown macro: { /* i == 0 is a special code for the latitude axis. */ wcsdst->pv[npv] = wcssrc->pv[k]; wcsdst->pv[npv].i = 0; npv++; }

            else if (i > 0 && map[i-1])

            Unknown macro: { wcsdst->pv[npv] = wcssrc->pv[k]; wcsdst->pv[npv].i = map[i-1]; npv++; }

            }
            wcsdst->npv = npv;

            That appears to get me past the problems, but I'm not sure it's correct.

            Thanks for all your work on wcslib,

            Paul.

            Show
            price Paul Price added a comment - I'm so very sorry, but I've just found an e-mail I sent to Mark Calabretta a few years ago. Here's his response: Hi Paul, Thanks for your bug report and the fix which I can confirm is valid. The i == 0 case arises for PROJPn keywords which appeared in early drafts of the WCS papers before being replaced by PVi_ma. They are non-standard but supported for backwards compatibility, though I have never come across any in a FITS header which is probably why the bug was never triggered. PROJPn is associated exclusively with the latitude axis, but wcslib doesn't know which that is until the wcsprm struct is analysed by wcsset(). Hence i == 0 means "the latitude axis", to be filled in later by wcsset(). I'll release a new version anon. Regards, Mark Calabretta On Fri 2013/09/27 17:14:05 -0400, Paul Price wrote in a message to: mcalabre@atnf.csiro.au Dear Mark, We're using wcslib in the LSST/HSC pipeline code. I ran the code through valgrind today trying to track down a memory bug, and it found a problem in wcslib. We're currently using 4.14, but it appears to be present in 4.18 as well. Here's line 834ff in 4.18, within wcssub(): /* Parameter values. */ npv = 0; for (k = 0; k < wcssrc->npv; k++) { i = wcssrc->pv [k] .i; if (i == 0 || (i > 0 && map [i-1] )) Unknown macro: { /* i == 0 is a special code for the latitude axis. */ wcsdst->pv[npv] = wcssrc->pv[k]; wcsdst->pv[npv].i = map[i-1]; npv++; } } wcsdst->npv = npv; Now consider the case where i = wcssrc->pv [k] .i = 0 (I'm not sure how this comes about, but it apparently does come about in our code, and it must in at least some cases be legitimate, judging by the comment about i == 0). Then you're indexing map [-1] , which is Bad. I've modified it in my testing to: /* Parameter values. */ npv = 0; for (k = 0; k < wcssrc->npv; k++) { i = wcssrc->pv [k] .i; if (i == 0) Unknown macro: { /* i == 0 is a special code for the latitude axis. */ wcsdst->pv[npv] = wcssrc->pv[k]; wcsdst->pv[npv].i = 0; npv++; } else if (i > 0 && map [i-1] ) Unknown macro: { wcsdst->pv[npv] = wcssrc->pv[k]; wcsdst->pv[npv].i = map[i-1]; npv++; } } wcsdst->npv = npv; That appears to get me past the problems, but I'm not sure it's correct. Thanks for all your work on wcslib, Paul.
            Hide
            swinbank John Swinbank added a comment -

            On second thoughts – it looks like the wcslib upgrade is going ahead on DM-3793. Let's just use this ticket to land the minor fix: no need to file another. Sorry for the noise.

            Show
            swinbank John Swinbank added a comment - On second thoughts – it looks like the wcslib upgrade is going ahead on DM-3793 . Let's just use this ticket to land the minor fix: no need to file another. Sorry for the noise.
            Hide
            swinbank John Swinbank added a comment -

            Merge at will.

            Show
            swinbank John Swinbank added a comment - Merge at will.

              People

              • Assignee:
                nlust Nate Lust
                Reporter:
                nlust Nate Lust
                Reviewers:
                John Swinbank
                Watchers:
                John Swinbank, Nate Lust, Paul Price
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: