Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-20127

Improve scarlet initialization

    Details

    • Type: Story
    • Status: In Progress
    • Resolution: Unresolved
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      47
    • Sprint:
      DRP F19-5, DRP F19-6 (Nov), DRP S20-2 (Jan), DRP S20-3 (Feb), DRP S20-5 (Apr), DRP S20-6 (May), DRP F20-1 (June), DRP F20-2 (July)
    • Team:
      Data Release Production

      Description

      Having a better initialization allows scarlet to converge much more quickly and/or provide a more accurate solution. The main issue with the current initialization is that sources are initialized in the observation seeing, which is much wider than the model PSF for ground based telescopes. Testing has shown that it is easier for the optimization algorithm to make the sources wider than it is to make them narrower, so it would be to our advantage to come up with a more conservative initialization procedure to produce narrow sources and allow them to grow.

      The goal of this ticket is to devise such an algorithm and demonstrate that it improves the performance of scarlet.

        Attachments

          Issue Links

            Activity

            Hide
            fred3m Fred Moolekamp added a comment -

            This ticket probably should have been subdivided into several other tickets, but the work on it evolved over time and there was never really a good place to demarcate the work and create a new ticket until it was already completed. I just updated the story points with the number story points used so far on this ticket, so below is a summary of the work performed so far for this ticket:

            • The first week or so was spent digging deep into the numpy.fft library to track a single pixel shift that frequently occurred between images before and after convolution. In short, a combination of the numpy.fft.fftshift function, padding of images to be convolved, and the trimming and centering algorithm used to undo the padding were to blame.
            • The next week or two was spent naively applying deconvolutions directly to the initial models that scarlet created from HSC images. This included creating a new Fourier class in scarlet to do the bookkeeping that ensures that images are properly padded, trimmed, and centered.
            • This ticket was about to be merged when it was realized that deconvolution in the initialization failed when using our simulated images created by galsim. The next week or so was spent investigating why deconvolution worked better on real images than simulated ones, the opposite of what one might expect.
            • It was discovered that the failure in the simulated images was not a problem with galsim or the models used therein but because we used a wider PSF than the ones used in the real HSC images. This lead to an investigation as to why deconvolution failed with an idealized gaussian PSF just because it was wide and at what point the algorithm failed. The discovery is that even with a gaussian PSF, and a gaussian signal, with no noise, numerical artifacts introduced by floating point precision dominated in the high frequencies needed to deconvolve from a wider PSF to a narrower one, even if the two PSFs are nearly identical in width.
            • Next a week or so was spent attempting to convert our initial (SDSS/HSC-like) templates into Sersic models in hopes that they would be easier to deconvolve. I was able to successfully create Sersic models to match the observed galaxies (after a bit of work, since galsim uses special sauce to model the inner "peaky" part of Sersic galaxies and it took some time to discover it) but it turns out that deconvolving a Sersic model is also non-trivial and this approach was abandoned.
            • Another few days was spent implementing a gaussian mixture model, again beginning with the initial SDSS/HSC-like templates, since gaussian mixtures can be analytically deconvolved when both the target and observation PSFs are gaussian. Unfortunately without using a large number of Gaussians this technique did not work, because using only 2 or 3 gaussians resulted in sampling parts of the observed model that were dominated by the PSF, not the signal, so deconvolution did not give the correct answer. The technique only worked when ~10-15 Gaussians is used, which has a runtime comparable to scarlet, so the initialization doesn't save us anything. I later found out that this is similar to the technique used by Erin Sheldon in his shredder code that is currently running in DES.
            • Before abandoning this project completely I ran scarlet on several models that we use for testing using multiple different algorithms: no deconvolution, deconvolution with the PSF difference kernel, and a few different mixture models. The total number of iterations and runtime were calculated for each model, for each blend, and the results will be attached to this ticket. The main summary is that using deconvolution works best, but it will only work when the observed PSF has sigma less than ~1.2. For greater sigma the algorithm quickly diverges and takes significantly longer to run. However there are a number of cases where no deconvolution performs significantly better, and while we could investigate and attempt to understand why this is the case, my recommendation at this point is to shelve this ticket for now and only return to it later if we find that we have some new insight and it is worth spending more time on.
            Show
            fred3m Fred Moolekamp added a comment - This ticket probably should have been subdivided into several other tickets, but the work on it evolved over time and there was never really a good place to demarcate the work and create a new ticket until it was already completed. I just updated the story points with the number story points used so far on this ticket, so below is a summary of the work performed so far for this ticket: The first week or so was spent digging deep into the numpy.fft library to track a single pixel shift that frequently occurred between images before and after convolution. In short, a combination of the numpy.fft.fftshift function, padding of images to be convolved, and the trimming and centering algorithm used to undo the padding were to blame. The next week or two was spent naively applying deconvolutions directly to the initial models that scarlet created from HSC images. This included creating a new Fourier class in scarlet to do the bookkeeping that ensures that images are properly padded, trimmed, and centered. This ticket was about to be merged when it was realized that deconvolution in the initialization failed when using our simulated images created by galsim. The next week or so was spent investigating why deconvolution worked better on real images than simulated ones, the opposite of what one might expect. It was discovered that the failure in the simulated images was not a problem with galsim or the models used therein but because we used a wider PSF than the ones used in the real HSC images. This lead to an investigation as to why deconvolution failed with an idealized gaussian PSF just because it was wide and at what point the algorithm failed. The discovery is that even with a gaussian PSF, and a gaussian signal, with no noise, numerical artifacts introduced by floating point precision dominated in the high frequencies needed to deconvolve from a wider PSF to a narrower one, even if the two PSFs are nearly identical in width. Next a week or so was spent attempting to convert our initial (SDSS/HSC-like) templates into Sersic models in hopes that they would be easier to deconvolve. I was able to successfully create Sersic models to match the observed galaxies (after a bit of work, since galsim uses special sauce to model the inner "peaky" part of Sersic galaxies and it took some time to discover it) but it turns out that deconvolving a Sersic model is also non-trivial and this approach was abandoned. Another few days was spent implementing a gaussian mixture model, again beginning with the initial SDSS/HSC-like templates, since gaussian mixtures can be analytically deconvolved when both the target and observation PSFs are gaussian. Unfortunately without using a large number of Gaussians this technique did not work, because using only 2 or 3 gaussians resulted in sampling parts of the observed model that were dominated by the PSF, not the signal, so deconvolution did not give the correct answer. The technique only worked when ~10-15 Gaussians is used, which has a runtime comparable to scarlet, so the initialization doesn't save us anything. I later found out that this is similar to the technique used by Erin Sheldon in his shredder code that is currently running in DES. Before abandoning this project completely I ran scarlet on several models that we use for testing using multiple different algorithms: no deconvolution, deconvolution with the PSF difference kernel, and a few different mixture models. The total number of iterations and runtime were calculated for each model, for each blend, and the results will be attached to this ticket. The main summary is that using deconvolution works best, but it will only work when the observed PSF has sigma less than ~1.2. For greater sigma the algorithm quickly diverges and takes significantly longer to run. However there are a number of cases where no deconvolution performs significantly better, and while we could investigate and attempt to understand why this is the case, my recommendation at this point is to shelve this ticket for now and only return to it later if we find that we have some new insight and it is worth spending more time on.
            Hide
            fred3m Fred Moolekamp added a comment -

            I'll also add that in my science time I have been working on two separate blog posts that describe in details many of the findings mentioned in my previous comment. The first post will just cover the numpy FFT fftshift/centering issue, since I've talked to several other people who have been bit by the same issue, and a second post will discuss the deconvolution issues. Since this is one of many science time projects that I'm working on right now the ETA is unknown.

            Show
            fred3m Fred Moolekamp added a comment - I'll also add that in my science time I have been working on two separate blog posts that describe in details many of the findings mentioned in my previous comment. The first post will just cover the numpy FFT fftshift/centering issue, since I've talked to several other people who have been bit by the same issue, and a second post will discuss the deconvolution issues. Since this is one of many science time projects that I'm working on right now the ETA is unknown.
            Hide
            fred3m Fred Moolekamp added a comment -

            Rather than continue to add onto the already massive number of story points attached to this ticket, splinter tickets will now be triggered for specific smaller tasks associated with improved initialization.

            Show
            fred3m Fred Moolekamp added a comment - Rather than continue to add onto the already massive number of story points attached to this ticket, splinter tickets will now be triggered for specific smaller tasks associated with improved initialization.

              People

              • Assignee:
                fred3m Fred Moolekamp
                Reporter:
                fred3m Fred Moolekamp
                Watchers:
                Fred Moolekamp
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:

                  Summary Panel