Details

Type: Story

Status: Done

Resolution: Done

Fix Version/s: None

Component/s: None

Labels:None

Story Points:2

Epic Link:

Sprint:Science Pipelines DMW163, Science Pipelines DMW164, Science Pipelines DMW165

Team:Data Release Production
Description
This is just a report of the amount of time it takes to run ShapeletPsfApprox and CModel over 10000 galaxies from GalSim
Attachments
Attachments
 DM4368.odt
 20 kB
 Full.max.2000.logs
 24 kB
Activity
The following study was done with two rounds of 7 sigma clipping, which should only remove extreme outliers. It shows that the large standard deviations in the previous runs were caused by some extreme outliers.
psfs_0.5.fits
SingleGaussian: average for 387 psfs: 0.0024 stdev: 0.004687
clipped values > 0.1075: [0.28041911125183105, 0.8109719753265381]
DoubleGaussian: average for 387 psfs: 0.0051 stdev: 0.000779
clipped values > 0.0176: [1.1540520191192627, 0.03627610206604004, 0.5256130695343018]
DoubleShapelet: average for 387 psfs: 0.0141 stdev: 0.001438
clipped values > 0.0609: [1.7314538955688477, 1.0571620464324951, 0.06220102310180664, 0.13170194625854492]
Full: average for 387 psfs: 0.1134 stdev: 0.084518
clipped values > 1.7492: [15.541382074356079, 3.824735164642334, 2.155457019805908, 11.185655117034912]
Test1: average for 387 psfs: 0.2929 stdev: 0.097276
clipped values > 2.7615: [17.389684915542603, 4.001487970352173, 5.731444835662842, 22.650147199630737, 23.304425954818726]
Test2: average for 387 psfs: 0.2554 stdev: 0.348485
clipped values > 5.7946: [69.13292789459229, 14.105606079101562]
.psfs_0.7.fits
SingleGaussian: average for 369 psfs: 0.0021 stdev: 0.000040
clipped values > 0.0024: [0.002516031265258789, 0.7539308071136475, 0.7510437965393066, 0.7741520404815674, 0.8092930316925049]
DoubleGaussian: average for 369 psfs: 0.0051 stdev: 0.001552
clipped values > 0.0962: [2.000887155532837, 0.21076488494873047, 0.13927197456359863]
DoubleShapelet: average for 369 psfs: 0.0141 stdev: 0.000160
clipped values > 0.0360: [0.0733940601348877, 6.129197835922241, 6.105515956878662, 4.8287341594696045]
Full: average for 369 psfs: 0.1141 stdev: 0.099120
clipped values > 1.9049: [4.603094100952148, 34.23759579658508]
Test1: average for 369 psfs: 0.2943 stdev: 0.153182
clipped values > 5.2502: [11.893396139144897, 6.470481872558594, 115.38624501228333]
Test2: average for 369 psfs: 0.2175 stdev: 0.020017
clipped values > 1.1176: [5.853317975997925, 3.2104039192199707, 2.6327810287475586]
.psfs_0.9.fits
SingleGaussian: average for 352 psfs: 0.0021 stdev: 0.000094
clipped values > 0.0244: [0.03729820251464844, 0.18493080139160156, 0.0496518611907959]
DoubleGaussian: average for 352 psfs: 0.0050 stdev: 0.000046
clipped values > 0.0083: [2.0939059257507324, 0.00985407829284668, 2.063598871231079, 0.01220703125]
DoubleShapelet: average for 352 psfs: 0.0141 stdev: 0.000775
clipped values > 0.1152: [3.2652430534362793, 4.3427510261535645, 0.2819490432739258]
Full: average for 352 psfs: 0.1079 stdev: 0.015193
clipped values > 1.0762: [8.40607213973999, 2.660454034805298, 5.388958930969238]
Test1: average for 352 psfs: 0.3000 stdev: 0.252734
clipped values > 5.3710: [8.429428100585938, 125.1890299320221, 9.923501014709473]
Test2: average for 352 psfs: 0.2238 stdev: 0.131330
clipped values > 2.9361: [5.458249092102051, 93.49560594558716, 4.508358001708984]
This is the result of a comparison of the time ShapletPsfApprox takes vs. Cmodel for 100 galaxies with 0.7 arcsec seeing. The galaxies are randomly selected, as are the Psfs.
Note that CModel does not increase greatly with model, while ShapeletPsfApprox does. However, it would probably be best to look at the previous comment for SPA, which was done with outlier rejection.
SingleGaussian: average for 100 exps: 0.0252 stdev: 0.010615
CModel: average for 100 exps: 0.0124 stdev: 0.003567
DoubleGaussian: average for 100 exps: 0.1466 stdev: 0.019766
CModel: average for 100 exps: 0.0230 stdev: 0.008743
DoubleShapelet: average for 100 exps: 3.0702 stdev: 1.194606
CModel: average for 100 exps: 0.0302 stdev: 0.007098
Full: average for 100 exps: 48.8366 stdev: 2.655909
CModel: average for 100 exps: 0.0848 stdev: 0.021048
Assuming all of these numbers are seconds per galaxy or PSF image (rather than seconds for a group), these numbers are indeed way too large, and it's clear I need to do some work on ShapeletPsfApprox to fix that.
I know you already added code to only run ShapeletPsfApprox once per subfield; I'm hoping the fact that this means you only run ShapeletPsfApprox once for every 10k galaxies means it's still not a large fraction of the overall time, even when it's horribly slow. If that's the case, I suggest you just proceed asis, while I work on speeding it up. I'm reasonably confident I can do that in a way that doesn't adversely affect the fitting results.
I'm more concerned that the fitting that goes incredibly slow might also converge to a bad fit. To explore that possibility, it'd be useful to look for cases where a simple ShapeletPsfApprox model (e.g. DoubleGaussian) runs quite quickly, but a more complex one that's strictly a superset of the simple one (e.g. Full) is extremely slow. If we could then just look at the models compared to the original image, we might learn something more about what's going on. Just looking at the parameters could be enlightening as well  we may be spending a lot of time trying to constrain components with extremely low amplitudes.
If you can package up any of the extremely slow fits as unit tests on ShapeletPsfApprox, I'll take a look at them. I can't promise it will be soon, as December will be very busy, but this will be a fairly high priority for me, since getting the LSST versions of ShapeletPsfApprox and CModel working on real data is going to be important for the HSC merge that everyone else at Princeton is working on right now.
These results differ in a few cases from the ones I sent you earlier. I reran several of the tests to be sure that they were correct
I also have done the iterations count you asked for in several cases, though I did not try to include it here.
The DM4368.odt file contains the latest results.
Review complete. I have nothing new to add, but I'll paste some of my statements from our previously offline conversation here for posterity:
It's interesting that the largest PSF images caused that many more failures (suggesting that it takes more iterations to fit more pixels on average), but looking at the other sizes I don't see a clear trend in terms of PSF image size vs. failure rate.
I'm actually encouraged to see the the CModel speed with Full is only ~8x slower than with SingleGaussian; I expected it to be much worse than that, actually.
Mostly, it's clear that I need to do some work on speeding up ShapeletPsfApprox to make it usable in practice, even for simple models. I'm pretty confident I can do that if I can just get some time to work on it. (I'm much less optimistic about being able to speed up CModel, which is why I've been paying more attention to how it scales with the PSF approximation complexity).
I've added a couple of Full and DoubleShapelet model logs which have printouts of time and number of iterations from the size of the history recorder catalog. All but 2 are outer iterations.
These are runs which show that the higher order fits for ShapeletPsfApprox are widely varying. I will plot a distribution next, but the widely varying stdevs and avgs are probably caused by some very large outliers.
psfs_0.5.fits
SingleGaussian: average for 306 psfs: 0.0023 stdev: 0.003821
DoubleGaussian: average for 306 psfs: 0.0118 stdev: 0.117214
DoubleShapelet: average for 306 psfs: 0.0300 stdev: 0.235367
Full: average for 306 psfs: 0.2188 stdev: 1.849481
Test1: average for 306 psfs: 0.5648 stdev: 4.341135
Test2: average for 306 psfs: 0.2538 stdev: 0.484847
.psfs_0.7.fits
SingleGaussian: average for 349 psfs: 0.0051 stdev: 0.043088
DoubleGaussian: average for 349 psfs: 0.0053 stdev: 0.004594
DoubleShapelet: average for 349 psfs: 0.0347 stdev: 0.343586
Full: average for 349 psfs: 0.2773 stdev: 2.568652
Test1: average for 349 psfs: 0.6854 stdev: 6.723544
Test2: average for 349 psfs: 0.5440 stdev: 5.086115
.psfs_0.9.fits
SingleGaussian: average for 354 psfs: 0.0046 stdev: 0.040344
DoubleGaussian: average for 354 psfs: 0.0084 stdev: 0.043921
DoubleShapelet: average for 354 psfs: 0.0158 stdev: 0.022809
Full: average for 354 psfs: 0.2617 stdev: 2.471886
Test1: average for 354 psfs: 0.7728 stdev: 5.927591
Test2: average for 354 psfs: 0.3068 stdev: 0.967577

Here is a second set of runs with different Psf libraries.
psfs_0.5.fits
SingleGaussian: average for 387 psfs: 0.0052 stdev: 0.043604
DoubleGaussian: average for 387 psfs: 0.0095 stdev: 0.064276
DoubleShapelet: average for 387 psfs: 0.0217 stdev: 0.102503
Full: average for 387 psfs: 0.1971 stdev: 0.993619
Test1: average for 387 psfs: 0.4779 stdev: 1.878722
Test2: average for 387 psfs: 0.4675 stdev: 3.582227
.psfs_0.7.fits
SingleGaussian: average for 369 psfs: 0.0105 stdev: 0.080098
DoubleGaussian: average for 369 psfs: 0.0115 stdev: 0.105081
DoubleShapelet: average for 369 psfs: 0.0605 stdev: 0.515485
Full: average for 369 psfs: 0.2194 stdev: 1.803057
Test1: average for 369 psfs: 0.6539 stdev: 6.036394
Test2: average for 369 psfs: 0.2457 stdev: 0.352249
.psfs_0.9.fits
SingleGaussian: average for 352 psfs: 0.0029 stdev: 0.010248
DoubleGaussian: average for 352 psfs: 0.0169 stdev: 0.156496
DoubleShapelet: average for 352 psfs: 0.0364 stdev: 0.288886
Full: average for 352 psfs: 0.1539 stdev: 0.543054
Test1: average for 352 psfs: 0.7054 stdev: 6.704170
Test2: average for 352 psfs: 0.5094 stdev: 4.921248