Perry Gee and Tony Johnson have recently reported high memory usage by SafeClipAssembleCoadd for examples with either large N (number of warps) or large patch sizes.
For simulated data processed by DESC, Tony Johnson reported 70GB processes coadding 6kX6k patches and unknown N. (Fill this in if you remember)
For DES data, Perry Gee reported memory usage >32GB with 10kX10k patches and unknown N. (Note: patches this large are not recommended). Masks were still uint16 at the time.
Profiling reveals that this memory usage was likely due to holding N masks in memory. In this example, patches were 4200X4200, mask pixels = int32, and N=33. 4200*4200*4*33/2**20 = 2220MiB. Full profiling print out attached.
For the simulated data example, N=~1000, patch size = 6KX6K, mask pixel size = uint16 (May 2016) , would have yielded a theoretical usage of 6e3*6e3*2*1000/2**30 = 67GiB. For the DES, example with the 10kX10k patches any more than 150 visits would have not fit on the 32G system.
Now that we've switched to int32 masks this is even more important, as the memory usage doubles for examples with large N or large patches. Even with 4k X 4k patches, it only takes N=180 to get memory usage > 10GB.
This ticket will remove the assumption that N masks can fit in memory from the implementation of CompareWarpAssembleCoadd. Because the solution will likely be the same for SafeClipAssemble too it will likely be implemented at the same time.