# Reduce memory footprint of meas_mosaic read

XMLWordPrintable

## Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
• Team:
Data Release Production

## Description

I am exhausting the memory using meas_mosaic during read. Usually we think of meas_mosaic as using too much memory during calculation, but I'm hitting the limit during read:

 Mosaic INFO: Reading catalogs ... Mosaic INFO: Use 28 cores for reading source catalog slurmstepd: error: Job 43298 exceeded memory limit (150023884 > 131072000), being killed slurmstepd: error: Exceeded job memory limit slurmstepd: error: *** JOB 43298 ON perseus-r5c4n2 CANCELLED AT 2017-03-23T22:32:38 ***

## Activity

Hide
Paul Price added a comment -

Lauren MacArthur, would you, as the Queen of meas_mosaic, review these changes please?

 price@pap-laptop:~/LSST/meas_mosaic (tickets/DM-9926=) $git sub commit 2a2c41c426ef630fc300250da58455d7aa6e0477 Author: Paul Price Date: Fri Mar 24 12:22:33 2017 -0400 SourceReader: don't read calexp Reading the calexp is slow, uses excessive memory and unnecessary: * the detector is available from the camera (no need to worry about cameras that don't index camera by dataId["ccd"]: we're only supporting HSC) * the dimensions of the image are available from the calexp_md. This is a little dirty, but it's used elsewhere in this file and we only expect to have to support meas_mosaic a little longer. python/lsst/meas/mosaic/mosaicTask.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) commit af52d86947b46b62f227a1321225c23f226f4ea2 Author: Paul Price Date: Fri Mar 24 12:24:21 2017 -0400 use std_vector.reserve before doing a bunch of push_back Probably not important here, but good practice in general because it guards against memory fragmentation. python/lsst/meas/mosaic/mosaicTask.py | 2 ++ 1 file changed, 2 insertions(+) commit 401ed89d5e5093ee4bf322c5c9aec79118e6ec66 Author: Paul Price Date: Fri Mar 24 12:28:41 2017 -0400 MosaicTask.readCatalog: reduce memory footprint in multiprocessing We do this in two ways: 1. Use Pool with maxtasksperchild=1. This stops memory fragmentation from compounding, because although memory may be fragmented in each process, after each read the process exits and is replaced with a fresh slate. 2. Join the pool when done. This stops a bunch of python processes from hanging around when iterating over tracts in a single job. python/lsst/meas/mosaic/mosaicTask.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Show Paul Price added a comment - Lauren MacArthur , would you, as the Queen of meas_mosaic, review these changes please? price@pap-laptop:~/LSST/meas_mosaic (tickets/DM-9926=)$ git sub commit 2a2c41c426ef630fc300250da58455d7aa6e0477 Author: Paul Price <price@astro.princeton.edu> Date: Fri Mar 24 12:22:33 2017 -0400   SourceReader: don't read calexp Reading the calexp is slow, uses excessive memory and unnecessary: * the detector is available from the camera (no need to worry about cameras that don't index camera by dataId["ccd"]: we're only supporting HSC) * the dimensions of the image are available from the calexp_md. This is a little dirty, but it's used elsewhere in this file and we only expect to have to support meas_mosaic a little longer.   python/lsst/meas/mosaic/mosaicTask.py | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)   commit af52d86947b46b62f227a1321225c23f226f4ea2 Author: Paul Price <price@astro.princeton.edu> Date: Fri Mar 24 12:24:21 2017 -0400   use std_vector.reserve before doing a bunch of push_back Probably not important here, but good practice in general because it guards against memory fragmentation.   python/lsst/meas/mosaic/mosaicTask.py | 2 ++ 1 file changed, 2 insertions(+)   commit 401ed89d5e5093ee4bf322c5c9aec79118e6ec66 Author: Paul Price <price@astro.princeton.edu> Date: Fri Mar 24 12:28:41 2017 -0400   MosaicTask.readCatalog: reduce memory footprint in multiprocessing We do this in two ways: 1. Use Pool with maxtasksperchild=1. This stops memory fragmentation from compounding, because although memory may be fragmented in each process, after each read the process exits and is replaced with a fresh slate. 2. Join the pool when done. This stops a bunch of python processes from hanging around when iterating over tracts in a single job.   python/lsst/meas/mosaic/mosaicTask.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
Hide
Lauren MacArthur added a comment -

I adopted the modifications of your first commit (i.e. not reading calexps) on some recent meas_mosaic runs and can confirm that works nicely. Yusra AlSayyad, if you want to have a look at the other two, that would be great!

Show
Lauren MacArthur added a comment - I adopted the modifications of your first commit (i.e. not reading calexps) on some recent meas_mosaic runs and can confirm that works nicely. Yusra AlSayyad , if you want to have a look at the other two, that would be great!
Hide

Your commit messages pre-answered all my questions. Nice. I might have written the note about "ccd" being OK because we're supporting only HSC as an inline comment in case someone is tempted to copy and paste it.

OK to merge.

Show
Yusra AlSayyad added a comment - Your commit messages pre-answered all my questions. Nice. I might have written the note about "ccd" being OK because we're supporting only HSC as an inline comment in case someone is tempted to copy and paste it. OK to merge.
Hide
Paul Price added a comment -

Added the inline comment, and merged to master.

Thanks, Lauren and Yusra!

Show
Paul Price added a comment - Added the inline comment, and merged to master. Thanks, Lauren and Yusra!

## People

• Assignee:
Paul Price
Reporter:
Paul Price
Reviewers:
Lauren MacArthur
Watchers:
Lauren MacArthur, Paul Price, Yusra AlSayyad