# Please do not strip {{tests/}} from Pipelines Docker images

XMLWordPrintable

## Details

• Type: Story
• Status: To Do
• Resolution: Unresolved
• Fix Version/s: None
• Component/s:
• Labels:
None
• Team:
SQuaRE

## Description

The lsstsqre/centos Docker images are explicitly constructed without the tests directory.

Unfortunately, the tests for some packages rely on the contents of the tests directory in other packages. For example, when trying to build pipe_tasks against a Dockerized obs_base, I get:

 ____________________________________________ ReadDefectsTestCase.test_read_defects ____________________________________________ [gw3] linux -- Python 3.7.2 /opt/lsst/software/stack/python/miniconda3-4.7.10/envs/lsst-scipipe-4d7b902/bin/python3.7   self =     def setUp(self): > butler = dafPersist.ButlerFactory(mapper=BaseMapper()).create()   tests/test_read_CuratedCalibs.py:61:  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ tests/test_read_CuratedCalibs.py:48: in __init__  policy = dafPersist.Policy(os.path.join(ROOT, "BaseMapper.yaml")) /opt/lsst/software/stack/stack/miniconda3-4.7.10-4d7b902/Linux64/daf_persistence/19.0.0-1-g6fe20d0+1/python/lsst/daf/persistence/policy.py:80: in __init__  self.__initFromFile(other) /opt/lsst/software/stack/stack/miniconda3-4.7.10-4d7b902/Linux64/daf_persistence/19.0.0-1-g6fe20d0+1/python/lsst/daf/persistence/policy.py:111: in __initFromFile  self.__initFromYamlFile(path) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _   self = {} path = '/opt/lsst/software/stack/stack/miniconda3-4.7.10-4d7b902/Linux64/obs_base/19.0.0-9-ge91d8c4+1/tests/BaseMapper.yaml'    def __initFromYamlFile(self, path):  """Opens a file at a given path and attempts to load it in from yaml.    :param path:  :return:  """ > with open(path, 'r') as f: E FileNotFoundError: [Errno 2] No such file or directory: '/opt/lsst/software/stack/stack/miniconda3-4.7.10-4d7b902/Linux64/obs_base/19.0.0-9-ge91d8c4+1/tests/BaseMapper.yaml'   /opt/lsst/software/stack/stack/miniconda3-4.7.10-4d7b902/Linux64/daf_persistence/19.0.0-1-g6fe20d0+1/python/lsst/daf/persistence/policy.py:145: FileNotFoundError 

This is happening because $PIPE_TASKS_DIR/tests/test_read_CuratedCalibs.py depends upon$OBS_BASE_DIR/tests/BaseMapper.yaml, which has been removed from the Docker images.

This renders the Docker images much less useful for development than they might otherwise be.

I don't know what the original motivation for stripping tests was (just to save space?). In general, I'd suggest that the Docker images should contain exactly the contents of the packages published at eups.lsst.codes — if it's appropriate to strip something from the Docker image, it must be appropriate to strip it from the package, and vice versa. Please stop special-casing this directory in image construction.

Adding Josh and Simon as watchers here, as respectively the author of the Docker image building code and the tests that are being broken.

## Activity

Hide
John Swinbank added a comment -

Maybe worth adding that a goodly chunk of that seems to be the results of executing the tests, rather than the tests and associated inputs (e.g. there's about 250MB in $PIPE_TASKS_DIR/tests after executing the tests, but only 14MB before). I'm a bit more on the fence about whether we can strip the test outputs. Show John Swinbank added a comment - Maybe worth adding that a goodly chunk of that seems to be the results of executing the tests, rather than the tests and associated inputs (e.g. there's about 250MB in$PIPE_TASKS_DIR/tests after executing the tests, but only 14MB before). I'm a bit more on the fence about whether we can strip the test outputs.
Hide
Tim Jenness added a comment -

Installing output data from tests is one of the things I fixed for lsst_ci in DM-22305 so we may have to make the same fixes to pipe_tasks.

We do install all sorts of things in our eups installs that are completely unnecessary for a non-developer binary distribution. The tests/.tests directories should not be distributed.

Show
Tim Jenness added a comment - Installing output data from tests is one of the things I fixed for lsst_ci in DM-22305 so we may have to make the same fixes to pipe_tasks. We do install all sorts of things in our eups installs that are completely unnecessary for a non-developer binary distribution. The tests/.tests directories should not be distributed.
Hide
Tim Jenness added a comment -

I've just done a fresh build (with some contamination from a sims build) and I get about 400MB of tests directories now. About 150MB is in .tests directories. obs_base is the winning package with nearly 80MB of data in it (half of that is for one test file and I'm not sure that file needs to be anything more than a few kB). afw has about 20MB of tests and 20MB of .tests. Deleting .tests and being a bit more careful with test files that don't need to be as big as they are could probably get us below 200MB of test files.

Show
Tim Jenness added a comment - I've just done a fresh build (with some contamination from a sims build) and I get about 400MB of tests directories now. About 150MB is in .tests directories. obs_base is the winning package with nearly 80MB of data in it (half of that is for one test file and I'm not sure that file needs to be anything more than a few kB). afw has about 20MB of tests and 20MB of .tests. Deleting .tests and being a bit more careful with test files that don't need to be as big as they are could probably get us below 200MB of test files.
Hide
John Swinbank added a comment -

Thanks Tim!

Other than “smaller is better”, do we actually know what we're aiming for here?

Show
John Swinbank added a comment - Thanks Tim! Other than “smaller is better”, do we actually know what we're aiming for here?
Hide
Joshua Hoblitt added a comment - - edited

AFAIK – docker hub does not publish a maximum size limit for images.  As we know that docker hub is using aws s3 to distribute layers, it seems probable that the maximum s3 object size of 5GiB will apply to the docker (compressed) layer size.  These docker images are already extraordinarily large and are fairly slow to download and uncompress.  I would rather see the image size going down rather than up as the downstream jupyter notebooks are layering on many more gigabytes.

Show
Joshua Hoblitt added a comment - - edited AFAIK – docker hub does not publish a maximum size limit for images.  As we know that docker hub is using aws s3 to distribute layers, it seems probable that the maximum s3 object size of 5GiB will apply to the docker (compressed) layer size.  These docker images are already extraordinarily large and are fairly slow to download and uncompress.  I would rather see the image size going down rather than up as the downstream jupyter notebooks are layering on many more gigabytes.

## People

• Assignee:
Frossie Economou
Reporter:
John Swinbank
Watchers:
John Swinbank, Joshua Hoblitt, Simon Krughoff, Tim Jenness
0 Vote for this issue
Watchers:
4 Start watching this issue

## Dates

• Created:
Updated: