A backup retention policy was implemented and deployed yesterday using s3 bucket lifecycle policy rules in order to prevent on-going backup accumulation requiring manual cleanup. The backup bucket now has rules attached to object prefixes of daily/ - 8 days, weekly/ - 35 days (7x5), and montly/ - 217 days (7x31). Objects under the monthly prefix are migrated to glacier after 30 days.
The sqre/backup/s3backup-eups job was split up into separate jobs for each backup period:
The new -daily-cron job was tested as working yesterday. However, the cron triggered build this morning exited non-zero.
copy failed: s3://****/stack/redhat/el7/devtoolset-6/miniconda3-4.3.21-10a4fa6/log4cxx-0.10.0.lsst7@Linux64.tar.gz to s3://****/daily/2018/03/29/2018-03-29T11:52:05Z/stack/redhat/el7/devtoolset-6/miniconda3-4.3.21-10a4fa6/log4cxx-0.10.0.lsst7@Linux64.tar.gz An error occurred (InvalidArgument) when calling the UploadPartCopy operation: Range specified is not valid for source object of size: 17601350
Which is likely the same failure mode as described in
DM-12861. It isn't clear if this is an s3 server side glitch or a bug in awscli.