Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-11640

Summarize the CPU usage in one RC reprocessing

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Based on the w_2017_30 run (DM-11185), summarize the computing resources needed to process the RC dataset. For example, execution time in total, in each driver step, in WIDE versus COSMOS.

        Attachments

        1. screenshot-1.png
          screenshot-1.png
          64 kB
        2. screenshot-2.png
          screenshot-2.png
          102 kB
        3. timecheckJira.sh
          10 kB

          Issue Links

            Activity

            Hide
            sthrush Samantha Thrush added a comment -

            After running the RC reprocessing code, I have come up with the following execution times The times for the singleFrameDriver.py, coaddDriver.py, and multiBandDriver.py code runs were extracted from Slurm using the sacct command. Since Slurm was not used to run mosaic.py or makeSkyMap.py, a seconds counter was used to track the elapsed time for each job.

            If there are any questions about what visits were used, how many CCDs were considered, or how many cores were used, please refer to the python code that will be attached to this Jira ticket in the near future.

            As you can see, on average the Cosmos jobs take longer to complete that the Wide jobs, resulting in the Cosmos jobs taking more than twice the time that it takes to run all of the Wide jobs. Of course, this information is based on one trial run. In the near future, I will be doing more runs to see if this trend is maintained or if this was simply a fluke, although that seems highly unlikely with my experience with this code.

            Show
            sthrush Samantha Thrush added a comment - After running the RC reprocessing code, I have come up with the following execution times The times for the singleFrameDriver.py, coaddDriver.py, and multiBandDriver.py code runs were extracted from Slurm using the sacct command. Since Slurm was not used to run mosaic.py or makeSkyMap.py, a seconds counter was used to track the elapsed time for each job. If there are any questions about what visits were used, how many CCDs were considered, or how many cores were used, please refer to the python code that will be attached to this Jira ticket in the near future. As you can see, on average the Cosmos jobs take longer to complete that the Wide jobs, resulting in the Cosmos jobs taking more than twice the time that it takes to run all of the Wide jobs. Of course, this information is based on one trial run. In the near future, I will be doing more runs to see if this trend is maintained or if this was simply a fluke, although that seems highly unlikely with my experience with this code.
            Hide
            sthrush Samantha Thrush added a comment - - edited

            The timecheckJira.sh script was used to complete this timed RC reprocessing trial. It should be noted that the mosaic.py jobs were all run separately so as to allow me to properly transcribe all of their running times.

            Show
            sthrush Samantha Thrush added a comment - - edited The timecheckJira.sh script was used to complete this timed RC reprocessing trial. It should be noted that the mosaic.py jobs were all run separately so as to allow me to properly transcribe all of their running times.
            Hide
            sthrush Samantha Thrush added a comment -

            After some suggestions from Hsin-Fang Chiang, I have updated the table provided above. The Core Hours column is the wall-time multiplied by the cores used by Slurm (if Slurm was not used, then the wall-time was multiplied by 1), and the Node Hours column was calculated by multiplying the wall-time by the number of node used (if slurm was not used, then the wall-time was multiplied by 1). Additionally, I have provided a final column that sums up all of the core time for one code (e.g. singleFrameDriver.py or mosaic.py) for one visit type (i.e. Cosmos or Wide).

            Show
            sthrush Samantha Thrush added a comment - After some suggestions from Hsin-Fang Chiang , I have updated the table provided above. The Core Hours column is the wall-time multiplied by the cores used by Slurm (if Slurm was not used, then the wall-time was multiplied by 1), and the Node Hours column was calculated by multiplying the wall-time by the number of node used (if slurm was not used, then the wall-time was multiplied by 1). Additionally, I have provided a final column that sums up all of the core time for one code (e.g. singleFrameDriver.py or mosaic.py) for one visit type (i.e. Cosmos or Wide).

              People

              Assignee:
              sthrush Samantha Thrush
              Reporter:
              hchiang2 Hsin-Fang Chiang
              Watchers:
              Hsin-Fang Chiang, Samantha Thrush
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: