Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-327

Take RAM into account when computing NCORES to use in installs

    Details

    • Story Points:
      1
    • Team:
      SQuaRE

      Description

      Darko Jevremovic reported he's had to switch off hyperthreading and manually override NCORES, MAKEFLAGS and SCONFLAGS because his 8-core machine had too little RAM to build afw with -j 8.

      To fix this, eupspkg default build routines should take RAM into account when computing the level of build parallelism.

      In the meantime, we should document the workaround (contact darko@aob.rs).

        Attachments

          Issue Links

            Activity

            Hide
            shaw Richard Shaw [X] (Inactive) added a comment -

            Mario, can you post Darko's communication where he describes the problem and it's solution?

            Show
            shaw Richard Shaw [X] (Inactive) added a comment - Mario, can you post Darko's communication where he describes the problem and it's solution?
            Hide
            shaw Richard Shaw [X] (Inactive) added a comment -

            Comments offered by Darko:

            Issue was that machine was hard crashing during instalation of afw. I had to switch off hyperthreading in BIOS and than it compiled & installed afw fine. Could be that there is some underlying hardware problem which manifests only with very high CPU or memory usage - as it is my desktop machine did not have time to investigate further (...)

            Machine:
            CPU i7-2600 3400GHz quad core (but with hyperthreading it reports 8 cores)

            Memory 11G; Swap 24G

            Uname -a Linux servo3 3.11.10-7-default #1 SMP Mon Feb 3 09:41:24 UTC 2014 (750023e) x86_64 x86_64 x86_64 GNU/Linux

            /etc/issue
            openSUSE 13.1 "Bottle"

            Show
            shaw Richard Shaw [X] (Inactive) added a comment - Comments offered by Darko: Issue was that machine was hard crashing during instalation of afw. I had to switch off hyperthreading in BIOS and than it compiled & installed afw fine. Could be that there is some underlying hardware problem which manifests only with very high CPU or memory usage - as it is my desktop machine did not have time to investigate further (...) Machine: CPU i7-2600 3400GHz quad core (but with hyperthreading it reports 8 cores) Memory 11G; Swap 24G Uname -a Linux servo3 3.11.10-7-default #1 SMP Mon Feb 3 09:41:24 UTC 2014 (750023e) x86_64 x86_64 x86_64 GNU/Linux /etc/issue openSUSE 13.1 "Bottle"
            Hide
            shaw Richard Shaw [X] (Inactive) added a comment -

            Mario replied:

            I suspect this may have been an issue with insufficient RAM, but it's hard to tell (did you maybe see if the machine was swapping and/or with high CPU utilization)?

            I just looked at the EUPS source... Unfortunately, the only workaround I can think of is pretty blunt: to edit $EUPS_DIR/lib/eupspkg.sh after running newinstall.sh, and changing the NCORES=.... line (near the end of the file) to explicitly use a smaller number.

            Show
            shaw Richard Shaw [X] (Inactive) added a comment - Mario replied: I suspect this may have been an issue with insufficient RAM, but it's hard to tell (did you maybe see if the machine was swapping and/or with high CPU utilization)? I just looked at the EUPS source... Unfortunately, the only workaround I can think of is pretty blunt: to edit $EUPS_DIR/lib/eupspkg.sh after running newinstall.sh, and changing the NCORES=.... line (near the end of the file) to explicitly use a smaller number.
            Hide
            shaw Richard Shaw [X] (Inactive) added a comment - - edited

            I note that openSUSE 13.1 is not a supported platform. Do we have a way to test that the proposed solution actually fixes the problem?

            Show
            shaw Richard Shaw [X] (Inactive) added a comment - - edited I note that openSUSE 13.1 is not a supported platform. Do we have a way to test that the proposed solution actually fixes the problem?
            Hide
            shaw Richard Shaw [X] (Inactive) added a comment -

            IF the problem turns out to be a sub-optimal ratio of available memory per cpu, I doubt one could write simple logic in a script to fine-tune the number of cores employed, and guarantee no further trouble. I suspect many users are like me and perform other tasks (some requiring significant memory) while waiting for the Stack to build. Also, even if tuning could be achieved in v8.0, the requirements may change with v9.0, etc. Maybe it's better to scale back how aggressively the script aims to employ cores (rough approximation), and to offer the user a chance to override (with -j or whatever)?

            Show
            shaw Richard Shaw [X] (Inactive) added a comment - IF the problem turns out to be a sub-optimal ratio of available memory per cpu, I doubt one could write simple logic in a script to fine-tune the number of cores employed, and guarantee no further trouble. I suspect many users are like me and perform other tasks (some requiring significant memory) while waiting for the Stack to build. Also, even if tuning could be achieved in v8.0, the requirements may change with v9.0, etc. Maybe it's better to scale back how aggressively the script aims to employ cores (rough approximation), and to offer the user a chance to override (with -j or whatever)?
            Hide
            mjuric Mario Juric added a comment -

            I agree we should have a way for the user to override.

            I do still think -j $NCORES is a better default than -j 1. It works for a large majority of users. With a bit tweaking that takes into account RAM-per-core, my guess is we'll probably have this problem solved.

            Re question about testing openSUSE – it's not a platform we can test on, but if users contribute experiences/test workarounds for us, we can capture them and add them to the documentation.

            Show
            mjuric Mario Juric added a comment - I agree we should have a way for the user to override. I do still think -j $NCORES is a better default than -j 1. It works for a large majority of users. With a bit tweaking that takes into account RAM-per-core, my guess is we'll probably have this problem solved. Re question about testing openSUSE – it's not a platform we can test on, but if users contribute experiences/test workarounds for us, we can capture them and add them to the documentation.
            Hide
            tjenness Tim Jenness added a comment -

            This ticket has been open for 5 years. eupspkg number of core calculations was adjusted in v2.0.0 of eups.

            Show
            tjenness Tim Jenness added a comment - This ticket has been open for 5 years. eupspkg number of core calculations was adjusted in v2.0.0 of eups.

              People

              • Assignee:
                Unassigned
                Reporter:
                mjuric Mario Juric
                Watchers:
                Richard Shaw [X] (Inactive), Robert Lupton, Robyn Allsman [X] (Inactive), Tim Jenness
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Summary Panel