# Switch to using new partitioner, loader

XMLWordPrintable

#### Details

• Type: Story
• Status: Done
• Resolution: Done
• Fix Version/s: None
• Component/s:
• Labels:
None
• Story Points:
4
• Sprint:
DB_W15_11, DB_W15_12, DB_W15_01
• Team:
Data Access and Database

#### Description

Integrated tests procedure has to rely on new loader

#### Activity

Hide
Fabrice Jammes added a comment -
• dataloader option to keep unzipped data on disk
• dataloader create emptyChunks dir if not exists
Show
Fabrice Jammes added a comment - dataloader option to keep unzipped data on disk dataloader create emptyChunks dir if not exists
Hide
Fabrice Jammes added a comment - - edited

Happy new year.

FYI, i've refactored most of the integration test code so that is gets more simple.

Code is both in qserv and qserv_testdata repositories.

I've used this good practice to configure python logging:
http://victorlin.me/posts/2012/08/26/good-logging-practice-in-python
It's very powerful because all log levels for each modules can be easily configured via a single yaml configuration file. It would allow to remove all the code of --verbose-all and -vvv options in the loader, and also all logging configuration code from other Qserv python tools, which would lead to simpler code and add some standardization in our python logging management. For now I've let both techniques live aside.

Show
Fabrice Jammes added a comment - - edited Hi Andy Salnikov , Happy new year. FYI, i've refactored most of the integration test code so that is gets more simple. Code is both in qserv and qserv_testdata repositories. I've used this good practice to configure python logging: http://victorlin.me/posts/2012/08/26/good-logging-practice-in-python It's very powerful because all log levels for each modules can be easily configured via a single yaml configuration file. It would allow to remove all the code of --verbose-all and -vvv options in the loader, and also all logging configuration code from other Qserv python tools, which would lead to simpler code and add some standardization in our python logging management. For now I've let both techniques live aside.
Hide
Andy Salnikov added a comment - - edited

Fabrice,

comments on qserv commits are all in github, here are comments for qserv_testdata, comments are for both tickets on the branch (DM-1658 and this):

• hard-coded location for QSERV_RUN_DIR, there should be a command-line option to change current default value
• line 79:

 if [ -r "${EUPS_DIR}" -a -x "${EUPS_DIR}" ]; then  . "${EUPS_DIR}/bin/setups.sh" why not  if [ -f "${EUPS_DIR}/bin/setups.sh" ]; then  . "${EUPS_DIR}/bin/setups.sh" • line 87: again, I'd replace it with  if [ ! -f "${QSERV_DIR}/SConstruct" ]; then

• line 98: killall mysqld mysql-proxy xrootd java python ...
• this is very unfriendly thing to do Imagine I run some Java or Python application which is not qserv. Please fix this to kill only the things that need to be killed (this should be true for all services)
• line 119: proper cleanup should be done even if qserv-test-integration.py fails, with set -e the script will stop immediately on any error. If you do cleanup then you probably do not need killall
• line 122: avoid making those symlinks in current directory, if you have not removed this yet
• if you use printf instead of echo do not forget to add \n at the end (and echo is OK in most cases, printf is needed only when you want format numbers or insert special characters)
• line 126: /n should be \n, but that line should not be there

... To be continued ...

Show
Andy Salnikov added a comment - - edited Fabrice, comments on qserv commits are all in github, here are comments for qserv_testdata, comments are for both tickets on the branch ( DM-1658 and this): bin/qserv-test-head.sh: hard-coded location for QSERV_RUN_DIR, there should be a command-line option to change current default value line 79: if [ -r "${EUPS_DIR}" -a -x "${EUPS_DIR}" ]; then . "${EUPS_DIR}/bin/setups.sh" why not if [ -f "${EUPS_DIR}/bin/setups.sh" ]; then . "${EUPS_DIR}/bin/setups.sh" line 87: again, I'd replace it with if [ ! -f "${QSERV_DIR}/SConstruct" ]; then line 98: killall mysqld mysql-proxy xrootd java python ... this is very unfriendly thing to do Imagine I run some Java or Python application which is not qserv. Please fix this to kill only the things that need to be killed (this should be true for all services) line 119: proper cleanup should be done even if qserv-test-integration.py fails, with set -e the script will stop immediately on any error. If you do cleanup then you probably do not need killall line 122: avoid making those symlinks in current directory, if you have not removed this yet if you use printf instead of echo do not forget to add \n at the end (and echo is OK in most cases, printf is needed only when you want format numbers or insert special characters) line 126: /n should be \n , but that line should not be there ... To be continued ...
Hide
Andy Salnikov added a comment -

Continuing review for qserv_testdata:

• our coding guidelines say that python module names should be camelCase (mysqlLoader.py instead of mysqlloader.py), probably not worth fixing it now, but something to think about for the future

bin/qserv-check-integration.py:

• lines 1-2: shebang line repeated twice
• line 33: tarfile import is unused
• line 36: common import is unused
• line 80: __name__ is "__main__" when you run it as a script, I'm not sure it's the best option for logger name, I'd probably just use root logger
• line 103: return_code is always 0 here, should it be non-zero for failed tests?
• some variable names have underscores, I believe DM coding guidelines are to use camelCase for regular names

bin/qserv-check-integration.py:

• unused imports logging, os, commons

bin/qserv-testunit.py

tests/benchmark.py:

• copyright string needs an update
• line 60: regexp string needs r prefix or doubling of all backslashes
• line 124: format string has 2 formatting codes but there are 4 arguments to format()
• in general logging functions (especially debug) should use lazy interpolation (debug("message %s", message) instead of debug("message {0}".format(message)) or debug("message %s" % message)) to save some CPU cycles

• line 66: commons.run_command() does not return anything, out will be None

• unused imports logging, const

• line 69: QservLoader constructor only takes 4 arguments
• line 76, 80: QservLoader has no connectAndInitDatabase() or alterTable() methods

description.yaml (for all 3 cases):

• these should be merged with common.cfg once we have everything in yaml
• some information in description.yaml is redundant, the same info exists (or should be added) in Table.cfg files, we should have it only in one place, not two. This applies to tables.partitioned-tables, tables.views, tables.directors (for that we need DM-1720 done). I hope we can deduce table loading order from other information as well.
• we should document all configuration parameters used by partitioner and data loader(s), or maybe even better, have some form of schema for that so we can discover errors in config files (right now there is no checks that parameter names are sane)
Show
Andy Salnikov added a comment - Continuing review for qserv_testdata: our coding guidelines say that python module names should be camelCase (mysqlLoader.py instead of mysqlloader.py), probably not worth fixing it now, but something to think about for the future bin/qserv-check-integration.py: lines 1-2: shebang line repeated twice line 33: tarfile import is unused line 36: common import is unused line 80: __name__ is "__main__" when you run it as a script, I'm not sure it's the best option for logger name, I'd probably just use root logger line 103: return_code is always 0 here, should it be non-zero for failed tests? some variable names have underscores, I believe DM coding guidelines are to use camelCase for regular names bin/qserv-check-integration.py: unused imports logging, os, commons bin/qserv-testunit.py lsst.qserv.tests.testqservdataloader does not exist? some comments/docstring would be handy, plus standard copyright header tests/benchmark.py: copyright string needs an update line 60: regexp string needs r prefix or doubling of all backslashes line 124: format string has 2 formatting codes but there are 4 arguments to format() in general logging functions (especially debug) should use lazy interpolation ( debug("message %s", message) instead of debug("message {0}".format(message)) or debug("message %s" % message) ) to save some CPU cycles tests/mysqlloader.py: line 66: commons.run_command() does not return anything, out will be None tests/qservloader.py: unused imports logging, const tests/testqservloader.py: line 69: QservLoader constructor only takes 4 arguments line 76, 80: QservLoader has no connectAndInitDatabase() or alterTable() methods description.yaml (for all 3 cases): these should be merged with common.cfg once we have everything in yaml some information in description.yaml is redundant, the same info exists (or should be added) in Table.cfg files, we should have it only in one place, not two. This applies to tables.partitioned-tables , tables.views , tables.directors (for that we need DM-1720 done). I hope we can deduce table loading order from other information as well. we should document all configuration parameters used by partitioner and data loader(s), or maybe even better, have some form of schema for that so we can discover errors in config files (right now there is no checks that parameter names are sane)
Hide
Fabrice Jammes added a comment -

I've taken in account all your comments, but them related to yaml, and pushed them to u/fjammes/DM-627.
Jacek Becla want to talk tomorrow about yaml use in Qserv. For now do you agree if I move logging configuration to basic .ini file and let description.yaml as it is in order to close this ticket?

Thanks,

Fabrice

Show
Fabrice Jammes added a comment - Hi Andy Salnikov I've taken in account all your comments, but them related to yaml, and pushed them to u/fjammes/ DM-627 . Jacek Becla want to talk tomorrow about yaml use in Qserv. For now do you agree if I move logging configuration to basic .ini file and let description.yaml as it is in order to close this ticket? Thanks, Fabrice
Hide
Andy Salnikov added a comment -

Hi Fabrice,

if you can avoid importing yaml in qserv for now then it should be OK to merge. For qserv_testdata it's OK to keep yaml config files for now (I presume that people who need to run integration tests all have anaconda in stack) but check other issues in my comments above.

Andy

Show
Andy Salnikov added a comment - Hi Fabrice, if you can avoid importing yaml in qserv for now then it should be OK to merge. For qserv_testdata it's OK to keep yaml config files for now (I presume that people who need to run integration tests all have anaconda in stack) but check other issues in my comments above. Andy
Hide
Fabrice Jammes added a comment -

I've checked all the issues, it required some significant work but I think it is worth it. Thanks for helping me to improve the code quality.
I've done what was decided with yaml.
Please let me know if we can close this ticket.

FYI

 qserv-test-head.sh -R ~/qserv-run/test

pass successfully.

Cheers,

Fabrice

Show
Fabrice Jammes added a comment - Hi Andy Salnikov , I've checked all the issues, it required some significant work but I think it is worth it. Thanks for helping me to improve the code quality. I've done what was decided with yaml. Please let me know if we can close this ticket. FYI qserv- test - head .sh -R ~ /qserv-run/test pass successfully. Cheers, Fabrice
Hide
Andy Salnikov added a comment -

OK to merge.

Show
Andy Salnikov added a comment - OK to merge.

#### People

Assignee:
Fabrice Jammes
Reporter:
Fritz Mueller
Reviewers:
Andy Salnikov
Watchers:
Andy Salnikov, Fabrice Jammes, Jacek Becla