Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-1570

Create integration test case using data duplicator

    XMLWordPrintable

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: Qserv
    • Labels:
      None

      Description

      Integration tests should provide a new test case which use sph-duplicate in partition package.

        Attachments

          Issue Links

            Activity

            Hide
            smonkewitz Serge Monkewitz added a comment -

            See https://github.com/LSST/partition/blob/master/docs/duplication.md for a description of what the duplicator does, and a worked example that runs it.

            Show
            smonkewitz Serge Monkewitz added a comment - See https://github.com/LSST/partition/blob/master/docs/duplication.md for a description of what the duplicator does, and a worked example that runs it.
            Hide
            jammes Fabrice Jammes added a comment -

            Hi Jacek Becla and Vaikunth Thukral,

            We should be careful, indeed DM-627 and this ticket are operating on the same code. Merge to master of the last reviewed ticket could be difficult here.

            Show
            jammes Fabrice Jammes added a comment - Hi Jacek Becla and Vaikunth Thukral , We should be careful, indeed DM-627 and this ticket are operating on the same code. Merge to master of the last reviewed ticket could be difficult here.
            Hide
            vaikunth Vaikunth Thukral added a comment -

            Hi Fabrice Jammes,

            Noted. Let's try to figure out the best way to go forward with these 2 tickets then, I'll be on hipchat.

            Show
            vaikunth Vaikunth Thukral added a comment - Hi Fabrice Jammes , Noted. Let's try to figure out the best way to go forward with these 2 tickets then, I'll be on hipchat.
            Hide
            vaikunth Vaikunth Thukral added a comment - - edited

            I have a script that is able to run the duplicator and produce partitioned data, but could not move towards creating a new test case since the data loader for the integration tests wasn't ready. From the email thread a few weeks ago we had decided to use the loader to use pre-partitioned data so I spoke with Fabrice Jammes and concluded that DM-627 was blocking this issue.

            Show
            vaikunth Vaikunth Thukral added a comment - - edited I have a script that is able to run the duplicator and produce partitioned data, but could not move towards creating a new test case since the data loader for the integration tests wasn't ready. From the email thread a few weeks ago we had decided to use the loader to use pre-partitioned data so I spoke with Fabrice Jammes and concluded that DM-627 was blocking this issue.
            Hide
            jammes Fabrice Jammes added a comment -

            Hi Vaikunth Thukral

            DM-627 is now done and merged to master, integration tests code has been refactored a lot here. Sorry for blocking you such a long time.

            Cheers,

            Fabrice

            Show
            jammes Fabrice Jammes added a comment - Hi Vaikunth Thukral DM-627 is now done and merged to master, integration tests code has been refactored a lot here. Sorry for blocking you such a long time. Cheers, Fabrice
            Hide
            vaikunth Vaikunth Thukral added a comment - - edited

            Changes to both qserv and qserv_testdata were added. Currently I cannot push my changes to qserv_testdata so Jacek Becla is reviewing with a manual diff of the commits. A few notes about this issue:

            1) The new test case is case05, and I tested it locally to be working correctly with the integration tests.
            2) The test is set up in a way that the level of duplication can be set easily by changing "--htm-level" in common.cfg to duplicate more or less data as per requirements. By default I have it set to 4, which is duplicating a ~2MB dataset to >100Mb.
            3) The test will not scale well since currently it involves cat-ing the output chunks of the duplicator into 1 file for the data loader. Fabrice Jammes and I discussed alternatives to this but we may not need to expand it beyond this level of usage anyway.
            4) The tests can, in principle, be easily adapted to duplicate other test case datasets too now, but integrating that will need a new story.

            Show
            vaikunth Vaikunth Thukral added a comment - - edited Changes to both qserv and qserv_testdata were added. Currently I cannot push my changes to qserv_testdata so Jacek Becla is reviewing with a manual diff of the commits. A few notes about this issue: 1) The new test case is case05, and I tested it locally to be working correctly with the integration tests. 2) The test is set up in a way that the level of duplication can be set easily by changing "--htm-level" in common.cfg to duplicate more or less data as per requirements. By default I have it set to 4, which is duplicating a ~2MB dataset to >100Mb. 3) The test will not scale well since currently it involves cat-ing the output chunks of the duplicator into 1 file for the data loader. Fabrice Jammes and I discussed alternatives to this but we may not need to expand it beyond this level of usage anyway. 4) The tests can, in principle, be easily adapted to duplicate other test case datasets too now, but integrating that will need a new story.
            Hide
            jbecla Jacek Becla added a comment -

            Actually I pushed your diff to u/jbecla/DM-1570 (and I was able to push to remotes easily, but maybe it is because I am an admin on gitolite?)

            What I don't like about it is that I am the committer, and you should be getting the credit. So it'd still be good if you'd push it to master from your branch.

            And I still can't do the review in github, bummer.

            I'll post comments here shortly.

            Show
            jbecla Jacek Becla added a comment - Actually I pushed your diff to u/jbecla/ DM-1570 (and I was able to push to remotes easily, but maybe it is because I am an admin on gitolite?) What I don't like about it is that I am the committer, and you should be getting the credit. So it'd still be good if you'd push it to master from your branch. And I still can't do the review in github, bummer. I'll post comments here shortly.
            Hide
            jbecla Jacek Becla added a comment -

            I made you the author of these commits.

            I also made few tweaks to make things simpler / faster:

            • update copyrights (years) for each file you touched
            • sort imports alphabetically (in tests/benchmark.py, tests/dbLoader.py)
            • deleted trailing spaces

            See the commits in u/jbecla/DM-1570

            If it passes the tests, I'll merge with master, sounds ok?

            Show
            jbecla Jacek Becla added a comment - I made you the author of these commits. I also made few tweaks to make things simpler / faster: update copyrights (years) for each file you touched sort imports alphabetically (in tests/benchmark.py, tests/dbLoader.py) deleted trailing spaces See the commits in u/jbecla/ DM-1570 If it passes the tests, I'll merge with master, sounds ok?
            Hide
            jbecla Jacek Becla added a comment -

            FYI, you need to rebase your qserv branch

            git checkout master
            git pull
            git checkout u/vaikunth/DM-1570
            git rebase master

            it rebases cleanly

            Show
            jbecla Jacek Becla added a comment - FYI, you need to rebase your qserv branch git checkout master git pull git checkout u/vaikunth/DM-1570 git rebase master it rebases cleanly
            Hide
            jbecla Jacek Becla added a comment - - edited

            regarding changes in qserv module:

            • please remove trailing spaces in code you committed. I saw it in admin/python/lsst/qserv/admin/dataDuplicator.py, didn't check elsewhere.
            • it'd be good to squash commits (do you know how to do that? If not, let's not mess around with it now)
            • update year in copyright lines for files you touched
            Show
            jbecla Jacek Becla added a comment - - edited regarding changes in qserv module: please remove trailing spaces in code you committed. I saw it in admin/python/lsst/qserv/admin/dataDuplicator.py, didn't check elsewhere. it'd be good to squash commits (do you know how to do that? If not, let's not mess around with it now) update year in copyright lines for files you touched
            Hide
            jbecla Jacek Becla added a comment -

            Looks like you are good to go now, K-T tweaked gitolite, so go ahead and do the merging.

            Show
            jbecla Jacek Becla added a comment - Looks like you are good to go now, K-T tweaked gitolite, so go ahead and do the merging.
            Hide
            jbecla Jacek Becla added a comment -

            all done, I updated headers, removed trailing spaces, squashed, and pushed things to the master (both qserv and qserv_testdata).

            Nice work Vaikunth! Thanks

            Show
            jbecla Jacek Becla added a comment - all done, I updated headers, removed trailing spaces, squashed, and pushed things to the master (both qserv and qserv_testdata). Nice work Vaikunth! Thanks
            Hide
            vaikunth Vaikunth Thukral added a comment -

            Great, thanks Jacek!

            Show
            vaikunth Vaikunth Thukral added a comment - Great, thanks Jacek!

              People

              Assignee:
              vaikunth Vaikunth Thukral
              Reporter:
              jammes Fabrice Jammes
              Reviewers:
              Jacek Becla
              Watchers:
              Fabrice Jammes, Jacek Becla, Serge Monkewitz, Vaikunth Thukral
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.