Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-24721

w_2020_18 butler create does not work with Oracle

    XMLWordPrintable

    Details

    • Story Points:
      1
    • Team:
      Architecture
    • Urgent?:
      No

      Description

      Trying to run latest weekly (w_2020_18) with Oracle:

      python /usr/local/lsst_stack/w_2020_18/stack/miniconda3-4.7.12-2deae7a/Linux64/daf_butler/19.0.0-65-ga8942968/bin/butler create --repo /work/mgower/gen3work/weeklyCItest/git/ci_hsc_gen3/DATA
      /work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/dialects/oracle/base.py:1381: SAWarning: Oracle version (19, 4, 0, 0, 0) is known to have a maximum identifier length of 128, rather than the historical default of 30. SQLAlchemy 1.4 will use 128 for this database; please set max_identifier_length=128 in create_engine() in order to test the application with this new length, or set to 30 in order to assure that 30 continues to be used.  In particular, pay close attention to the behavior of database migrations as dynamically generated names may change. See the section 'Max Identifier Lengths' in the SQLAlchemy Oracle dialect documentation for background.
        % ((self.server_version_info,))
      Traceback (most recent call last):
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
          cursor, statement, parameters, context
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
          cursor.execute(statement, parameters)
      cx_Oracle.DatabaseError: ORA-01408: such column list already indexed
       
      The above exception was the direct cause of the following exception:
      Traceback (most recent call last):
        File "/usr/local/lsst_stack/w_2020_18/stack/miniconda3-4.7.12-2deae7a/Linux64/daf_butler/19.0.0-65-ga8942968/bin/butler", line 28, in <module>
          sys.exit(main())
        File "/usr/local/lsst_stack/w_2020_18/stack/miniconda3-4.7.12-2deae7a/Linux64/daf_butler/19.0.0-65-ga8942968/python/lsst/daf/butler/cli/butler.py", line 252, in main
          return cli()
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/click/core.py", line 764, in __call__
          return self.main(*args, **kwargs)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/click/core.py", line 717, in main
          rv = self.invoke(ctx)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/click/core.py", line 1137, in invoke
          return _process_result(sub_ctx.command.invoke(sub_ctx))
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/click/core.py", line 956, in invoke
          return ctx.invoke(self.callback, **ctx.params)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/click/core.py", line 555, in invoke
          return callback(*args, **kwargs)
        File "/usr/local/lsst_stack/w_2020_18/stack/miniconda3-4.7.12-2deae7a/Linux64/daf_butler/19.0.0-65-ga8942968/python/lsst/daf/butler/cli/cmd/create.py", line 42, in create
          outfile=outfile)
        File "/usr/local/lsst_stack/w_2020_18/stack/miniconda3-4.7.12-2deae7a/Linux64/daf_butler/19.0.0-65-ga8942968/python/lsst/daf/butler/_butler.py", line 382, in makeRepo
          Registry.fromConfig(config, create=createRegistry, butlerRoot=root)
        File "/usr/local/lsst_stack/w_2020_18/stack/miniconda3-4.7.12-2deae7a/Linux64/daf_butler/19.0.0-65-ga8942968/python/lsst/daf/butler/registry/_registry.py", line 204, in fromConfig
          datasets=datasets, create=create)
        File "/usr/local/lsst_stack/w_2020_18/stack/miniconda3-4.7.12-2deae7a/Linux64/daf_butler/19.0.0-65-ga8942968/python/lsst/daf/butler/registry/_registry.py", line 223, in __init__
          self._opaque = opaque.initialize(self._db, context)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/contextlib.py", line 119, in __exit__
          next(self.gen)
        File "/usr/local/lsst_stack/w_2020_18/stack/miniconda3-4.7.12-2deae7a/Linux64/daf_butler/19.0.0-65-ga8942968/python/lsst/daf/butler/registry/interfaces/_database.py", line 387, in declareStaticTables
          self._metadata.create_all(self._connection)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/sql/schema.py", line 4321, in create_all
          ddl.SchemaGenerator, self, checkfirst=checkfirst, tables=tables
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1627, in _run_visitor
          visitorcallable(self.dialect, self, **kwargs).traverse_single(element)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 144, in traverse_single
          return meth(obj, **kw)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/sql/ddl.py", line 781, in visit_metadata
          _is_metadata_operation=True,
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 144, in traverse_single
          return meth(obj, **kw)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/sql/ddl.py", line 833, in visit_table
          self.traverse_single(index)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/sql/visitors.py", line 144, in traverse_single
          return meth(obj, **kw)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/sql/ddl.py", line 862, in visit_index
          self.connection.execute(CreateIndex(index))
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 984, in execute
          return meth(self, multiparams, params)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/sql/ddl.py", line 72, in _execute_on_connection
          return connection._execute_ddl(self, multiparams, params)
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1046, in _execute_ddl
          compiled,
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1288, in _execute_context
          e, statement, parameters, cursor, context
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1482, in _handle_dbapi_exception
          sqlalchemy_exception, with_traceback=exc_info[2], from_=e
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 178, in raise_
          raise exception
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
          cursor, statement, parameters, context
        File "/work/mgower/miniconda/envs/bps-lsst-scipipe-2deae7a/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 590, in do_execute
          cursor.execute(statement, parameters)
      sqlalchemy.exc.DatabaseError: (cx_Oracle.DatabaseError) ORA-01408: such column list already indexed
      [SQL: CREATE INDEX run_fkidx_name ON run (name)]
      (Background on this error at: http://sqlalche.me/e/4xp6)
      scons: *** [DATA/butler.yaml] Error 1
      scons: building terminated because of errors.
      
      

      Got the create table and index SQL by setting SQLAlchemy's echo=True.   Can repeat the error message.

        Attachments

          Issue Links

            Activity

            Hide
            mgower Michelle Gower added a comment - - edited

            PostgreSQL and others also implicitly create indexes for primary keys.   They just don't seem to care if one creates exact duplicate indexes with different names.  Even if the RDBMS allows it, we aren't going to want duplicate indexes for performance reasons if not maintainability.

            Show
            mgower Michelle Gower added a comment - - edited PostgreSQL and others also implicitly create indexes for primary keys.   They just don't seem to care if one creates exact duplicate indexes with different names.  Even if the RDBMS allows it, we aren't going to want duplicate indexes for performance reasons if not maintainability.
            Hide
            mgower Michelle Gower added a comment -

            Alembic docs recommend the following for implicit naming (found by Christopher Stephens [X]):

            But there is also a better way to go, which is to make use of a feature new in SQLAlchemy 0.9.2 which makes use of the events behind the scenes known as naming_convention. Here, we can create a new MetaData object while passing a dictionary referring to a naming scheme:convention = {
              "ix": "ix_%(column_0_label)s",
              "uq": "uq_%(table_name)s_%(column_0_name)s",
              "ck": "ck_%(table_name)s_%(constraint_name)s",
              "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
              "pk": "pk_%(table_name)s"
            }metadata = MetaData(naming_convention=convention)
            If we define our models using a MetaData as above, the given naming convention dictionary will be used to provide names for all constraints and indexes.

            I'm not sure what it means in terms of PostgreSQL naming limits.

            Show
            mgower Michelle Gower added a comment - Alembic docs  recommend the following for implicit naming (found by Christopher Stephens [X] ): But there is also a better way to go, which is to make use of a feature new in SQLAlchemy 0.9.2 which makes use of the events behind the scenes known as naming_convention. Here, we can create a new MetaData object while passing a dictionary referring to a naming scheme:convention = { "ix": "ix_%(column_0_label)s", "uq": "uq_%(table_name)s_%(column_0_name)s", "ck": "ck_%(table_name)s_%(constraint_name)s", "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s", "pk": "pk_%(table_name)s" }metadata = MetaData(naming_convention=convention) If we define our models using a MetaData as above, the given naming convention dictionary will be used to provide names for all constraints and indexes. I'm not sure what it means in terms of PostgreSQL naming limits.
            Hide
            jbosch Jim Bosch added a comment -

            PostgreSQL and others also implicitly create indexes for primary keys.

            I assume you meant "foreign keys"?

            Also, do you happen to know if PostgreSQL and SQLite actually create a duplicate index when asked to create an explicit index that duplicates an implied one from a foreign key? (If you don't know, I'll look into it myself, but I figured there was a chance you already saw the answer somewhere.)

            Show
            jbosch Jim Bosch added a comment - PostgreSQL and others also implicitly create indexes for primary keys. I assume you meant "foreign keys"? Also, do you happen to know if PostgreSQL and SQLite actually create a duplicate index when asked to create an explicit index that duplicates an implied one from a foreign key? (If you don't know, I'll look into it myself, but I figured there was a chance you already saw the answer somewhere.)
            Hide
            mgower Michelle Gower added a comment -

            I meant primary keys because name is the primary key for the run table.   I didn't look closely enough at the index name pattern to realize the one being explicitly made was intended to be a foreign key index.   I (perhaps wrongly) thought the collision was because the primary key created an index for the name column and the explicitly created index was a duplicate index also only on the name column.

            From the postgresql 12 documentation:
            A foreign key must reference columns that either are a primary key or form a unique constraint. This means that the referenced columns always have an index (the one underlying the primary key or unique constraint); so checks on whether a referencing row has a match will be efficient. Since a DELETE of a row from the referenced table or an UPDATE of a referenced column will require a scan of the referencing table for rows matching the old value, it is often a good idea to index the referencing columns too. Because this is not always needed, and there are many choices available on how to index, declaration of a foreign key constraint does not automatically create an index on the referencing columns.

            Show
            mgower Michelle Gower added a comment - I meant primary keys because name is the primary key for the run table.   I didn't look closely enough at the index name pattern to realize the one being explicitly made was intended to be a foreign key index.   I (perhaps wrongly) thought the collision was because the primary key created an index for the name column and the explicitly created index was a duplicate index also only on the name column. From the postgresql 12 documentation: A foreign key must reference columns that either are a primary key or form a unique constraint. This means that the referenced columns always have an index (the one underlying the primary key or unique constraint); so checks on whether a referencing row has a match will be efficient. Since a  DELETE  of a row from the referenced table or an  UPDATE  of a referenced column will require a scan of the referencing table for rows matching the old value, it is often a good idea to index the referencing columns too. Because this is not always needed, and there are many choices available on how to index, declaration of a foreign key constraint does not automatically create an index on the referencing columns.
            Hide
            jbosch Jim Bosch added a comment -

            Ah, so we do have cases where a foreign key is also the primary key on the local table - I wonder if those are the only ones causing trouble with Oracle?

            Show
            jbosch Jim Bosch added a comment - Ah, so we do have cases where a foreign key is also the primary key on the local table - I wonder if those are the only ones causing trouble with Oracle?
            Hide
            jbosch Jim Bosch added a comment -

            Michelle Gower, asking you to review since you're already familiar with the context and general approach, and at least some of this code is familiar to you as well.

            PR is here: https://github.com/lsst/daf_butler/pull/274

            Show
            jbosch Jim Bosch added a comment - Michelle Gower , asking you to review since you're already familiar with the context and general approach, and at least some of this code is familiar to you as well. PR is here: https://github.com/lsst/daf_butler/pull/274
            Hide
            mgower Michelle Gower added a comment -

            One question.   Good to merge.

            Show
            mgower Michelle Gower added a comment - One question.   Good to merge.

              People

              Assignee:
              jbosch Jim Bosch
              Reporter:
              mgower Michelle Gower
              Reviewers:
              Michelle Gower
              Watchers:
              Jim Bosch, Michelle Gower, Tim Jenness
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:

                  Jenkins

                  No builds found.