Uploaded image for project: 'Data Management'
  1. Data Management
  2. DM-16694

Determine whether and how to use OAuth2 proxy

    Details

    • Type: Story
    • Status: Done
    • Resolution: Done
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Sprint:
      Arch 2018-12-03, Arch 2018-12-10, Arch 2019-01-07
    • Team:
      Architecture

      Description

      Should/can we use an off-the-shelf OAuth2 proxy along with Kubernetes ingress to handle the OAuth2 OpenID Connect process for all LSP services, passing the resulting token through? If so, how should it be configured?

      Concretely, the OAuth2 proxy would listen to all LSP endpoints; detect whether an OAuth2 (OpenID Connect) authentication token is present; (reverse) proxy to the appropriate upstream service if so; and conduct the OAuth2 token acquisition process with CILogon, including providing a callback URL, if not. The upstream connection should pass the token along so that authorization can be performed.

        Attachments

          Issue Links

            Activity

            Hide
            bvan Brian Van Klaveren added a comment -

            After some experimenting with nginx auth requests, I'm convinced we may be able to leverage OAuth proxy, but we can also write our own authorizers and deploy them to containers, at which time we can potentially use a custom authorizer microservice as well, such as this:

            https://github.com/brianv0/nginx-scitokens/blob/master/authorizer.py

            Nginx PLUS also has a built-in jwt authorizer we can potentially leverage.

            I'll be testing this out on kubernetes on my personal machine today

            Show
            bvan Brian Van Klaveren added a comment - After some experimenting with nginx auth requests, I'm convinced we may be able to leverage OAuth proxy, but we can also write our own authorizers and deploy them to containers, at which time we can potentially use a custom authorizer microservice as well, such as this: https://github.com/brianv0/nginx-scitokens/blob/master/authorizer.py Nginx PLUS also has a built-in jwt authorizer we can potentially leverage. I'll be testing this out on kubernetes on my personal machine today
            Hide
            bvan Brian Van Klaveren added a comment -

            Note - There's a difference in methodology if we do this in GKE with the GKE ingress controller vs other kubernetes deployments because the GKE Ingress Controller doesn't support the "auth-url" parameter.

             

            It's much simpler with an nginx ingress controller, and we could still run an nginx ingress controller in GKE (https://cloud.google.com/community/tutorials/nginx-ingress-gke).

             

            If we aren't to use an nginx ingress in GKE, a GKE based deployment would probably need at least:

            1. Public Ingress Controller; routing all requests to a
            2. Auth container/Auth proxy; authenticating (and potentially authorizing) all requests, roting to
            3. Private ingress controller for LSP

             

            In this scenario, all requests are proxied through the Auth container/proxy, where it largely functions as auth termination.

             

            The last two can potentially be collapsed into a custom nginx container, but then we lose the autoconfig features of the ingress controller.

             

            Show
            bvan Brian Van Klaveren added a comment - Note - There's a difference in methodology if we do this in GKE with the GKE ingress controller vs other kubernetes deployments because the GKE Ingress Controller doesn't support the "auth-url" parameter.   It's much simpler with an nginx ingress controller, and we could still run an nginx ingress controller in GKE ( https://cloud.google.com/community/tutorials/nginx-ingress-gke).   If we aren't to use an nginx ingress in GKE, a GKE based deployment would probably need at least: Public Ingress Controller; routing all requests to a Auth container/Auth proxy; authenticating (and potentially authorizing) all requests, roting to Private ingress controller for LSP   In this scenario, all requests are proxied through the Auth container/proxy, where it largely functions as auth termination.   The last two can potentially be collapsed into a custom nginx container, but then we lose the autoconfig features of the ingress controller.  
            Hide
            bvan Brian Van Klaveren added a comment -

            It seems so far that using NGINX even in GKE is uncontroversial within the LSP

            Show
            bvan Brian Van Klaveren added a comment - It seems so far that using NGINX even in GKE is uncontroversial within the LSP
            Hide
            bvan Brian Van Klaveren added a comment -

            This issue will may be relevant in the future:

            https://github.com/kubernetes/ingress-nginx/issues/2862

            Show
            bvan Brian Van Klaveren added a comment - This issue will may be relevant in the future: https://github.com/kubernetes/ingress-nginx/issues/2862
            Hide
            bvan Brian Van Klaveren added a comment -

            To a first approximation, I have this working. There's several issues, however.

            • oauth2_proxy doesn't actually store the ID token in a header. Some code would be needed to enable this, but it's probably about 25 lines of code - not much. ouath2_proxy does store make the access token (in the ploxiln fork, see below), which can be used against the /userinfo endpoint in OpenID connect to get the userinfo which is mostly the same as the ID token, but an ID token should still be signed/generated into a JWT ideally, so we can verify it in different places of the stack. We already need a token issuer for the capability tokens, so maybe it makes sense to do this ourselves. Adam Thornton currently uses the userinfo endpoint of CILogon, so I do know that info is complete. It's not clear to me if an id token, as issued by CILogon, actually has all the same information as the userinfo endpoint but it's likely. We need a jwt if we want to use the jwt authenticator of jupyter.
            • I need to add a new oauth2 client (and get credentials) for our domains for CILogon. Currently I'm testing with my own domain on GKE in GCP (https://datasets.science)
            • ouath2_proxy is close to abandonware, though working. There's no current winner in the forks, though there's https://github.com/pusher/oauth2_proxy, https://github.com/ploxiln/oauth2_proxy, https://github.com/buzzfeed/sso, and some talk about getting the pusher fork into the CNCF.
              The ploxiln was best to get working because it has some PRs applied that improve OIDC support, but their docker container isn't great. I built my own docker container form a polxiln release and that's mostly working. The buzzfeed/sso seems to be getting some revamped OIDC support as of two days ago (via a Microsoft Azure AD integration PR). The pusher version also has some fixes. It's not clear which is the best one to base off of, though the ploxiln branch seems to have most activity despite seemingly most of the work by one maintainer.
            • With nginx, I haven't been able to get the authenticator (ouath2_proxy) and authorizer (python) working in conjunction, though there's a few ways of doing this:
              (Note: In this context, the authorizer is for only authorizing the access based on group membership or capability.)

            1. Use auth_request for oauth2_proxy, which gets evaluated first, then use the access_by_lua_block, which will get evaluated after.
            2. auth_request only for the authorizer. Redirect to /oauth2/sign_in if there's no token.
            3. Abandon oauth2_proxy, bring in the authentication code into the python code and just use that. oauth2_proxy is more general than OpenID connect, so that's where some of the impedance mismatch is coming from (OAuth2 doesn't have id tokens, oauth2_proxy doesn't have first class support for id tokens, mostly just access tokens from OAuth2)
            4. Fork oauth2_proxy, make some changes and rewrite the authorizer code in Go. Submit PRs to an oauth2_proxy fork, but they might not be accepted upstream since they are fairly specific to LSST. Some work could be done to make them more generic. Group Membership would be fine, capabilities may be fine. We would need to also add code to oauth2_proxy to do pass through if it sees a JWT in the bearer (See: https://github.com/bitly/oauth2_proxy/issues/530)
            5. Write an auth module for nginx (in C). This is close to (1), but it's easier working with Lua generally, but this would be the higher quality way (and take more time)

            There's a few variants on some of these options as well. I'm still pushing forward, mostly with 1 and 2. (1) is hacky but might be quickest. (3) means we own all the code, and might not be so bad.

            Again, I'm still confident I can get something working this week, in conjunection with the authorizer, which protects access, and I can provide the access token to backend services (firefly, jupyterlab), but they'd need to get the userinfo themselves.

            Show
            bvan Brian Van Klaveren added a comment - To a first approximation, I have this working. There's several issues, however. oauth2_proxy doesn't actually store the ID token in a header. Some code would be needed to enable this, but it's probably about 25 lines of code - not much. ouath2_proxy does store make the access token (in the ploxiln fork, see below), which can be used against the /userinfo endpoint in OpenID connect to get the userinfo which is mostly the same as the ID token, but an ID token should still be signed/generated into a JWT ideally, so we can verify it in different places of the stack. We already need a token issuer for the capability tokens, so maybe it makes sense to do this ourselves. Adam Thornton currently uses the userinfo endpoint of CILogon, so I do know that info is complete. It's not clear to me if an id token, as issued by CILogon, actually has all the same information as the userinfo endpoint but it's likely. We need a jwt if we want to use the jwt authenticator of jupyter. I need to add a new oauth2 client (and get credentials) for our domains for CILogon. Currently I'm testing with my own domain on GKE in GCP ( https://datasets.science ) ouath2_proxy is close to abandonware, though working. There's no current winner in the forks, though there's https://github.com/pusher/oauth2_proxy, https://github.com/ploxiln/oauth2_proxy , https://github.com/buzzfeed/sso , and some talk about getting the pusher fork into the CNCF. The ploxiln was best to get working because it has some PRs applied that improve OIDC support, but their docker container isn't great. I built my own docker container form a polxiln release and that's mostly working. The buzzfeed/sso seems to be getting some revamped OIDC support as of two days ago (via a Microsoft Azure AD integration PR). The pusher version also has some fixes. It's not clear which is the best one to base off of, though the ploxiln branch seems to have most activity despite seemingly most of the work by one maintainer. With nginx, I haven't been able to get the authenticator (ouath2_proxy) and authorizer (python) working in conjunction, though there's a few ways of doing this: (Note: In this context, the authorizer is for only authorizing the access based on group membership or capability.) 1. Use auth_request for oauth2_proxy, which gets evaluated first, then use the access_by_lua_block , which will get evaluated after. 2. auth_request only for the authorizer. Redirect to /oauth2/sign_in if there's no token. 3. Abandon oauth2_proxy, bring in the authentication code into the python code and just use that. oauth2_proxy is more general than OpenID connect, so that's where some of the impedance mismatch is coming from (OAuth2 doesn't have id tokens, oauth2_proxy doesn't have first class support for id tokens, mostly just access tokens from OAuth2) 4. Fork oauth2_proxy, make some changes and rewrite the authorizer code in Go. Submit PRs to an oauth2_proxy fork, but they might not be accepted upstream since they are fairly specific to LSST. Some work could be done to make them more generic. Group Membership would be fine, capabilities may be fine. We would need to also add code to oauth2_proxy to do pass through if it sees a JWT in the bearer (See: https://github.com/bitly/oauth2_proxy/issues/530 ) 5. Write an auth module for nginx (in C). This is close to (1), but it's easier working with Lua generally, but this would be the higher quality way (and take more time) There's a few variants on some of these options as well. I'm still pushing forward, mostly with 1 and 2. (1) is hacky but might be quickest. (3) means we own all the code, and might not be so bad. Again, I'm still confident I can get something working this week, in conjunection with the authorizer, which protects access, and I can provide the access token to backend services (firefly, jupyterlab), but they'd need to get the userinfo themselves.
            Hide
            bvan Brian Van Klaveren added a comment -

            With some code editing in OAuth2 proxy to support forwarding for ID tokens (when authenticated) to backends (i.e. Firefly, DAX, JLD), and additional "bypass" authenticators that will bypass OAuth2 proxy if it finds a valid JWT Bearer token in the Authorization header, as well as a re-configuration of how we used OAuth2 proxy and a few small edits in the python authorizer, I've been able to get just about everything working together.

            Guarantees

            An application which follows this method is guaranteed that:

            1. A user is authenticated by the time it reaches the application

            Optionally, an application can optionally choose that

            1. OAuth2 Access, Refresh, and ID tokens can be made available to the application
            2. A user is guaranteed to be authorized to access the application by the time it reaches the application. The refresh token can be used to generate new JWT ID tokens when the initial one expires, though it's likely OAuth2 (And optionally the LSST authorizer) can also do that, at which case an application would just need to harvest the ID token for UI requests and then it could reuse that token for API requests.

            OAuth2 can optionally manage refresh tokens itself and update. This hasn't been thoroughly tested. LSST Authorizer may also be modified to manage tokens and forward them in theory.

            Authentication

            The flow is this:
            A request comes in to the ingress. The ingress forwards a request to the the OAuth2 Proxy Service (internal DNS name, bypassing ingress), which is running in proxy mode (as opposed to "auth_request" mode), with the path of the LSST Authorizer (python) (e.g. http://oauth2-proxy.default.svc.cluster.local:4180/auth).

             

            At this point, OAuth2_proxy will see if it was configured to skip JWT bearer tokens, if they are valid. It will check the Authorization header for a Bearer token, and then attempt to verify it against one of the Token Issuer's it's configured with. If it is able to verify, it passes the request on to the LSST authorizer service (e.g. http://lsst-authorizer.default.svc.cluster.local:8080/auth). If it doesn't verify, it attempts to use the standard OAuth2_proxy behavior by checking the HTTP session to see if the user has previously authenticated. If so, it also sets the ID token and passes it along to the LSST authorizer.

            This two-tiered approach should allow us to add more complex authorization models in python easily without having to touch the oauth2_proxy code.

            Authorization

            As said before, authorization is handled in the python code we control

            The LSST authorizer can be configured in a few authorization modes:

            1. For tokens which contains group information, it will map to capabilities
            2. It can take tokens which have capabilties in them under the scp parameters
            3. It can be configured to not skip authorization.
               

            Applications can declare the capabilities they require in the auth url for their ingress rule. Following from the above example, this looks like:

            http://oauth2-proxy.default.svc.cluster.local:4180/auth?capability=exec:portal

            Read on workspace, for example, might look like this:
            http://oauth2-proxy.default.svc.cluster.local:4180/auth?capability=read:workspace

            The group mapping code would take the groups a user is in and map them to capabilities, taking into account the LSST group naming rules. The authorizer is configured with a group prefix (e.g. lsst_int_pdac_), a mapping of the resource to group name (e.g. portal -> portal, workspace -> ws, and a mapping of the capability to a postfix ( exec -> _x).

            Following this:
            exec:portal => lsst_int_pdac_ + portal + _x == lsst_int_pdac_portal_x
            read:workspace => lsst_int_pdac_ + ws + _r == lsst_int_pdac_ws+r
             
            The authorizer can also authenticate the tokens it receives as well, but that's only interesting for API workflows.

            Branch where work was done on oauth2_proxy:
            https://github.com/lsst-dm/oauth2_proxy/tree/oidc_id_tokens 

            Kubernetes and Login Flow

            This was testing in kubernetes on GCP, using a private domain (datasets.science).

            There's a few ways to initiate login. It's not clear which is the best way, and might not be clear until we start integrating with notebook, api, and portal integration.

            Show
            bvan Brian Van Klaveren added a comment - With some code editing in OAuth2 proxy to support forwarding for ID tokens (when authenticated) to backends (i.e. Firefly, DAX, JLD), and additional "bypass" authenticators that will bypass OAuth2 proxy if it finds a valid JWT Bearer token in the Authorization header, as well as a re-configuration of how we used OAuth2 proxy and a few small edits in the python authorizer, I've been able to get just about everything working together. Guarantees An application which follows this method is guaranteed that: A user is authenticated by the time it reaches the application Optionally, an application can optionally choose that OAuth2 Access, Refresh, and ID tokens can be made available to the application A user is guaranteed to be authorized to access the application by the time it reaches the application. The refresh token can be used to generate new JWT ID tokens when the initial one expires, though it's likely OAuth2 (And optionally the LSST authorizer) can also do that, at which case an application would just need to harvest the ID token for UI requests and then it could reuse that token for API requests. OAuth2 can optionally manage refresh tokens itself and update. This hasn't been thoroughly tested. LSST Authorizer may also be modified to manage tokens and forward them in theory. Authentication The flow is this: A request comes in to the ingress. The ingress forwards a request to the the OAuth2 Proxy Service (internal DNS name, bypassing ingress), which is running in proxy mode (as opposed to "auth_request" mode), with the path of the LSST Authorizer (python) (e.g. http://oauth2-proxy.default.svc.cluster.local:4180/auth ).   At this point, OAuth2_proxy will see if it was configured to skip JWT bearer tokens, if they are valid. It will check the Authorization header for a Bearer  token, and then attempt to verify it against one of the Token Issuer's it's configured with. If it is able to verify, it passes the request on to the LSST authorizer service (e.g. http://lsst-authorizer.default.svc.cluster.local:8080/auth ). If it doesn't verify, it attempts to use the standard OAuth2_proxy behavior by checking the HTTP session to see if the user has previously authenticated. If so, it also sets the ID token and passes it along to the LSST authorizer. This two-tiered approach should allow us to add more complex authorization models in python easily without having to touch the oauth2_proxy code. Authorization As said before, authorization is handled in the python code we control The LSST authorizer can be configured in a few authorization modes: For tokens which contains group information, it will map to capabilities It can take tokens which have capabilties in them under the  scp  parameters It can be configured to not skip authorization.   Applications can declare the capabilities they require in the auth url for their ingress rule. Following from the above example, this looks like: http://oauth2-proxy.default.svc.cluster.local:4180/auth?capability=exec:portal Read on workspace, for example, might look like this: http://oauth2-proxy.default.svc.cluster.local:4180/auth?capability=read:workspace The group mapping code would take the groups a user is in and map them to capabilities, taking into account the LSST group naming rules. The authorizer is configured with a group prefix (e.g. lsst_int_pdac_) , a mapping of the resource to group name (e.g. portal -> portal , workspace -> ws , and a mapping of the capability to a postfix ( exec -> _x ). Following this: exec:portal => lsst_int_pdac_ + portal + _x == lsst_int_pdac_portal_x read:workspace => lsst_int_pdac_ + ws + _r == lsst_int_pdac_ws+r   The authorizer can also authenticate the tokens it receives as well, but that's only interesting for API workflows. Branch where work was done on oauth2_proxy: https://github.com/lsst-dm/oauth2_proxy/tree/oidc_id_tokens   Kubernetes and Login Flow This was testing in kubernetes on GCP, using a private domain (datasets.science). There's a few ways to initiate login. It's not clear which is the best way, and might not be clear until we start integrating with notebook, api, and portal integration.
            Hide
            bvan Brian Van Klaveren added a comment -

            In addition, last week I received tokens for lsst-lsp-int, and I will work on trying some testing out with this.

            Show
            bvan Brian Van Klaveren added a comment - In addition, last week I received tokens for lsst-lsp-int, and I will work on trying some testing out with this.
            Hide
            bvan Brian Van Klaveren added a comment -

            Closing this issue for now. I'm currently working on integration work based on it.

            Show
            bvan Brian Van Klaveren added a comment - Closing this issue for now. I'm currently working on integration work based on it.

              People

              • Assignee:
                bvan Brian Van Klaveren
                Reporter:
                ktl Kian-Tat Lim
                Watchers:
                Brian Van Klaveren, Christopher Clausen, Gregory Dubois-Felsmann, Kian-Tat Lim
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Due:
                  Created:
                  Updated:
                  Resolved:

                  Summary Panel