With some code editing in OAuth2 proxy to support forwarding for ID tokens (when authenticated) to backends (i.e. Firefly, DAX, JLD), and additional "bypass" authenticators that will bypass OAuth2 proxy if it finds a valid JWT Bearer token in the Authorization header, as well as a re-configuration of how we used OAuth2 proxy and a few small edits in the python authorizer, I've been able to get just about everything working together.
An application which follows this method is guaranteed that:
- A user is authenticated by the time it reaches the application
Optionally, an application can optionally choose that
- OAuth2 Access, Refresh, and ID tokens can be made available to the application
- A user is guaranteed to be authorized to access the application by the time it reaches the application. The refresh token can be used to generate new JWT ID tokens when the initial one expires, though it's likely OAuth2 (And optionally the LSST authorizer) can also do that, at which case an application would just need to harvest the ID token for UI requests and then it could reuse that token for API requests.
OAuth2 can optionally manage refresh tokens itself and update. This hasn't been thoroughly tested. LSST Authorizer may also be modified to manage tokens and forward them in theory.
The flow is this:
A request comes in to the ingress. The ingress forwards a request to the the OAuth2 Proxy Service (internal DNS name, bypassing ingress), which is running in proxy mode (as opposed to "auth_request" mode), with the path of the LSST Authorizer (python) (e.g. http://oauth2-proxy.default.svc.cluster.local:4180/auth).
At this point, OAuth2_proxy will see if it was configured to skip JWT bearer tokens, if they are valid. It will check the Authorization header for a Bearer token, and then attempt to verify it against one of the Token Issuer's it's configured with. If it is able to verify, it passes the request on to the LSST authorizer service (e.g. http://lsst-authorizer.default.svc.cluster.local:8080/auth). If it doesn't verify, it attempts to use the standard OAuth2_proxy behavior by checking the HTTP session to see if the user has previously authenticated. If so, it also sets the ID token and passes it along to the LSST authorizer.
This two-tiered approach should allow us to add more complex authorization models in python easily without having to touch the oauth2_proxy code.
As said before, authorization is handled in the python code we control
The LSST authorizer can be configured in a few authorization modes:
- For tokens which contains group information, it will map to capabilities
- It can take tokens which have capabilties in them under the scp parameters
- It can be configured to not skip authorization.
Applications can declare the capabilities they require in the auth url for their ingress rule. Following from the above example, this looks like:
Read on workspace, for example, might look like this:
The group mapping code would take the groups a user is in and map them to capabilities, taking into account the LSST group naming rules. The authorizer is configured with a group prefix (e.g. lsst_int_pdac_), a mapping of the resource to group name (e.g. portal -> portal, workspace -> ws, and a mapping of the capability to a postfix ( exec -> _x).
exec:portal => lsst_int_pdac_ + portal + _x == lsst_int_pdac_portal_x
read:workspace => lsst_int_pdac_ + ws + _r == lsst_int_pdac_ws+r
The authorizer can also authenticate the tokens it receives as well, but that's only interesting for API workflows.
Branch where work was done on oauth2_proxy:
Kubernetes and Login Flow
This was testing in kubernetes on GCP, using a private domain (datasets.science).
There's a few ways to initiate login. It's not clear which is the best way, and might not be clear until we start integrating with notebook, api, and portal integration.