Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(team): Add sensitive data auth, store secrets securely #22294

Draft
wants to merge 3 commits into
base: zenhog
Choose a base branch
from

Conversation

neilkakkar
Copy link
Collaborator

@neilkakkar neilkakkar commented May 15, 2024

Problem

Still need to test everything, add migration, but this should add support for:

  1. Storing sensitive fields in our database (instance setting was storing things in plain text for secrets 🙈 - but not bad since those were only our secrets)
  2. Validate a payload based on a shared secret set

Any API endpoint can make use of these to validate a given payload + hash.

Changes

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Does this work well for both Cloud and self-hosted?

How did you test this code?

def create(self, validated_data: dict[str, Any], **kwargs) -> Team:
serializers.raise_errors_on_nested_writes("create", self, validated_data)
request = self.context["request"]
organization = self.context["view"].organization # Use the org we used to validate permissions

validated_data = self.encrypt_sensitive_fields(validated_data)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, instead of doing it like this, which is a bit wonky / easy to forget to encrypt and then bork, maybe better to take the API token approach, where we generate the shared secret, vs. depending on the user for it, and they can input it in their systems. And then we can reset it. This also implies we don't necessarily need encryption on these fields, since they're not coming from the user anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.. only reason I can think of not to use the personal API key as the secret is because it's at the org level, and per user, rather than per team.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what you think Ben, currently I'm thinking:

  1. Keep the encryption
  2. Move it to a separate patch request which is the only way to generate / reset this key, and encryption happens here
  3. Don't show it on the UI after generation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the demo will continue like it is right now

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're being overly careful about it. Most UIs that I've seen with this kind of thing display it in plain text.

As it's only a shared secret for signing the level of risk is low. Versus an api key that is actually an authentication key which is what makes it super sensitive

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said I think we should encrypt it because why not 😅 but I wouldn't stress about never making it displayable again

Copy link
Contributor

@benjackwhite benjackwhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just talking about the specific hmac part of the solution space, this looks good 👍

@@ -59,6 +59,9 @@
# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY: str = os.getenv("SECRET_KEY", DEFAULT_SECRET_KEY)

# SECURITY WARNING: keep the encryption secret key used in production secret!
ENCRYPTION_SECRET_KEY: str = os.getenv("ENCRYPTION_SECRET_KEY", "sNxC-b8AkSkNqs8nVfUz4_gn89MzuQmtckgmxm81LP8=")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to add a debug thing here so it only uses a default if in DEBUG otherwise throws?

@@ -390,6 +391,14 @@ def render_template(
posthog_app_context["frontend_apps"] = get_frontend_apps(user.team.pk)
posthog_app_context["default_event_name"] = get_default_event_name(user.team)

if user.email:
# TODO: Convert to an SDK function call
posthog_app_context["current_user_hash"] = hmac.new(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One change we should make here is to have two clear values current_user_id_hash and current_user_email_hash as otherwise it will be confusing what they are used for. The ID one would be great for "secure mode" in the future and the email for anything that requires emails (like this feature or maybe sending emails in the future)

"""
Based on hmac validation
"""
# TODO: Do I need to consider any time-based validation here?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally yeah although its definitely harder for the implementer. Would be awesome for very sensitive operations to include a ?t=NOW param as part of the message that can then be validated both in terms of time and content. That can always be added later though when we have a really high need for validation.

@posthog-bot
Copy link
Contributor

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

@Gilbert09
Copy link
Contributor

Just out of interest, how does this differ from using EncryptedTextField on models? We use this in data warehouse to encrypt users access tokens/secrets at the moment - https://github.com/PostHog/posthog/blob/master/posthog/warehouse/models/credential.py#L10-L11

@neilkakkar
Copy link
Collaborator Author

oh I saw that package, didn't realise we were already using it! ( https://pypi.org/project/django-encrypted-model-fields/ ). Seeing the 0.X.X versions and the 2 year ago last update didn't inspire too much confidence 🙈

Tech wise, just that not doing it at the model level gives us more freedom to customise and recover from problems (like rotating encryption keys), and also customise when we want to allow decrypting / sending back a serialised decrypted field. But, not married to the current PR, potentially this customisation is unnecessary for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants