Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Do not initialize snowplow tracker if anything goes wrong with it #8302

Closed
wants to merge 5 commits into from

Conversation

jaceksan
Copy link

@jaceksan jaceksan commented Dec 5, 2023

Closes issue #8256

It may be a temporary workaround -
it may be removed once dbt-labs/dbt-core#8680 is merged.

Copy link

netlify bot commented Dec 5, 2023

Deploy Preview for meltano canceled.

Name Link
🔨 Latest commit a0b7183
🔍 Latest deploy log https://app.netlify.com/sites/meltano/deploys/65ce54a98c023c0008495cc2

@jaceksan jaceksan changed the title Log warning if import of snowplow_tracker fails feat: log warning if import of snowplow_tracker fails Dec 5, 2023
Copy link

codecov bot commented Dec 5, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (18b32ed) 91.51% compared to head (a0b7183) 91.50%.
Report is 2 commits behind head on main.

Files Patch % Lines
src/meltano/core/tracking/tracker.py 86.66% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8302      +/-   ##
==========================================
- Coverage   91.51%   91.50%   -0.01%     
==========================================
  Files         245      245              
  Lines       19146    19151       +5     
  Branches     2139     2140       +1     
==========================================
+ Hits        17522    17525       +3     
- Misses       1345     1346       +1     
- Partials      279      280       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Dec 5, 2023

Hi @jaceksan!

Correct me if I'm wrong but I don't think this fixes #8256 because the error does not occur at import time, but rather at initialization time in

for endpoint in endpoints:
if not check_url(endpoint):
logger.warning("invalid_snowplow_endpoint", endpoint=endpoint)
continue
parsed_url = urlparse(endpoint)
emitters.append(
Emitter(
endpoint=parsed_url.hostname + parsed_url.path,
protocol=parsed_url.scheme or "http",
port=parsed_url.port,
request_timeout=request_timeout,
),
)
.

What do you think? The change in this PR fixed the problem you were having with dbt?

@jaceksan
Copy link
Author

jaceksan commented Dec 5, 2023

Hi @jaceksan!

Correct me if I'm wrong but I don't think this fixes #8256 because the error does not occur at import time, but rather at initialization time in

for endpoint in endpoints:
if not check_url(endpoint):
logger.warning("invalid_snowplow_endpoint", endpoint=endpoint)
continue
parsed_url = urlparse(endpoint)
emitters.append(
Emitter(
endpoint=parsed_url.hostname + parsed_url.path,
protocol=parsed_url.scheme or "http",
port=parsed_url.port,
request_timeout=request_timeout,
),
)

.
What do you think? This the change in this PR fixed the problem you were having with dbt?

Hey @edgarrmondragon!
The issue I experienced was already described in one of the comments in the issue:

ImportError: cannot import name 'SelfDescribing' from 'snowplow_tracker'

This happens when you install dbt-core into a venv where you already installed Meltano.
I added dbt-core to the env created by poetry in meltano repo and executed:

meltano init --no-usage-stats --force .

It did not fail anymore.

But I am quite new in poetry, I may use it wrong.

But I think I get your point.
When you install both tools in a different way, it may NOT fail on imports and may fail later in the section you linked.
I could implement a check there with can_track(). Actually, I already tried it as an alternative solution.
I realized that it was not that easy, because can_track() calls other class methods which require some of the imports.
I could split imports and import the problematic libs as late as possible to make this alternative solution work as much as possible.
Should I extend it this way?

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Dec 9, 2023

@jaceksan So meltano init does work, but I fear other commands would still be broken with this change.

How about meltano install? It imports PluginsTrackingContext, which depends on SelfDescribingJson so I'm afraid that'd still be broken.

@jaceksan
Copy link
Author

meltano install

You are right. I re-tested it and even meltano init does not work. It reports the warning as expected but fails when initiating the Emitters.

I try to implement a better solution as soon as possible.

@jaceksan
Copy link
Author

@edgarrmondragon I analyzed the whole code base and I am afraid I can't solve this.

SelfDescribingJson is used in too many places. A couple of classes are inherited from it, which is not easy to override.

Do you have any idea how to handle such cases?

@edgarrmondragon
Copy link
Collaborator

@edgarrmondragon I analyzed the whole code base and I am afraid I can't solve this.

SelfDescribingJson is used in too many places. A couple of classes are inherited from it, which is not easy to override.

Do you have any idea how to handle such cases?

@jaceksan One option that comes to mind is

  1. banning imports of any imports from meltano.core.tracking.* subpackages and
  2. instead put all those imports in src/meltano/core/tracking/__init__.py,
  3. wrap them in a try/except block and export a boolean constant TRACKER_AVAILABLE if any import fails,
  4. elsewhere import things from that root module
  5. use TRACKER_AVAILABLE to conditionally instantiate the tracker, add contexts, etc.

@jaceksan
Copy link
Author

jaceksan commented Jan 9, 2024

I was notified by the related dbt-core issue that they are not going to work on it.
I try to find time to continue working on this, it blocks me from orchestrating both Meltano and dbt by Dagster.

@edgarrmondragon
Copy link
Collaborator

I was notified by the related dbt-core issue that they are not going to work on it.
I try to find time to continue working on this, it blocks me from orchestrating both Meltano and dbt by Dagster.

Thanks @jaceksan. Let me know if the plan above makes sense.

@jaceksan jaceksan requested a review from a team as a code owner January 30, 2024 07:34
@jaceksan jaceksan changed the title feat: log warning if import of snowplow_tracker fails feat: ignore incorrect version of snowplow_tracker Jan 30, 2024
@jaceksan jaceksan force-pushed the supress_snowplow branch 3 times, most recently from ebea181 to dae4d83 Compare January 30, 2024 07:42
Closes issue meltano#8256

It may be a temporary workaround -
it may be removed once dbt-labs/dbt-core#8680 is merged.
src/meltano/core/tracking/tracker.py Show resolved Hide resolved
src/meltano/core/tracking/tracker.py Outdated Show resolved Hide resolved
src/meltano/core/tracking/tracker.py Outdated Show resolved Hide resolved
@edgarrmondragon
Copy link
Collaborator

The code looks good but it's probably worth doing some manual testing with common ways of installing Meltano along with dbt in a venv to confirm everything's working as expected.

@edgarrmondragon edgarrmondragon changed the title feat: ignore incorrect version of snowplow_tracker fix: Do not initialize snowplow tracker if anything goes wrong with it Jan 31, 2024
@jaceksan
Copy link
Author

The code looks good but it's probably worth doing some manual testing with common ways of installing Meltano along with dbt in a venv to confirm everything's working as expected.

I tested it.
I installed Meltano and THEN dbt, which overrides snowplow-tracker by their minimal-snowplow-tracker.
Then I executed meltano run on top of my quite complex project consisting of tap-github, tap-csv-s3, tap-jira, target-snowflake, and target-duckdb. It finished without any issues.
Then I executed dbt run on top of the same project and it finished OK too.

The problem with my solution is that developers have to install Meltano first and then dbt. When they do it in opposite order, dbt won't work properly affected by snowplow-tracker delivered by Meltano.

I can share with you how exactly I tested it, so you can reproduce it.
The project is open-sourced.

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Feb 6, 2024

@jaceksan can you give #8400 a try?

TL;DR: dbt uses the same package name snowplow_tracker but a different distribution name minimal_snowplow_tracker so we can ensure the right one is imported by using importlib.util.import_module.

The problem with my solution is that developers have to install Meltano first and then dbt. When they do it in opposite order, dbt won't work properly affected by snowplow-tracker delivered by Meltano.

That should be fixed by dbt-labs/dbt-core#9528

@jaceksan
Copy link
Author

@jaceksan can you give #8400 a try?

TL;DR: dbt uses the same package name snowplow_tracker but a different distribution name minimal_snowplow_tracker so we can ensure the right one is imported by using importlib.util.import_module.

The problem with my solution is that developers have to install Meltano first and then dbt. When they do it in opposite order, dbt won't work properly affected by snowplow-tracker delivered by Meltano.

That should be fixed by dbt-labs/dbt-core#9528

Great contribution to dbt-core, thanks for that!

I installed Meltano from your branch and executed my project. Unfortunately, it does not work for me.
Screenshot from 2024-02-13 11-09-08

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Feb 14, 2024

I installed Meltano from your branch and executed my project. Unfortunately, it does not work for me.

@jaceksan ah you're right. I was under the impression that import_module used the distribution name, but I was wrong!

I'll try a similar approach to the dbt PR here and instantiate the emitters in a compatible way.

@edgarrmondragon
Copy link
Collaborator

Closing this now based on #8256 (comment). We can revisit later if we still feel this is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants