Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - Glue integration #529

Draft
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

Arun-kc
Copy link
Contributor

@Arun-kc Arun-kc commented Sep 7, 2023

{# Glue - truncate and insert (non-atomic) #}
{% macro glue__replace_table_data(relation, rows) %}
{% set intermediate_relation = elementary.create_intermediate_relation(relation, rows, temporary=True) %}
{% do dbt.glue_exec_query(dbt.get_insert_overwrite_sql(intermediate_relation, relation)) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally should be using run_query, but not able to pass the flag as 'False' for DDL and DML statements which in turn is causing issues at run_query when it tries to change the case of columns into lowercase. So using glue_exec_query for now.

Note: When DDL and DML statements are passed to run_query in dbt it will return none

{% endfor %}
{% elif insert_rows_method == 'chunk' %}
{% set rows_chunks = elementary.split_list_to_chunks(rows, chunk_size) %}
{% for rows_chunk in rows_chunks %}
{% set insert_rows_query = elementary.get_chunk_insert_query(table_relation, columns, rows_chunk) %}
{% do elementary.run_query(insert_rows_query) %}
{% if target.type == 'glue' %}
{% do dbt.glue_exec_query(insert_rows_query) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as #r1318786335

@@ -22,13 +22,21 @@
{% set queries_len = insert_rows_queries | length %}
{% for insert_query in insert_rows_queries %}
{% do elementary.file_log("[{}/{}] Running insert query.".format(loop.index, queries_len)) %}
{% do elementary.run_query(insert_query) %}
{% if target.type == 'glue' %}
{% do dbt.glue_exec_query(insert_query) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reason as #r1318786335

@@ -57,7 +65,7 @@
{% do rendered_column_values.append(column_value) %}
{% else %}
{% set column_value = elementary.insensitive_get_dict_value(row, column.name) %}
{% do rendered_column_values.append(elementary.render_value(column_value)) %}
{% do rendered_column_values.append(elementary.render_value(column_value, column.data_type)) %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing column data_type in order to cast timestamp columns for glue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant