-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Box] add logs #2554
base: main
Are you sure you want to change the base?
[Box] add logs #2554
Conversation
From the log lines:
|
connectors/sources/box.py
Outdated
logger.debug("Fetching content starting from root folder") | ||
await self.fetchers.put(partial(self._fetch, doc_id=root_folder)) | ||
self._logger.info("Fetching data from Box's Free Account") | ||
await self.fetchers.put(partial(self._fetch, doc_id="0")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having "0"
be defined with a variable name root_folder
(basically as it was before) is better for clarity. Someone without understanding of box will not know what 0
is at first glance. Could you revert that please?
connectors/sources/box.py
Outdated
already_processed_ids = set() | ||
root_folder = "0" | ||
|
||
stored_id = set() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think already_processed_ids
was clearer than stored_id
. If I look at this code later, I will think "why is it stored?"
If you want to shorten the variable name, perhaps seen_ids
is better?
buildkite test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good, just left one final change request
connectors/sources/box.py
Outdated
query_string = ( | ||
"&".join(f"{key}={value}" for key, value in params.items()) | ||
if params | ||
else "" | ||
) | ||
self._logger.debug(f"Calling GET {url}?{query_string}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think query_string
should only built when debug logs are enabled. How it is here, this string is being built every run even though it isn't used unless debug logs are enabled. It's minor, but every bit counts when looking at performance.
This could be moved to a func elsewhere called something like debug_query_string
and could be called in-line like this:
query_string = ( | |
"&".join(f"{key}={value}" for key, value in params.items()) | |
if params | |
else "" | |
) | |
self._logger.debug(f"Calling GET {url}?{query_string}") | |
self._logger.debug(f"Calling GET {url}?{debug_query_string(params)}") |
Part Of #2299
Adding more logs in Box connector.
Log file: https://drive.google.com/file/d/1TnFFvRBvMb2wGKQWz9N6dVkQtXUO7AIK/view?usp=drive_link
Checklists
Pre-Review Checklist
config.yml.example
)v7.13.2
,v7.14.0
,v8.0.0
)