Simplify Reading Partitioned Data with Automatic Glob Expansion #12048
gabrieldernbach
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Currently, reading partitioned data, especially with Hive, requires specifying a detailed glob-pattern expansion:
In comparison, other libraries like pandas allow a simpler approach:
Proposal
It would be helpful to introduce automatic glob expansion to simplify reading partitioned data. This would enable users to specify the root directory without manually counting the partitioning levels.
Expected Behavior
Allow the following command to automatically expand to include all relevant partitioned files:
Considerations
Are there potential issues with implementing automatic glob expansion?
Are there edge cases I didn't considered?
Thank you for considering and sharing your insights.
Beta Was this translation helpful? Give feedback.
All reactions