Releases: andrjas/data_check
Releases · andrjas/data_check
0.19.0
0.18.0
0.17.0
Added
- pipeline YAML validation via pydantic
- more breakpoint step features and documentation
Changed
- replaced 'overall result' with 'summary'
Fixed
- load_template and load_lookups called twice in run
- generating sorted csv for checks
- updated SQLAlchemy links to 2.0
- print exception if merging non-unique columns
0.16.0
0.15.0
Added
- 'data_check init' to create projects and pipelines
- 'append' as alias for append-mode in cli and pipelines
- 'ping --wait' and --timeout/--retry
- Python 3.11 support
Changed
- io module is renamed to file_ops
- running csv file without matching sql file will fail, otherwise it will run the csv check
- MSSQL uses arm64 image for CI
Fixed
- NA/NaT should be treated equally in checks
- CTRL+C should work in Windows
- 'data_check gen' works with full table checks
Removed
- custom docker images for CI
0.14.0
Added
- pre-commit hooks with various tools for code quality
- project wide default_load_mode configuration
- pipelines: added 'files' for 'sql' to deprecate 'sql_files'
- pipelines: added 'run' as alias for 'check'
- tests that pipeline steps matches cli
- pipelines: 'write_check' for 'sql'
- documentation for 'fake' pipeline step
- pipelines: added 'table' and 'file' for 'load' to deprecate 'load_table'
- running data_check_pipeline.yml directly to execute the pipeline
Changed
- refactored TableInfo into Table
- moved integration tests into pytest
- upgraded dependencies
Fixed
- load fails if csv doesn't have all columns
Deprecated
- pipelines: 'sql_files' is deprecated, use 'sql' instead
- pipelines: 'load_table' is deprec
0.13.0
Added
- upsert mode for loading data into tables
- pipelines: added 'mode' to deprecate 'load_mode'
- env variable DATA_CHECK_CONNECTION can override default connection
Changed
- printing exception on failure without --traceback
- upgraded dependencies
- documentation theme
Fixed
- Oracle: using VARCHAR2 instead of CLOB to load strings and large decimals
- bug in runner.executor when calculating max_workers
Deprecated
- pipelines: 'load_mode' is deprecated, use 'mode' instead
Removed
- workaround for replace mode
- support for python 3.7
- importlib-metadata dependency
0.12.0
0.11.1
0.11.0
Added
--sql
and--sql-files
use lookups- full table checks
--print --diff
to print only changed columns--write-check
to generate a CSV check
Changed
- example project moved into subfolder
- split main into cli module
- rewrote cli testing using click.testing.CliRunner
--sql
with--output
doesn't print on console
Fixed
- recursive process spawning
- pipeline does not stop on error
- log file is written into project path
--print
with empty set prints result when failing