Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
-
Updated
May 23, 2024 - Java
Dinky is a real-time data development platform based on Apache Flink, enabling agile data development, deployment and operation.
The project aims to enhance NLP capabilities for Amharic Language by developing a data corpus for various NLP applications. The project involves collecting, cleaning, processing data, developing APIs, and automating the pipeline.
Hydra: Column-oriented Postgres. Add scalable analytics to your project in minutes.
Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)
Developed a robust ETL pipeline for Next Cola Pvt. Ltd data which extracts data from many different OLTP sources, converts them into dimensions and facts and load into datawarehouse for analytical workload.
Computer Science and Engineering (CSE) is a multidisciplinary field that combines elements of computer science and computer engineering to design, develop, and maintain computer systems and software. It is a rapidly evolving field that plays a crucial role in shaping the modern world.
A Data Warehouse project based on Microsoft Northwind Database.
Repository for tutorials, information and notes on technology in general.
A free to use dbt package for creating and loading Data Vault 2.0 compliant Data Warehouses (powered by dbt, an open source data engineering tool, registered trademark of dbt Labs)
This project outlines the final project requirements for DAV6100 - Information Architectures, focusing on group assignments, scoring criteria, topic selection, core requirements, and project components such as design, development, visualization, and executive presentation.
Roadmap for Data Engineering
DIRECT, the Data Integration Run-time Execution Control Tool, is a data logistics control framework that can be used to monitor, log, audit and control data integration / ETL processes.
An open source and free to use generic (basic) Microsoft SQL Server data warehouse
Soccer Players Data Analyst and Similar Players Finder
Generic interface exchange format for Data Warehouse Automation and ETL generation.
The Virtual Data Warehouse is a code generation and template management tool. It is part of the data solution automation ecosystem - the 'engine' for data solution automation.
This repository contains a collection of Databases projects and code samples showcasing my skills and experience in SQL-PostgreSQL development. It serves as a portfolio to demonstrate my proficiency in various aspects of Database programming. Mostly, includes tasks about SQL, PostgreSQL and GIS.
Data warehouse for CouchDB
Data Analysis, Analytics, Science, AI & ML, LLM etc.
Add a description, image, and links to the datawarehouse topic page so that developers can more easily learn about it.
To associate your repository with the datawarehouse topic, visit your repo's landing page and select "manage topics."