datalake-ingestion

Here are 7 public repositories matching this topic...

KeeplerIO / de-identification-framework

Application of our De-identification Framework with open source technologies, enabling enterprises to take ownership of the de-identification process and deploy it in trusted environments.

data privacy-protection data-security pii de-identification datalake-ingestion

Updated Nov 15, 2021
Python

anthager / Anton.Pizza

Star

security microservices big-data ai game-engine machine-learning-algorithms distributed-computing fintech graph-database monolith crt scaleable clustering-algorithm datalake blockchain-technology container-engine edge-computing monorepository datalake-ingestion

Updated May 10, 2019
JavaScript

ac-gomes / data_engineer_with_airflow

Star

Este projeto é uma adaptação com base em um teste real para uma posição de Engenheiro de Dados Jr.

postgres airflow json-api aws-s3 python3 azure-storage datalake datalake-ingestion

Updated Jun 16, 2023
Python

fabricioasn / EnemInData_TCC_Unicarioca

Star

This repository is designed for a data science project aimed to education, wich uses a public database from brazilian educational research institute about the nationam highschool exam and applies ETL and datamining association rules to this dataset.

data sql azure powerbi tsql mssqlserver azuredatafactory analysis-services datalake-ingestion

Updated Jun 24, 2021

andyvroberts / smoke

Star

Acquisition of energy industry balancing and settlement calculation data, into a data lake

c-sharp azure-functions datalake-ingestion

Updated Nov 4, 2024
C#

hannah0wang / end-to-end-data-reporting

Star

End to end data reporting project using Azure services like Azure Data Factory for data orchestration, Azure Synapse Analytics for data warehousing, Databricks for data transformations, and Power BI for intuitive data visualization and reporting.

sql azure end-to-end end-to-end-testing powerbi databricks tsql databricks-notebooks datafactory datalake-ingestion datalake-etl datalake-storage

Updated Dec 30, 2023
Jupyter Notebook

BurakCakan / gcs-data-ingestion

Star

This repo is designed to show how to read and write data from/to google cloud storage with pyspark. The raw data is ingested, transformed and stored in the data lake in snapshot format.

unit-testing spark ci-cd python3 google-cloud-platform datalake-ingestion

Updated Feb 27, 2023
Python

Improve this page

Add a description, image, and links to the datalake-ingestion topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the datalake-ingestion topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datalake-ingestion

Here are 7 public repositories matching this topic...

KeeplerIO / de-identification-framework

anthager / Anton.Pizza

ac-gomes / data_engineer_with_airflow

fabricioasn / EnemInData_TCC_Unicarioca

andyvroberts / smoke

hannah0wang / end-to-end-data-reporting

BurakCakan / gcs-data-ingestion

Improve this page

Add this topic to your repo