Application of our De-identification Framework with open source technologies, enabling enterprises to take ownership of the de-identification process and deploy it in trusted environments.
-
Updated
Nov 15, 2021 - Python
Application of our De-identification Framework with open source technologies, enabling enterprises to take ownership of the de-identification process and deploy it in trusted environments.
Este projeto é uma adaptação com base em um teste real para uma posição de Engenheiro de Dados Jr.
This repository is designed for a data science project aimed to education, wich uses a public database from brazilian educational research institute about the nationam highschool exam and applies ETL and datamining association rules to this dataset.
Acquisition of energy industry balancing and settlement calculation data, into a data lake
End to end data reporting project using Azure services like Azure Data Factory for data orchestration, Azure Synapse Analytics for data warehousing, Databricks for data transformations, and Power BI for intuitive data visualization and reporting.
This repo is designed to show how to read and write data from/to google cloud storage with pyspark. The raw data is ingested, transformed and stored in the data lake in snapshot format.
Add a description, image, and links to the datalake-ingestion topic page so that developers can more easily learn about it.
To associate your repository with the datalake-ingestion topic, visit your repo's landing page and select "manage topics."