Summary
Overview
Work History
Education
Skills
Certification
Languages
Timeline
Generic
Jeferson Machado Santos

Jeferson Machado Santos

Porto Alegre / RS

Summary

GCP Certified Professional Data Engineer with experience in building data ecosystems which gather data from different systems and interfaces and make them available to business users. Creating these ecosystems gave me opportunity to work with technologies like Data and Delta Lakes, Python, SQL, PySpark, GCP, Dataproc, Dataflow, Pub/Sub, BigQuery, Bigtable, Cloud Functions, PrestoDB, Kubernetes, Docker, Dagster, Datahub, GitHub, ETL, among others.

Overview

3
3
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Xepelin
03.2023 - Current
  • Part of the remote Data Engineering team, composed of 9 Data and Software Engineers
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Deployed open-source data orchestrator Dagster (Python and Kubernetes) to orchestrate all data pipelines, removing them from Cloud Scheduler
  • Integrated orchestrator with DBT, bringing greater possibilities of scheduling and monitoring for different models
  • Implemented Datahub (Kubernetes) as tool for data governance, ingesting metadata and lineage from all entities on data domains, allowing people from different areas to consult on a centralized place which tables, columns and information exist on data ecosystem and the relation between them

Data Engineer

iVoy
08.2021 - 03.2023
  • Proposed and deployed complete architecture for a data ecosystem on GCP, with objective to extract data from company's systems and interfaces, transform and make them available to analysts and business users
  • Created a batch layer updated daily, composed of data ingestion with Python Scripts, Delta Lake on Google Cloud Storage, Pyspark on Dataproc to process data across layers and Bigquery to store final reports and provide access to end users. This batch layer was orchestrated with Airflow and infrastructure provided with Terraform
  • Created a real time layer for specific reports, using Debezium Server on Docker for CDC, Pub/Sub and Dataflow for data processing and BigTable and Bigquery for final storage
  • Created an address validation pipeline using Debezium Server on Docker for CDC, Pub/Sub and Dataflow for data processing and ElasticSearch for storage and text search, reducing number of manual address validations in 50%

Data Engineer, IT Senior Consultant

Peers Consulting & Technology
09.2020 - 08.2021
  • Corporate Bigdata: Data Lake solution with most used data used by business consulting, as well as providing self service access to it. Used technologies: Azure Data Lake Gen, Data Factory, Databricks, Azure Functions, Synapse analytics and Power BI
  • Exploratory Data Analysis (EDA) with Python on Brazilian education quality data for governments
  • Creation of BI tool, using Python and Dash Plotly, with online access and user experience for non- technical users to extract data (queries) from BigQuery
  • Web scrapping with Python to obtain data for commercial proposals and specific projects

Education

Bootcamp - Cloud Data Engineer

IGTI
Belo Horizonte / Brazil
10.2021

Bootcamp - Data Engineer

IGTI
Belo Horizonte / Brazil
05.2021

MBA - Business Process Management

Unisinos
Porto Alegre / Brazil
03.2015

Bachelor's degree - Business Administration

Federal University of Rio Grande Do Sul
Porto Alegre, Brazil
08.2012

Skills

  • Python
  • PySpark
  • Extract, Transform, Load (ETL)
  • Apache Airflow
  • BigTable
  • Apache Beam
  • Google Cloud Dataflow
  • Google Kubernetes Engine (GKE)
  • Kubernetes
  • Docker
  • Terraform

Certification

  • GCP Professional Data Engineer, Google Cloud - jan/2023 to jan/2025

Languages

English
Bilingual or Proficient (C2)
Spanish
Advanced (C1)
Portuguese
Bilingual or Proficient (C2)

Timeline

Senior Data Engineer

Xepelin
03.2023 - Current

Data Engineer

iVoy
08.2021 - 03.2023

Data Engineer, IT Senior Consultant

Peers Consulting & Technology
09.2020 - 08.2021

Bootcamp - Cloud Data Engineer

IGTI

Bootcamp - Data Engineer

IGTI

MBA - Business Process Management

Unisinos

Bachelor's degree - Business Administration

Federal University of Rio Grande Do Sul
Jeferson Machado Santos