Summary
Overview
Work History
Education
Skills
Websites
Certification
Languages
Timeline
Generic
Adriano Ferraro

Adriano Ferraro

São Paulo

Summary

Hello everyone, my name is Adriano. I'm a senior data engineer with over 5 years of experience building cloud data pipelines using technologies like Python, SQL, Apache Spark, and Apache Airflow on Azure and GCP. At ACT Digital, I led a team to migrate from an on-premises to a cloud platform, completing the project one month ahead of schedule and improving system processing efficiency by 60%. I also developed a data pipeline for integrating CSV files with dynamic headers, enhancing the table structure without breaking the process, using Azure Data Factory, Databricks, Delta, Azure Data Lake, Python, and SQL. My approach is focused on innovation and efficiency, always seeking the best practices and most advanced technologies to ensure data quality and integrity. I am adept at collaborative work and believe in the power of multidisciplinary teams to achieve exceptional results. Throughout my career, I have demonstrated the ability to solve complex problems, adapt quickly to new technologies, and contribute significantly to project success. I'm always looking for new challenges and opportunities to apply my knowledge and skills, helping organizations maximize the value of their data.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Data Engineer Specialist

act digital
04.2021 - Current

Responsible for helping data teams to build data pipelines that are scalable and fault-tolerant.


• Development of a data pipeline for integrating CSV files with the dynamic layout in batch, whichresulted in the delivery of data without reoccupation with the new columns with information placed inthe files, providing an evolution of the table structure without our process breaking. Using Azure DataFactory, Databricks, Delta, Azure Data Lake, Python and SQL.
• Development of a data pipeline to transform a table that contains a column with several fieldsand values within it, this column should be transformed into several other columns. The delivery ofthese transformed data resulted in a speed gain for the business area that previously took 3 hours tomake this separation. With automation, we do it in less than 15 minutes. I used technologies such asBigQuery, Composer, Python, and SQL.
• Development of a data platform that automates the entire data integration process in the Data Lake,which resulted in the delivery of data in production by the analytical engineer more quickly. It used totake 120 hours to deliver a data ingestion in production now that time has dropped to 48 hours with thenew data platform. Using Google Cloud Storage, Composer, BigQuery, Cloud Functions, Python andSQL.

Data Engineer

Adriano Ferraro - page 1MJV Technology & Innovation
11.2020 - 04.2021

• Developed data pipelines for integrating Salesforce API data into Data Lake, which resulted in thedelivery of automated information to the business area, which previously needed to extract the data andmanually put it into a spreadsheet. This process took about 4 hours every day. With the automation, thedata was loaded and transformed in 20 minutes. Using Python, SQL, Composer, DataProc and Spark.
• Developed a system to monitor the quality of data that landed in the Data Lake, which resulted in thedelivery of information with quality and consistency by the Analytics Engineer Using Python, Pandas,SQL, Composer, BigQuery, Cloud Functions, Cloud Storage, and Power BI.

Data Engineer

GESTO
04.2019 - 11.2020

Responsible for helping data teams to build a new data platform on GCP.


• Developed a new data platform to support all business areas of the company, which resulted insavings of 2 million per year with infrastructure costs and a reduction from 4 days to two hours inprocessing customer bases. Using Talend, Airflow, BigQuery, Cloud Storage
• Developed an automation to handle files that arrived with the wrong header from healthcare providersbefore the process was executed in 4 hours because data platform operators needed to normalize thefile header manually, this automation resulted in a time-saving process now running in 4 minutes. UsingPython and Pandas.

Education

Technical Degree - Data Modeling/Warehousing and Database Administration

Uninove - Universidade Nove de Julho
05.2021

Skills

  • Python
  • SQL
  • Apache Spark
  • PySpark
  • Apache Airflow
  • Microsoft Azure
  • Synapse
  • Azure Data Factory
  • Azure Databricks
  • Azure Data Lake

Certification

  • Big Data Fundamentos 2.0 - Data Science Academy, 5d2697555e4cdeebe38b456e
  • Curso MySQL - Curso em Video, 4110-1091-246989
  • Bootcamp Engenharia de Dados - How Bootcamps
  • Google Cloud Platform Big Data and Machine Learning Fundamentals em Português Brasileiro - Coursera, P23DRDZFV99E
  • Modernizing Data Lakes and Data Warehouses with GCP em Português Brasileiro - Coursera, MLZ2TJJTUHY3
  • Building Batch Data Pipelines on GCP em Português Brasileiro - Coursera, DXZBW8MY25FN
  • Building Resilient Streaming Analytics Systems on GCP em Português Brasileiro - Coursera, FHS65NPV9RFT
  • Data Engineering with Apache Spark - One Way Solution
  • Certificate of Completion of Fundamentals of Delta Lake - Databricks
  • Workshop Implementado Pipelines SQL com DBT - Engenharia de Dados Academy
  • Specialization Data Engineering with dbt - Engenharia de Dados Academy

Languages

English
Upper intermediate (B2)

Timeline

Data Engineer Specialist

act digital
04.2021 - Current

Data Engineer

Adriano Ferraro - page 1MJV Technology & Innovation
11.2020 - 04.2021

Data Engineer

GESTO
04.2019 - 11.2020

Technical Degree - Data Modeling/Warehousing and Database Administration

Uninove - Universidade Nove de Julho
Adriano Ferraro