Summary
Overview
Work History
Education
Skills
Certification
ACADEMIC ACTIVITIES
COURSES
Languages
Timeline
Generic

Thais Andrade

Summary

Skilled and collaborative data engineer with experience in building data pipelines from the ground up using a variety of technologies. I have a solid foundation in Data Quality practices, Data Analysis, and Data Discovery, and I am well-versed in the modern data stack, platform migrations, and reporting. I am currently expanding my expertise in DevOps, messaging architectures, distributed systems, and data governance practices. With strong communication and interpersonal skills, I am easy to work with and thrive in team environments. I am passionate about growing as a data professional and continuously learning.

Overview

5
5
years of professional experience
1
1
Certification

Work History

Data Engineer

Guideline
11.2023 - Current
  • I worked on migrating a data platform from Microsoft Server to a cloud-based architecture, ensuring smooth transition and scalability. I was responsible for designing and creating data warehouse models to streamline data storage and access. Additionally, I conducted cost analysis and produced detailed reports to optimize data operations and improve decision-making.
  • Tools: Python, MySQL, Airflow, AWS SES, SQLAlchemy, DuckDB, Clickhouse, SSIS Packages, Mage.ai, Docker

Data Engineer

Haistack.ai
07.2023 - 10.2023
  • As the sole Data Engineer, I led the definition and implementation of the architecture. I was responsible for designing and building data warehouse models to efficiently organize and manage data, supporting seamless data storage and access for the applications to consume.
  • Tools: Python, SQL, MySQL, PostgreSQL, Airflow, AWS, Terraform, Github, Github Actions, SQLAlchemy

Data Engineer

Quanto
12.2021 - 05.2023
  • In this role, I structured the scope and activities of the Data Quality team, ensuring effective processes for data governance. I created Soda tests and Airflow DAGs for automated execution and integration with the existing stack. I planned and executed the development of a data observability platform to monitor and ensure data reliability. Additionally, I built data cleaning and transformation pipelines, ensuring compliance with data privacy principles based on the LGPD (Brazilian General Data Protection Law). I also led data ELT operations using a CDC solution for real-time transformation in the analytical platform.
  • Tools: Airflow, BigQuery, dbt, PostgreSQL, Grafana, GitHub, GitHub Actions, Docker, GKE, Soda, Alvin, Looker

Jr Data Engineer

Itaú Unibanco
03.2020 - 11.2021
  • In this role, I was responsible for acquiring data from the data lake and distributing it to the credit platform using shell scripts, DataStage, and Sqoop. I oversaw the distribution and execution of credit models and participated in automation projects to enable direct consumption of data lake data from customer service screens. I built and managed data lake processing pipelines, loading processed data into SQL Server. Additionally, I conducted ad hoc studies to enhance communication channels with customers and developed data lake pipelines for specialized studies, supporting data-driven insights across the organization.
  • Tools: Microsoft SQL Server, Alteryx, C#, RTC IBM, Hive, Excel, SAS, Python, Teradata, Hadoop, Shell Script, SQL, DataStage, sqoop

Education

Bachelor’s - Computer Science

Federal University of ABC

Bachelor’s - Science And Technology

Federal University of ABC
01.2021

Skills

  • Languages: Shell Script, Python, SQL, Java
  • Storage: Hive, HDFS, Teradata, BigQuery, SQL Server, PostgreSQL, MySQL, DuckDB
  • ETL & orchestration: Apache Spark, Apache Airflow, Alteryx, DataStage, Sqoop, dbt, Mageai
  • Devops & Cloud: Git, GCP, Docker, Helm, Terraform, Github Actions, AWS Lambda, AWS SES
  • Other: Linux OS, Soda, Great Expectations, Clickhouse, SSIS packages
  • ETL development, Data warehousing, Data modeling, Data migration

Certification

Astronomer Certification for Apache Airflow Fundamentals

ACADEMIC ACTIVITIES

  • Researcher - Genealogy studies group
  • Gender patterns identification in Brazilian academic coauthorship networks using family trees analysis through graph theory application.

COURSES

  • Big Data Modeling and Management Systems - University of California San Diego/18h Getting
  • Started with Hive for Relational Database Developers - Pluralsight/3h
  • Star Schema Foundations - Pluralsight/2h30
  • Data Quality Fundamentals - Udemy/3h
  • Getting to know Apache Airflow - Alura/12h
  • Dbt Fundamentals - dbt Labs/5h
  • Docker, creating and managing containers - Alura/10h
  • How git works - Pluralsight/2h

Languages

English
Bilingual or Proficient (C2)
Spanish
Beginner (A1)

Timeline

Data Engineer

Guideline
11.2023 - Current

Data Engineer

Haistack.ai
07.2023 - 10.2023

Data Engineer

Quanto
12.2021 - 05.2023

Jr Data Engineer

Itaú Unibanco
03.2020 - 11.2021

Bachelor’s - Computer Science

Federal University of ABC

Bachelor’s - Science And Technology

Federal University of ABC
Thais Andrade