Tag

data engineering

5 posts collected under data engineering.

data analytics

Why Manual Data Checking Fails: My Guide to Automated ETL Testing

A data analyst's practical guide to ETL pipeline automation, data validation, and testing frameworks. Learn how to catch nulls and duplicates before they break dashboards.

data analytics

Mastering Distributed Data: My Honest Experience Building Pipelines with Java

A data analyst's practical guide to learning Apache Spark with Java. Covering the Dataset API, ETL pipelines, performance tuning, and distributed computing.

data analytics

My Data Cleaning Experience: Escaping the Memory Limit Trap

A data analyst's honest experience switching to modern DataFrame libraries. Learn how lazy execution and optimized queries solve massive memory bottlenecks.

data analytics

Mastering Big Data Scale: A Guide to Databricks and Apache Spark for Analysts

Learn Databricks and Apache Spark fundamentals. Michael Park shares insights on Lakehouse Architecture, Spark SQL, and optimizing big data ETL pipelines.

data analytics

Mastering SQL for Data Analytics: A Professional Review of the Ultimate MySQL Bootcamp

Data analyst Michael Park reviews the Ultimate MySQL Bootcamp. Learn SQL vs NoSQL, RDBMS, and how to transition from Excel to professional data analytics.