Senior / Staff Data Engineer • Databricks • Spark • Cloud Data Platforms
I am a data engineer with over 12 years of experience building scalable, reliable, and cost-efficient data platforms. My primary focus is on Databricks-based lakehouse architectures using Apache Spark, Delta Lake, and cloud-native services on AWS and Azure.
I enjoy solving complex data problems, optimizing large-scale pipelines, and designing systems that support analytics and business decision-making. I prefer hands-on individual contributor roles where I can own architecture and implementation end to end.
Typical areas I work on day to day:
Selected projects that reflect my work as a senior data engineer:
I have used Databricks features and created an end-to-end ETL pipeline. I have used the medallion architecture (Bronze, Silver, Gold) pattern. I have written all code in class and object style, because we can easily achieve scalable, reliable transformations and business-ready analytics for sales data. The architecture uses Databricks Auto Loader, Structured Streaming, Delta Lake, and Unity Catalog. Using SCD Type 1 and SCD Type 2. In this project, I ensure data quality, reliability, and governance.
Key Features:Built a scalable Databricks lakehouse processing multi-terabyte batch data. Focused on Spark performance tuning, AQE, Z-ORDER, and cost optimization.
Case study (coming soon)Designed CDC-based incremental ingestion pipelines into Delta tables with SCD Type 1 and Type 2 modeling to support analytics and BI workloads.
Case study (coming soon)Migrated legacy analytics workloads to Databricks on AWS and Azure, applying auto-scaling, cluster policies, and storage optimizations.
Case study (coming soon)