Shiva Manhar

Senior / Staff Data Engineer • Databricks • Spark • Cloud Data Platforms

About Me

I am a data engineer with over 12 years of experience building scalable, reliable, and cost-efficient data platforms. My primary focus is on Databricks-based lakehouse architectures using Apache Spark, Delta Lake, and cloud-native services on AWS and Azure.

I enjoy solving complex data problems, optimizing large-scale pipelines, and designing systems that support analytics and business decision-making. I prefer hands-on individual contributor roles where I can own architecture and implementation end to end.

What I Work On

Typical areas I work on day to day:

Projects

Selected projects that reflect my work as a senior data engineer:

Real-Time Lakehouse Pipeline

Kafka → Databricks Structured Streaming pipeline implementing Bronze–Silver–Gold layers with Delta Lake, schema evolution, and checkpointing for reliable near real-time analytics.

Case study (coming soon)

Enterprise Analytics Lakehouse

Built a scalable Databricks lakehouse processing multi-terabyte batch data. Focused on Spark performance tuning, AQE, Z-ORDER, and cost optimization.

Case study (coming soon)

CDC & Incremental Data Platform

Designed CDC-based incremental ingestion pipelines into Delta tables with SCD Type 1 and Type 2 modeling to support analytics and BI workloads.

Case study (coming soon)

Cloud Migration & Optimization

Migrated legacy analytics workloads to Databricks on AWS and Azure, applying auto-scaling, cluster policies, and storage optimizations.

Case study (coming soon)

Core Technologies

Databricks
Apache Spark
PySpark
Delta Lake
Structured Streaming
AWS
Azure
Snowflake
Python
SQL

Certifications