??% Match

Advanced Data Engineer

Honeywell • 3 hours ago

Location

Bengaluru, Karnataka, India

Job Type

Full-Time

Experience Level

Mid-level, Manager-level (3-5 Years)

Salary Range

Not disclosed

Job Description

Job Description Ddvanced Data Engineer - Value Engineering & Component Engineering COE Location: Bangalore, IN (Hybrid) Role Overview: Honeywell's VECE COE is building a next-generation, AI-Ready data platform to power advanced analytics, predictive insights, and data science at enterprise scale. As a Senior Data Engineer, you will be a founding technical pillar of this platform: designing and building the data infrastructure that transforms raw, multi-source data into governed, high-quality, analytics-ready assets. This is not a maintenance role. You will architect, build, and own end-to-end data pipelines using Azure Databricks as the primary platform, following Medallion Architecture principles, and delivering trusted data to downstream consumers in Google Cloud Platform (GCP). You will directly shape how Honeywell's VECE organization transitions from traditional descriptive analytics to proactive, AI-driven decision-making. Responsibilities What you will build? Data Pipelines & Ingestion Implement end-to-end ingestion pipelines from heterogeneous sources (i.e. Snowflake, SQL Server, Excel, REST APIs, and unstructured files) into Azure Databricks following defined architecture patterns Build and maintain Bronze → Silver → Gold Medallion layers, applying transformation logic, business rules, and quality checks at each stage Implement incremental loading pattern (i.e. CDC, watermarking, Delta Lake MERGE/UPSERT) to ensure efficient, scalable, and reliable data delivery Develop pipelines for structured and unstructured data (i.e. documents, JSON, Parquet, Excel) supporting AI and ML consumption downstream Data Modeling & Semantic Layer Implement and extend data models (i.e. fact/dimension tables, domain data marts) following designs defined by the Senior DE and AI team. Write clean, modular, reusable PySpark and SQL transformation logic that is testable, documented, and deployable via CI/CD Contribute to the semantic layer that powers Power BI dashboards and GCP-connected analytics consumers Maintain and improve existing models as business requirements evolve Orchestration and Data Ops Build and manage Databricks Workflows: configuring task dependencies, retry policies, and failure alerting Follow and contribute to CI/CD practices: version control, pull requests, automated testing, and deployment to Dev/QA/Prod environments using Azure DevOps or GitHub Actions Package and deploy reusable logic as Python libraries following team standards Monitor pipeline health, investigate failures, and resolve data issues within SLA Data Governance & Quality Apply data quality rules (i.e. validation, deduplication, null checks, reconciliation) within pipelines to ensure data arrives fit for purpose Operate within the Unity Catalog governance framework respecting RBAC, namespace structure, and tagging standards defined by platform leads Ensure data delivered to GCP is schema-consistent, validated, and documented Flag and escalate data quality issues proactively not reactively FinOps Awareness Write cost-conscious PySpark avoiding unnecessary full scans, optimizing joins, using appropriate cluster types Apply Delta table best practices (i.e. VACUUM, OPTIMIZE, compaction) to manage storage costs Follow cluster policies defined by platform leads and flag unusual resource consumption Must Have Databricks: 2+ years hands-on: PySpark, Delta Lake, Workflows, Unity Catalog. Demonstrate expertise in data strategy, for example: Medallion Architecture, Domain Data Modeling and Functional Data Architecture. Data Quality Frameworks (i.e. rule-based validation, anomaly detection) Data Pipelines: incremental loading, CDC, CI/CD, Observability Advanced Python/Pyspark and Advanced SQL Strongly preferred: DLT, UC, GCP, Azure, Kafka. Highly value Databricks Certified Professional Qualifications Experience 4-6+ years of overall data engineering experience 2+ years of hands-on Azure Databricks experience in production environments Demonstrated ability to build and deliver pipelines — not just maintain or support them Experience working within a defined architecture and contributing to its improvement Comfortable working with multiple data source types — relational, file-based, API About Honeywell: Honeywell Industrial Automation enhances process industry operations, creates sensor technologies, automates supply chains, and improves worker safety. The VECE COE focuses on optimizing operational processes and driving sustainable growth Required Skills Data Pipelines Data Warehousing Google Cloud Microsoft Azure Python and Pyspark Structured Query Language

About Honeywell

Honeywell is a Fortune 500 company that invents and manufactures technologies to address tough challenges linked to global macrotrends such as safety, security, and energy. With approximately 110,000 employees worldwide, including more than 19,000 engineers and scientists, we have an unrelenting focus on quality, delivery, value, and technology in everything we make and do.

Connections

Sai Charan

Senior Developer

5+ years

Kalpana Sharma

Team Lead

3+ years

Rahul Patel

Full Stack Developer

4+ years

Priya Singh

Frontend Developer

2+ years

Connect with professionals in your network

Coming Soon

Job Match Score

??%

Based on your resume

Skill Match Analysis

??% skills matched (?? of 31 skills)

💡 This is keyword matching for reference only. Your actual match score uses AI semantic analysis.