Senior Data Engineer | Greenfield Data Engineering Role | Unifying 100+ Security Tools | Led by Seasoned Cybersecurity Founders| Pune

Job Details
Full Time 8+ Years
Skills

Full Job Description

Role Overview

This is a greenfield Senior Data Engineer (8+ years) role focused on building the core data foundation for a cybersecurity product. You’ll design, build, and operate large-scale data pipelines and connector systems that handle high-volume security data from multiple external platforms. The ownership is end to end. Ingestion, normalization, processing, reliability, and data quality. You’ll work closely with backend, platform, and AI teams to ensure the data is production-grade, scalable, and ML-ready.


What you’ll own

  • Build and run scalable, fault-tolerant data pipelines for cybersecurity data such as logs, events, alerts, and vulnerabilities

  • Design and maintain Apache Airflow DAGs for batch, incremental, and event-driven ingestion, including backfills and reprocessing

  • Define schemas and normalization layers across diverse security data sources

  • Implement strong data quality checks including validation, deduplication, and consistency guarantees

  • Build and extend a connector framework to ingest data from SIEM, EDR, VM, CSPM, and CNAPP platforms

  • Design API-based, streaming, and batch ingestion pipelines with proper auth, secrets management, rate limits, retries, and state handling

  • Optimize pipelines for performance, scale, and cost in cloud environments

  • Set up observability for pipelines with metrics, logging, alerting, and health monitoring

  • Own data systems from design to production support

  • Partner with backend and AI/ML teams to deliver analytics- and ML-ready datasets and validate data accuracy using real security tools


What you bring

  • 8+ years of hands-on data engineering experience

  • Strong track record of building production-grade data pipelines at scale

  • Deep expertise with Apache Airflow including DAG design, scheduling, monitoring, and backfills

  • Strong Python skills are mandatory; Go is a plus

  • Experience working with large volumes of structured and semi-structured data like JSON, logs, and events

  • Solid understanding of distributed systems and data processing patterns

  • Experience with cloud platforms such as AWS, GCP, or Azure

  • Familiarity with CI/CD, Docker, Kubernetes, and cloud-native deployments

  • Strong preference for experience with cybersecurity or observability data including SIEM, EDR, Vulnerability Management, CSPM, CNAPP, or SOC platforms

High Impact Jobs: CareerXperts Jobs 

Follow CareerXperts on LinkedIn: CareerXperts Consulting