About

Hi, I'm Snehangsu, an Data Engineer, transforming raw information into valuable insights. With over 7 years of experience, I excel in crafting efficient data pipelines with data governance, obeservality and automation. Let's collaborate and empower your business into a strategic asset.

The Journey: Working in customer service provided a valuable lens into real-world business complexities and user challenges. This exposure fueled my passion to leverage data, and I transitioned to the data-driven landscape, eager to learn, apply my skills to the development of better solutions.

Recent Work:
- Data pre-processing for an open-source platform (DIKSHA) to monitor educational activities.
- Designing the data-pipeline and data modelling for Udaan, a commodities trading platform.
- Performing the ETL process for HSBC using Apache Spark, BigQuery, and Dataproc to fulfill ad-hoc dashboard requests.

Explore some of my notable work and projects in the links below!

Contact

Skills

Frequent

  • Python
  • BASH
  • GCP
  • BigQuery
  • Docker
  • Apache Spark / PySpark
  • DBT
  • SQL
  • Apache Kafka
  • Looker Studio
  • Cassandra DB
  • Terraform

Occasionally

  • AWS Lambda
  • Git/GitHub
  • Power BI
  • Apache Flink
  • Mongo DB
  • Prefect
  • Mage
  • DLT
  • Selenium
  • Apache Druid

Concepts

  • ER Diagram
  • Star Schema (Kimball)
  • Data Warehouse Architecture
  • Map-Reduce Concepts
  • SOLID Principles
  • OOPs Principle
  • Distributed Architecture

Work Experience

  • Data Engineer, Cloudcraftz Solutions

    2022-Present
    • Shikshalokam
      • Bridged the gap between business needs and data by maintaining a robust 97.8% uptime for our data pipelines.
        Leveraged PySpark, Druid, Azure Storage, and Python to seamlessly ingest streaming and transactional data from 8 diverse sources.

      • Collaborated on the migration of data from Apache Druid to CassandraDB.
        Contributing to a 17% performance increase through strategic data modeling and a switch to event-based data capture

      • Enabled real-time reporting by building a data pipeline with Apache Flink and Kafka to a dynamic Cassandra database.
        Utilized my expertise in Apache Flink, Kafka, and Cassandra to achieve this

      • Handled data transformation & governace using DBT to maintain bronze-silver-gold data hierachy.
        Leveraged my proficiency in DBT and data warehouse best practices to achieve this.

      • Developed a cloud-agnostic data storage solution with automated monitoring.
        Leveraged slack, database design and data modelling skills to achieve this.

    • Udaan
      • Established a centralized data pipeline to analyze transactional data.
        Utilized Selenium and data warehousing (BigQuery) to build this efficient data pipeline.

      • Optimized commodity procurement through a data model designed for vendor deals, commodities, prices, and purchases.
        Contributed to a Kimball data modeling architecture and designed the database schema, enabling efficient vendor and commodity management.

      • Analyzed commodity prices for reports, enabling short-term price forecasts and data governance for sustainable inventory and JIT models.
        Leveraged my expertise in market analysis to achieve this.

    • Data Services, Concentrix (Intuit)

      2019-2021
      • Ensuring data integrity through the implementation and enforcement of robust governance and security frameworks..
        Safeguarded data through critical thinking (database issues, data migration, security) and comprehensive DB health assessments.

      • Provided data insights by optimizing small business data organization using star schema and ER diagram expertise.

    • Customer Operation, Amazon

      2016-2019
      • Managed customer relations, resolved disputes, and prevented abuse in Concession Abuse Prevention.
        Showcasing excellent communication and problem-solving skills.

      • Served as a Subject Matter Expert in understanding and redefining customer policies, streamlining the process.
        Showcasing deeper understanding of customer requirements along with data analysis.