Data Projects
Here are a few projects that show my expertise in data field


Spotify ETL Pipeline
Developed an ETL pipeline using Amazon Web Services , by fetching data from Spotify's API .
Transformed the data using AWS Lambda and used S3 for storage of the data files.
Used Glue Crawler to infer the schema from the files and created a database using AWS Glue.
To manage analytical workloads , I used Athena.
Triggers are used to update the data on a weekly basis and perform transformations on raw data.
GCP Pipeline and automated reporting
I leveraged GCP's services to automate sales data pipeline and reporting to find insights.
Cloud Storage is used to store data in files generated by python script.
Cloud Data Fusion , a GUI based data transformation tool is used to mask sensitive information and hash passwords.
Stored the transformed data in Google Big Query and used Looker Studio to turn data into actionable insights.
Cloud Composer is used to orchestrate the entire workflow.


→
→


Data Engineering Projects
I used GCP's services to create a sales report.
Web portal was used to upload data in the form of CSV files and Cloud Storage was used to store the data.
Cloud Functions were used to transform the data and load the data onto Google Big Query.
Looker studio was used to turn data into actionable insights.
ETL Pipeline for automatic transaction reporting
→
Data Analysis Projects


Processed 200,000+ transaction records to uncover key business trends and insights.
Utilized SQL Workbench for data cleaning, transformation, and in-depth performance analysis.
Developed an interactive dashboard with 7+ key performance indicators (KPIs) to track revenue, profit margins, and contribution metrics.
Integrated dynamic filters for year-wise and month-wise comparisons.
Business Performance Dashboard – AtliQ Hardware
→
Workforce Retention Analysis
Analyzed and trained a machine learning model on 100,000+ employee records, considering factors like satisfaction level, salary, and experience.
Utilized PyCaret for automated model selection, achieving an 85%+ accuracy in predicting workforce attrition.
Processed and stored data in Google BigQuery, ensuring efficient scalability and fast querying.
Made predictions on 15,000 employees to identify potential attrition patterns.
→

