Work Experience
Data Engineer
Greenphire LLC (Thoma Bravo portfolio company) • Atlanta, GA
September 2023 – Present
- Design, build, and maintain scalable data pipelines, machine learning models, cloud infrastructure, APIs, DevOps practices, and visualization solutions for clinical trial operations.
- Delivered a Patient Spend Budgeting solution on AWS to forecast clinical trial participant expenses, improving recruitment and trial integrity.
- Built time series forecasting models (DeepAR, AWS SageMaker) for budgeting needs, incorporating seasonality and trial phase indicators.
- Architected a company-wide reporting framework (FastAPI, Angular, Qlik, SSO) deployed on Kubernetes for secure, self-service analytics.
- Translated legacy SQL ETL pipelines to PySpark on AWS Glue, improving scalability and reducing run-times.
- Developed reusable Terraform modules for AWS Glue, enabling rapid deployment across environments.
- Implemented CI/CD pipelines (GitHub Actions) for automated testing, provisioning, builds, and deployments.
- Built data quality validation frameworks and self-healing workflows using AWS Lambda, CloudWatch, S3, EventBridge, and Athena.
- Led containerization and deployment using Kubernetes, Helm, and FluxCD for GitOps.
- Recognition: Nominated for the Phirestarter Award for exceptional contributions and innovation.
Software Engineer
Checkbook.io • San Mateo, CA
May 2022 – December 2022
- Enhanced internal CMS performance using Gatsby and Netlify, improving site speed and deployment workflow.
- Designed and implemented an address verification system (Angular, Python) for new users, improving data accuracy and compliance.
- Built and released new features for digital check creation, streamlining user experience.
- Upgraded frontend to React 18.10, enabling modern features and improved performance.
- Diagnosed and resolved critical UI/UX bugs, boosting platform reliability and reducing support tickets.
Machine Learning Engineer Intern
SemiCab • Atlanta, GA
May 2021 – August 2021
- Developed and evaluated ML models (Moving Average, SES, Holt-Winters, SARIMA, DeepAR) to forecast truck demand.
- Identified DeepAR as the most effective model, significantly outperforming traditional methods.
- Tuned hyperparameters to improve model performance, reducing RMSE and final loss.
- Deployed DeepAR on AWS SageMaker and exposed via scalable API endpoint.
- Built a Flask web app for business analysts to test forecasting models via a user-friendly interface.
Computer Science Summer Institute
Google • Atlanta, GA
July 2019 – August 2019
- Developed a full-stack web app aggregating real-time inventory data from multiple merchants using Google APIs.
- Designed a dynamic interface for product comparison (price, specs, availability).
- Collaborated in a team environment, following best practices in web development and software design.
Big Data Intern
Cardlytics • Atlanta, GA
May 2018 – August 2018
- Developed scalable data pipelines to process real-time consumer purchase data for targeted marketing campaigns.
- Built and optimized batch/streaming queries using Kafka and Scala/Spark for low-latency data ingestion.
- Designed and implemented ETL workflows with Spark/Scala, transforming raw data into actionable insights.
- Deployed and configured a Hadoop cluster for large-scale data storage and distributed processing.
- Engineered data loading from Hadoop into Vertica for advanced analytics and reporting.