Dhruvil Dave

dxd210049@utdallas.edu ; Kaggle ; LinkedIn ; +1 551-888-6315


University of Texas at Dallas Dallas, Texas, USA
M.S. Dec 2023
  • GPA: 3.78 / 4.00
  • Major: Business Analytics - Data Science concentration
  • Minor: Applied Machine Learning
  • Awarded Dean's Excellence Scholarship
Ahmedabad University Ahmedabad, Gujarat, India
B.Tech. May 2022
  • Major: Information and Communication Technology
  • Additional coursework:
    • Completed Data Engineering, Big Data, and Machine Learning on Google Cloud Platform Specialization (Coursera)
    • Introduction to Database Engineering (Udemy)


Gāyatrī Corporation Ahmedabad, Gujarat, India
Data Science Intern Jun '22 - Aug '22
  • Identified key metrics from the sales data that improved the sales cost by 7%.
  • Optimized long running queries in PostgreSQL and BigQuery to achieve 3x speed.
Ahmedabad University Ahmedabad, Gujarat, India
Teaching Assistant Jul '21 - Jun '22
  • Teaching Assistant for Advanced Statistics, Computer Networks, and Operating Systems. Taught, handled coursework, and managed a class of 150 students
PedalsUp Ahmedabad, Gujarat, India
Backend Golang Intern Oct '20 - Nov '20
  • Facilitated in porting the Continuous Integration pipeline using Golang to speed up builds by 6 times from 1 hour to 11 minutes. Coordinated in moving the production database of 10 million records to PostgreSQL from SQLite with zero downtime on Docker

Technical Skills

  • Data Science and Machine Learning: Python, R, PostgreSQL, Tidyverse, Numpy, Pandas, Apache Spark, Apache Cassandra, Puppeteer, Google BigQuery, Plotly, Seaborn, Docker
  • Web Development: Golang, TypeScript, C, Linux, Node.js, Deno, Bash, HTML, CSS, Next.js
  • Languages: Gujarati, Hindi, and Sanskrit
  • Hobbies: Pursuing a degree in Indian Classical Music in Vocals and Harmonium and like exploring philosophies and various aspects of history and spend time playing football.


Kaggle Notebooks and Datasets Master
  • Ranked in top 300 in Notebooks category and 30 in Datasets category globally amongst more than a million users on the platform.
  • Secured 5th position in Song Popularity Contest. Participated in various Machine Learning and Data Science competitions.
  • Improved skills like Data Preprocessing, Model testing, Statistical Modelling, Hypothesis Testing, A/B Testing, and Feature Engineering.
  • Published over 15 datasets and 20 articles, analysis, and tutorial notebooks.
Spotify Charts Dataset
  • A complete dataset of all “Top 200” and “Viral 50” charts published by Spotify of daily statistics of the top tracks on the platform. The data was scraped, processed, and curated completely from scratch. An entire pipeline was written to ensure smooth and fast ingestion and parsing of 40 GB data with approximately 26 million rows and updating it daily and hosted over Kaggle and BigQuery and creating PostgreSQL dumps using Apache Spark, TypeScript and Golang
Wikibooks Dataset
  • Created a dataset of complete dump of Wikibooks in 12 languages. The dataset contains all the pages of all the chapters of books in 12 languages along with metadata like title, abstract, and body in text and HTML with size over 12 GB. Hosted on Kaggle, this dataset has been downloaded over 3,000+ times and even got selected as a research dataset by various institutes and universities
Warehouse Storage Optimization
  • Developed a warehouse optimization system as a part of Machine Learning course to understand the workings of clustering and classification algorithms using Scikit-Learn, Pandas, and Optuna

Leadership Experience

  • IEEE Ahmedabad University Student Branch
  • Red-Black Decision Tree Machine Learning Group


  • Unicode Aware Sanskrit Transliteration: UAST (https://arxiv.org/abs/2203.14277)

Additional Information

  • Eligible to work in the USA for internships and full-time employment for up to 36 months without sponsorship