Dhruvil Dave

Kaggle Master, Machine Learning Engineer, Classical Musician

LinkedInKaggleGitHubmail

Education

University of Texas at DallasDallas, Texas, USA
M.S. (Business Analytics and Data Science)
  • Pursuing M.S. in Business Analytics with concentration in Data Science
Ahmedabad UniversityAhmedabad, Gujarat, India (भारत)
B.Tech. (Information and Communication Technology)
  • Additional coursework:
    • Completed Data Engineering, Big Data, and Machine Learning on Google Cloud Platform Specialization (Coursera)
    • Introduction to Database Engineering (Udemy)

Publications

Unicode Aware Sanskrit Transliteration: UAST
  • https://arxiv.org/abs/2203.14277 We hereby discuss the problems and provide a solution that solves the issue of incompatibilities between various transliteration and encoding schemes for Sanskrit language and Devanagari script. The Unicode standard works fundamentally in a different way than Sanskrit language. This is the easiest possible solution to the problem discussed in the paper ranging from domains of Natural Language Processing to Human Computer Interaction.

Work Experience

Ahmedabad UniversityAhmedabad, Gujarat, India
Teaching AssistantJul '21 - Jun '22
  • Working as a teaching assistant for Advanced Statistics, Computer Networks, and Operating Systems, course for handling coursework and teaching a class of 150 students
PedalsUpAhmedabad, Gujarat, India
Backend Golang InternOct '20 - Nov '20
  • Facilitated in porting the Continuous Integration pipeline using Golang to speed up builds by 6 times from 1 hour to 11 minutes. Coordinated in moving the production database of 10 million records to PostgreSQL from SQLite with zero downtime on Docker

Projects

Spotify Charts Dataset
  • A complete dataset of all “Top 200” and “Viral 50” charts published by Spotify of daily statistics of the top tracks on the platform. The data was scraped, processed, and curated completely from scratch. An entire pipeline was written to ensure smooth and fast ingestion and parsing of 40 GB data with approximately 26 million rows and updating it daily and hosted over Kaggle and BigQuery and creating PostgreSQL dumps using Apache Spark, TypeScript and Golang
Wikibooks Dataset
  • Created a dataset of complete dump of Wikibooks in 12 languages. The dataset contains all the pages of all the chapters of books in 12 languages along with metadata like title, abstract, and body in text and HTML with size over 12 GB. Hosted on Kaggle, this dataset has been downloaded over 3,000+ times and even got selected as a research dataset by various institutes and universities
Warehouse Storage Optimization
  • Developed a warehouse optimization system as a part of Machine Learning course to understand the workings of clustering and classification algorithms using Scikit-Learn, Pandas, and Optuna

Leadership Experience

IEEE Ahmedabad University Student BranchAhmedabad, Gujarat, India
  • Served as a Treasurer of Women In Engineering branch of IEEE Ahmedabad University
Red-Black Decision TreeAhmedabad, Gujarat, India
  • Creator of a Machine Learning and Data Science enthusiasts where we host weekly talks and activities

Skills & Interest

Skills
  • Data Science and Machine Learning: Python, R, PostgreSQL, Tidyverse, Numpy, PyTorch, Pandas, Apache Spark, Apache Cassandra, Puppeteer, Google BigQuery, Plotly, Seaborn, Docker
  • Web Development: Golang, TypeScript, C, Linux, Node.js, Deno, Bash, HTML, CSS, Next.js
Interests
  • Fluent in English, Gujarati, and can comfortably communicate in Sanskrit and Hindi. I spend a lot of time exploring new technologies and programming languages/frameworks/libraries. Pursuing a degree in Indian Classical Music in Vocals and Harmonium and like exploring philosophies and various aspects of history and spend time playing football