Rushikesh Gholap
I am a Data Science aspirant with four years of expertise in data analysis, machine learning, and forecasting, with a focus on AI-driven solutions in the Fintech industry. I am currently a Master's student in Data Science at Drexel University, set to graduate in May 2025. Here, I am learning and practicing data tools with cutting-edge techniques to process data for optimal and relevant analysis.
As a child, delving into the realm of computers felt akin to venturing into an uncharted universe, where programs and games danced to the tune of my commands. This ignited a profound fascination with the inner workings of these machines. With an inherent inclination towards science and a penchant for logical reasoning, I found myself drawn towards the world of analytics; for instance, predicting how a person behaves given we have past information about them.
At Markytics, I led projects and mentored in data preprocessing and ML metrics. We developed a multilingual voice bot with custom-trained ASR models, replicating how human agents handle calls. Using Python, R, SQL, and various data analysis and machine learning libraries, we developed the project and leveraged cloud services like AWS to host the entire project online.
Post-graduation, I aim to work as a Data Scientist to deepen my Business Domain understanding of the Finance and Banking sector. In the long run, I envision myself starting my own Fin-Tech firm, leveraging my extensive Data Science domain expertise to drive product excellence, specifically assisting in the analysis of customer loan credibility.
Outside Studies I explore trekking, running marathon, DIY things and my latest hobby is singing and learning swimming strokes.Stack and Algorithms
- Deep Learning - Image segmentation, Language modelling, NLP, Word2vec, Transformers, Variational Auto-encoders
- Classic ML - Generalized linear models, Ensemble models (Stacknet, Xgboost, Catboost, Random Forests), Tree based models (Decision Trees), SVMs, K-means, TSNE, PCA, Probabilistic models, Forecasting (ARIMA, S-ARIMAX)
- Libraries - Numpy, Pandas, OpenCV, Pillow, Collections, Itertools, Scikit learn, Tensorflow2, Keras, Pytorch, Matplotlib, Scipy, Django, Flask, Pyspark, Selenium, Dplyr, Tidyr, Ggplot2, VueJs,AppScripts
- Languages - Python3, R, SQL
- Cloud - Ngrok, AWS, Twillio , Azure, GCP, Google Analytics.
- Data Systems and Tools - Anaconda3, Jupyter, Postgres, MSSQL, MySQL, SQLite, MS PowerBI, Tableau, Superset, Postman, VS Code, Notebooks
Calling Voice-Bot
Data Scientist, Markytics
Spearheaded the development of a multilingual voice bot, aiming to enhance call handling efficiency and user interaction.
DetailsThe project focused on addressing the challenge of efficient call management in multilingual contexts. My role involved developing custom Whisper ASR models integrated with an Asterisk server for sophisticated call handling and creating dynamic intent recognition and object-oriented dialogue flows.
Technologies UsedPython3, ASR, Whisper, Tensorflow2, Sub-graph embeddings.
Challenges and SolutionsOne of the significant challenges was ensuring natural, lifelike responses in multiple languages. This was achieved by leveraging Google's TTS with SSML, enabling the bot to deliver responses that closely mimic human interaction.
Results and ImpactThe voice bot achieved an impressive average response time ratio of 0.68, automating level 1 & 2 calls and saving over 4 hours daily for human agents. This allowed agents to focus more on critical calls, significantly improving operational efficiency.
Bank Loan Assist Portal
Data Scientist, Markytics
Developed an automated portal for new bank account creation and loan processing, integrated with the bank’s vendor API.
DetailsThis project addressed the need for efficient bank account creation and loan application processing. My role involved automating document data extraction (like CIBIL and ITR) using Tika and conducting KYC verifications for multiple entities. The project aimed to streamline the loan approval process with a comprehensive scoring system.
Technologies UsedBrickstream, SQL, R, Video Processing.
Challenges and SolutionsA key challenge was accurately processing and evaluating a large volume of financial documents. This was overcome by employing advanced data extraction tools and algorithms, enabling precise and rapid processing of sensitive financial data.
Results and ImpactThe portal significantly improved the speed and accuracy of bank account creation and loan processing, enhancing customer satisfaction and operational efficiency. The scoring system based on 100+ parameters provided a robust framework for loan approval decisions.
Loan Management Portal
Data Scientist, Markytics
Created a Django-based platform for bank employees to monitor and track loan customers, integrating a WhatsApp chatbot for improved communication.
DetailsThe project aimed to enhance the efficiency of loan management in banks. My responsibilities included developing a comprehensive platform that allowed employees to monitor and track loan customers and their performance. Additionally, I built a WhatsApp chatbot using Google’s muRIL for seamless loan repayment interactions and managed user notifications using Twilio.
Technologies UsedDjango, Google’s muRIL, Twilio, PowerBI.
Challenges and SolutionsA significant challenge was creating a bot that could understand and respond to text and audio messages in various Indian regional languages. This was addressed by utilizing advanced NLP techniques and integrating multi-language support, enabling effective and natural communication with customers.
Results and ImpactThe portal and chatbot significantly improved the loan management process, allowing for better tracking of customer performance and more efficient communication. This led to improved customer service and more effective management of loan repayments.
Promotion/Offers/Coupon Recommender
Data Scientist, Markytics
Developed an ETL tool for recommending personalized promotions and offers to customers through a mobile app API.
DetailsThis project focused on enhancing customer engagement by providing tailored offers and coupons. My role encompassed building an ETL tool that extracted relevant customer and merchant data, and creating a recommendation engine using Azure-hosted Django with a CatBoost model. This system used customers' redeeming history and merchant offers to generate personalized recommendations.
Technologies UsedAzure, Django, CatBoost, SQL.
Challenges and SolutionsThe main challenge was accurately predicting customer preferences and offer relevance. This was tackled by employing a machine learning model that analyzed historical data patterns to forecast customer interests and offer appeal effectively.
Results and ImpactThe recommender system successfully improved customer engagement by delivering more relevant and appealing offers, leading to an increase in offer redemption rates and overall customer satisfaction.
Recaptured Image Identification of Meter for Billing
Data Scientist, Markytics
Developed a Progressive Web App (PWA) using VueJs and Django for identifying recaptured images of meters for accurate billing.
DetailsThe goal was to enhance the accuracy and reliability of billing through image analysis. I was responsible for building a PWA that utilized OpenCV and PIL for image processing, and creating a machine learning model with Keras and TensorFlow (mCNN). The focus was on identifying Moiré Patterns in images to distinguish between original and recaptured images.
Technologies UsedVueJs, Django, OpenCV, PIL, TensorFlow, Keras.
Challenges and SolutionsA major challenge was differentiating between original and recaptured images with high accuracy. This was overcome by training the model on a dataset of 300 images for each class, using frequency-based decomposition and augmentation techniques to enhance the model's ability to recognize subtle patterns.
Results and ImpactThe system achieved an accuracy of 93% and an F1 score of 0.91, significantly improving the accuracy of billing processes and reducing the instances of billing errors due to misidentified images.
Track & Report Sales Insights
Data Scientist, Markytics
Built an ETL tool for generating insightful sales reports from large datasets, enhancing data-driven decision-making in business operations.
DetailsThe project's objective was to provide comprehensive sales insights by processing extensive data. My role included developing an ETL tool that gathered data from PostgreSQL, and using data analysis libraries like NumPy and Pandas to process and summarize this data. I also developed tailored Excel reports with PyExcelerate and a Django API for report generation within specific time windows.
Technologies UsedPostgreSQL, NumPy, Pandas, PyExcelerate, Django.
Challenges and SolutionsA significant challenge was managing and summarizing the large corpus of data efficiently. I addressed this by utilizing advanced data processing techniques and creating optimized SQL views to handle the data effectively.
Results and ImpactThe solution enabled the production of detailed, customized sales reports, providing critical insights that supported strategic business decisions and performance improvements.