Who I Am
I am an Undergraduate student majoring in Electronics & Telecommunications at VIIT Pune. I have a keen interest in the field of Deep Learning, with a specific focus on the intersection of Language and Vision modalities. I enjoy exploring and implementing research ideas, and have a passion for communicating my work through articles and teaching others through courses and talks.
I like to write highly readable and self-contained code, with Python being my primary programming language. I am an intermediate learner in TensorFlow and PyTorch.
I am currently working on building Open Source models on HuggingFace 🤗 and learning about Machine Learning in Production. I enjoy writing technical articles and giving talks about Deep Learning.
Off the work, I enjoy watching movies and TV shows like Suits & a regular cricket player. If you have any questions or need assistance, feel free to reach out to me. I am always happy to help!
Actively Looking For Internship 👨💻
I am generally interested in the area of Deep learning. More specifically in language and vision modalities. I enjoy implementing research ideas, sometimes incorporating them into practical applications, and communicating my implementation details through open source models.
Replicate the architecture proposed in the Swin Transformer paper for medical image semantic segmentation. Swin Transformer is a novel architecture introduced in the paper "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" by Ze Liu et al. The Swin Transformer has shown promising results in various computer vision tasks, including image classification and object detection.
Implementation of a multi-head YOLOv9 model for clothes detection and instance segmentation. The model is trained on the DeepFashion dataset and evaluated using MSCOCO.
Translation model to translate English sentence to Hindi.
Neural Machine Translation approach to build a scalable fine-tuned model.
ScriptForge is a base model series which allows users to generate YouTube , Video scripts. These scripts can be used to create new and exciting content.
QA-BERT is a Question Answering Model. This model is a lighter version of any of the question-answering models out there. The model is trained on SqUAD dataset.
Hinglish MaskedLM is a Masked Language Model.This model is custom trained on Hinglish Data. Hinglish is a combination of Hindi & English
Yes I am a Patent Holder ! A Showcase of my Patented Solutions
Indian Patent | Published 2023
The solution is designed to assist blind individuals in gaining awareness of their surroundings by providing auditory access to real-time video information.
Authors - Mr.Shreyas Dixit, Dr.Pradnya Dixit
Indian Patent | Published 2023
Our solution can help visually impaired people access and experience the world around them by providing them with information and descriptions of the Images.
Authors - Mr.Shreyas Dixit , Dr.Parikshit Mahalle , Dr.Pradnya Dixit , Mr.Yashwant Ingle
Experience
Worked on Stutter Detection using Deep Learning. Project Included data gathering from schools, colleges, and universities. Data preprocessing, Exploratory data analysis, etc.
Developed a system for stutter detection, which includes distinguishing the time of stuttering and calculating the percentage of stuttering. This work utilized modern deep learning architectures such as Wav2Vec2 and Agnostic BERT.
Contributed to the development of "Real-time Accent Conversion System: Indian to US Accent".
During my time at BVIRAL, I will be actively involved in building a deep learning pipeline for their content company. This exciting project entails developing a comprehensive system that generates relevant titles and categorizes millions of short-form videos on Instagram. As a Deep Learning Engineering Intern, my responsibilities will include designing and implementing algorithms, optimizing model performance, and collaborating with a talented team to ensure the success and effectiveness of the pipeline. I am thrilled to contribute my skills and knowledge to this project and make a significant impact in the field of content generation and categorization.
Leading the Microsoft Club at Vishwakarma Institute of Information Technology Pune.
During my internship, I developed a multiple page website for a holiday home or villa. The project involved working in an Agile team system, following an iterative and collaborative approach to project management. We analyzed the client's requirements, developed a project plan, and used HTML, CSS, JavaScript to develop the website. We incorporated features such as an attractive landing page, image gallery, and contact form. Through collaborative teamwork and regular meetings, we successfully delivered a website that met the client's expectations.
Certificates
Projects
Conference ,Talks & Other
AutoSprout is an Automatic Irrigation System which enables users to login to an web or app and monitor their plants as well as agricultural lands.
This research explores the use of convolutional neural networks (CNNs) to accurately classify aerial images of crop fields. By creating an annotated dataset and training the CNN model, the study achieved over 90% accuracy in differentiating various crop types and road areas. The successful classification of crops using aerial imagery has significant implications for precision agriculture, enabling targeted interventions and optimized resource allocation to enhance crop yields.
IEEE | December 22
Federated Learning is a new approach to machine learning which allows users to train models on their data without the data leaving the edge device. However, this is vulnerable to data manipulation by attackers. This paper explores the data integrity of Federated Learning, and models different types of cyberattacks. Conclusions are presented along with the characteristics and behaviors of the attacks.
Link
Talks
I had the opportunity to teach at Symbiosis Institute of Technology Pune, teaching more than 100 students for over a month on various topics such as numpy, pandas, matplotlib, and other data analysis tools with my fellow members. It has been a fulfilling experience to share my knowledge and see my students develop new skills.
NewsLetter