Jatin Gupta

Click on the email to unscramble.
Academic Email: 2rn2u2adh0g.sj6a@ta0.0i82n
Personal Email: m@162016atgpjliangutac.om

I am a final-year undergraduate student at Sharda University, Greater Noida, majoring in Computer Science and Data Science. My academic pursuits focus majorly on Large Language Models (LLMs), especially for legal and financial applications.

I am currently engaged in a Research Internship at IIIT-Naya Raipur under Dr. Santosh Kumar, where I explore real-world deployments of AI in data-driven environments.

Last Summer, I interned at Infosys under Trapti Singhal, where I developed a time series–based demand forecasting model that improved forecast accuracy by 5% and presented the results to board-level stakeholders.

Under the guidance of Dr. Ali Imam Abidi, I co-developed Legal Assist AI, which outperformed GPT-3.5 on legal reasoning tasks. Building on that, I helped create CA-Ben, a benchmark for financial reasoning in LLMs, which I presented at MoStart 2025 in Bosnia and Herzegovina.

I have co-authored multiple book chapters in the field of AI, contributing to academic research and interdisciplinary knowledge sharing.

Please feel free to check out my resume. You can also find me on other spaces below.

~ 𝕏 (Twitter) | Google Scholar | Github | LinkedIn | Coding Profile ~

May '25	Got selected for the 𝑶𝒖𝒕𝒓𝒆𝒂𝒄𝒉 𝑰𝒏𝒕𝒆𝒓𝒏𝒔𝒉𝒊𝒑 𝑷𝒓𝒐𝒈𝒓𝒂𝒎 at 𝑰𝑰𝑰𝑻 𝑵𝒂𝒚𝒂 𝑹𝒂𝒊𝒑𝒖𝒓 !!! 🎊
Apr '25	Presented Large Language Models acing Chartered Accountancy at MoStart-2025 Conference 🥳!
Apr '25	Our paper Large Language Models acing Chartered Accountancy got accepted at MoStart-2025 Conference 🥳!
Mar '25	Presented Indian Sign Language Detection for Real-Time Translation using Machine Learning, at the RAIT Conference 2025.
Feb '25	Presented our paper Vaani, at the ICRTICC-2025 Conference.
Jan '25	Our paper, Vaani has been accepted for publication at ICRTICC-2025 🥳.
Jan '25	Acquired membership with the International Association of Engineers (IAENG).
Dec '24	Our paper Indian Sign Language Detection for Real-Time Translation using ML got accepted at RAIT Conference 2025 🥳.
Nov '24	Our research article, Legal Assist AI, has been published as our first preprint and is currently under review.
Nov '24	Ranked in the top 3% of contributors in GSSOC-2024.
Sep '24	Ranked 4th in ModelWiz: The ML Arena; Kaggle Competition hosted by DataPool Club.
Sep '24	Selected as a Contributor for GirlScript Summer of Code 2024 Extended Edition! 🎉.
May '24	Selected for AI Internship at Infosys Springboard, Bengaluru (Remote), India.
Sep '23	Awarded Merit Scholarship for Academic Year 2022-2023.
Sep '22	Joined Sharda University, Greater Noida for B.Tech in CS with specialization in Data Science and Analytics.

Sharda University, Greater Noida
Bachelor of Technology in CSE - Data Science and Analytics
September '22 - June '26

Awards: Merit Scholarship Awardee 2022-2023

Student Societies:

Member | National Service Scheme (NSS)

Research Intern | IIIT Naya Raipur

May '25 - Present

Naya Raipur, Chhattisgarh, India — Onsite
[GitHub]

Currently working as a Research Intern at IIIT Naya Raipur under Dr. Santosh Kumar through the Outreach Internship Programme (OIP). I’m focused on research and development in Generative AI and NLP (Natural Language Processing), addressing real-world challenges in data-centric domains. The work involves exploring novel methods, contributing to research outputs, and collaborating on academic publication.

Artificial Intelligence Intern | Infosys Springboard

May '24 - July '24

Bengaluru, Karnataka, India — Remote
May '24 - July '24
[GitHub]

Our project , "Demand Forecasting for E-commerce", conducted under the guidance of Trapti Singhal at Infosys, involved designing a demand forecasting model using time series analysis, leading to a 5% improvement in product sales forecast reliability. I optimized the model through feature engineering and collaborated with a team of 10 to deliver a comprehensive presentation of our findings and methodology to board members, showcasing both technical and communication skills.

	Large Language Models acing Chartered Accountancy Jatin Gupta, Akhil Sharma, Saransh Singhania, Mohammad Adnan, Sakshi Deo, Dr. Ali Imam Abidi, Keshav Gupta MOSTART-2025: International Conference on Digital Transformation in Education and Artificial Intelligence Applications* [paper] (Available soon.) We introduce CA-Ben, a benchmark for testing the financial, legal, and quantitative reasoning of language models in the Indian context, using data from ICAI’s Chartered Accountancy exams. Six top LLMs are evaluated, with GPT-4o and Claude 3.5 Sonnet leading in legal and conceptual tasks. However, challenges remain in numerical accuracy and legal interpretation, underscoring the need for hybrid reasoning and retrieval-augmented methods in financial AI.
	Indian Sign Language Detection for Real-Time Translation using Machine Learning Rajat Singhal, Jatin Gupta, Akhil Sharma, Anushka Gupta, Navya Sharma RAIT-2025: *International Conference on Recent Advances in Information Technology* [paper] (Available soon.) We propose a vision-based solution for Indian Sign Language (ISL) recognition to support the deaf and mute communities. Using CNNs and Mediapipe for hand gesture detection, our model achieves 99.95% accuracy on the ISL dataset. Evaluated with F1 score and precision-recall, the system offers a complete pipeline for inclusive communication technology.
	Vaani: Transforming Healthcare and Enhancing Communication for the Deaf and Mute Community with Real-Time Translation Jatin Gupta, Rajat Singhal, Navya Sharma, Brijesh Yadav, Vedant Tiwari ICRTICC-2025: *International Conference on Recent Trends in Intelligent Computing and Communication* [paper] (Available soon.) We introduce Vaani, a real-time, multilingual communication tool for individuals with hearing and speech impairments, designed for critical settings like healthcare. It combines speech-to-text, text-to-speech, offline access, and secure encryption to ensure effective, accessible communication—especially in rural areas. Vaani enhances interaction, bridging communication gaps in both medical and everyday contexts.
	Legal Assist AI-Leveraging Transformer based Model for Effective Legal Assistance Jatin Gupta, Akhil Sharma, Saransh Singhania, Dr. Ali Imam Abidi* arXiv - preprint. [preprint] We present Legal Assist AI, a transformer-based model built to enhance legal access in India. Trained on datasets like Lawyer_GPT_India and AIBE, it answers legal queries with high accuracy. In evaluations, it scored 60.08% on the AIBE, outperforming models like GPT-3.5 Turbo and Mistral 7B in legal reasoning and reliability. Unlike others, it minimizes hallucinations, making it suitable for real-world use. Aimed at both legal professionals and the public, future versions will support multilingual and case-specific queries.

Text to 3D

[code]

Text to 3D scene Generator is a web application that uses generaitve AI to generate 3D scenes and models from text descriptions. It integrates with Three.js for rendering, Azure OpenAI API for simpler models generation, and Tripo3D API for creating realistic 3D models. Built with a Flask backend and a modern HTML/CSS/JavaScript frontend, it allows users to create, view, interact with, and download 3D models through a responsive web interface.

CA-ThinkFlow

[code]

CA-ThinkFlow is an AI-powered financial consulting application designed to assist users with various financial queries. Built using Streamlit and Langchain, this application leverages advanced language models to provide accurate and context-aware responses to user questions related to finance. The system implements Retrieval Augmented Generation (RAG) to enhance response accuracy by referencing a curated knowledge base of financial documents and regulations. A robust fallback mechanism switches between different language models when confidence scores are low, ensuring reliable responses even for complex queries.

Digital Image Tampering Detection

[code]

The code establishes a robust framework for detecting tampered images using Error Level Analysis (ELA) and a Convolutional Neural Network (CNN). ELA highlights compression artifacts indicative of tampering, and images are preprocessed, normalized, and resized to a fixed dimension. The dataset, containing authentic and tampered images, is labeled and split into training, validation, and test sets. Data augmentation is applied to enhance generalization, and the model is trained with early stopping. Evaluation includes confusion matrices, accuracy, precision, and F1-scores, achieving ~90% accuracy. The pipeline also visualizes tampered regions and predicts authenticity with high confidence, demonstrating its efficacy for digital image forensics.

More projects can be found on Github

Last updated: April 14, 2025