Projects
Sentiment Analysis of Movie Reviews
Predicting sentiment (positive/negative) of movie reviews using NLP and machine learning. (Scikit-learn, NLTK, Pandas)
This project delved into the application of natural language processing (NLP) techniques for classifying sentiment (positive or negative) within movie reviews. Leveraging key libraries like scikit-learn, NLTK, and Pandas, I meticulously preprocessed the text data, including tokenization, stop word removal, and stemming. To extract meaningful features, I employed the Term Frequency-Inverse Document Frequency (TF-IDF) weighting scheme.
Extensive model exploration revealed a tie between two top performers Support Vector Machines (SVM) with both linear and RBF kernels achieved an impressive accuracy of 89.9%. Due to its simplicity and ease of deployment, I prioritized the linear SVM as the best model. Analysis of the confusion matrix highlights a subtle bias towards positive classifications, presenting an interesting area for further optimization and potential exploration of techniques to address class imbalance.
StudyBuddy
An AI-powered study companion application built with React and Cloudflare Workers, integrating Anthropic’s Console, OpenAi’s API and GroqCloud for all sorts of ai computational activities.
StudyBuddy AI is an innovative educational platform designed to revolutionize the way students learn and interact with study materials. Leveraging React for a dynamic frontend and Cloudflare Workers for a serverless backend, this project seamlessly integrates Anthropic’s Claude AI model to provide intelligent, context-aware study assistance. The application offers personalized learning experiences, real-time explanations across various subjects, and chat-based interactions for in-depth topic exploration.
A key technical challenge was implementing a scalable architecture capable of handling real-time AI interactions while maintaining low latency. This was achieved through strategic use of Cloudflare Workers and D1 database for efficient data management and API routing. The project features a sophisticated user authentication system with tiered access levels, integrated with Stripe for seamless subscription management. The application includes features such as chat history management, PDF document analysis, and adaptive responses based on user subscription levels. Ongoing development focuses on enhancing the AI’s contextual understanding of educational content and implementing advanced analytics to track and optimize user learning paths.
Bible-Based Language Model (TBC)
Developing a specialized T5-based language model for biblical text analysis and generation using PyTorch and Hugging Face Transformers.
Current Status: NEED COMPUTE
This project focuses on creating a sophisticated language model tailored for biblical texts and theological analysis. Utilizing the T5 architecture and PyTorch, I implemented a comprehensive data preprocessing pipeline to handle multiple Bible versions and theological texts. The model is designed to perform various tasks including verse completion, thematic analysis, and contextual interpretation of biblical passages.
A hybrid training approach was employed, leveraging transfer learning from smaller to larger models to optimize performance. Custom evaluation metrics were developed to assess the model’s capability in tasks specific to biblical understanding. The evaluation framework provides insights into the model’s performance across different aspects of biblical knowledge and interpretation. While initial results show promise in capturing biblical language patterns, ongoing work focuses on refining the model’s accuracy and contextual understanding.