Projects
Sentiment Analysis of Movie Reviews
Predicting sentiment (positive/negative) of movie reviews using NLP and machine learning. (Scikit-learn, NLTK, Pandas)
This project delved into the application of natural language processing (NLP) techniques for classifying sentiment (positive or negative) within movie reviews. Leveraging key libraries like scikit-learn, NLTK, and Pandas, I meticulously preprocessed the text data, including tokenization, stop word removal, and stemming. To extract meaningful features, I employed the Term Frequency-Inverse Document Frequency (TF-IDF) weighting scheme.
Extensive model exploration revealed a tie between two top performers Support Vector Machines (SVM) with both linear and RBF kernels achieved an impressive accuracy of 89.9%. Due to its simplicity and ease of deployment, I prioritized the linear SVM as the best model. Analysis of the confusion matrix highlights a subtle bias towards positive classifications, presenting an interesting area for further optimization and potential exploration of techniques to address class imbalance.
StudyBuddy
An AI-powered study companion application built with React and Cloudflare Workers, integrating multiple AI APIs for intelligent educational assistance.
StudyBuddy AI is an innovative educational platform designed to revolutionize the way students learn and interact with study materials. Leveraging React for a dynamic frontend and Cloudflare Workers for a serverless backend, this project seamlessly integrates multiple AI APIs, including Anthropic’s Claude, to provide intelligent, context-aware study assistance. The application offers personalized learning experiences, real-time explanations across various subjects, and chat-based interactions for in-depth topic exploration.
Key technical achievements include implementing a scalable architecture for real-time AI interactions, robust user authentication with JWT and role-based access control, and a tiered subscription model integrated with Stripe. Advanced features such as PDF document analysis, smart text selection, and version control for study materials enhance the learning experience. Performance optimizations like React Query for state management, code splitting, and lazy loading resulted in a 40% reduction in load times. The project also focuses on security, utilizing JWT tokens and bcrypt for password hashing to ensure user data protection.
Bible-Based Language Model (TBC)
Developing a specialized T5-based language model for biblical text analysis and generation using PyTorch and Hugging Face Transformers.
Current Status: NEED COMPUTE
This project focuses on creating a sophisticated language model tailored for biblical texts and theological analysis. Utilizing the T5 architecture and PyTorch, I implemented a comprehensive data preprocessing pipeline to handle multiple Bible versions and theological texts. The model is designed to perform various tasks including verse completion, thematic analysis, and contextual interpretation of biblical passages.
A hybrid training approach was employed, leveraging transfer learning from smaller to larger models to optimize performance. Custom evaluation metrics were developed to assess the model’s capability in tasks specific to biblical understanding. The evaluation framework provides insights into the model’s performance across different aspects of biblical knowledge and interpretation. While initial results show promise in capturing biblical language patterns, ongoing work focuses on refining the model’s accuracy and contextual understanding.