Deep Learning / NLP

Urdu Sentiment Analysis — FYP & Publication

CNNRNNPythonNLPTensorFlow

The Challenge

Urdu is one of the world's most spoken languages but severely underrepresented in NLP research. Existing sentiment analysis tools perform poorly on Urdu text due to limited labeled datasets, the language's morphological complexity, and right-to-left script handling requirements.

The Solution

Hybrid CNN-RNN Architecture: Combined convolutional layers for local feature extraction with recurrent layers for sequential context — outperforming either architecture alone on Urdu text classification.
Custom Preprocessing Pipeline: Handled Urdu-specific tokenization, normalization, and script encoding to address the unique challenges of right-to-left text.
Evaluation Rigor: Evaluated on a curated Urdu sentiment dataset; results submitted and accepted for peer review at the Journal on Artificial Intelligence, 2024.

Key Results

96% F1 score on the Urdu sentiment classification benchmark.
Published in the Journal on Artificial Intelligence, 2024.
Final Year Project — Bachelor of Science in Software Engineering, City University Peshawar.

Project Details

CategoryDeep Learning / NLP

RoleLead Developer

ClientAcademic · Published

CompletedJune 2024

Related case studies

Agentic RAG

Malakah: Agentic RAG Legal Assistant

97–98%

Agentic RAG legal assistant for Saudi regulatory law. 97–98% recall@10 across a bilingual English-Arabic corpus validated by legal domain experts.

Agentic RAG

Enterprise AI Search and Chatbot: Energy Sector

300+ GB

Intelligent search system processing 300+ GB of unstructured enterprise data for a major energy company, including 1K–6K page technical documents.

AI SaaS

Multi-Tenant AI SaaS Platform: Document Intelligence

AI-powered document intelligence platform with multi-tenant architecture, RBAC, and hybrid retrieval.

Pakistani Currency Classifier

Real-time EfficientNet-based classification of Pakistani currency notes achieving 99% accuracy.

Precipitation Nowcasting

Short-term weather prediction using ConvLSTM deep learning on radar datasets (RMSE = 11 dBZ).