Md Arid Hasan
About Me
Hi! I'm a Ph.D. student in the Department of Computer Science and a graduate affiliate of the Schwartz Reisman Institute for Technology and Society (SRI) at the University of Toronto. I am advised by Prof. Ishtiaque Ahmed. I am also a member of the Department of Computer Science's Third Space research group and Dynamics Graphics Project. Before coming to the UofT, I completed my Masters from Faculty of Computer Science at University of New Brunswick, where I was supervised by Paul Cook, PhD. Prior to this, I worked as a Lecturer at Daffodil International University (DIU) in Bangladesh. Prior to that, I worked as a Research Programmer at Cognitive Insight Limited. I completed my Bachelor's Degree in Computer Science and Engineering from Daffodil International University, which is one of the topmost Engineering Universities in Bangladesh.
- ACL 2026 Full Conference Student Registration Award (in-person), 11th Workshop on Computational Linguistics and Clinical Psychology - May, 2026
- 2026–2027 SRI Graduate Fellow, SRI, University of Toronto - April, 2026
- OpenAI's Researcher Access Program and API, OpenAI - March, 2024
- SGS Travel Awards - October, 2023
- Masters International Differential Tuition Scholarship/Waiver - September, 2023 to August, 2025
- Graduate Academic Award (GAA) / Graduate Research Award (GRA) - September, 2023 to April, 2025
- Reserach Award, From Division of Research, Daffodil International University." - 2021-2023
- (B.Sc.) - 2015-2018
My research advances human-centered and responsible AI, with a particular focus on natural language processing and generative models. I investigate how these systems encode cultural, linguistic, and subjective knowledge, especially in multilingual and low-resource settings, and how these factors shape their reliability and societal impact. My work spans LLM evaluation, culturally grounded AI, and applications in sensitive domains such as mental health and narrative dominance. A central focus of my research is understanding both the potential and the risks of deploying LLMs in mental health contexts, particularly for underserved populations in the Global South, where access to professional care is limited and cultural context is critical. I study how culturally aware LLMs can support para-counselors (also known as community health workers and lay counselors) by augmenting their ability to provide scalable, context-sensitive, and empathetic care, while also addressing challenges such as hallucinations, cultural misalignment, and biased ground truth. Through building and evaluating context-aware systems, my goal is to develop AI-driven mental health tools that are safe, inclusive, and responsive to diverse communities.
Recent News
2026
- May 2026 Paper Accepted at CLPsych 2026: Enhancing Mental Health Counseling Support in Bangladesh using Culturally-Grounded Knowledge.
- April 2026 Awarded the 2026–2027 SRI Graduate Fellowship at the SRI, University of Toronto.
- April 2026 Paper accepted at ACL 2026 (Main): LLM-Based Multi-Task Bangla Hate Speech Detection: Type, Severity, and Target.
2025
- 2025 Paper accepted at EMNLP 2025 (Main): Memeintel: Explainable Detection of Propagandistic and Hateful Memes.
- 2025 Paper accepted at EMNLP 2025 (Findings): PropXplain: Can LLMs Enable Explainable Propaganda Detection?
- 2025 Paper accepted at Interspeech 2025: SpokenNativQA: Multilingual Everyday Spoken Queries for LLMs.
- 2025 Paper accepted at ACL 2025 (Findings): NativQA: Multilingual Culturally-Aligned Natural Query for LLMs.
- 2025 Co-organized BLP-2025 Task 1: Bangla Hate Speech Identification, at the Second Workshop on Bangla Language Processing.
2024
- 2024 Awarded the OpenAI's Researcher Access Program OpenAI.
- 2024 Paper accepted at EMNLP 2024: ArMeme: Propagandistic Content in Arabic Memes.
- 2024 Co-organized ArAIEval 2024 Shared Task at ArabicNLP, ACL 2024, Thailand.
- 2024 Paper accepted at COLING 2025: AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs.
Experiences
- Working with GPU and Bash Scripts.
- Working with ML frameworks & toolkits in handling large-scale and complex data sets
Courses conducting as a GTA
- Data Mining and Machine Learning (Tutorial Instructor and Grading)
- Foundation of Artificial Intelligence (Grading)
- Programming Languages (Grading)
Publications
For a more up-to-date list of my publications, please check my Google Scholar.
* Equal Contributions
2026
2025
2024
2023
FakeDTML at CheckThat! 2023: Identifying Check-worthiness of Tweets and Debate Snippets
2022
2021
2020
2019
2018
Teaching
Throughout my tenure at Daffodil International University, I have passionately taught a diverse range of courses, including Artificial Intelligence, Data Mining and Machine Learning, Programming and Problem Solving, Digital Image Processing, and Object Oriented Programming. As an instructor, I dedicated myself to fostering a dynamic learning environment and guiding students towards comprehensive academic growth and success.
2023
2022
2021
Projects
Depthwise Separable Convolutions with Deep Residual Convolutions
XceptionNet, Depthwise Separable Convolutions, Deep Residual, CNN, CIFAR-10
In this project, we propose an optimized Xception architecture tailored for edge devices, aiming for lightweight and efficient deployment. We incorporate the depthwise separable convolutions with deep residual convolutions of the Xception architecture to develop a small and efficient model for edge devices. The resultant architecture reduces parameters, memory usage, and computational load. The proposed architecture is evaluated on the CIFAR 10 object detection dataset. The evaluation result of our experiment also shows the proposed architecture is smaller in parameter size and requires less training time while outperforming Xception architecture performance.
Ensemble Language Models for Multilingual Sentiment Analysis
BERT multilingual, AraBERT, XLM-RoBERTa, Instructions
In this project, I mainly explore sentiment analysis on tweet texts from SemEval-17 and the Arabic Sentiment Tweet dataset (ASTD). Moreover, I investigated four pretrained language models and proposed two ensemble language models. The findings include monolingual models exhibiting superior performance and ensemble models outperforming the baseline while the majority voting ensemble outperforms the English language.
Multiplatform Bangla Sentiment Analysis
Dataset, Transformers, LLMs, Instructions
The MUBASE dataset is a multiplatform dataset consisting of Tweets and Facebook posts, which are manually annotated with sentiment polarity. The annotation agreement of this manually annotated dataset shows an agreement score of 0.84, indicating a perfect agreement among the annotators.
MEDIC: a multi-task learning dataset for disaster image classification
Dataset, ResNet, VGG, EfficientNet, SqueezeNet, DenseNet
The MEDIC is the largest multi-task learning disaster related dataset, which is an extended version of the crisis image benchmark dataset. It consists data from several data sources such as CrisisMMD, data from AIDR and Damage Multimodal Dataset (DMD). The dataset contains 71,198 images.
Resources for Bangla Natural Language Processing (BanglaNLP)
Dataset, Transformers, BiLSTM, LMs
In our work A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models, we provide a review of Bangla NLP tasks, resources, and tools available to the research community; we benchmark datasets collected from various platforms for nine NLP tasks using current state-of-the-art algorithms (i.e., transformer-based models). We provide comparative results for the studied NLP tasks by comparing monolingual vs. multilingual models of varying sizes. We report our results using both individual and consolidated datasets and provide data splits for future research. We reviewed a total of 108 papers and conducted 175 sets of experiments. Our results show promising performance using transformer-based models while highlighting the trade-off with computational costs. We hope that such a comprehensive survey will motivate the community to build on and further advance the research on Bangla NLP.
AmaderCAT
Language: PHP, JavaScript
Framework: CodeIgniter, JQuery, Bootstrap
Database:MySQL
The application AmaderCAT is the abbreviation of Amader Computer-assisted Translation. This application is developed for the purpose of building parallel corpus for Machine Translation system. The application contains a Translation Memory and a Glossary suggestions implementation that used for helping translators by providing TM and glossary suggestions. The application is collaborative and highly configurable for the translation task. It has the mechanism for crowd translation. You can use it as single user or a group/team. In future, we will add Machine Translation System in our application using Neural Network technologies.
Skills
Programming Languages
|
ML & NLP Tools
|
LLMs Explored
|
Frameworks (Front- and back-end)
|
Database
|
Web Server
|
Operating System
|
IDE
|
Others
|
Professional Services
Reviewer (Notable)
2026
2025
2024
2023
2022
2021
Reviewed one articles titled "A systematic review of sentiment analysis using machine learning and deep learning approaches"
Professional Development
Extracurricular Activities
Professional Services
Reviewer (Notable)
2024
Full Year
2023
Reviewed three articles
2022
Reviewed one articles
Reviewed one articles
2021
Reviewed one articles titled "A systematic review of sentiment analysis using machine learning and deep learning approaches"