Md. Arid Hasan

MCS Student at UNB
Research Assistant under supervision of Paul Cook, PhD.
Teaching Assistant at FCS

Fredericton,

CV [Updated 21/02/2024]

arid[DOT]hasan [AT] unb.ca arid[DOT]hasan[DOT]h [AT] gmail[DOT]com

Google Scholar

ResearchGate

Twitter

GitHub

GitLab

About Me

SGS Travel Awards -
Masters International Differential Tuition Scholarship/Waiver -
Graduate Academic Award (GAA) / Graduate Research Award (GRA) -
Reserach Award, From Division of Research, Daffodil International University." -
(B.Sc.) -

Hobbies

Education

Experiences

Working with GPU and Bash Scripts.
Working with ML frameworks & toolkits in handling large-scale and complex data sets

Skills: Large Language Models (LLM), Natural Language Processing (NLP), Transformer Models, BERT (Language Model), Data Preparation

Courses conducting as a GTA

Data Mining and Machine Learning (Tutorial Instructor and Grading)
Foundation of Artificial Intelligence (Grading)
Programming Languages (Grading)

Skills: Decision Trees, SVM, Random Forest, Long Short-term Memory (LSTM), Convolutional Neural Networks (CNN), University Lecturing, Python (Programming Language), Artificial Neural Networks

Publications

* Equal Contributions

2024

NativQA: Multilingual Culturally-Aligned Natural Query for LLMs

Md Arid Hasan*, Maram Hasanain, Fatema Ahmed, Sahinur Rahman Laskar, Sunaya Upadhyay, Vrunda N Sukhadia, Mucahid Kutlu, Shammur Absar Chowdhury and Firoj Alam*

Submitted to ICLR 2025

PDF BibTeX

ArMeme: Propagandistic Content in Arabic Memes

Firoj Alam, Abul Hasnat, Fatema Ahmed, Md Arid Hasan and Maram Hasanain

EMNLP 2024

PDF BibTeX

ArAIEval Shared Task: Propagandistic Techniques Detection in Unimodal and Multimodal Arabic Content

Maram Hasanain*, Md Arid Hasan*, Fatema Ahmed, Reem Suwaileh, Md Rafiul Biswas, Wajdi Zaghouani and Firoj Alam

ArabicNLP24 at ACL

PDF BibTeX

AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs

Basel Mousi, Nadir Durrani, Fatema Ahmed, Md Arid Hasan, Maram Hasanain, Tameem Kabbani, Fahim Dalvi, Shammur Absar Chowdhury and Firoj Alam

Submitted to COLING2025

PDF BibTeX

Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings

Md Arid Hasan, Krishno Dey, Prerona Tarannum, Imran Razzak and Usman Naseem

Submitted to COLING2025

PDF BibTeX

Better to Ask in English: Evaluation of Large Language Models on English, Low-resource and Cross-Lingual Settings

Krishno Dey, Prerona Tarannum, Md Arid Hasan, Imran Razzak and Usman Naseem

Submitted to COLING2025

PDF BibTeX

2023

Zero-and Few-Shot Prompting with LLMs: A Comparative Study with Fine-tuned Models for Bangla Sentiment Analysis

Md Arid Hasan, Shudipta Das, Afiyat Anjum, Firoj Alam, Anika Anjum, Avijit Sarker and Sheak Rashed Haider Noori

LREC-COLING 2024

PDF BibTeX

BLP 2023 Task 2: Sentiment Analysis

Md. Arid Hasan, Firoj Alam, Anika Anjum, Shudipta Das and Afiyat Anjum

Proceedings of the 1st International Workshop on Bangla Language Processing (BLP-2023), 6-11 December 2023, EMNLP, Singapore

PDF BibTeX

Role of Social Media Imagery in Disaster Informatics

Firoj Alam, Kashif Ahmad, Md. Arid Hasan, Ferda Ofli and Mohammad Imran

In book: International Handbook of Disaster Research

PDF BibTeX

Z-Index at BLP-2023 Task 2: A Comparative Study on Sentiment Analysis

Prerona Tarannum, Md. Arid Hasan, Krishno Dey

Proceedings of the 1st International Workshop on Bangla Language Processing (BLP-2023), 6-11 December 2023, EMNLP, Singapore

PDF BibTeX

Semantics Squad at BLP-2023 Task 2: Sentiment Analysis of Bengali Text with Fine Tuned Transformer Based Models

Krishno Dey, Md. Arid Hasan, Prerona Tarannum, and Francis Palma

Proceedings of the 1st International Workshop on Bangla Language Processing (BLP-2023), 6-11 December 2023, EMNLP, Singapore

PDF BibTeX

Semantics Squad at BLP-2023 Task 1: Violence Inciting Bengali Text Detection with Fine-Tuned Transformer-Based Models

Krishno Dey, Prerona Tarannum, Md. Arid Hasan, Francis Palma

Proceedings of the 1st International Workshop on Bangla Language Processing (BLP-2023), 6-11 December 2023, EMNLP, Singapore

PDF BibTeX

Z-Index at CheckThat! 2023: Unimodal and Multimodal Checkworthiness Classification

Prerona Tarannum, Md. Arid Hasan, Firoj Alam, and Sheak Rashed Haider Noori

CLEF 2023: Conference and Labs of the Evaluation Forum, 18-21 September 2023, Thessaloniki - Greece

PDF BibTeX

NN at CheckThat! 2023: Subjectivity in News Articles Classification with Transformer Based Models

Krishno Dey, Prerona Tarannum, Md. Arid Hasan and Sheak Rashed Haider Noori

CLEF 2023: Conference and Labs of the Evaluation Forum, 18-21 September 2023, Thessaloniki - Greece

PDF BibTeX

FakeDTML at CheckThat! 2023: Identifying Check-worthiness of Tweets and Debate Snippets

Abdullah Al Mamun Sardar, Md. Ziaul Karim, Krishno Dey and Md. Arid Hasan

CLEF 2022: Conference and Labs of the Evaluation Forum, 18-21 September 2023, Thessaloniki - Greece

PDF BibTeX

2022

MEDIC: a multi-task learning dataset for disaster image classification

Firoj Alam, Tanvirul Alam, Md. Arid Hasan, Abul Hasnat, Muhammad Imran, and Ferda Ofli

Journal: Neural Computing and Applications, Springer Nature

PDF BibTeX

SemEval-2022 Task 3: PreTENS-Evaluating Neural Networks on Presuppositional Semantic Knowledge

Roberto Zamparelli, Shammur Chowdhury, Dominique Brunato, Cristiano Chesi, Felice Dell’Orletta, Md Arid Hasan, Giulia Venturi

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

PDF BibTeX

Z-Index at CheckThat! Lab 2022: Check-Worthiness Identification on Tweet Text

Prerona Tarannum, Md. Arid Hasan, Firoj Alam, and Sheak Rashed Haider Noori

CLEF 2022: Conference and Labs of the Evaluation Forum, 05-08 September 2022, Bologna, Italy

PDF BibTeX

2021

A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models

Firoj Alam, Md. Arid Hasan, Tanvir Alam, Akib Khan, Janntatul Tajrin, Naira Khan, Shammur Absar Chowdhury

arXiv preprint, submitted to TALLIP

PDF BibTeX

Multi Class Fake News Detection using LSTM Approach

Bhaskar Majumdar, Md RafiuzzamanBhuiyan, Md Arid Hasan, Md Sanzidul Islam, Sheak Rashed Haider Noori

2021 10th International Conference on System Modeling & Advancement in Research Trends (SMART)

PDF BibTeX

M82B at CheckThat! 2021: Multiclass Fake News Detection Using BiLSTM.

Sohel Siddique Ashik, Abdur Rahman Apu, Nusrat Jahan Marjana, Md Sanzidul Islam, Md Arid Hassan

CLEF 2021: Conference and Labs of the Evaluation Forum, 21-24 September 2022, Bucharest, Romania

PDF BibTeX

Qword at CheckThat! 2021: An Extreme Gradient Boosting Approach for Multiclass Fake News Detection.

Rudra Sarker Utsha, Mumenunnessa Keya, Md Arid Hassan, Md Sanzidul Islam

CLEF 2021: Conference and Labs of the Evaluation Forum, 21-24 September 2022, Bucharest, Romania

PDF BibTeX

BlackOps at CheckThat! 2021: User Profiles Analyze of Intelligent Detection on Fake Tweets Notebook for PAN.

SM Sohan, Sharun Akter Khushbu, Md Sanzidul Islam, Md Arid Hassan

CLEF 2021: Conference and Labs of the Evaluation Forum, 21-24 September 2022, Bucharest, Romania

PDF BibTeX

Team Sigmoid at CheckThat! 2021 Task 3a: Multiclass fake news detection with Machine Learning.

Abdullah Al Mamun Sardar, Shahalu Akter Salma, Md Sanzidul Islam, Md Arid Hassan, Touhid Bhuiyan

CLEF 2021: Conference and Labs of the Evaluation Forum, 21-24 September 2022, Bucharest, Romania

PDF BibTeX

2020

Sentiment Classification in Bangla Textual Content: A Comparative Study

Md. Arid Hasan, Jannatul Tajrin, Shammur Absar Chowdhury, Firoj Alam

2020 23rd International Conference on Computer and Information Technology (ICCIT)

PDF BibTeX

2019

Neural Machine Translation for the Bangla-English Language Pair

Md. Arid Hasan, Firoj Alam, Shammur Absar Chowdhury, Naira Khan

2019 22nd International Conference on Computer and Information Technology (ICCIT)

PDF BibTeX

Neural vs Statistical Machine Translation: Revisiting the Bangla-English Language Pair

Md. Arid Hasan, Firoj Alam, Shammur Absar Chowdhury, Naira Khan

2019 International Conference on Bangla Speech and Language Processing (ICBSLP)

PDF BibTeX

2018

A collaborative platform to collect data for developing machine translation systems

Md. Arid Hasan, Firoj Alam, and Sheak Rashed Haider Noori

Proceedings of International Joint Conference on Computational Intelligence: IJCCI 2018

PDF BibTeX

Teaching

Throughout my tenure at Daffodil International University, I have passionately taught a diverse range of courses, including Artificial Intelligence, Data Mining and Machine Learning, Programming and Problem Solving, Digital Image Processing, and Object Oriented Programming. As an instructor, I dedicated myself to fostering a dynamic learning environment and guiding students towards comprehensive academic growth and success.

2023

2022

2021

Projects

Depthwise Separable Convolutions with Deep Residual Convolutions

XceptionNet, Depthwise Separable Convolutions, Deep Residual, CNN, CIFAR-10

In this project, we propose an optimized Xception architecture tailored for edge devices, aiming for lightweight and efficient deployment. We incorporate the depthwise separable convolutions with deep residual convolutions of the Xception architecture to develop a small and efficient model for edge devices. The resultant architecture reduces parameters, memory usage, and computational load. The proposed architecture is evaluated on the CIFAR 10 object detection dataset. The evaluation result of our experiment also shows the proposed architecture is smaller in parameter size and requires less training time while outperforming Xception architecture performance.

Ensemble Language Models for Multilingual Sentiment Analysis

BERT multilingual, AraBERT, XLM-RoBERTa, Instructions

In this project, I mainly explore sentiment analysis on tweet texts from SemEval-17 and the Arabic Sentiment Tweet dataset (ASTD). Moreover, I investigated four pretrained language models and proposed two ensemble language models. The findings include monolingual models exhibiting superior performance and ensemble models outperforming the baseline while the majority voting ensemble outperforms the English language.

Multiplatform Bangla Sentiment Analysis

Dataset, Transformers, LLMs, Instructions

The MUBASE dataset is a multiplatform dataset consisting of Tweets and Facebook posts, which are manually annotated with sentiment polarity. The annotation agreement of this manually annotated dataset shows an agreement score of 0.84, indicating a perfect agreement among the annotators.

MEDIC: a multi-task learning dataset for disaster image classification

Dataset, ResNet, VGG, EfficientNet, SqueezeNet, DenseNet

The MEDIC is the largest multi-task learning disaster related dataset, which is an extended version of the crisis image benchmark dataset. It consists data from several data sources such as CrisisMMD, data from AIDR and Damage Multimodal Dataset (DMD). The dataset contains 71,198 images.

Resources for Bangla Natural Language Processing (BanglaNLP)

Dataset, Transformers, BiLSTM, LMs

In our work A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models, we provide a review of Bangla NLP tasks, resources, and tools available to the research community; we benchmark datasets collected from various platforms for nine NLP tasks using current state-of-the-art algorithms (i.e., transformer-based models). We provide comparative results for the studied NLP tasks by comparing monolingual vs. multilingual models of varying sizes. We report our results using both individual and consolidated datasets and provide data splits for future research. We reviewed a total of 108 papers and conducted 175 sets of experiments. Our results show promising performance using transformer-based models while highlighting the trade-off with computational costs. We hope that such a comprehensive survey will motivate the community to build on and further advance the research on Bangla NLP.

AmaderCAT

Language: PHP, JavaScript
Framework: CodeIgniter, JQuery, Bootstrap
Database:MySQL

The application AmaderCAT is the abbreviation of Amader Computer-assisted Translation. This application is developed for the purpose of building parallel corpus for Machine Translation system. The application contains a Translation Memory and a Glossary suggestions implementation that used for helping translators by providing TM and glossary suggestions. The application is collaborative and highly configurable for the translation task. It has the mechanism for crowd translation. You can use it as single user or a group/team. In future, we will add Machine Translation System in our application using Neural Network technologies.

Skills

Programming Languages

Python
PHP
JavaScript
Java

ML & NLP Tools

Transformers
Pytorch
LM-Harness
LLMeBench
OpenNMT
Keras
Sci-kit Learn
NLTK

LLMs Explored

GPT-4, 4o, and 4v
GPT-3.5
Gemini
Llama 2, 3, and 3.1
Jais
Bloomz
Claude-3.1
Mistral
FlanT5

Frameworks (Front- and back-end)

CodeIgniter
Vue.js
JQuery
Bootstrap
Laravel

Database

MySQL
SQLite
MS SQL Server

Web Server

Apache
NginX

Operating System

Mac OS
Ubuntu
Debian
Windos

IDE

PyCharm
PhpStorm
IntelliJ Idea
NetBeans
CodeBlocks

Others

Git
Docker
Latex
Anaconda
Jupyter Notebook

Extracurricular Activities

Co‑organizer, 2024 ArAIEval Shared Task at Arabic NLP: Propagandistic Techniques Detection in Unimodal and Multimodal Arabic Content

Proceedings of the Second Arabic Natural Language Processing Conference (ArabicNLP 2024), August 2024, ACL, Thailand

Co‑organizer, BLP‑2023 TASK 2: Sentiment Analysis

Proceedings of the 1st International Workshop on Bangla Language Processing (BLP-2023), December 2023, EMNLP, Singapore

Talk on Artificial Intelligence in Natural Language Processing

7TH BANGLADESH SCHOOL OF INTERNET GOVERNANCE, Dhaka, Bangladesh - February 2023

Co‑organizer, SEMEVAL‑2022 TASK 3: PreTENS‑Evaluating Neural Networks on Presuppositional Semantic Knowledge

2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics - July 2022

Supervisor, DIU-NLP and Machine Learning Research Lab