Publications

This is a collection of the research work that I have done till now.

BanglaLekha-Isolated: A multi-purpose comprehensive dataset of Handwritten Bangla Isolated characters.(Journal)

Publisher: Elsevier Data in Brief, 12, 103-107.
Publication Year: 2017
Authors: Biswas, M., Islam, R., Shom, G. K., Shopon, M., Mohammed, N., Momen, S., & Abedin, A
Link: https://www.sciencedirect.com/science/article/pii/S2352340917301117
Abstract
BanglaLekha-Isolated, a Bangla handwritten isolated character dataset is presented in this article. This dataset contains 84 different characters comprising of 50 Bangla basic characters, 10 Bangla numerals and 24 selected compound characters. 2000 handwriting samples for each of the 84 characters were collected, digitized and pre-processed. After discarding mistakes and scribbles, 1,66,105 handwritten character images were included in the final dataset. The dataset also includes labels indicating the age and the gender of the subjects from whom the samples were collected. This dataset could be used not only for optical handwriting recognition research but also to explore the influence of gender and age on handwriting. The dataset is publicly available at https://data.mendeley.com/datasets/hf6sf8zrkc/2.

Bangla handwritten digit recognition using autoencoder and deep convolutional neural network.

Conference: International Workshop on Computational Intelligence
Publication Year: 2016
Authors: Shopon, M., Mohammed, N., & Abedin, M. A
Link: https://ieeexplore.ieee.org/document/7860340
Abstract
Handwritten digit recognition is a typical image classification problem. Convolutional neural networks, also known as ConvNets, are powerful classification models for such tasks. As different languages have different styles and shapes of their numeral digits, accuracy rates of the models vary from each other and from language to language. However, unsupervised pre-training in such situation has shown improved accuracy for classification tasks, though no such work has been found for Bangla digit recognition. This paper presents the use of unsupervised pre-training using autoencoder with deep ConvNet in order to recognize handwritten Bangla digits, i.e., 0-9. The datasets that are used in this paper are CMATERDB 3.1.1 and a dataset published by the Indian Statistical Institute (ISI). This paper studies four different combinations of these two datasets-two experiments are done against their own training and testing images, other two experiments are done cross validating the datasets. In one of these four experiments, the proposed approach achieves 99.50% accuracy, which is so far the best for recognizing handwritten Bangla digits. The ConvNet model is trained with 19,313 images of ISI handwritten character dataset and tested with images of CMATERDB dataset.

Image augmentation by blocky artifact in deep convolutional neural network for handwritten digit recognition.

Conference: Imaging, vision & pattern recognition (icivpr), 2017 ieee international conference on (pp. 1–6)
Publication Year: 2017
Authors: Shopon, M., Mohammed, N., & Abedin, M. A.
Link: https://ieeexplore.ieee.org/document/7890867/
Abstract
Deep Convolutional Neural Networks - also known as DCNN - are powerful models for different visual pattern classification problems. Many works in this field use image augmentation at the training phase to achieve better accuracy. This paper presents blocky artifact as an augmentation technique to increase the accuracy of DCNN for handwritten digit recognition, both English and Bangla digits, i.e., 0-9. This paper conducts a number of experiments on three different datasets: MNIST Dataset, CMATERDB 3.1.1 Dataset and Indian Statistical Institute (ISI) Dataset. For each dataset, DCNNs with the proposed augmentation technique give better results than those without such augmentation. Unsupervised pre-training with the blocky artifact achieves 99.56%, 99.83% and 99.35% accuracy respectively on MNIST, CMATERDDB and ISI datasets producing, in the process, so far the best accuracy rate for CMATERDB and ISI datasets.

Krill herd based clustering algorithm for wireless sensor networks

Conference: International Workshop on Computational Intelligence
Publication Date: 2016
Authors: Shopon, M., Adnan, M. A., & Mridha, M. F
Link: https://ieeexplore.ieee.org/document/7860346
Abstract
Wireless sensor networks are principally categorized by insufficient energy resource. Naturally, communication between the nodes is the utmost energy consuming act that they perform. Hence, development of a well-organized clustering algorithm can play a vital part in enhancing the lifetime of network. Currently, nature inspired methodologies are very common in dealing with it. This work presents a centralized approach that deals with energy-awareness of wireless sensor networks using the Krill Herd algorithm. The performance of the suggested algorithm is assessed with famous clustering protocols. The simulation results show that suggested approach can maximize sensor network lifetime over other algorithms of the same category.

An incremental clustered gradient method for wireless sensor networks

Conference: 21st Saudi Computer Society National Computer Conference
Publication Year: 2018
Authors: Mahmud, A., Akhtaruzzaman, A., & Shopon, M.
Link: https://ieeexplore.ieee.org/document/8593074
Abstract
In wireless sensor networks, clustering is a very crucial problem. Basically clustering means grouping some specific objects based on their behavior and functionality. Clustering can be formulated for different optimization problems, such as nonsmooth, nonconvex problems. This paper is based on the review of the optimization algorithm that was proposed in the paper A Convergent Incremental Gradient Method With Constant Step Size by Blatt et al called Incremental Aggregate Gradient method. A novel algorithm called Incremental Clustered Aggregate Gradient Method was proposed in this paper to counter the shortcomings of the previous one. It has many similarities with the earlier method but it is more efficient for wireless sensor networks. The main aim of Incremental Gradient Method was to minimize the sum of continuously differentiable functions and also it required a single gradient evaluation per iteration and used a constant step size. For quadratic functions, a global linear rate of convergence was proved. It was claimed that it is more suitable for sensor networks. Although the experiments performed in this work confirm the convergence properties of it, it was found that it is not suitable for sensor networks. The proposed method addresses the flaws of the previous method as regards to sensor networks. When both algorithms operate with their respective optimal step sizes, they require approximately the same number of gradient evaluations for convergence.

Hand Sign to Bangla Speech: A Deep Learning in Vision based system for Recognizing Hand Sign Digits and Generating Bangla Speech.

Conference: International Conference on Sustainable Computing in Science, Technology & Management (SUSCOM-2019)
Publication Year: 2019
Authors: Ahmed, S., Islam, M., Hassan, J., Ahmed, M. U., Ferdosi, B. J., Saha, S., & Shopon, M
Link: https://arxiv.org/abs/1901.05613
Abstract
Recent advancements in the field of computer vision with the help of deep neural networks have led us to explore and develop many existing challenges that were once unattended due to the lack of necessary technologies. Hand Sign/Gesture Recognition is one of the significant areas where the deep neural network is making a substantial impact. In the last few years, a large number of researches has been conducted to recognize hand signs and hand gestures, which we aim to extend to our mother-tongue, Bangla (also known as Bengali). The primary goal of our work is to make an automated tool to aid the people who are unable to speak. We developed a system that automatically detects hand sign based digits and speaks out the result in Bangla language. According to the report of the World Health Organization (WHO), 15% of people in the world live with some kind of disabilities. Among them, individuals with communication impairment such as speech disabilities experience substantial barrier in social interaction. The proposed system can be invaluable to mitigate such a barrier. The core of the system is built with a deep learning model which is based on convolutional neural networks (CNN). The model classifies hand sign based digits with 92% accuracy over validation data which ensures it a highly trustworthy system. Upon classification of the digits, the resulting output is fed to the text to speech engine and the translator unit eventually which generates audio output in Bangla language.

End to End Optical Character Recognition Using Sythetic Dataset Generator For Noisy Conditions

Conference: International Joint Conference on Computational Intelligence, IJCCI 2019
Publication Year: 2019
Authors: Shopon, M., Diput,N.H., & Mohammed, N.
Link: https://link.springer.com/chapter/10.1007/978-981-15-3607-6_41
Abstract
Optical Character Recognition is one of the most prevailing research fields since 1970's. Numerous research work has been conducted on Optical Character Recognition. The problem of Optical Character Recognition is to convert images of texts into editable texts. Recent advances in Deep Learning has accelerated the improvements in this field, particularly with languages with large annotated datasets. Bangla, a language with large number of character classes and complex cursive alphabet shapes, is unfortunately not included in these advancements due to the lack of a large annotated dataset. This work concentrates on attempting to perform OCR in noisy conditions for Bangla text. We have created a dataset of 5000 noisy Bangla text samples. To augment this small collection we use a strategy to pre-train our proposed End-to-End model on synthetically generated data and then optionally fine-tune on a part of the collected dataset. Our results indicate that attempting to perform noisy OCR is an extremely challenging task and the best results are obtained when models trained on synthetic data are fine tuned with some real world data.

Unsupervised Pretraining and Transfer Learning Based Bangla Sign Language Recognition

Conference: International Joint Conference on Computational Intelligence, IJCCI 2019.
Publication Year: 2019
Authors: Nishat,Z.K. & Shopon, M. (2019)
Link: https://link.springer.com/chapter/10.1007/978-981-15-3607-6_42
Abstract
For hearing impaired peoples Bangladeshi Sign Language (BdSL) is a common medium in Bangladesh that is used for their day to day conversation. In this work we have developed a system for BdSL recognition. We have used transfer learning and unsupervised pre-training for recognition. A dataset of 2080 image was used for conducting the experiment. As the number of samples in the used dataset was very small we have performed augmentation to increase the amount of data samples. This dataset consist of 46 Bangla Characters Sign Language. Among them 10 are Bangla digits, 6 are vowels and 36 are consonants. We have conducted two different experiments on the dataset. In one we have used unsupervised pre training. It has shown excellent performance in the field of image classification. We have acquired 94.86 accuracy using unsupervised pre training. Our second experiment was done using transfer learning. Transfer learning is mostly used when the amount of data available is very limited. We have attained 96.57 state of the art accuracy using transfer learning.

Synthetic Class Specific Bangla Handwritten Character Generation Using Conditional Generative Adversarial Networks

Conference: International Conference on Bangla Speech and Language Processing, ICBSLP 2019
Publication Year: 2019
Authors: Nishat,Z.K. & Shopon, M.
Link: https://ieeexplore.ieee.org/abstract/document/9084031/
Abstract
Bangla handwritten character recognition is known to be one of the most classical problem in the field of machine learning. In order to solve a machine learning problem one must thing is dataset. The more varied data a model sees the better it learns. Generative adversarial networks (GANs) are a group of neural networks that are used in unsupervised machine learning. It helps to resolve many difficult operations such as image generation from description, transforming low resolution image into high resolution, retrieving image contents given a small pattern etc. GAN's have many other promising applications in machine learning. There are many variations available for GAN. One of the variation of GAN is Conditional Generative Adversarial Networks(cGAN). This kind of GAN is used for generating a specific type of image. In this work we have used cGAN for generating Class Based Character Generation. This work can help researchers to generate handwritten characters to enhance the perfomance of deep learning models. We have trained this model to generate 50 Basic Bangla Characters, 10 Bangla Numerals and 24 Compound characters.

Bidirectional LSTM with Attention Mechanism for Automatic Bangla News Categorization In Terms of News Captions

Conference: International Conference on Electronic Systems and Intelligent Computing
Publication Year: 2020
Authors: Shopon, M
Link: Will Publish Soon
Abstract
The aim of any classification problem is to create a set of models that can classify the class of different texts and objects. Text classification is known as one such application. This problem can be used in various classification task, e.g. news category classification, identifying language, classification of text genre, recommendation systems etc. In this paper we propose a text classification method using Bidirectional LSTM with Attention mechanism to classify Bangla news articles. This news articles are collected from a renowned a news portal Prothom-Alo. The dataset consist of in total 383304 news articles and there were total number of 12 different categories. Traditionally news classification task is done in terms of news content. But in our work we have performed classification based on the news captions. Which takes lesser amount of training time. We have achieved 91.37\% accuracy using our approach. This is the state of the art result that has achieved on this dataset.

Automatic Violence Detection Method Using Convolution Neural Network

Conference: International Conference On Sentimental Analysis And Deep Learning (ICSADL 2020)
Publication Year: 2020
Authors: Karim.A, Razin,J, Ahmed,N. Shopon, M. Alam.T
Link: Will Publish Soon
Abstract
Automatic violence detection using video surveillance system is a mandatory things for everyday life. There are frequent incidents of snatch, fights, murders and many other misdeeds in various important places of the country such as bus stand, railway station, launch gateway, deserted highway, universities, hospitals and many more areas. For this purpose, a violent dataset has been proposed for automatic detecting a situation which is violence or nonviolence. All the data has been collected based on Bangladeshi context. Which includes both types of data like violence and nonviolence. However, different types of machine learning and deep learning algorithms have been applied in this field and detect different results. Here Convolution neural network model is used for detecting violence automatically.80\% data have been used for training the model and 20\% have been used for testing. Near about 96.16\% accuracy has achieved by our mod

Classification of Bangla News Articles Using Bidirectional Long Short Term Memory

Conference: IEEE TENSYMP (2020 IEEE Region 10 Symposium)
Publication Year: 2020
Authors: Shahin.M, Ahmed.T, Rahman.S, Shopon.M,
Link: Will Publish Soon
Abstract
Classification is a method of assigning input vectors to one of the discrete classes. This problem can be used to identify related content such as E-commerce, news agencies, content cura- tors, blogs, directories, and likes can use automated technologies. In this paper, we have proposed a method of classification using bi-directional LSTM to classify the Bangla news headline. We have used Bangla stop word corpus to removing stop words to get a better result in our method of classification. We have used Gensim and fastText model to vectorized our text to compatible with our machine learning model. We have built a dataset that contains around 10 lakh articles from the different renowned newspapers of Bangladesh and 8 different categories. Then we trained this data in 3 different models. Among those models, Bi- LSTM has achieved 85.14 percent accuracy, which is better than any other method.
---