Abstract:
The recognition of computer vision-based natural Bangla sign language in real-time is challenging due to occlusion, illumination variation, cluttered background and huge computational cost. This thesis presents a computer vision-based Bangla sign language recognition system. The proposed system contains three major modules: Bangla hand-signs segmentation and classification; Bangla sign words recognition; and hand-sign-spelled Bangla words, composite numerals and sentences recognition. In the first module, the system uses Haar-classifiers to initialize region of interest (ROI) by detecting predefined hand-signs. Then the system segments human skin-color using proposed Fuzzy rule based RGB (FRB-RGB) model. The system extracts probable binary hand-signs based on segmented skin-color pixels with specific motions. To remove noise morphological closing operation are done on the binary image. Then the system applies Gaussian smoothing filter to smooth the binary image. The system applies specific rotation followed by clipping and normalization on the binary image to make it rotation and scale invariant. Then the system generates Window-Grid Vector (WGV) by applying a 5×5 window-grid mask on the image. The system forms a feature vector based on the WGV, area and mean height of the probable hand-sign. The system classifies the hand-signs by calculating the maximum Inter-Correlation Coefficient (ICC) between test and pre-trained feature vectors. The system is trained using 4600 (10×10×46) images for 46 hand-signs of two-handed 36 Bangla alphabet and one-handed 10 numeral signs (Set-1); 3800 (10×10×38) images for 38 one-handed Bangla alphabet signs (Set-2); and 3000 (10×10×30) images for 30 hand shapes of main elements in BdSL (Set-3). The system is tested using 27600 (4600×6) images for Set-1, 22800 (3800×6) images for Set-2 and 18000 (3000×6) images for Set-3 in six different environments. The system achieves mean recognition accuracy of 95.67% for Set-1, 95.60% for Set-2 and 95.10% for Set-3 of hand-signs with average computational cost of 8.01 ms/f. In the second module, the system segments skin-like area using previous FRB-RGB model from the ROI. After noise removing and Gaussian smoothing, the system converts it into gray image. Then the system extracts Outer Boundaries (contours) using Canny edge detector and encodes it into Normalized Outer Boundary Vector (NOBV). The system recognizes sign words based on maximum Inter-Correlation Coefficient (ICC) between test NOBVs and pre-trained NOBV. The system is trained using 1800 (18×10×10) outer boundary templates for 18 BdSL words from ten signers and the system is tested using another 1800 images of the 18 BdSL words from another ten signers. The system achieves mean recognition accuracy of 90.11% with average computational cost of 26.063 ms/f. The third module interprets hand-sign-spelled BdSL into Bangla words, composite numerals and sentences using Bangla Language Modeling Algorithm (BLMA). The system tracks the ROI using Adaptive Kalman Filter (AKF) and extracts probable binary hand-signs. The system initially classifies hand-signs using NOBV. If classification score does not satisfy specific threshold of ICC then the system uses WGV based classifier. The system is trained using 5200 images for 52 hand-signs for hand-signs classification phase. The system is tested using 31200 (5200×6) images for 52 hand-signs in six different environments. The system achieves mean accuracy of 95.83% for hand-signs classification with average computational cost of 39.972 ms/f. Then the system is tested for BLMA using the hand-sign-spelled of 500 words, 100 composite numerals and 80 sentences in BdSL. For this experiment the system achieved mean accuracy of 93.50% for words, 95.50% for composite numerals and 90.50% for sentences recognition in BdSL.