TEXT-LINE SEGMENTATION METHODS AND ALGORITHMS IN HANDWRITTEN DOCUMENT IMAGES
DOI:
https://doi.org/10.5281/zenodo.17511827Keywords:
handwritten text image, text line segmentation, projection profile, handwritten document, parchmentAbstract
Automatic line segmentation plays a crucial role in the digitization and automatic recognition of handwritten
and historical documents. The variability of handwriting styles, curved baselines, touching characters, ink diffusion,
and uneven backgrounds make this task particularly challenging. Traditional methods based on projection profiles or
morphological operations are effective for printed and well-structured texts but often fail when applied to degraded or
historical documents. In recent years, advanced approaches based on probabilistic modeling, energy minimization, graph
theory, and deep neural networks have emerged. This study presents a detailed review and comparative analysis of twentyfive
leading line segmentation methods. The analysis encompasses language-independent probabilistic approaches,
morphological hybrid models, the Mumford–Shah variational framework, and state-of-the-art deep learning architectures
such as ARU-Net, Adaptive U-Net, Mask R-CNN, and GAN-based models. For each method, working principles,
preprocessing requirements, evaluation datasets, and accuracy metrics are systematically examined. The results reveal
a clear evolution from rule-based and geometry-driven algorithms toward data-centric, binarization-free methods capable
of processing multilingual and noisy manuscripts. Furthermore, the analysis consolidates existing research achievements
and identifies open research directions, including the integration of multimodal cues, self-supervised learning, and adaptive
segmentation strategies. Overall, this comprehensive review contributes to a systematic understanding of progress and
remaining challenges in the field of handwritten text line segmentation
References
Li, Y., Zheng, Y., Doermann, D. and Jaeger, S., 2008. Script-independent text line segmentation in freestyle handwritten
documents. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(8), pp.1313-1329.
Louloudis, G., Gatos, B., Pratikakis, I. and Halatsis, C., 2008. Text line detection in handwritten documents. Pattern
recognition, 41(12), pp.3758-3772.
Louloudis, G., Gatos, B., Pratikakis, I. and Halatsis, C., 2009. Text line and word segmentation of handwritten
documents. Pattern recognition, 42(12), pp.3169-3183.
Text line segmentation based on morphology and histogram projection 2009 dos Santos, R.P., Clemente, G.S., Ren,
T.I. and Cavalcanti, G.D., 2009, July. Text line segmentation based on morphology and histogram projection. In 2009
th International Conference on Document Analysis and Recognition (pp. 651-655). IEEE.
Du, X., Pan, W. and Bui, T.D., 2009. Text line segmentation in handwritten documents using Mumford–Shah model.
Pattern Recognition, 42(12), pp.3136-3145.
Papavassiliou, V., Stafylakis, T., Katsouros, V. and Carayannis, G., 2010. Handwritten document image segmentation
into text lines and words. Pattern recognition, 43(1), pp.369-377.
Alaei, A., Pal, U. and Nagabhushan, P., 2011. A new scheme for unconstrained handwritten text-line segmentation.
Pattern Recognition, 44(4), pp.917-928.
Ryu, J., Koo, H.I. and Cho, N.I., 2014. Language-independent text-line extraction algorithm for handwritten documents.
IEEE Signal processing letters, 21(9), pp.1115-1119.
Saabni, R., Asi, A. and El-Sana, J., 2014. Text line extraction for historical document images. Pattern Recognition
Letters, 35, pp.23-33.
Kavitha, A.S., Shivakumara, P., Kumar, G.H. and Lu, T., 2016. Text segmentation in degraded historical document
images. Egyptian informatics journal, 17(2), pp.189-197.
Khare, V., Shivakumara, P., Navya, B.J., Swetha, G.C., Guru, D.S., Pal, U. and Lu, T., 2018, August. Weightedgradient
features for handwritten line segmentation. In 2018 24th international conference on pattern recognition
(ICPR) (pp. 3651-3656). IEEE.
Grüning, T., Leifert, G., Strauß, T., Michael, J. and Labahn, R., 2019. A two-stage method for text line detection in
historical documents. International Journal on Document Analysis and Recognition (IJDAR), 22(3), pp.285-302.
Kurar Barakat, B., Cohen, R., Droby, A., Rabaev, I. and El-Sana, J., 2020. Learning-free text line segmentation for
historical handwritten documents. Applied Sciences, 10(22), p.8276.
Hu, P., Wang, W., Li, Q. and Wang, T., 2021. Touching text line segmentation combined local baseline and connected
component for uchen Tibetan historical documents. Information Processing & Management, 58(6), p.102689.
Sahare, P., Tembhurne, J.V., Parate, M.R., Diwan, T. and Dhok, S.B., 2023. Script independent text segmentation of
document images using graph network based shortest path scheme. International Journal of Information Technology,
(4), pp.2247-2261.
Kumar, J., Abd-Almageed, W., Kang, L. and Doermann, D., 2010, June. Handwritten Arabic text line segmentation
using affinity propagation. In Proceedings of the 9th IAPR international workshop on document analysis systems (pp.
-142).
Garz, A., Fischer, A., Sablatnig, R. and Bunke, H., 2012, March. Binarization-free text line segmentation for historical
documents based on interest point clustering. In 2012 10th IAPR International Workshop on Document Analysis
Systems (pp. 95-99). IEEE.
Renton, G., Chatelain, C., Adam, S., Kermorvant, C. and Paquet, T., 2017, November. Handwritten text line
segmentation using fully convolutional network. In 2017 14th IAPR International Conference on Document Analysis
and Recognition (ICDAR) (Vol. 5, pp. 5-9). IEEE.
Vo, Q.N., Kim, S.H., Yang, H.J. and Lee, G.S., 2018. Text line segmentation using a fully convolutional network in
handwritten document images. IET Image Processing, 12(3), pp.438-446.
Daldali, M. and Souhar, A., 2019. Handwritten Arabic documents segmentation into text lines using seam carving.
IJIMAI, 5(5), pp.89-96.
Gader, T.B.A. and Echi, A.K., 2020, September. Unconstrained handwritten arabic text-lines segmentation based on
ar2u-net. In 2020 17th international conference on frontiers in handwriting recognition (ICFHR) (pp. 349-354). IEEE.
Mechi, O., Mehri, M., Ingold, R. and Essoukri Ben Amara, N., 2021. A two-step framework for text line segmentation
in historical Arabic and Latin document images. International Journal on Document Analysis and Recognition (IJDAR),
(3), pp.197-218.
Barakat, B.K., Droby, A., Alaasam, R., Madi, B., Rabaev, I. and El-Sana, J., 2021, January. Text line extraction using
fully convolutional network and energy minimization. In International Conference on Pattern Recognition (pp. 126-
. Cham: Springer International Publishing.
Droby, A., Kurar Barakat, B., Alaasam, R., Madi, B., Rabaev, I. and El-Sana, J., 2022. Text line extraction in historical
documents using mask R-CNN. Signals, 3(3), pp.535-549.
Özşeker, İ., Demir, A.A. and Özkaya, U., 2025. GAN-based text line segmentation method for challenging handwritten
documents. International Journal on Document Analysis and Recognition (IJDAR), 28(1), pp.59-69.