A Survey in Deep Learning Model for Image Annotation
Image annotation is generating the human-understandable natural language sentence for images. Annotating the image with sentence is one kind of the computer vision process that includes in the artificial intelligence. Annotation is working by combining computer vision and natural language processing. In image annotation, there are two types: sentence based annotation and single word annotation. Deep learning can get the more accurate sentence for the image. This paper is the survey for image annotation that applied the deep learning model. This discusses existing methods, technical difficulty, popular datasets, evaluation metrics that mostly used for image annotation.
K. Papineni, S. Roukos, T. Ward, and W.J. Zhu, “BLEU: A Method for Automatic Evaluation of Machine Translation”, Proceeding of the 40th Annual Meeting of the Association for Computational Linguistics (ACL), Philadelphia, July 2002, pp. 311-318.
X.Y. Lin, “ROUGE: A Package for Automatic Evaluation of Summaries”, Text Summarization Branches Out, vol.8, 2004, pp. 1-8.
M. Hodosh, P. Young, and J. Hockenmaier, “Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics”, Journal of Artificial Intelligence Research, 47:853-899, 2013.
T.Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays, P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollar, “Microsoft COCO: Common Objects in Context”, European Conference on Computer Vision, 2014, pp. 740-755.
M. Denkowshi, and A. Lavie, “Meteor Universal: Language Specific Translation Evaluation for Any Target Language”, Proceedings of the Ninth Workshop on Statistical Machine Translation, pp. 376-380. 2014.
K. Cho, A. Courville, and Y. Bengio, “Describing Multimedia Content Using Attention-Based Encoder-Decoder Networks”, IEEE Trans. on Multimedia, 17(11):1875-1886, 2015.
K. Xu, J.L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, R. S. Zemel and Y. Bengio, “Show, Attend and Tell: Neural Image Caption Generation with Visual Attention”, International conference on machine learning, 2015, pp. 2048-2057.
B. A. Plummer, L. Wang, C. M. Cervantes, J. C. Caicedo, J. Hockenmaier, and S. Lazebnik, “Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models”, Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 2641-2649.
R. Vedantam, C. L. Zitnick, and D. Parikh, “CIDEr: Consensus-based Image Description Evaluation”, Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 4566-4575.
C. Wang, H. Yang, C. Bartz, and C. Meinel, “Image captioning with deep bidirectional LSTMs”, In Proceedings of the 2016 ACM on Multimedia Conference, 2016, pp. 988–997.
M. Wang, L. Song, X. Yang, and C. Luo, “A parallel-fusion RNN-LSTM architecture for image caption generation”, In 2016 IEEE International Conference on Image Processing (ICIP’16), 2016, pp. 4448–4452.
P. Anderson, B. Fernando, M. Johnson, and S. Gould. “SPICE: Semantic Propositional Image Caption Evaluation”, In European Conference on Computer Vision, Springer, Cham, 2016, pp. 382-398.
K. Fu, J. Jin, R. Cui, F. Sha, and C. Zhang, “Aligning Where to see and What to Tell: Image Captioning with Region-Based Attention and Scene-Specific Contexts”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 39(12):2321-2334, 2017.
S. Qu, Y. Xi, and S. Ding, “Visual Attention Based on Long-Short Term Memory Model for Image Caption Generation”, Control and Decision Conference (CCDC), 2017 29th Chinese, IEEE, May 2017, pp. 4789-4794.
J. Lu, C. Xiong, D. Parikh, and R. Socher, “Knowing when to look: Adaptive attention via a visual sentinel for image captioning”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17), 2017, pp. 3242–3250.
L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, and T.S. Chua, “SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17), 2017, pp. 6298–6306.
Z. Gan, C. Gan, X. He, Y. Pu, K. Tran, J. Gao, L. Carin, and L. Deng, “Semantic compositional networks for visual captioning”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17), 2017, pp. 1141–1150.
C. Liu, J. Mao, F. Sha, and A. L. Yuille, “Attention Correctness In Neural Image Captioning”, In AAAI, 2017, pp. 4176–4182.
J. Gu, G. Wang, J. Cai, and T. Chen, “An Empirical Study Of Language CNN For Image Captioning”, In Proceedings of the International Conference on Computer Vision (ICCV’17), 2017, pp. 1231–1240.
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: Lessons learned from the 2015 MSCOCO image captioning challeng”, IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 4, Apr. 2017, pp. 652-663.
L. Li, S. Tang, Y. Zhang, L. Deng, and Q. Tian, “GLA: Global-Local Attention for Image Description”, IEEE Trans. on Multimedia, 20(3):726-737, 2018.
S. Ye, J. Han, and N. Liu, “Attentive Linear Transformation for Image Captioning”, IEEE Trans. on Image Processing, 27(11):5514-5524, 2018.
X. Zhu, L. Li, J. Liu, H. Peng, and X. Niu, “Captioning Transformer with Stacked Attention Model”, Applied Sciences, 8(5):739, 2018.
J. Aneja, A. Deshpande, and A. G. Schwing, “Convolutional Image Captioning”, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 5561–5570.
Q. Wang and A. B. Chan, “CNN+ CNN: Convolutional Decoders For Image Captioning”, arXiv preprint arXiv:1805.09019, 2018.
Authors who submit papers with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
- By submitting the processing fee, it is understood that the author has agreed to our terms and conditions which may change from time to time without any notice.
- It should be clear for authors that the Editor In Chief is responsible for the final decision about the submitted papers; have the right to accept\reject any paper. The Editor In Chief will choose any option from the following to review the submitted papers:A. send the paper to two reviewers, if the results were negative by one reviewer and positive by the other one; then the editor may send the paper for third reviewer or he take immediately the final decision by accepting\rejecting the paper. The Editor In Chief will ask the selected reviewers to present the results within 7 working days, if they were unable to complete the review within the agreed period then the editor have the right to resend the papers for new reviewers using the same procedure. If the Editor In Chief was not able to find suitable reviewers for certain papers then he have the right to reject the paper.
- Author will take the responsibility what so ever if any copyright infringement or any other violation of any law is done by publishing the research work by the author
- Before publishing, author must check whether this journal is accepted by his employer, or any authority he intends to submit his research work. we will not be responsible in this matter.
- If at any time, due to any legal reason, if the journal stops accepting manuscripts or could not publish already accepted manuscripts, we will have the right to cancel all or any one of the manuscripts without any compensation or returning back any kind of processing cost.
- The cost covered in the publication fees is only for online publication of a single manuscript.