Metadata Extraction from References of Different Styles

  • Olugbenga A Madamidola Department of Computer Sciences, Dominion University, Ibadan Oyo State Nigeria.
  • Olatunde, Ibikunle Rubber Research Institute of Nigeria, Benin City, Nigeria.
  • Olawale T Adeboje Department of Computer Sciences, Dominion University, Ibadan Oyo State Nigeria.
  • Promise I Ayansola Department of Computer Sciences, Dominion University, Ibadan Oyo State Nigeria.
Keywords: Metadata Extraction, Strings, Research, Reference, Regular Expression


Metadata extraction is the process of describing extrinsic and intrinsic qualities of the resource such as document, image, video, including getting data from references. References form an essential part of electronic scholarly publications. A reference is the way of giving acknowledgment to individuals for their creative and intellectual works that one utilized in his or her research work. It can also be used to locate particular sources and combat plagiarism. A reference style dictates the information necessary for a reference and how the information is ordered. Accurate and automatic reference metadata generation provides scalability, interoperability and usability for digital libraries of both public and private institution and their collections. Accurate reference metadata extraction becomes an intriguing task to researchers who want to collect data of scientific publications; therefore, this research work proposes a metadata extraction from references of different styles with the use of regular expression. This work accurately extract metadata such as author, title of article, volume, year of publication and institution from references of different styles limiting it to six referencing style.


. B.A. Ojokoh, “Rule-based metadata extraction for heterogeneous references”, Oriental Journal of Computer Science and Technology 2 (2009).

. Houssam Nassif, Ryan Woods, “Information Extraction for Clinical Data Mining: A Mammography Case Study”, in 2009 IEEE International Conference on Data Mining Workshops (ICDMW). FL, USA, pp.37-42, December 2009.

. Bin Zhou, Yan Jia, “A Distributed Text Mining System for Online Web Textual Data Analysis”, in Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC). Huangshan, China, pp.1-4, October 2010.

. Sushain Pandit, Ontology-guided extraction of structured information from unstructured text: Identifying and capturing complex relationships, Ames, Iowa: Iowa State University, 2010.

. D. Carrell, D. Miglioretti, “Coding free text radiology reports using the cancer text information extraction system (caTIES)”, In American Medical Informatics Association Annual Symposium Proceedings (AMIA). Rochester, USA, pp.889-893, September 2007.

. L. Rokach, O. Maimon, “Information retrieval system for medical narrative reports”, In Proc. of the 6th International Conference on Flexible Query Answering Systems (FQAS). Lyon, France, pp.217–228, June 2004.

. D. Gupta, B. Morris, T. Catapano, G. Sautter, A new approach towards bibliographic reference identification, parsing and inline citation matching, in: Proceedings of the International Conference of Contemporary Computing, India, 2009, pp. 93–102.

. Bolanle Ojokoh, Ming Zhang, Jian Tang, A trigram hidden Markov model for metadata extraction from heterogeneous references, Information Sciences 181 (2011) 1538–1551.

. S.H. Papavlasopoulos, M.S. Poulos, N.T. Korfiatis, G.D. Bokos, A non linear index to evaluate a journal’s scientific impact, Information Sciences 180 (2010) 2156–2175

How to Cite
Madamidola, O. A., Ibikunle, O., T Adeboje, O., & I Ayansola, P. (2021). Metadata Extraction from References of Different Styles. International Journal of Computer (IJC), 40(1), 83-90. Retrieved from