Query-Annotation Tool
Date:
One of the main challenges of Natural language processing (NLP) is converting unstructured data into a structured format. Structured data can then in turn be used to create knowledge graphs, train other machine learning models, etc. A widely used method for this is Named Entity Recognition (NER). It involves the identification and extraction of some particular entities of interest in the text. In our case we had a corpus of 600 research texts regarding the different types of coatings applied on Steel. Our task was to populate a domain model consisting of predefined entities like ingredients used, there quantities, the conditions under which the steel coating process took place, coating type, substrate, etc. We started by creating our own corpus to train a NER model that could identify some of the basic entities like occurrences of molecules, processes, conditions, actions and quantities. Using the trained model we implemented an Annotation tool cum Search tool where one could perform queries, that would perform search on the database and return focused results. Code