NER as the goal of recognizing relevant entities in the text, such as persons, locations, or organizations, but it can also be applied to specific domains of work. For example, in the biomedical domain, NER can be used to identify proteins, disease names, biological processes, drugs, chemical compounds, among others. NER is essential for many other Natural Language Processing applications, such as text summarization, question answering, or even machine translation.
Combining a NER Rule-based approach with Unicage’s powerful tools to rapidly identify key entities in great quantities of scientific text.
Unicage NER Architecture
With Unicage NER, we can identify in large amounts of scientific text, key entities such as genes, proteins and drugs, and extract additional information about these same entities, namely information from databases and biomedical ontologies.
The main advantage of Unicage NER regarding traditional Python approaches is that, with Unicage NER, we are able to efficiently process the data with less computational requirements. An example of this can be seen in the process of normalizing the json input data: With Unicage, we use can use a single command, whilst in Python, one has to first upload and open the file before it can be processed and normalized.
Unicage NER is still a prototype that is being developed. We plan to tune this tool and make additional testing, comparing it with other existing NER software. If you want to more about Unicage NER, you can check our blog post where we present Unicage NER and examples of its output. Additionally, you can book a meeting with us, where we can gladly explain in more detail how Unicage NER works.