Main types of markup for text corpora of information resources

МРНТИ 28.23.11                                                                                                 №4 (2020г.)

Mamyrbayev О.Zh., Shayakhmetova А.S.,Kurmanbek Zh.А.

 

The research examines the development of modern web content corpora in Kazakh, Russian and English, as well as the model of grammatical expres-
sion of the meaning identity of the fact that prompts to action in English and the technology of automatic extraction of synonymous collocation pairs from the texts of the corpora. The development of technology for searching, extracting and analyzing forensically meningfull information from unstructured and weakly structured data is relevant for solving various industrial and economic problems.

Keywords: automatic natural language processing, corpus of texts, criminally significant information.

 

Комментарии закрыты.