eKoNLPy: Korean NLP Python Library for Economic Analysis
<!-- Links: -->
<!-- Links: -->
eKoNLPy is a Korean Natural Language Processing (NLP) Python library specifically designed for economic analysis. It extends the functionality of the Mecab tagger from KoNLPy to improve the handling of economic terms, financial institutions, and company names by classifying them as single nouns. Additionally, it incorporates sentiment analysis features to determine the tone of monetary policy statements, such as hawkish or dovish.
Note
From version 2.0.0, eKoNLPy integrates the extended tagger with the original tagger. If you want to use the original tagger, set use_original_tagger=True when creating an instance of the class. Additionally, the class can be directly imported from the module. The default input text parameter of has been changed from to to be consistent with the original tagger.
Mecab
Mecab
ekonlpy
Mecab.pos()
phrase
text
Note
eKoNLPy is built on the fugashi and mecab-ko-dic libraries. For more information on using the Mecab tagger, refer to the fugashi documentation. Since eKoNLPy no longer relies on the KoNLPy library, Java is not required for its use. This makes eKoNLPy compatible with Windows, Linux, and macOS without the need for Java installation. You can also use eKoNLPy on Google Colab.
If you wish to tokenize general Korean text with eKoNLPy, you do not need to install the KoNLPy library. Instead, use the same ekonlpy.Mecab class with the use_original_tagger=True option.
However, if you plan to use the Korean Sentiment Analyzer (KSA), which employs the Kkma morpheme analyzer, you will need to install the KoNLPy library.
Installation
To install eKoNLPy, run the following command:
pip install ekonlpy
Usage
Part of Speech Tagging
To use the part-of-speech tagging feature, input Mecab.pos(text) just like KoNLPy. First, the input is processed using KoNLPy's Mecab morpheme analyzer. Then, if a combination of consecutive tokens matches a term in the user dictionary, the phrase is separated into compound nouns.
You can set the intensity_cutoff parameter to adjust the intensity threshold for classifying low-confidence sentences as neutral (default: 1.3).
Korean Sentiment Analyzer (KSA)
For general Korean sentiment analysis, use the KSA class. The morpheme analyzer used in this class is Kkma, developed by Seoul National University's IDS Lab. The sentiment dictionary is also from the same lab (reference: http://kkma.snu.ac.kr/).
eKoNLPy is an open-source software library for Korean Natural Language Processing (NLP), specifically designed for economic analysis. The library is released under the MIT License, allowing developers and researchers to use, modify, and distribute the software freely.
Citation
If you use eKoNLPy in your work or research, please cite the following sources:
Lee, Young Joon, eKoNLPy: A Korean NLP Python Library for Economic Analysis, 2018. Available at: https://github.com/entelecheia/eKoNLPy.
Lee, Young Joon, Soohyon Kim, and Ki Young Park. "Deciphering Monetary Policy Board Minutes with Text Mining: The Case of South Korea." Korean Economic Review 35 (2019): 471-511.
You can also use the following BibTeX entry for citation:
@misc{lee2018ekonlpy,
author= {Lee, Young Joon},
year = {2018},
title = {{eKoNLPy: A Korean NLP Python Library for Economic Analysis}},
note = {\url{https://github.com/entelecheia/eKoNLPy}}
}
By citing eKoNLPy in your work, you acknowledge the efforts and contributions of its creators and help promote further development and research in Korean NLP for economic analysis.