Provide automatic download of some datasets and Pre-trained models, provide built-in Loader and Pipe of various datasets
simplify the data preprocessing process by using a unified Tabular data loader
Provide various NLP tools, such as Embedding loading and intermediate data cache
Provide various components and reproduction models of Neural Network. Including Chinese Word Segment, Named Entity Recognition, Syntactic Analysis, Text Classification, Text Matching, Co-reference Resolution, Text Summarization, etc.
Provide various build-in callback functions.
fastHan, a Chinese NLP tool based on fastNLP and pytorch, has two versions: base and large. Its kernel is a joint model based on BERT. it is trained in 13 corpus and can handle four tasks: Chinese word segmentation, part-of-speech tagging, dependency Parsing, and named entity recognition.
A JavaChinese NLP open source project that provide tools for NLP, including tokenization, part-of-speech tagging, parsing, Text similarity calculation, etc.
This project is out of maintenance now