Open Source Tools


    • Introduction

      fastNLP, a light NLP toolkit, aims at implement NLP tasks quickly and build complex models.

    • Features

        • Provide automatic download of some datasets and Pre-trained models, provide built-in Loader and Pipe of various datasets
        • simplify the data preprocessing process by using a unified Tabular data loader
        • Provide various NLP tools, such as Embedding loading and intermediate data cache
        • Provide various components and reproduction models of Neural Network. Including Chinese Word Segment, Named Entity Recognition, Syntactic Analysis, Text Classification, Text Matching, Co-reference Resolution, Text Summarization, etc.
        • Provide various build-in callback functions.
    • Structure

          fastNLP Fuctions
          fastNLP.core core functions, such as data process components, trainer, tester, etc
          fastNLP.models Neural Network models
          fastNLP.modules components used to build Neural Network models
          fastNLP.embeddings functions tuning setence index into vector index
 Read and write functions
    • Start


    fastHan, a Chinese NLP tool based on fastNLP and pytorch, has two versions: base and large. Its kernel is a joint model based on BERT. it is trained in 13 corpus and can handle four tasks: Chinese word segmentation, part-of-speech tagging, dependency Parsing, and named entity recognition.


    A Java Chinese NLP open source project that provide tools for NLP, including tokenization, part-of-speech tagging, parsing, Text similarity calculation, etc.

    This project is out of maintenance now