ansj_seg Download - ansj_seg Source code download

ansj_seg

JAVA source code

ansj_seg

Download

Ansj Chinese word segmentation

Help

Development documents: version 3.x and before, version 5.x and after

summary

This is a java implementation of Chinese word segmentation based on n-Gram+CRF+HMM.

The word segmentation speed reaches about 2 million words per second (tested on mac air), and the accuracy can reach more than 96%.

Currently, functions such as Chinese word segmentation, Chinese name recognition, user-defined dictionaries, keyword extraction, automatic summarization, and keyword tagging have been implemented.

It can be applied to natural language processing and other aspects, and is suitable for various projects that require high word segmentation effects.

maven

        
        <dependency>
            <groupId>org.ansj</groupId>
            <artifactId>ansj_seg</artifactId>
            <version>5.1.1</version>
        </dependency>

Call demo

If you download for the first time and just want to test the test effect, you can call this simple interface


 String str = "欢迎使用ansj_seg,(ansj中文分词)在这里如果你遇到什么问题都可以联系我.我一定尽我所能.帮助大家.ansj_seg更快,更准,更自由!" ;
 System.out.println(ToAnalysis.parse(str));
 
 欢迎/v,使用/v,ansj/en,_,seg/en,,,(,ansj/en,中文/nz,分词/n,),在/p,这里/r,如果/c,你/r,遇到/v,什么/r,问题/n,都/d,可以/v,联系/v,我/r,./m,我/r,一定/d,尽我所能/l,./m,帮助/v,大家/r,./m,ansj/en,_,seg/en,更快/d,,,更/d,准/a,,,更/d,自由/a,!

Join Us

I have been thinking about it for a long time, no matter if anyone can help me. I'll write it down, if you're interested or enthusiastic, you can contact me.

Supplementary documentation, adding calling examples and instructions
Add some regular Recognition, for example, ID card number recognition, currently unfinished include时间识别, IP地址识别,邮箱识别,网址识别,词性识别, etc...
Provide a more optimized CRF model. Replace ansj's default model.
Supplementary test cases, incomplete testing in many places. If you are interested you can help!
Reconstruct the name recognition model. Add models such as organization name recognition.
Add syntax and grammar analysis
Implement the word segmentation method of lstm
Fill in the gaps...

Expand

Additional Information

Version ansj_seg
Type JAVA source code
Update Time 2024-12-21
size 24.14MB
From Github

Related Applications

OpenCore_NO_ACPI_Build

2024-11-13
nspanel_pro_tools_apk

2024-11-12
zkwork_aleo_gpu_worker

2024-11-11
nextcloud_share_url_downloader

2024-11-01
Dog_Fox_Bunny

2022-08-01
Lihua data analysis engine free version 3.0_search_navigation_collection_public opinion_ranking_api

2022-06-28

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
redisson

JAVA source code

redisson-3.40.1
opentelemetry java instrumentation

JAVA source code

Version 2.10.0
PrettyZoo

JAVA source code

v2.1.1
waymo open dataset

Other source code

December 2023 Update
wp functions

Other categories

1.0.0
termwind

Other categories

v2.3.0

Related Information All