日本語の自然言語処理に関するPythonライブラリ、学習済みモデル、辞書、およびコーパスの厳選リストです。
このリストには、558件の日本語NLPリポジトリが含まれています。 これらのリポジトリを検索するためのツールはHugging Face Spacesで利用可能です。 Huggingface に掲載されているモデルは、こちら をご覧ください。
日本語 NLP 分類データセットを公開しました。
English | 日本語 (Japanese) | 繁體中文 (Chinese) | 简体中文 (Chinese) |
Corpus
ChatGPT
Python
Updated on May 04, 2024
Name | downloads/week | total downloads | stars |
---|---|---|---|
SudachiPy | |||
Janome | |||
mecab-python3 | |||
mecab | |||
fugashi | |||
nagisa | |||
pyknp | |||
Mykytea-python | |||
konoha | |||
natto-py | |||
rakutenma-python | |||
python-vaporetto | |||
dango | |||
rhoknp | |||
python-vibrato | |||
jagger-python |
Name | downloads/week | total downloads | stars |
---|---|---|---|
ginza | |||
cabocha | |||
UniDic2UD | |||
camphr | |||
SuPar-UniDic | |||
depccg | |||
bertknp | - | - | |
esupar | |||
yomikata | |||
jdepp-python |
Name | downloads/week | total downloads | stars |
---|---|---|---|
pykakasi | |||
cutlet | |||
alphabet2kana | |||
Convert-Numbers-to-Japanese | - | - | |
mozcpy | |||
jamorasep | |||
text2phoneme | - | - | |
jntajis-python | |||
wiredify | |||
mecab-text-cleaner |
Name | downloads/week | total downloads | stars |
---|---|---|---|
neologdn | |||
jaconv | |||
mojimoji | |||
text-cleaning | - | - | |
HojiChar | |||
utsuho | |||
python-habachen |
Name | downloads/week | total downloads | stars |
---|---|---|---|
bunkai | |||
japanese-sentence-breaker | |||
sengiri | |||
budoux | |||
ja_sentence_segmenter | |||
hasami | |||
kuzukiri | |||
ja-senter-benchmark | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
oseti | |||
negapoji | - | - | |
pymlask | |||
asari |
Name | downloads/week | total downloads | stars |
---|---|---|---|
jparacrawl-finetune | - | - | |
JASS | - | - | |
PheMT | - | - | |
VISA | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
namaco | - | - | |
entitypedia | - | - | |
noyaki | |||
bert-japanese-ner-finetuning | - | - | |
joint-information-extraction-hs | - | - | |
pygeonlp |
Name | downloads/week | total downloads | stars |
---|---|---|---|
manga-ocr | |||
mokuro | |||
handwritten-japanese-ocr | - | - | |
OCR_Japanease | - | - | |
ndlocr_cli | - | - | |
donut | |||
JMTrans | - | - | |
Kindai-OCR | - | - | |
text_recognition | - | - | |
Poricom | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
JGLUE | - | - | |
ginza-transformers | |||
t5_japanese_dialogue_generation | - | - | |
japanese_text_classification | - | - | |
Japanese-BERT-Sentiment-Analyzer | - | - | |
jmlm_scoring | - | - | |
allennlp-shiba-model | |||
evaluate_japanese_w2v | - | - | |
gector-ja | - | - | |
Japanese-BPEEncoder | - | - | |
Japanese-BPEEncoder_V2 | - | - | |
transformer-copy | - | - | |
japanese-stable-diffusion | - | - | |
nagisa_bert | |||
prefix-tuning-gpt | - | - | |
JGLUE-benchmark | - | - | |
jptranstokenizer | |||
jp-stable | - | - | |
compare-ja-tokenizer | - | - | |
lm-evaluation-harness-jp-stable | - | - | |
llm-lora-classification | - | - | |
jp-stable | - | - | |
rinna_gpt-neox_ggml-lora | - | - | |
japanese-llm-roleplay-benchmark | - | - | |
japanese-llm-ranking | - | - | |
llm-jp-eval | - | - | |
llm-jp-sft | - | - | |
llm-jp-tokenizer | - | - | |
japanese-lm-fin-harness | - | - | |
ja-vicuna-qa-benchmark | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
mecab | - | - | |
jumanpp | - | - | |
kytea | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
cabocha | - | - | |
knp | - | - |
trimatch - Trimatch:(完全 | 接頭辞 | 近似)文字列マッチングライブラリ |
Name | downloads/week | total downloads | stars |
---|---|---|---|
jsc | - | - | |
aquaskk | - | - | |
mozc | - | - | |
trimatch | - | - | |
resembla | - | - | |
corvusskk | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
lindera | - | ||
vaporetto | - | ||
goya | - | ||
vibrato | - | ||
yoin | - | ||
mecab-rs | - | ||
awabi | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
wana_kana_rust | - | ||
unicode-jp-rs | - | ||
kana | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
lindera-tantivy | - | ||
tantivy-vibrato | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
daachorse | - | ||
find-simdoc | - | ||
crawdad | - | ||
tokenizer-speed-bench | - | - | |
stringmatch-bench | - | - | |
vime | - | - | |
voicevox_core | - | - | |
akaza | - | - | |
Jotoba | - | - | |
dvorakjp-romantable | - | - | |
niinii | - | - | |
cskk | - | - | |
japanki | - | - | |
jpreprocess | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
kuromoji.js | |||
rakutenma | |||
node-mecab-ya | |||
juman-bin | |||
node-mecab-async |
Name | downloads/week | total downloads | stars |
---|---|---|---|
kuroshiro | |||
kuroshiro-analyzer-kuromoji | |||
hepburn | |||
japanese-numerals-to-number | |||
jslingua | |||
WanaKana | |||
node-romaji-name | |||
kyujitai.js | |||
normalize-japanese-addresses | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
bangumi-data | |||
yomichan | - | - | |
proofreading-tool | - | - | |
kanjigrid | - | - | |
japanese-toolkit | - | - | |
analyze-desumasu-dearu | |||
hatsuon | |||
sentiment_ja_js | - | - | |
mecab-ipadic-seed | |||
Japanese-Word-Of-The-Day | |||
oskim | - | - | |
tweetMapping | - | - | |
pitch-accent | |||
kana2ipa | - | - | |
voicevox | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
kagome | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
ojosama | - | - | |
nihongo | - | - | |
yomichan-import | - | - | |
imas-ime-dic | - | - | |
go-kakasi | - | - | |
go-moji | - | - | |
ojichat | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
kuromoji | - | - | |
Sudachi | - | - | |
SudachiDict | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
kanjitomo-ocr | - | - | |
jakaroma | - | - | |
kakasi-java | - | - | |
Kamite | - | - | |
react-native-japanese-tokenizer | - | - | |
elasticsearch-analysis-japanese | - | - | |
moji4j | - | - | |
neologdn-java | - | - | |
elasticsearch-sudachi | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
japanese-words-to-vectors | - | - | |
chiVe | - | - | |
elmo-japanese | - | - | |
embedrank | - | - | |
aovec | |||
dependency-based-japanese-word-embeddings | - | - | |
jawikivec | - | - | |
jawiki_word_vector_updater | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
bert-japanese | - | - | |
japanese-pretrained-models | - | - | |
bert-japanese | - | - | |
SudachiTra | |||
japanese-dialog-transformers | - | - | |
shiba | |||
Dialog | - | - | |
language-pretraining | - | - | |
medbertjp | - | - | |
ILYS-aoba-chatbot | - | - | |
t5-japanese | - | - | |
pytorch_bert_japanese | - | - | |
Laboro-BERT-Japanese | - | - | |
RoBERTa-japanese | - | - | |
aMLP-japanese | - | - | |
bert-japanese-aozora | - | - | |
sbert-ja | - | - | |
BERT-Japan-vaccination | - | - | |
gpt2-japanese | - | - | |
text2text-japanese | - | - | |
gpt-ja | - | - | |
friendly_JA-Model | - | - | |
albert-japanese | - | - | |
ja_text_bert | - | - | |
DistilBERT-base-jp | - | - | |
bert | - | - | |
Laboro-DistilBERT-Japanese | - | - | |
luke | - | - | |
GPTSAN | - | - | |
japanese-clip | - | - | |
AcademicBART | - | - | |
AcademicRoBERTa | - | - | |
LINE-DistilBERT-Japanese | - | - | |
Japanese-Alpaca-LoRA | - | - | |
albert-japanese-tinysegmenter | - | - | |
japanese-llama-experiment | - | - | |
easylightchatassistant | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
VRChatGPT | - | - | |
AITuberDegikkoMirii | - | - | |
wanna | |||
ChatdollKit | - | - | |
ChuanhuChatGPTJapanese | - | - | |
AISisterAIChan | - | - | |
vrchatbot | - | - | |
gptuber-by-langchain | - | - | |
openai-chatfriend | - | - | |
chrome-ext-translate-to-hiragana-with-chatgpt | - | - | |
azure-search-openai-demo | - | - | |
chatvrm | - | - | |
sftly-replace | - | - | |
summarize_arxv | - | - | |
aiavatarkit | - | - | |
pva-aoai-integration-solution | - | - | |
jp-azureopenai-samples | - | - | |
character_chat | - | - | |
chatgpt-slackbot | - | - | |
chatgpt-prompt-sample-japanese | - | - | |
kanji-flashcard-app-gpt4 | - | - | |
IgakuQA | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
mecab-ipadic-neologd | - | - | |
tdmelodic | - | - | |
jamdict | |||
unidic-py | |||
Japanese-Company-Lexicon | - | - | |
manbyo-sudachi | - | - | |
jawiki-kana-kanji-dict | - | - | |
JIWC-Dictionary | - | - | |
JumanDIC | - | - | |
ipadic-py | |||
unidic-lite | |||
emoji-ime-dictionary | - | - | |
google-ime-dictionary | - | - | |
dic-nico-intersection-pixiv | - | - | |
google-ime-user-dictionary-ja-en | - | - | |
emoticon | - | - | |
mecab-mozcdic | - | - | |
denonbu-ime-dic | - | - | |
nijisanji-ime-dic | - | - | |
pokemon-ime-dic | - | - | |
EJDict | - | - | |
Ayashiy-Nipongo-Dic | - | - | |
genshin-dict | - | - | |
jmdict-simplified | - | - | |
mozcdict-ext | - | - | |
mh-dict-jp | - | - | |
jitenbot | - | - | |
mecab-unidic-neologd | - | - | |
hololive-dictionary | - | - | |
jmdict-yomitan | - | - | |
yomichan-jlpt-vocab | - | - | |
Jitendex | - | - | |
jiten | - | - | |
pixiv-yomitan | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
ner-wikipedia-dataset | - | - | |
IOB2Corpus | - | - | |
TwitterCorpus | - | - | |
UD_Japanese-PUD | - | - | |
UD_Japanese-GSD | - | - | |
KWDLC | - | - | |
AnnotatedFKCCorpus | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
small_parallel_enja | - | - | |
Web-Crawled-Corpus-for-Japanese-Chinese-NMT | - | - | |
CourseraParallelCorpusMining | - | - | |
JESC | - | - | |
AMI-Meeting-Parallel-Corpus | - | - | |
giant_ja-en_parallel_corpus | - | - | |
jesc_small | - | - | |
graded-enja-corpus | - | - | |
cjk-compsci-terms | - | - | |
Laboro-ParaCorpus | - | - | |
google-vs-deepl-je | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
JMRD | - | - | |
open2ch-dialogue-corpus | - | - | |
BSD | - | - | |
asdc | - | - | |
japanese-corpus | - | - | |
BPersona-chat | - | - | |
japanese-daily-dialogue | - | - | |
llm-japanese-dataset | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
spacy_tutorial | - | - | |
fastTextJapaneseTutorial | - | - | |
allennlp-NER-ja | - | - | |
chariot-PyTorch-Japanese-text-classification | - | - | |
ginza-examples | - | - | |
DocumentClassificationUsingBERT-Japanese | - | - | |
BERT_Japanese_Google_Colaboratory | - | - | |
bert-book | - | - | |
janome-tutorial | - | - | |
handson-language-models | - | - | |
JapaneseNLI | - | - | |
deep-learning-with-pytorch-ja | - | - | |
bert-classification-tutorial | - | - | |
python-nlp-book | - | - | |
llm-book | - | - | |
nlp2024-tutorial-3 | - | - |
Name | downloads/week | total downloads | stars |
---|---|---|---|
awesome-bert-japanese | - | - | |
GEC-Info-ja | - | - | |
dataset-list | - | - | |
tuning_playbook_ja | - | - | |
japanese-pitch-accent-resources | - | - | |
awesome-japanese-llm | - | - |