1 | # |
---|
2 | # This is a sample user dictionary for Kuromoji (JapaneseTokenizer) |
---|
3 | # |
---|
4 | # Add entries to this file in order to override the statistical model in terms |
---|
5 | # of segmentation, readings and part-of-speech tags. Notice that entries do |
---|
6 | # not have weights since they are always used when found. This is by-design |
---|
7 | # in order to maximize ease-of-use. |
---|
8 | # |
---|
9 | # Entries are defined using the following CSV format: |
---|
10 | # <text>,<token 1> ... <token n>,<reading 1> ... <reading n>,<part-of-speech tag> |
---|
11 | # |
---|
12 | # Notice that a single half-width space separates tokens and readings, and |
---|
13 | # that the number tokens and readings must match exactly. |
---|
14 | # |
---|
15 | # Also notice that multiple entries with the same <text> is undefined. |
---|
16 | # |
---|
17 | # Whitespace only lines are ignored. Comments are not allowed on entry lines. |
---|
18 | # |
---|
19 | |
---|
20 | # Custom segmentation for kanji compounds |
---|
21 | æ¥æ¬çµæžæ°è,æ¥æ¬ çµæž æ°è,ããã³ ã±ã€ã¶ã€ ã·ã³ãã³,ã«ã¹ã¿ã åè© |
---|
22 | é¢è¥¿åœé空枯,é¢è¥¿ åœé 空枯,ã«ã³ãµã€ ã³ã¯ãµã€ ã¯ãŠã³ãŠ,ã«ã¹ã¿ã åè© |
---|
23 | |
---|
24 | # Custom segmentation for compound katakana |
---|
25 | ããŒãããã°,ããŒã ããã°,ããŒã ããã°,ããã«ãåè© |
---|
26 | ã·ã§ã«ããŒããã°,ã·ã§ã«ã㌠ããã°,ã·ã§ã«ã㌠ããã°,ããã«ãåè© |
---|
27 | |
---|
28 | # Custom reading for former sumo wrestler |
---|
29 | æééŸ,æééŸ,ã¢ãµã·ã§ãŠãªã¥ãŠ,ã«ã¹ã¿ã 人å |
---|