Language Analyzersedit

A set of analyzers aimed at analyzing specific language text. The following types are supported: arabic, armenian, basque, brazilian, bulgarian, catalan, chinese, cjk, czech, danish, dutch, english, finnish, french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, norwegian, persian, portuguese, romanian, russian, sorani, spanish, swedish, turkish, thai.

Configuring language analyzersedit

Stopwordsedit

All analyzers support setting custom stopwords either internally in the config, or by using an external stopwords file by setting stopwords_path. Check Stop Analyzer for more details.

Excluding words from stemmingedit

The stem_exclusion parameter allows you to specify an array of lowercase words that should not be stemmed. Internally, this functionality is implemented by adding the keyword_marker token filter with the keywords set to the value of the stem_exclusion parameter.

The following analyzers support setting custom stem_exclusion list: arabic, armenian, basque, catalan, bulgarian, catalan, czech, finnish, dutch, english, finnish, french, galician, german, irish, hindi, hungarian, indonesian, italian, norwegian, portuguese, romanian, russian, sorani, spanish, swedish, turkish.

Reimplementing language analyzersedit

The built-in language analyzers can be reimplemented as custom analyzers (as described below) in order to customize their behaviour.

Note

If you do not intend to exclude words from being stemmed (the equivalent of the stem_exclusion parameter above), then you should remove the keyword_marker token filter from the custom analyzer configuration.

arabic analyzeredit

The arabic analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "arabic_stop": {
          "type":       "stop",
          "stopwords":  "_arabic_" 
        },
        "arabic_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "arabic_stemmer": {
          "type":       "stemmer",
          "language":   "arabic"
        }
      },
      "analyzer": {
        "arabic": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "arabic_stop",
            "arabic_normalization",
            "arabic_keywords",
            "arabic_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

armenian analyzeredit

The armenian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "armenian_stop": {
          "type":       "stop",
          "stopwords":  "_armenian_" 
        },
        "armenian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "armenian_stemmer": {
          "type":       "stemmer",
          "language":   "armenian"
        }
      },
      "analyzer": {
        "armenian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "armenian_stop",
            "armenian_keywords",
            "armenian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

basque analyzeredit

The basque analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "basque_stop": {
          "type":       "stop",
          "stopwords":  "_basque_" 
        },
        "basque_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "basque_stemmer": {
          "type":       "stemmer",
          "language":   "basque"
        }
      },
      "analyzer": {
        "basque": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "basque_stop",
            "basque_keywords",
            "basque_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

brazilian analyzeredit

The brazilian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "brazilian_stop": {
          "type":       "stop",
          "stopwords":  "_brazilian_" 
        },
        "brazilian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "brazilian_stemmer": {
          "type":       "stemmer",
          "language":   "brazilian"
        }
      },
      "analyzer": {
        "brazilian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "brazilian_stop",
            "brazilian_keywords",
            "brazilian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

bulgarian analyzeredit

The bulgarian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "bulgarian_stop": {
          "type":       "stop",
          "stopwords":  "_bulgarian_" 
        },
        "bulgarian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "bulgarian_stemmer": {
          "type":       "stemmer",
          "language":   "bulgarian"
        }
      },
      "analyzer": {
        "bulgarian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "bulgarian_stop",
            "bulgarian_keywords",
            "bulgarian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

catalan analyzeredit

The catalan analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "catalan_elision": {
        "type":         "elision",
            "articles": [ "d", "l", "m", "n", "s", "t"]
        },
        "catalan_stop": {
          "type":       "stop",
          "stopwords":  "_catalan_" 
        },
        "catalan_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "catalan_stemmer": {
          "type":       "stemmer",
          "language":   "catalan"
        }
      },
      "analyzer": {
        "catalan": {
          "tokenizer":  "standard",
          "filter": [
            "catalan_elision",
            "lowercase",
            "catalan_stop",
            "catalan_keywords",
            "catalan_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

chinese analyzeredit

The chinese analyzer cannot be reimplemented as a custom analyzer because it depends on the ChineseTokenizer and ChineseFilter classes, which are not exposed in Elasticsearch. These classes are deprecated in Lucene 4 and the chinese analyzer will be replaced with the Standard Analyzer in Lucene 5.

cjk analyzeredit

The cjk analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "english_stop": {
          "type":       "stop",
          "stopwords":  "_english_" 
        }
      },
      "analyzer": {
        "cjk": {
          "tokenizer":  "standard",
          "filter": [
            "cjk_width",
            "lowercase",
            "cjk_bigram",
            "english_stop"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

czech analyzeredit

The czech analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "czech_stop": {
          "type":       "stop",
          "stopwords":  "_czech_" 
        },
        "czech_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "czech_stemmer": {
          "type":       "stemmer",
          "language":   "czech"
        }
      },
      "analyzer": {
        "czech": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "czech_stop",
            "czech_keywords",
            "czech_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

danish analyzeredit

The danish analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "danish_stop": {
          "type":       "stop",
          "stopwords":  "_danish_" 
        },
        "danish_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "danish_stemmer": {
          "type":       "stemmer",
          "language":   "danish"
        }
      },
      "analyzer": {
        "danish": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "danish_stop",
            "danish_keywords",
            "danish_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

dutch analyzeredit

The dutch analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "dutch_stop": {
          "type":       "stop",
          "stopwords":  "_dutch_" 
        },
        "dutch_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "dutch_stemmer": {
          "type":       "stemmer",
          "language":   "dutch"
        },
        "dutch_override": {
          "type":       "stemmer_override",
          "rules": [
            "fiets=>fiets",
            "bromfiets=>bromfiets",
            "ei=>eier",
            "kind=>kinder"
          ]
        }
      },
      "analyzer": {
        "dutch": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "dutch_stop",
            "dutch_keywords",
            "dutch_override",
            "dutch_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

english analyzeredit

The english analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "english_stop": {
          "type":       "stop",
          "stopwords":  "_english_" 
        },
        "english_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "english_stemmer": {
          "type":       "stemmer",
          "language":   "english"
        },
        "english_possessive_stemmer": {
          "type":       "stemmer",
          "language":   "possessive_english"
        }
      },
      "analyzer": {
        "english": {
          "tokenizer":  "standard",
          "filter": [
            "english_possessive_stemmer",
            "lowercase",
            "english_stop",
            "english_keywords",
            "english_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

finnish analyzeredit

The finnish analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "finnish_stop": {
          "type":       "stop",
          "stopwords":  "_finnish_" 
        },
        "finnish_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "finnish_stemmer": {
          "type":       "stemmer",
          "language":   "finnish"
        }
      },
      "analyzer": {
        "finnish": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "finnish_stop",
            "finnish_keywords",
            "finnish_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

french analyzeredit

The french analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "french_elision": {
        "type":         "elision",
            "articles": [ "l", "m", "t", "qu", "n", "s",
                          "j", "d", "c", "jusqu", "quoiqu",
                          "lorsqu", "puisqu"
                        ]
        },
        "french_stop": {
          "type":       "stop",
          "stopwords":  "_french_" 
        },
        "french_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "french_stemmer": {
          "type":       "stemmer",
          "language":   "light_french"
        }
      },
      "analyzer": {
        "french": {
          "tokenizer":  "standard",
          "filter": [
            "french_elision",
            "lowercase",
            "french_stop",
            "french_keywords",
            "french_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

galician analyzeredit

The galician analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "galician_stop": {
          "type":       "stop",
          "stopwords":  "_galician_" 
        },
        "galician_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "galician_stemmer": {
          "type":       "stemmer",
          "language":   "galician"
        }
      },
      "analyzer": {
        "galician": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "galician_stop",
            "galician_keywords",
            "galician_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

german analyzeredit

The german analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "german_stop": {
          "type":       "stop",
          "stopwords":  "_german_" 
        },
        "german_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "german_stemmer": {
          "type":       "stemmer",
          "language":   "light_german"
        }
      },
      "analyzer": {
        "german": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "german_stop",
            "german_keywords",
            "german_normalization",
            "german_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

greek analyzeredit

The greek analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "greek_stop": {
          "type":       "stop",
          "stopwords":  "_greek_" 
        },
        "greek_lowercase": {
          "type":       "lowercase",
          "language":   "greek"
        },
        "greek_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "greek_stemmer": {
          "type":       "stemmer",
          "language":   "greek"
        }
      },
      "analyzer": {
        "greek": {
          "tokenizer":  "standard",
          "filter": [
            "greek_lowercase",
            "greek_stop",
            "greek_keywords",
            "greek_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

hindi analyzeredit

The hindi analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "hindi_stop": {
          "type":       "stop",
          "stopwords":  "_hindi_" 
        },
        "hindi_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "hindi_stemmer": {
          "type":       "stemmer",
          "language":   "hindi"
        }
      },
      "analyzer": {
        "hindi": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "indic_normalization",
            "hindi_normalization",
            "hindi_stop",
            "hindi_keywords",
            "hindi_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

hungarian analyzeredit

The hungarian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "hungarian_stop": {
          "type":       "stop",
          "stopwords":  "_hungarian_" 
        },
        "hungarian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "hungarian_stemmer": {
          "type":       "stemmer",
          "language":   "hungarian"
        }
      },
      "analyzer": {
        "hungarian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "hungarian_stop",
            "hungarian_keywords",
            "hungarian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

indonesian analyzeredit

The indonesian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "indonesian_stop": {
          "type":       "stop",
          "stopwords":  "_indonesian_" 
        },
        "indonesian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "indonesian_stemmer": {
          "type":       "stemmer",
          "language":   "indonesian"
        }
      },
      "analyzer": {
        "indonesian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "indonesian_stop",
            "indonesian_keywords",
            "indonesian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

irish analyzeredit

The irish analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "irish_elision": {
          "type":       "elision",
          "articles": [ "h", "n", "t" ]
        },
        "irish_stop": {
          "type":       "stop",
          "stopwords":  "_irish_" 
        },
        "irish_lowercase": {
          "type":       "lowercase",
          "language":   "irish"
        },
        "irish_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "irish_stemmer": {
          "type":       "stemmer",
          "language":   "irish"
        }
      },
      "analyzer": {
        "irish": {
          "tokenizer":  "standard",
          "filter": [
            "irish_stop",
            "irish_elision",
            "irish_lowercase",
            "irish_keywords",
            "irish_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

italian analyzeredit

The italian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "italian_elision": {
        "type":         "elision",
            "articles": [
                "c", "l", "all", "dall", "dell",
                "nell", "sull", "coll", "pell",
                "gl", "agl", "dagl", "degl", "negl",
                "sugl", "un", "m", "t", "s", "v", "d"
            ]
        },
        "italian_stop": {
          "type":       "stop",
          "stopwords":  "_italian_" 
        },
        "italian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "italian_stemmer": {
          "type":       "stemmer",
          "language":   "light_italian"
        }
      },
      "analyzer": {
        "italian": {
          "tokenizer":  "standard",
          "filter": [
            "italian_elision",
            "lowercase",
            "italian_stop",
            "italian_keywords",
            "italian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

norwegian analyzeredit

The norwegian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "norwegian_stop": {
          "type":       "stop",
          "stopwords":  "_norwegian_" 
        },
        "norwegian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "norwegian_stemmer": {
          "type":       "stemmer",
          "language":   "norwegian"
        }
      },
      "analyzer": {
        "norwegian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "norwegian_stop",
            "norwegian_keywords",
            "norwegian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

persian analyzeredit

The persian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "char_filter": {
        "zero_width_spaces": {
            "type":       "mapping",
            "mappings": [ "\\u200C=> "] 
        }
      },
      "filter": {
        "persian_stop": {
          "type":       "stop",
          "stopwords":  "_persian_" 
        }
      },
      "analyzer": {
        "persian": {
          "tokenizer":     "standard",
          "char_filter": [ "zero_width_spaces" ],
          "filter": [
            "lowercase",
            "arabic_normalization",
            "persian_normalization",
            "persian_stop"
          ]
        }
      }
    }
  }
}

Replaces zero-width non-joiners with an ASCII space.

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

portuguese analyzeredit

The portuguese analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "portuguese_stop": {
          "type":       "stop",
          "stopwords":  "_portuguese_" 
        },
        "portuguese_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "portuguese_stemmer": {
          "type":       "stemmer",
          "language":   "light_portuguese"
        }
      },
      "analyzer": {
        "portuguese": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "portuguese_stop",
            "portuguese_keywords",
            "portuguese_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

romanian analyzeredit

The romanian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "romanian_stop": {
          "type":       "stop",
          "stopwords":  "_romanian_" 
        },
        "romanian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "romanian_stemmer": {
          "type":       "stemmer",
          "language":   "romanian"
        }
      },
      "analyzer": {
        "romanian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "romanian_stop",
            "romanian_keywords",
            "romanian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

russian analyzeredit

The russian analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "russian_stop": {
          "type":       "stop",
          "stopwords":  "_russian_" 
        },
        "russian_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "russian_stemmer": {
          "type":       "stemmer",
          "language":   "russian"
        }
      },
      "analyzer": {
        "russian": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "russian_stop",
            "russian_keywords",
            "russian_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

sorani analyzeredit

The sorani analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "sorani_stop": {
          "type":       "stop",
          "stopwords":  "_sorani_" 
        },
        "sorani_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "sorani_stemmer": {
          "type":       "stemmer",
          "language":   "sorani"
        }
      },
      "analyzer": {
        "sorani": {
          "tokenizer":  "standard",
          "filter": [
            "sorani_normalization",
            "lowercase",
            "sorani_stop",
            "sorani_keywords",
            "sorani_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

spanish analyzeredit

The spanish analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "spanish_stop": {
          "type":       "stop",
          "stopwords":  "_spanish_" 
        },
        "spanish_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "spanish_stemmer": {
          "type":       "stemmer",
          "language":   "light_spanish"
        }
      },
      "analyzer": {
        "spanish": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "spanish_stop",
            "spanish_keywords",
            "spanish_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

swedish analyzeredit

The swedish analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "swedish_stop": {
          "type":       "stop",
          "stopwords":  "_swedish_" 
        },
        "swedish_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "swedish_stemmer": {
          "type":       "stemmer",
          "language":   "swedish"
        }
      },
      "analyzer": {
        "swedish": {
          "tokenizer":  "standard",
          "filter": [
            "lowercase",
            "swedish_stop",
            "swedish_keywords",
            "swedish_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

turkish analyzeredit

The turkish analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "turkish_stop": {
          "type":       "stop",
          "stopwords":  "_turkish_" 
        },
        "turkish_lowercase": {
          "type":       "lowercase",
          "language":   "turkish"
        },
        "turkish_keywords": {
          "type":       "keyword_marker",
          "keywords":   [] 
        },
        "turkish_stemmer": {
          "type":       "stemmer",
          "language":   "turkish"
        }
      },
      "analyzer": {
        "turkish": {
          "tokenizer":  "standard",
          "filter": [
            "apostrophe",
            "turkish_lowercase",
            "turkish_stop",
            "turkish_keywords",
            "turkish_stemmer"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.

This filter should be removed unless there are words which should be excluded from stemming.

thai analyzeredit

The thai analyzer could be reimplemented as a custom analyzer as follows:

{
  "settings": {
    "analysis": {
      "filter": {
        "thai_stop": {
          "type":       "stop",
          "stopwords":  "_thai_" 
        }
      },
      "analyzer": {
        "thai": {
          "tokenizer":  "thai",
          "filter": [
            "lowercase",
            "thai_stop"
          ]
        }
      }
    }
  }
}

The default stopwords can be overridden with the stopwords or stopwords_path parameters.