Mastodon
  • What is Mastodon?
  • Using Mastodon
    • Signing up for an account
    • Setting up your profile
    • Posting toots
    • Using the network features
    • Dealing with unwanted content
    • Promoting yourself and others
    • Set your preferences
    • More settings
    • Using Mastodon externally
    • Moving or leaving accounts
    • Running your own server
  • Running Mastodon
    • Preparing your machine
    • Installing from source
    • Configuring your environment
    • Installing optional features
      • Full-text search
      • Hidden services
      • Single Sign On
    • Setting up your new instance
    • Using the admin CLI
    • Upgrading to a new release
    • Backing up your server
    • Migrating to a new machine
    • Scaling up your server
    • Moderation actions
    • Troubleshooting errors
      • Database index corruption
  • Developing Mastodon apps
    • Getting started with the API
    • Playing with public data
    • Obtaining client app access
    • Logging in with an account
    • Guidelines and best practices
    • Libraries and implementations
  • Contributing to Mastodon
    • Technical overview
    • Setting up a dev environment
    • Code structure
    • Routes
    • Bug bounties and responsible disclosure
  • Spec compliance
    • ActivityPub
    • WebFinger
    • Security
    • Microformats
    • OAuth
    • Bearcaps
  • REST API
    • OAuth Scopes
    • Rate limits
  • API Methods
    • apps
      • oauth
    • accounts
      • bookmarks
      • favourites
      • mutes
      • blocks
      • domain_blocks
      • filters
      • reports
      • follow_requests
      • endorsements
      • featured_tags
      • preferences
      • suggestions
    • statuses
      • media
      • polls
      • scheduled_statuses
    • timelines
      • conversations
      • lists
      • markers
      • streaming
    • notifications
      • push
    • search
    • instance
      • trends
      • directory
      • custom_emojis
    • admin
    • announcements
    • proofs
    • oembed
  • API Entities
    • Account
    • Activity
    • Admin::Account
    • Admin::Report
    • Announcement
    • AnnouncementReaction
    • Application
    • Attachment
    • Card
    • Context
    • Conversation
    • Emoji
    • Error
    • FeaturedTag
    • Field
    • Filter
    • History
    • IdentityProof
    • Instance
    • List
    • Marker
    • Mention
    • Notification
    • Poll
    • Preferences
    • PushSubscription
    • Relationship
    • Report
    • Results
    • ScheduledStatus
    • Source
    • Status
    • Tag
    • Token

Full-text search

Setting up ElasticSearch to search for statuses authored, favourited, or mentioned in.

    • Installing ElasticSearch
    • Configuring Mastodon
    • Search optimization for other languages

Mastodon supports full-text search when ElasticSearch is available. Mastodon’s full-text search allows logged in users to find results from their own toots, their favourites, and their mentions. It deliberately does not allow searching for arbitrary strings in the entire database.

Installing ElasticSearch

ElasticSearch requires a Java runtime. If you don’t have Java already installed, do it now. Assuming you are logged in as root:

apt install openjdk-17-jre-headless

Add the official ElasticSearch repository to apt:

wget -O /usr/share/keyrings/elasticsearch.asc https://artifacts.elastic.co/GPG-KEY-elasticsearch
echo "deb [signed-by=/usr/share/keyrings/elasticsearch.asc] https://artifacts.elastic.co/packages/7.x/apt stable main" > /etc/apt/sources.list.d/elastic-7.x.list

Now you can install ElasticSearch:

apt update
apt install elasticsearch
Security warning: By default, ElasticSearch is supposed to bind to localhost only, i.e. be inaccessible from the outside network. You can check which address ElasticSearch binds to by looking at network.host within /etc/elasticsearch/elasticsearch.yml. Consider that anyone who can access ElasticSearch can access and modify any data within it, as there is no authentication layer. So it’s really important that the access is secured. Having a firewall that only exposes the 22, 80 and 443 ports is advisable, as outlined in the main installation instructions. If you have a multi-host setup, you must know how to secure internal traffic.
Security warning: ElasticSearch versions between 2.0 and 2.14.1 are affected by an exploit in the log4j library. If affected, please refer to the temporary mitigation from the ElasticSearch issue tracker.

To start ElasticSearch:

systemctl daemon-reload
systemctl enable --now elasticsearch

Configuring Mastodon

Edit .env.production to add the following variables:

ES_ENABLED=true
ES_HOST=localhost
ES_PORT=9200

If you have multiple Mastodon servers on the same machine, and you are planning to use the same ElasticSearch installation for all of them, make sure that all of them have unique REDIS_NAMESPACE in their configurations, to differentiate the indices. If you need to override the prefix of the ElasticSearch indices, you can set ES_PREFIX directly.

After saving the new configuration, restart Mastodon processes for it to take effect:

systemctl restart mastodon-sidekiq
systemctl reload mastodon-web

Now it’s time to create the ElasticSearch indices and fill them with data:

su - mastodon
cd live
RAILS_ENV=production bin/tootctl search deploy

Search optimization for other languages

Chinese search optimization

The default analyzer of the ElasticSearch is the standard analyzer, which may not be the best especially for Chinese. To improve search experience, you can install a language specific analyzer. Before creating the indices in ElasticSearch, install the following ElasticSearch extensions:

  • elasticsearch-analysis-ik
  • elasticsearch-analysis-stconvert

And then modify Mastodon’s index definition as follows:

diff --git a/app/chewy/accounts_index.rb b/app/chewy/accounts_index.rb
--- a/app/chewy/accounts_index.rb
+++ b/app/chewy/accounts_index.rb
@@ -4,7 +4,7 @@ class AccountsIndex < Chewy::Index
   settings index: { refresh_interval: '5m' }, analysis: {
     analyzer: {
       content: {
-        tokenizer: 'whitespace',
+        tokenizer: 'ik_max_word',
         filter: %w(lowercase asciifolding cjk_width),
       },

diff --git a/app/chewy/statuses_index.rb b/app/chewy/statuses_index.rb
--- a/app/chewy/statuses_index.rb
+++ b/app/chewy/statuses_index.rb
@@ -16,9 +16,17 @@ class StatusesIndex < Chewy::Index
         language: 'possessive_english',
       },
     },
+    char_filter: {
+      tsconvert: {
+        type: 'stconvert',
+        keep_both: false,
+        delimiter: '#',
+        convert_type: 't2s',
+      },
+    },
     analyzer: {
       content: {
-        tokenizer: 'uax_url_email',
+        tokenizer: 'ik_max_word',
         filter: %w(
           english_possessive_stemmer
           lowercase
@@ -27,6 +35,7 @@ class StatusesIndex < Chewy::Index
           english_stop
           english_stemmer
         ),
+        char_filter: %w(tsconvert),
       },
     },
   }
diff --git a/app/chewy/tags_index.rb b/app/chewy/tags_index.rb
--- a/app/chewy/tags_index.rb
+++ b/app/chewy/tags_index.rb
@@ -2,10 +2,19 @@

 class TagsIndex < Chewy::Index
   settings index: { refresh_interval: '15m' }, analysis: {
+    char_filter: {
+      tsconvert: {
+        type: 'stconvert',
+        keep_both: false,
+        delimiter: '#',
+        convert_type: 't2s',
+      },
+    },
     analyzer: {
       content: {
-        tokenizer: 'keyword',
+        tokenizer: 'ik_max_word',
         filter: %w(lowercase asciifolding cjk_width),
+        char_filter: %w(tsconvert),
       },

       edge_ngram: {

Last updated October 30, 2022 · Improve this page
Also available in: 简体中文

Sponsored by

Dotcom-Monitor LoadView Stephen Tures Swayable

Join Mastodon · Blog · ·

View source · CC BY-SA 4.0 · Imprint