Upgrade Elasticsearch from 2.3 to 7.4

Photo by Marten Newhall on Unsplash

Sometimes it is common just to deploy a Elasticsearch domain at AWS and forget to keep it up-to-date with the latest version. After a while, you realize that it’s five major releases behind the stable version and there’s no straight path to upgrade.

The idea of this post is to show the problems that you can face during this process of upgrading and also give you a more simplified way than following the AWS documentation.

Pre-requirements

 

  • ES domain v2.3

From 2.3 to 5.1

 

According to the AWS documentation, the first upgrade needs to be done manually using the ES snapshot API. So to successfully upgrade, you will need to create an S3 bucket and a set of permissions that allow your ES domains to dump data into the bucket.

To save your time, we have created a python script that communicates with the AWS API and setup everything for us. Focused on avoiding problems with your root domain, this script will create a brand new domain and execute all the steps in there.

Script steps

 

  1. Create an S3 bucket to store snapshot from 2.3 domain
  2. Create new Elasticsearch domain (Optional)
  3. Create IAMpermissions (Policy, Role, and attachments)
  4. Register snapshot in both domains
  5. Take snapshot from 2.3 and save it to S3 bucket
  6. Restore snapshot from S3 into the new domain

Teardown

 

  1. Delete S3 bucket
  2. Delete IAMpermissions (Policy, Role, and detachments)

Dependencies

 

  • Docker

Variables

# Elasticsearchexport OLD_DOMAIN_NAME=test
export AWS_REGION=ap-southeast-2
export NEW_DOMAIN_NAME=test-newexport CREATE_NEW_DOMAIN=True # Optional
export NEW_INSTANCE_TYPE=m5.xlarge.elasticsearch # Optional# S3export BUCKET_NAME=es-automated-update # Optional

To run all of these steps at once just run:

# Download the required variables, you should edit this file.
wget https://raw.githubusercontent.com/DNXLabs/es-auto-upgrade/master/var.envdocker run -it --env-file vars.env dnxsolutions/es-auto-upgrade upgrade.py

When the script finishes running, you should have your brand new ES domain at version 5.1 with all your data from 2.3.

From 5.1 to 7.4

From now on Amazon ES offers in-place Elasticsearch upgrades for domains that run versions 5.1 and later and we can proceed using the In-place along with the Elasticsearch reindex API to get to version 7.4.

Currently, Amazon ES supports the following upgrade paths.

5.1 to 5.6

 

5.6 to 6.8

Important Indices created in version 6.x no longer support multiple mapping types. Indices created in version 5.x still support multiple mapping types when restored into a 6.x cluster. Check that your client code creates only a single mapping type per index. To minimize downtime during the upgrade from Elasticsearch 5.6 to 6.x, Amazon ES reindexes the .kibana index to .kibana-6, deletes .kibana, creates an alias named .kibana, and maps the new index to the new alias.

6.8 to 7.4

Important
Elasticsearch 7.0 includes numerous breaking changes. Before initiating an in-place upgrade, we recommend taking a manual snapshot of the 6.8 domain, restoring it on a test 7.x domain, and using that test domain to identify potential upgrade issues.
Like Elasticsearch 6.x, indices can only contain one mapping type, but that type must now be named _doc. As a result, certain APIs no longer require a mapping type in the request body (such as the _bulk API).
For new indices, self-hosted Elasticsearch 7.x has a default shard count of one. Amazon ES 7.x domains retain the previous default of five.

This next steps will repeat until you get to the latest ES version.

1. Reindex

docker run -it --env-file vars.env dnxsolutions/es-auto-upgrade reindex.py

2. Rollout Upgrade

Go to your domain actions and select the Upgrade Domain button.

Image for post

Then select the Upgrade checkbox and Submit.

Image for post

AWS now will take care of the rest taking snapshots and upgrading to the next ES version. You can check the progress at the Upgrade History tab.

Image for post

Once this step is finished, repeat the reindex and upgrade until you get to 7.4.

Troubleshooting

 

Your domain sometimes might be ineligible for an upgrade or fail to upgrade for a wide variety of reasons. So AWS has provided one table showing the most common issues.

Mapping Issues

 

When you get to 7.4, you might notice some changes with the mapping. Since 2.3, a lot of new features were implemented, and this causes a lot of changes with mapping.

So what could happen is that you have a variable title with called id and it has changed to string.

What you can do it take the new mapping consulting the ES API at the endpoint.

https://$ES_DOMAIN_URL/<indice>/_mapping

Then compare with the original mapping at version 2.3 and adapt to your needs.

file: mapping.json{
    "mappings":{
        "properties":{
            "style": {
                "type": "text",
                "analyzer": "keyword"
            }
        }
    }
}
# Apply the new mappingfor index in <indice>; do
    curl -HContent-Type:application/json -XPUT "$ES_DOMAIN_URL/$index-new" -d @mapping.json
done

String -> Text

{
    "mappings":{
        "properties":{
            "id": {
                "type": "text"
            }
        }
    }
}

Float -> Long

{
    "mappings":{
        "properties":{
            "price":{
                "type":"long"
            }
        }
    }
}

Completion type

 

If you use completion type with payload and output, you may notice that since v5.1, ES does not support these two fields anymore. So to find a workaround for this, the following issue did this very well.

 

Now, instead of returning the payload, the whole hit is returned from Elasticsearch, so you can use any regular property of the document. In order to save bandwidth and memory, it makes sense to return only properties which you care about, by using the “_source filtering” feature (see _source_include and _source_exclude features.

So, for example, this is one way of using context, payload, and output:

"brand_suggestion":{
   "properties":{
      "context":{
         "properties":{
            "visible_context":{
               "type":"boolean"
            }
         }
      },
      "input":{
         "type":"completion",
         "analyzer":"simple",
         "preserve_separators":true,
         "preserve_position_increments":true,
         "max_input_length":50,
         "contexts":[
            {
               "name":"visible_context",
               "type":"CATEGORY",
               "path":"is_visible"
            }
         ]
      },
      "output":{
         "type":"text",
         "fields":{
            "keyword":{
               "type":"keyword",
               "ignore_above":256
            }
         }
      },
      "payload":{
         "properties":{
            "id":{
               "type":"long"
            },
            "urlKey":{
               "type":"text",
               "fields":{
                  "keyword":{
                     "type":"keyword",
                     "ignore_above":256
                  }
               }
            }
         }
      }
   }
}

At DNX Solutions, we work to bring a better cloud and application experience for digital-native companies in Australia.

Our current focus areas are AWS, Well-Architected Solutions, Containers, ECS, Kubernetes, Continuous Integration/Continuous Delivery, and Service Mesh and Data Solutions (movement, transformation, lakes, warehouses and analytics).

We are always hiring cloud engineers for our Sydney office, focusing on cloud-native concepts.

Check our open-source projects at https://github.com/DNXLabs and follow us on Linkedin or Facebook.

Stay informed on the latest
insights and tech-updates

No spam - just releases, updates, and tech information.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *