How to take and restore opensearch snapshot

An OpenSearch snapshot is a backup of an index taken from a running cluster. It's better to use snapshots instead of disk backups due...

Opensearch Nov 8, 2023 0 12 Add to Reading List

How to take and restore opensearch snapshot

Data is the lifeblood of any organization, and the importance of protecting it cannot be overstated. Starting with OpenSearch v2.5 in Amazon OpenSearch Service, we introduced Snapshot Management, which automates the process of taking snapshots of your domain. Snapshot Management helps you create point-in-time backups of your domain using OpenSearch Dashboards, including both data and configuration settings (for visualizations and dashboards). You can use these snapshots to restore your cluster to a specific state, recover from potential failures, and even clone environments for testing or development purposes.

Before this release, to automate the process of taking snapshots, you needed to use the snapshot action of OpenSearch’s Index State Management (ISM) feature. With ISM, you could only back up a particular index. Automating backup for multiple indexes required you to write custom scripts or use external management tools. With Snapshot Management, you can automate snapshotting across multiple indexes to safeguard your data and ensure its durability and recoverability.

In this post, we share how to use Snapshot Management to take automated snapshots using OpenSearch Service.

Solution overview

We demonstrate the following high-level steps:

Register a snapshot repository in OpenSearch Service (a one-time process).
Configure a sample ISM policy to migrate the indexes from hot storage to the UltraWarm storage tier after the indexes meet a specific condition.
Create a Snapshot Management policy to take an automated snapshot for all indexes present across different storage tiers within a domain.

As of this writing, Snapshot Management doesn’t support single snapshot creation for all indexes present across different storage tiers within OpenSearch Service. For example, if you try to create a snapshot on multiple indexes with * and some indexes are in the warm tier, the snapshot creation will fail.

To overcome this limitation, you can use index aliases, with one index alias for each type of storage tier. For example, every new index created in the cluster will belong to the hot alias. When the index is moved to the UltraWarm tier via ISM, the alias for the index will be modified to warm, and the index will be removed from the hot alias.

Register a manual snapshot repository

To register a manual snapshot repository, you must create and configure an Amazon Simple Storage Service (Amazon S3) bucket and AWS Identity and Access Management (IAM) roles. For more information, refer to Prerequisites. Complete the following steps:

Create an S3 bucket to store snapshots for your OpenSearch Service domain.
Create an IAM role called SnapshotRole with the following IAM policy to delegate permissions to OpenSearch Service (provide the name of your S3 bucket):

{
    "Version": "2012-10-17",
    "Statement": [{
        "Action": ["s3:ListBucket"],
        "Effect": "Allow",
        "Resource": ["arn:aws:s3:::"]
    }, {
        "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
        "Effect": "Allow",
        "Resource": ["arn:aws:s3:::/*"]
    }]
}

Set the trust relationship for SnapshotRole as follows:

{
    "Version": "2012-10-17",
    "Statement": [{
        "Sid": "",
        "Effect": "Allow",
        "Principal": {
            "Service": "es.amazonaws.com"
        },
        "Action": "sts:AssumeRole"
    }]
}

Create a new IAM role called RegisterSnapshotRepo, which delegates iam:PassRole and es:ESHttpPut
(provide your AWS account and domain name):

{
    "Version": "2012-10-17",
    "Statement": [{
        "Effect": "Allow",
        "Action": "iam:PassRole",
        "Resource": "arn:aws:iam:::role/SnapshotRole"
    }, {
        "Effect": "Allow",
        "Action": "es:ESHttpPut",
        "Resource": "arn:aws:es:region::domain//*"
    }]
}

If you have enabled fine-grained access control for your domain, map the snapshot role manage_snapshots to your RegisterSnapshotRepo IAM role in OpenSearch Service.
Now you can use Python code like the following example to register the S3 bucket you created as a snapshot repository for your domain. Provide your host name, Region, snapshot repo name, and S3 bucket. Replace "arn:aws:iam::123456789012: role/SnapshotRole" with the ARN of your SnapshotRole.
The Boto3 session should use the RegisterSnapshotRepo IAM role

const AWS = require('aws-sdk');

// AWS configuration
AWS.config.update({ region: 'us-east-1' }); // Set your AWS region
const service = 'es';

// Create an AWS STS client to assume the role
const sts = new AWS.STS();

// Assume the IAM role to obtain temporary AWS credentials
const params = {
  RoleArn: 'arn:aws:iam:::role/RegisterSnapshotRepo', // Replace with your IAM role ARN
  RoleSessionName: 'Snapshot-Session', // Provide a session name
};

sts.assumeRole(params, (err, data) => {
  if (err) {
    console.error('Error assuming IAM role:', err);
  } else {
    const { AccessKeyId, SecretAccessKey, SessionToken } = data.Credentials;

    // Create the AWS4 authentication headers using the AWS SDK signer
    const host = 'your-opensearch-domain'; // Replace with your OpenSearch endpoint
    const region = 'us-east-1'; // Replace with your AWS region
    const endpoint = new AWS.Endpoint(host);

    const payload = {
      type: 's3',
      settings: {
        bucket: '',
        region: 'us-east-1',
        role_arn: 'arn:aws:iam:::role/SnapshotRole',
      },
      indices: "regulation,risklevel,ruleset,sendemail,substatus,tepl,typesdata,url,user,affiliates,bh,case,casenotice,category,client,clientthree,clienttwo,collection,contact,domaindata,domweb,gbh,googlebatch,googleinfraction,groupbatchdomain,industry,infractiontype,infrdomain,keywords,merchant,murls,minfr,notes,permission",
      ignore_unavailable : true,
      include_global_state : false,
      metadata : {
        taken_by : "user123",
        taken_because : "backup before upgrading"
      }
    };

    const canonicalUri = '/_snapshot/betarepo/sn4'; // Replace with your OpenSearch resource path

    const request = new AWS.HttpRequest(endpoint, region);
    request.method = 'PUT';
    request.path = canonicalUri;
    request.headers['Host'] = host;
    request.headers['Content-Type'] = 'application/json';
    request.body = JSON.stringify(payload);

    // Sign the request using the AWS SDK signer
    const signer = new AWS.Signers.V4(request, service);
    signer.addAuthorization({
      accessKeyId: AccessKeyId,
      secretAccessKey: SecretAccessKey,
      sessionToken: SessionToken,
    });

    const client = new AWS.NodeHttpClient();
    // Use AWS SDK's NodeHttpClient to send the request
    client.handleRequest(request, null, (response) => {
      let responseBody = '';

      response.on('data', (chunk) => {
        responseBody += chunk;
      });

      response.on('end', () => {
        console.log('Response Status Code:', response.statusCode);
        console.log('Response Body:', responseBody);
      });
    });
  }
});