How-To Configure the AWS S3 Connection

What Does This Article Cover?

Intelligence Hub includes a configurable Connection that allows Intelligence Hub solutions to write data to AWS Amazon S3. The Amazon S3 Connection supports writing data payloads from Intelligence Hub to AWS Amazon S3 buckets. The process to perform these actions are provided below. These process descriptions assume intermediate level knowledge of building Intelligence Hub component objects include Models, Instance, Flows, Pipelines and Connectors.

  • What is AWS Amazon S3
  • Configuring Amazon S3 Authentication
  • Writing Data to an Amazon S3 Bucket
  • Other related material

What is AWS S3?

Amazon Simple Storage Service (Amazon S3) is an object storage service. Customers use Amazon S3 to store and protect data for a range of use cases, such as data lakes, websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics. Amazon S3 provides management features so that you can optimize, organize, and configure access to your data to meet your specific business, organizational, and compliance requirements.

Configuring Amazon S3 Token Authentication:

An Amazon S3 Access and Secret Key may be used to authenticate an Intelligence Hub Amazon S3 Connection to AWS Amazon S3. The following describes the configuration process.

  • Use AWS Identity and Access Management to create a new policy with access to Amazon S3.
  • Create a new user with AWS Identity and Access Management and assign the policy to the user.
  • Generate the Access and Secret Keys

Writing Data to an Amazon S3 Bucket:

Virtually any Intelligence Hub payload may be written to an Amazon S3 bucket. When writing data to an Amazon S3 bucket often data is aggregated and or buffered and written to a file format. An Intelligence Hub Pipeline is an effective way to configure this. The process consists of the following steps.

  • Configure an Amazon S3 Connection with the required authentication information as described above.
  • Create an Intelligence Pipeline that includes the Stages need to perform the desired actions.
  • Add a Pipeline Write or Write New Stage.
  • Configure the Stage’s Connection Output Settings.
  • The bucket name must be included. This must be an existing Amazon S3 bucket.
  • The key is the default file name in S3. This often includes a unique identifier and date and time stamp. Dynamic references are supported. If the field is left empty the file name is a GUID with a timestamp.
  • The payload reference is used when working with complex payloads. This setting uses dynamic outputs to specify the attribute that contains the file payload.
  • The storage class of the payload may be specified. Amazon S3 offers a range of storage classes that you can choose from based on the data access, resiliency, and cost requirements of your workloads.
  • Finally, a time may be added to the key to logically separate files in S3 as yyyy/MM/dd/HH/key.

Other related material