HomeBig DataExtract ServiceNow information utilizing AWS Glue Studio in an Amazon S3 information...

Extract ServiceNow information utilizing AWS Glue Studio in an Amazon S3 information lake and analyze utilizing Amazon Athena


Many various cloud-based software program as a service (SaaS) choices can be found in AWS. ServiceNow is without doubt one of the widespread cloud-based workflow automation platforms extensively utilized by AWS clients. Up to now few years, we noticed a variety of clients who wished to extract and combine information from IT service administration (ITSM) instruments like ServiceNow for numerous use circumstances:

  • Generate perception from information – Whenever you mix ServiceNow information with information from different companies like CRM (similar to Salesforce) or Martech information (similar to Amazon Pinpoint) to generate higher insights (e.g., constructing full buyer 360 view).
  • Archive information for future enterprise or regulatory necessities – You may archive the info in uncooked kind in your information lake to work on future use circumstances or simply preserve it to fulfill regulatory necessities similar to auditing.
  • Enhance efficiency by decoupling reporting or machine studying use circumstances from ITSM – Whenever you transfer your ITSM reporting from ServiceNow to an Amazon Easy Storage Service (Amazon S3) information lake, there isn’t any efficiency affect in your ServiceNow occasion.
  • Information democratization – You may extract the info and put it into an information lake so it may be accessible to different enterprise customers and items to discover and use.

Many shoppers have been constructing trendy information architectures on AWS, which incorporates constructing information lakes on Amazon S3 and utilizing broad and deep AWS analytics and an AI/ML companies to extract significant info from information by combining information from completely different information sources.

On this submit, we offer a step-by-step information to carry information from ServiceNow to an S3 information lake utilizing AWS Glue Studio and analyze the info with Amazon Athena.

Answer overview

On this resolution, ServiceNow information is being extracted by AWS Glue utilizing a Market connector. AWS Glue offers built-in help for probably the most generally used information shops (similar to Amazon Redshift, Amazon Aurora, Microsoft SQL Server, MySQL, MongoDB, and PostgreSQL) utilizing JDBC connections. AWS Glue additionally means that you can use customized JDBC drivers in your extract, rework, and cargo (ETL) jobs. For information shops that aren’t natively supported, similar to SaaS functions, you need to use connectors and saved in Amazon S3. The info is cataloged within the AWS Glue Information Catalog, and we use Athena to question the info.

AWS Glue is a serverless information integration service that makes it straightforward to find, put together, and mix information for analytics, machine studying (ML), and utility growth. AWS Glue offers all of the capabilities wanted for information integration so you can begin analyzing your information and put it to make use of in minutes as a substitute of months.

Amazon Athena is an interactive question service that makes it straightforward to investigate information in Amazon S3 utilizing customary SQL. Athena is serverless, so there isn’t any infrastructure to handle, and also you pay just for the queries that you simply run.

ServiceNow is a cloud-based software program platform for ITSM that helps to automate IT enterprise administration. It’s designed based mostly on ITIL pointers to supply service orientation for duties, actions, and processes.

The next diagram illustrates our resolution structure.
aws glue blog

To implement the answer, we full the next high-level steps:

  1. Subscribe to the AWS Glue Connector Market for ServiceNow from AWS Market.
  2. Create a connection in AWS Glue Studio.
  3. Create an AWS Identification and Entry Administration (IAM) position for AWS Glue.
  4. Configure and run an AWS Glue job that makes use of the connection.
  5. Run the question towards the info lake (Amazon S3) utilizing Athena.

Stipulations

For this walkthrough, it is best to have the next:

  • An AWS account.
  • A ServiceNow account. To observe together with this submit, you possibly can join a developer account, which is pre-populated with pattern information in lots of the ServiceNow objects.
  • ServiceNow connection properties credentials saved in AWS Secrets and techniques Supervisor. On the Secrets and techniques Supervisor console, create a brand new secret (choose Different sort of secrets and techniques) with a key-value pair for every property, for instance:
    • Username – ServiceNow Occasion account consumer identify (for instance, admin)
    • Password – ServiceNow Occasion account password
    • Occasion – ServiceNow occasion identify with out https and .service-now.com

Copy the key identify to make use of when configuring the connection in AWS Glue Studio.

Subscribe to the AWS Glue Market Connector for ServiceNow

To attach, we use the AWS Glue Market Connector for ServiceNow. You want to subscribe to the connector from AWS Market.

The AWS Glue Market Connector for ServiceNow is offered by third-party impartial software program vendor (ISV) listed on AWS Market. Related subscription charges and AWS utilization charges apply as soon as subscribed.

To make use of the connector in AWS Glue, you’ll want to activate the subscribed connector in AWS Glue Studio. The activation course of creates a connector object and connection in your AWS account.

  1. On the AWS Glue console, select AWS Glue Studio.
  2. Select Connectors.
  3. Select Market.
  4. Seek for the CData AWS Glue Connector for ServiceNow.


After you subscribe to the connector, a brand new config tab seems on the AWS Market connector web page.

  1. Evaluate the pricing and different related info.
  2. Select Proceed to Subscribe.
  3. Select Settle for Phrases.

After you subscribe to the connector, the following steps are to configure it.

  1. Retain the default picks for Supply Methodology and Software program Model to make use of the most recent connector software program model.
  2. Select Proceed to Launch.

  1. Select Utilization Directions.


A pop-up seems with a hyperlink to activate the connector with AWS Glue Studio.

  1. Select this hyperlink to begin configuring the connection to your ServiceNow account in AWS Glue Studio.

Create a connection in AWS Glue Studio

Create a connection in AWS Glue Studio with the next steps:

  1. For Title, enter a novel identify to your ServiceNow connection.
  2. For Connection credential sort, select username_password.
  3. For AWS Secret, select the Secrets and techniques Supervisor secret you created as a prerequisite.

Don’t present any further particulars within the non-compulsory Credentials part as a result of it retrieves the worth from Secrets and techniques Supervisor.

  1. Select Create connection and activate connector to complete creating the connection.

You need to now have the ability to view the ServiceNow connector you subscribed to and its related connection.

Create an IAM position for AWS Glue

The subsequent step is to create an IAM position with the mandatory permissions for the AWS Glue job. The identify of the position should begin with the string AWSGlueServiceRole for AWS Glue Studio to make use of it accurately. You want to grant your IAM position permissions that AWS Glue can assume when calling different companies in your behalf. For extra info, see Create an IAM Function for AWS Glue.

Connect the next AWS managed insurance policies to the position:

{
    "Model": "2012-10-17",
    "Assertion": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetResourcePolicy",
                "secretsmanager:GetSecretValue",
                "secretsmanager:DescribeSecret",
                "secretsmanager:ListSecretVersionIds"
            ],
            "Useful resource": [
                "{secret name arn}"
            ]
        }
    ]
}

For extra details about permissions, see Evaluate IAM permissions wanted for the AWS Glue Studio consumer.

Configure and run the AWS Glue job

After you configure your connection, you possibly can create and run an AWS Glue job.

Create a job that makes use of the connection

To create a job, full the next steps:

  1. In AWS Glue Studio, select Connectors.
  2. Choose the connection you created.
  3. Select Create job.


The visible job editor seems. A brand new supply node, derived from the connection, is displayed on the job graph. Within the node particulars panel on the best, the Information supply properties tab is chosen for consumer enter.

Configure the supply node properties

You may configure the entry choices to your connection to the info supply on the Information supply properties tab. For this submit, we offer a easy walkthrough. Check with the AWS Glue Studio Person Information for extra info.

  1. On the Supply menu, select CData AWS Glue Connector for ServiceNow.

  1. On the Information supply properties – Connector tab, be sure the supply node to your connector is chosen.

The Connection area is populated robotically with the identify of the connection related to {the marketplace} connector.

  1. Enter both a supply desk identify or a question to make use of to retrieve information from the info supply. For this submit, we enter the desk identify incident.

  1. On the Rework menu, select Apply Mapping.
  2. In a Node Property Tab, Choose Node Dad and mom CData AWS Glue Connector for ServiceNow.
  3. As we’re connecting to an exterior information supply; while you first look into Rework and Output schema tab; you gained’t discover the schema extracted from the supply.
  4. So as so that you can retrieve schema, Go to Information Preview tab, click on on Begin information preview session and choose the IAM position you will have created for this job.
  5. As soon as the Information preview is finished, go to Information Supply part and click on on Use datapreview schema.
  6. Go to Rework and Examine all of the columns the place Information Kind exhibiting as NULL.

  1. On the Goal menu, select Amazon S3.
  2. On the Information goal properties – S3 tab, for Format, select Parquet.
  3. For Compression Kind, select GZIP.
  4. For S3 Goal Location, enter the Amazon S3 location to retailer the info.
  5. For Information Catalog replace choices, choose Create a desk within the Information Catalog and on subsequent runs, preserve present schema and add new partitions.
  6. For Database, enter sampledb.
  7. For Desk identify, enter incident.

Edit, save, and run the job

Edit the job by including and modifying the nodes within the job graph. See Modifying ETL jobs in AWS Glue Studio for extra info.

After you edit the job, enter the job properties.

  1. Select the Job particulars tab above the visible graph editor.
  2. For Title, enter a job identify.
  3. For IAM Function, select an IAM position with the mandatory permissions, as described beforehand.
  4. For Kind, select Spark.
  5. For Glue model, select Glue 3.0 – Helps spark 3.1, Scala 2, Python 3.
  6. For Language, select Python 3.
  7. Employee sort : G.1X
  8. Requested variety of staff: 2
  9. Variety of retries: 1
  10. Job timeout (minutes): 3
  11. Use the default values for the opposite parameters.

For extra details about job parameters, see Defining Job Properties for Spark Jobs.

12. After you save the job, select Run to run the job.

Be aware – Working the Glue Job incur price. You may be taught extra about AWS Glue Pricing right here.

To view the generated script for the job, select the Script tab on the prime of the visible editor. The Job runs tab exhibits the job run historical past for the job. For extra details about job run particulars, see View info for latest job runs.

Question towards the info lake utilizing Athena

After the job is full, you possibly can question the info in Athena.

  1. On the Athena console, select the sampledb database.

You may view the newly created desk referred to as incident.

  1. Select the choices icon (three vertical dots) and select Preview desk to view the info.

Now let’s carry out some analyses.

  1. Discover all of the incident tickets which are escalated by operating the next question:
    SELECT task_effective_number FROM "sampledb"."incident" 
    the place escalation = 2;

  1. Discover ticket depend with precedence:
    SELECT precedence, depend(distinct task_effective_number)  FROM "sampledb"."incident"
    group by precedence
    order by precedence asc

Conclusion

On this submit, we demonstrated how you need to use an AWS Glue Studio connector to attach from ServiceNow and produce information into your information lake for additional use circumstances.

AWS Glue offers built-in help for probably the most generally used information shops (similar to Amazon Redshift, Amazon Aurora, Microsoft SQL Server, MySQL, MongoDB, and PostgreSQL) utilizing JDBC connections. AWS Glue additionally means that you can use customized JDBC drivers in your extract, rework, and cargo (ETL) jobs. For information shops that aren’t natively supported, similar to SaaS functions, you need to use connectors.

To be taught extra, discuss with the AWS Glue Studio ConnectorAWS Glue Studio Person Information and Athena Person Information.


In regards to the Authors

Navnit Shukla is AWS Specialist Answer Architect in Analytics. He’s captivated with serving to clients uncover insights from their information. He builds options to assist organizations make data-driven choices.

Srikanth Sopirala is a Principal Options Architect at AWS. He’s a seasoned chief with over 20 years of expertise, who’s captivated with serving to clients construct scalable information and analytic options to achieve well timed insights and make essential enterprise choices. In his spare time, he enjoys studying, spending time along with his household, and highway biking.

Naresh Gautam is a Principal Options Architect at AWS. His position helps clients architect extremely accessible, high-performance, and cost-effective information analytics options to empower clients with data-driven decision-making. In his free time, he enjoys meditation and cooking.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments